HOME

TheInfoList



OR:

A reliable multicast is any
computer networking A computer network is a set of computers sharing resources located on or provided by network nodes. The computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are ma ...
protocol that provides a '' reliable'' sequence of packets to multiple recipients simultaneously, making it suitable for applications such as multi-receiver
file transfer File transfer is the transmission of a computer file through a communication channel from one computer system to another. Typically, file transfer is mediated by a communications protocol. In the history of computing, numerous file transfer protocol ...
.


Overview

Multicast In computer networking, multicast is group communication where data transmission is addressed to a group of destination computers simultaneously. Multicast can be one-to-many or many-to-many distribution. Multicast should not be confused with ...
is a network addressing method for the delivery of
information Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random ...
to a group of destinations simultaneously using the most efficient strategy to deliver the messages over each link of the network only once, creating copies only when the links to the multiple destinations split (typically
network switch A network switch (also called switching hub, bridging hub, and, by the IEEE, MAC bridge) is networking hardware that connects devices on a computer network by using packet switching to receive and forward data to the destination device. A netw ...
es and routers). However, like the
User Datagram Protocol In computer networking, the User Datagram Protocol (UDP) is one of the core communication protocols of the Internet protocol suite used to send messages (transported as datagrams in packets) to other hosts on an Internet Protocol (IP) network. ...
, multicast does not guarantee the delivery of a message stream. Messages may be dropped, delivered multiple times, or delivered out of order. A reliable multicast protocol adds the ability for receivers to detect lost and/or out-of-order messages and take corrective action (similar in principle to TCP), resulting in a gap-free, in-order message stream.


Reliability

The exact meaning of ''reliability'' depends on the specific protocol instance. A minimal definition of reliable multicast is ''eventual delivery of all the data to all the group members, without enforcing any particular delivery order''. However, not all reliable multicast protocols ensure this level of reliability; many of them trade efficiency for reliability, in different ways. For example, while TCP makes the sender responsible for transmission reliability, multicast
NAK In data networking, telecommunications, and computer buses, an acknowledgment (ACK) is a signal that is passed between communicating processes, computers, or devices to signify acknowledgment, or receipt of message, as part of a communicatio ...
-based protocols shift the responsibility to receivers: the sender never knows for sure that all the receivers have in fact received all the data. RFC- 2887 explores the design space for bulk data transfer, with a brief discussion on the various issues and some hints at the possible different meanings of ''reliable''.


Reliable Group Data Delivery

Reliable Group Data Delivery (RGDD) is a form of multicasting where an object is to be moved from a single source to a fixed set of receivers known before transmission begins. A variety of applications may need such delivery: Hadoop Distributed File System (HDFS) replicates any chunk of data two additional times to specific servers, VM replication to multiple servers may be required for scale out of applications and data replication to multiple servers may be necessary for load balancing by allowing multiple servers to serve the same data from their local cached copies. Such delivery is frequent within datacenters due to plethora of servers communicating while running highly distributed applications. RGDD may also occur across datacenters and is sometimes referred to as inter-datacenter Point to Multipoint (P2MP) Transfers. Such transfers deliver huge volumes of data from one datacenter to multiple datacenters for various applications: search engines distribute search index updates periodically (e.g. every 24 hours), social media applications push new content to many cache locations across the world (e.g. YouTube and Facebook), and backup services make several geographically dispersed copies for increased fault tolerance. To maximize bandwidth utilization and reduce completion times of bulk transfers, a variety of techniques have been proposed for selection of multicast forwarding trees.


Virtual synchrony

Modern systems like the Spread Toolkit, Quicksilver, and
Corosync The Corosync Cluster Engine is an open source implementation of the Totem Single Ring Ordering and Membership protocol. It was originally derived from the OpenAIS project and licensed under the new BSD License. The mission of the Corosync effo ...
can achieve data rates of 10,000 multicasts per second or more, and can scale to large networks with huge numbers of groups or processes. Most
distributed computing A distributed system is a system whose components are located on different computer network, networked computers, which communicate and coordinate their actions by message passing, passing messages to one another from any system. Distributed com ...
platforms support one or more of these models. For example, the widely supported object-oriented
CORBA The Common Object Request Broker Architecture (CORBA) is a standard defined by the Object Management Group (OMG) designed to facilitate the communication of systems that are deployed on diverse platforms. CORBA enables collaboration between sys ...
platforms all support transactions and some CORBA products support transactional replication in the one-copy-serializability model. Th
"CORBA Fault Tolerant Objects standard"
is based on the virtual synchrony model. Virtual synchrony was also used in developing the New York Stock Exchange fault-tolerance architecture, the French Air Traffic Control System, the US Navy AEGIS system, IBM's Business Process replication architecture for
WebSphere IBM WebSphere refers to a brand of proprietary computer software products in the genre of enterprise software known as "application and integration middleware". These software products are used by end-users to create and integrate applications wi ...
and Microsoft's Windows Clustering architecture for
Windows Longhorn The New product development, development of Windows Vista occurred over the span of five years, starting in earnest in May 2001, prior to the release of Microsoft's Windows XP operating system, and continuing until November 2006. Microsoft origin ...
enterprise servers.


Systems that support virtual synchrony

Virtual synchrony was first supported by the Cornell University and was called the "Isis Toolkit". Cornell's most current version, Vsync was released in 2013 under the name Isis2 (the name was changed from Isis2 to Vsync in 2015 in the wake of a terrorist attack in Paris by an extremist organization called ISIS), with periodic updates and revisions since that time. The most current stable release is V2.2.2020; it was released on November 14, 2015; the V2.2.2048 release is currently available in Beta form. Vsync aims at the massive data centers that support
cloud computing Cloud computing is the on-demand availability of computer system resources, especially data storage ( cloud storage) and computing power, without direct active management by the user. Large clouds often have functions distributed over mul ...
. Other such systems include the Horus system the Transis system, the Totem system, an IBM system called Phoenix, a distributed security key management system called Rampart, the "Ensemble system", the Quicksilver system, "The OpenAIS project", its derivative the Corosync Cluster Engine and a number of products (including the IBM and Microsoft ones mentioned earlier).


Other existing or proposed protocols

*
Pragmatic General Multicast Pragmatic General Multicast (PGM) is a reliable multicast computer network transport protocol. PGM provides a reliable sequence of packets to multiple recipients simultaneously, making it suitable for applications like multi-receiver file-transfe ...
(PGM) *
Tibco Software TIBCO Software Inc. is an American business intelligence software company founded in 1997 in Palo Alto, California. It has headquarters in Palo Alto, California, and offices in North America, Europe, Asia, the Middle East, Africa and South A ...
's TRDP (part of RV). Note: when Tibco acquired
Talarian Talarian was a provider of real-time infrastructure software. Now part of TIBCO, it was a veteran provider of message-oriented middleware. Talarian was a member of the Business Integration Group (BIG), the Internet Protocol Multicast Initiative ( ...
, they inherited a PGM implementation with SmartSockets (SmartPGM). TRDP pre-dates the development of SmartPGM * OpenDDS as an open source implementation since their 0.12 release *
Scalable Reliable Multicast A Scalable Reliable Multicast protocol is a reliable multicast framework for light-weight sessions and application level framing. The algorithms of this framework are efficient, robust, and scale well to both very large networks and very large sessi ...

SRM
*
QuickSilver Scalable Multicast The QuickSilver project at Cornell University is an AFRL-funded effort to build a platform in support of a new generation of scalable, secure, reliable distributed computing applications able to "regenerate" themselves after failure. Among the par ...
(QSM) *
SMART Multicast SMART Multicast is an experimental method of Secure Reliable IP Multicast. It allows a user to forward IP datagrams to an unlimited group of receivers. See the article on multicast for a general discussion of this subject - this article is specif ...
(Secure Multicast for Advanced Repeating of Television) * Reliable Stream ProtocolRSP
info needed.
(RSP), a high-performance open source protocol for compute clusters * TIPC Communication Groups


Library support

*
JGroups JGroups is a library for reliable one-to-one or one-to-many communication written in the Java language. It can be used to create groups of processes whose members send messages to each other. JGroups enables developers to create reliable multi ...
(Java API): popula
project
implementation
Spread
C/C++ API, Java API
RMF
(C# API)
hmbdc
open source (headers only) C++ middleware, ultra-low latency/high throughput, scalable and reliable inter-thread, IPC and network messaging


References


Further reading

*Reliable Distributed Systems: Technologies, Web Services and Applications. K.P. Birman. Springer Verlag (1997). ''Textbook, covers a broad spectrum of distributed computing concepts, including virtual synchrony.'' *Distributed Systems: Principles and Paradigms (2nd Edition). Andrew S. Tanenbaum, Maarten van Steen (2002). ''Textbook, covers a broad spectrum of distributed computing concepts, including virtual synchrony.''
"The process group approach to reliable distributed computing"
K.P. Birman, Communications of the ACM 16:12 (Dec. 1993). ''Written for non-experts.''
"Group communication specifications: a comprehensive study"
Gregory V. Chockler, Idit Keidar, *Roman Vitenberg. ACM Computing Surveys 33:4 (2001). ''Introduces a mathematical formalism for these kinds of models, then uses it to compare their expressive power and their failure detection assumptions.''
"The part-time parliament"
Leslie Lamport. ACM Transactions on Computing Systems (TOCS), 16:2 (1998). ''Introduces the Paxos implementation of replicated state machines.''
"Exploiting virtual synchrony in distributed systems"
K.P. Birman and T. Joseph. Proceedings of the 11th ACM Symposium on Operating systems principles (SOSP), Austin Texas, Nov. 1987. ''Earliest use of the term, but probably not the best exposition of the topic.'' {{DEFAULTSORT:Reliable Multicast Inter-process communication Fault-tolerant computer systems Distributed algorithms Process theory Computer networking