HOME

TheInfoList



OR:

In
software architecture Software architecture is the fundamental structure of a software system and the discipline of creating such structures and systems. Each structure comprises software elements, relations among them, and properties of both elements and relations. ...
, publish–subscribe is a
messaging pattern In software architecture, a messaging pattern is an architectural pattern which describes how two different parts of an application, or different systems connect and communicate with each other. There are many aspects to the concept of messaging w ...
where senders of messages, called publishers, do not program the messages to be sent directly to specific receivers, called subscribers, but instead categorize published messages into classes without knowledge of which subscribers, if any, there may be. Similarly, subscribers express interest in one or more classes and only receive messages that are of interest, without knowledge of which publishers, if any, there are. Publish–subscribe is a sibling of the
message queue In computer science, message queues and mailboxes are software-engineering components typically used for inter-process communication (IPC), or for inter- thread communication within the same process. They use a queue for messaging – the ...
paradigm, and is typically one part of a larger
message-oriented middleware Message-oriented middleware (MOM) is software or hardware infrastructure supporting sending and receiving messages between distributed systems. MOM allows application modules to be distributed over heterogeneous platforms and reduces the complex ...
system. Most messaging systems support both the pub/sub and message queue models in their
API An application programming interface (API) is a way for two or more computer programs to communicate with each other. It is a type of software Interface (computing), interface, offering a service to other pieces of software. A document or standa ...
; e.g.,
Java Message Service The Jakarta Messaging API (formerly Java Message Service or JMS API) is a Java application programming interface (API) for message-oriented middleware. It provides generic messaging models, able to handle the producer–consumer problem, that can ...
(JMS). This pattern provides greater network
scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
and a more dynamic
network topology Network topology is the arrangement of the elements ( links, nodes, etc.) of a communication network. Network topology can be used to define or describe the arrangement of various types of telecommunication networks, including command and contro ...
, with a resulting decreased flexibility to modify the publisher and the structure of the published data.


Message filtering

In the publish-subscribe model, subscribers typically receive only a subset of the total messages published. The process of selecting messages for reception and processing is called ''filtering''. There are two common forms of filtering: topic-based and content-based. In a topic-based system, messages are published to "topics" or named logical channels. Subscribers in a topic-based system will receive all messages published to the topics to which they subscribe. The publisher is responsible for defining the topics to which subscribers can subscribe. In a content-based system, messages are only delivered to a subscriber if the attributes or content of those messages matches constraints defined by the subscriber. The subscriber is responsible for classifying the messages. Some systems support a hybrid of the two; publishers post messages to a topic while subscribers register content-based subscriptions to one or more topics.


Topologies

In many publish-subscribe systems, publishers post messages to an intermediary message broker or event bus, and subscribers register subscriptions with that broker, letting the broker perform the filtering. The broker normally performs a
store and forward Store and forward is a telecommunications technique in which information is sent to an intermediate station where it is kept and sent at a later time to the final destination or to another intermediate station. The intermediate station, or node in ...
function to route messages from publishers to subscribers. In addition, the broker may prioritize messages in a
queue __NOTOC__ Queue () may refer to: * Queue area, or queue, a line or area where people wait for goods or services Arts, entertainment, and media *''ACM Queue'', a computer magazine * ''The Queue'' (Sorokin novel), a 1983 novel by Russian author ...
before routing. Subscribers may register for specific messages at build time, initialization time or runtime. In GUI systems, subscribers can be coded to handle user commands (e.g., click of a button), which corresponds to build time registration. Some frameworks and software products use
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
configuration files to register subscribers. These configuration files are read at initialization time. The most sophisticated alternative is when subscribers can be added or removed at runtime. This latter approach is used, for example, in
database trigger A database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database. The trigger is mostly used for maintaining the integrity of the information on the database. For example ...
s,
mailing list A mailing list is a collection of names and addresses used by an individual or an organization to send material to multiple recipients. The term is often extended to include the people subscribed to such a list, so the group of subscribers is re ...
s, and
RSS RSS ( RDF Site Summary or Really Simple Syndication) is a web feed that allows users and applications to access updates to websites in a standardized, computer-readable format. Subscribing to RSS feeds can allow a user to keep track of many di ...
. The
Data Distribution Service The Data Distribution Service (DDS) for real-time systems is an Object Management Group (OMG) machine-to-machine (sometimes called middleware or connectivity framework) standard that aims to enable dependable, high-performance, interoperable, re ...
(DDS) middleware does not use a broker in the middle. Instead, each publisher and subscriber in the pub/sub system shares meta-data about each other via
IP multicast IP multicast is a method of sending Internet Protocol (IP) datagrams to a group of interested receivers in a single transmission. It is the IP-specific form of multicast and is used for streaming media and other network applications. It uses speci ...
. The publisher and the subscribers cache this information locally and route messages based on the discovery of each other in the shared cognizance. In effect, brokerless architectures require publish/subscribe system to construct an overlay network which allows efficient decentralized routing from publishers to subscribers. It was shown by
Jon Kleinberg Jon Michael Kleinberg (born 1971) is an American computer scientist and the Tisch University Professor of Computer Science and Information Science at Cornell University known for his work in algorithms and networks. He is a recipient of the Nevanl ...
that efficient decentralised routing requires Navigable Small-World topologies. Such Small-World topologies are usually implemented by decentralized or federated publish/subscribe systems. Locality-aware publish/subscribe systems construct Small-World topologies that route subscriptions through short-distance and low-cost links thereby reducing subscription delivery times.


History

One of the earliest publicly described pub/sub systems was the "news" subsystem of the Isis Toolkit, described at the 1987
Association for Computing Machinery The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing society. The ACM is a non-profit professional member ...
(ACM) Symposium on Operating Systems Principles conference (SOSP '87), in a paper "Exploiting
Virtual Synchrony A reliable multicast is any computer networking protocol that provides a '' reliable'' sequence of packets to multiple recipients simultaneously, making it suitable for applications such as multi-receiver file transfer. Overview Multicast is a net ...
in
Distributed Systems A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. Distributed computing is a field of computer sci ...
. 123–138".


Advantages


Loose coupling

Publishers are loosely coupled to subscribers, and need not even know of their existence. With the topic being the focus, publishers and subscribers are allowed to remain ignorant of system topology. Each can continue to operate as per normal independently of the other. In the traditional tightly coupled client–server paradigm, the client cannot post messages to the server while the server process is not running, nor can the server receive messages unless the client is running. Many pub/sub systems decouple not only the locations of the publishers and subscribers but also decouple them temporally. A common strategy used by
middleware analyst Middleware analysts are computer software engineers with a specialization in products that connect two different computer systems together. These products can be open-source or proprietary. As the term implies, the software, tools, and technologies ...
s with such pub/sub systems is to take down a publisher to allow the subscriber to work through the backlog (a form of
bandwidth throttling Bandwidth throttling consists in the intentional limitation of the communication speed (bytes or kilobytes per second) of the ingoing (received) data and/or in the limitation of the speed of outgoing (sent) data in a network node or in a network ...
).


Scalability

Pub/sub provides the opportunity for better
scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
than traditional client-server, through parallel operation, message caching, tree-based or network-based routing, etc. However, in certain types of tightly coupled, high-volume enterprise environments, as systems scale up to become data centers with thousands of servers sharing the pub/sub infrastructure, current vendor systems often lose this benefit; scalability for pub/sub products under high load in these contexts is a research challenge. Outside of the enterprise environment, on the other hand, the pub/sub paradigm has proven its scalability to volumes far beyond those of a single data center, providing Internet-wide distributed messaging through web syndication protocols such as
RSS RSS ( RDF Site Summary or Really Simple Syndication) is a web feed that allows users and applications to access updates to websites in a standardized, computer-readable format. Subscribing to RSS feeds can allow a user to keep track of many di ...
and
Atom Every atom is composed of a nucleus and one or more electrons bound to the nucleus. The nucleus is made of one or more protons and a number of neutrons. Only the most common variety of hydrogen has no neutrons. Every solid, liquid, gas, and ...
. These syndication protocols accept higher latency and lack of delivery guarantees in exchange for the ability for even a low-end web server to syndicate messages to (potentially) millions of separate subscriber nodes.


Disadvantages

The most serious problems with pub/sub systems are a side-effect of their main advantage: the decoupling of publisher from subscriber.


Message delivery issues

A pub/sub system must be designed carefully to be able to provide stronger system properties that a particular application might require, such as assured delivery. * The broker in a pub/sub system may be designed to deliver messages for a specified time, but then stop attempting delivery, whether or not it has received confirmation of successful receipt of the message by all subscribers. A pub/sub system designed in this way cannot guarantee delivery of messages to any applications that might require such assured delivery. Tighter coupling of the designs of such a publisher and subscriber pair must be enforced outside of the pub/sub architecture to accomplish such assured delivery (e.g. by requiring the subscriber to publish receipt messages). * A publisher in a pub/sub system may assume that a subscriber is listening, when in fact it is not. A factory may utilize a pub/sub system where equipment can publish problems or failures to a subscriber that displays and logs those problems. If the logger fails (crashes), equipment problem publishers won't necessarily receive notice of the logger failure, and error messages will not be displayed or recorded by any equipment on the pub/sub system. This is also a design challenge for alternative messaging architectures, such as a client/server system. In a client/server system, when an error logger fails, the system will receive an indication of the error logger (server) failure. However, the client/server system will have to deal with that failure by having redundant logging servers online, or by dynamically spawning fallback logging servers. This adds complexity to the client and server designs, as well as to the client/server architecture as a whole. However, in a pub/sub system, redundant logging subscribers that are exact duplicates of the existing logger can be added to the system to increase logging reliability without any impact to any other equipment on the system. In a pub/sub system, the feature of assured error message logging can be added incrementally, subsequent to implementing the basic functionality of equipment problem message logging. The pub/sub pattern scales well for small networks with a small number of publisher and subscriber nodes and low message volume. However, as the number of nodes and messages grows, the likelihood of instabilities increases, limiting the maximum scalability of a pub/sub network. Example throughput instabilities at large scales include: * Load surges—periods when subscriber requests saturate network throughput followed by periods of low message volume (underutilized network bandwidth) * Slowdowns—as more and more applications use the system (even if they are communicating on separate pub/sub channels) the message volume flow to an individual subscriber will slow For pub/sub systems that use brokers (servers), the argument for a broker to send messages to a subscriber is
in-band In telecommunications, in-band signaling is the sending of control information within the same band or channel used for data such as voice or video. This is in contrast to out-of-band signaling which is sent over a different channel, or even o ...
, and can be subject to security problems. Brokers might be fooled into sending notifications to the wrong client, amplifying denial of service requests against the client. Brokers themselves could be overloaded as they allocate resources to track created subscriptions. Even with systems that do not rely on brokers, a subscriber might be able to receive data that it is not authorized to receive. An unauthorized publisher may be able to introduce incorrect or damaging messages into the pub/sub system. This is especially true with systems that
broadcast Broadcasting is the distribution of audio or video content to a dispersed audience via any electronic mass communications medium, but typically one using the electromagnetic spectrum ( radio waves), in a one-to-many model. Broadcasting began ...
or
multicast In computer networking, multicast is group communication where data transmission is addressed to a group of destination computers simultaneously. Multicast can be one-to-many or many-to-many distribution. Multicast should not be confused with ...
their messages.
Encryption In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can decip ...
(e.g.
Transport Layer Security Transport Layer Security (TLS) is a cryptographic protocol designed to provide communications security over a computer network. The protocol is widely used in applications such as email, instant messaging, and voice over IP, but its use in securi ...
(SSL/TLS)) can prevent unauthorized access, but cannot prevent damaging messages from being introduced by authorized publishers. Architectures other than pub/sub, such as client/server systems, are also vulnerable to authorized message senders that behave maliciously.


See also

*
Atom Every atom is composed of a nucleus and one or more electrons bound to the nucleus. The nucleus is made of one or more protons and a number of neutrons. Only the most common variety of hydrogen has no neutrons. Every solid, liquid, gas, and ...
, another highly scalable web-syndication protocol *
Data Distribution Service The Data Distribution Service (DDS) for real-time systems is an Object Management Group (OMG) machine-to-machine (sometimes called middleware or connectivity framework) standard that aims to enable dependable, high-performance, interoperable, re ...
(DDS) *
Event-driven programming In computer programming, event-driven programming is a programming paradigm in which the flow of the program is determined by events such as user actions ( mouse clicks, key presses), sensor outputs, or message passing from other programs or t ...
*
High-level architecture The High Level Architecture (HLA) is a standard for distributed simulation, used when building a simulation for a larger purpose by combining (federating) several simulations. The standard was developed in the 1990s under the leadership of the US ...
*
Internet Group Management Protocol The Internet Group Management Protocol (IGMP) is a communications protocol used by hosts and adjacent routers on IPv4 networks to establish multicast group memberships. IGMP is an integral part of IP multicast and allows the network to direct mu ...
(IGMP) *
Message broker A message broker (also known as an integration broker or interface engine) is an intermediary computer program module that translates a message from the formal messaging protocol of the sender to the formal messaging protocol of the receiver. Mess ...
s *
Message queue In computer science, message queues and mailboxes are software-engineering components typically used for inter-process communication (IPC), or for inter- thread communication within the same process. They use a queue for messaging – the ...
*
Observer pattern In software design and engineering, the observer pattern is a software design pattern in which an object, named the subject, maintains a list of its dependents, called observers, and notifies them automatically of any state changes, usually by ca ...
* Producer–consumer problem *
Push technology Push technology or server push is a style of Internet-based communication where the request for a given transaction is initiated by the publisher or central server. It is contrasted with pull/get, where the request for the transmission of informat ...
*
RSS RSS ( RDF Site Summary or Really Simple Syndication) is a web feed that allows users and applications to access updates to websites in a standardized, computer-readable format. Subscribing to RSS feeds can allow a user to keep track of many di ...
, a highly scalable web-syndication protocol *
Usenet Usenet () is a worldwide distributed discussion system available on computers. It was developed from the general-purpose Unix-to-Unix Copy (UUCP) dial-up network architecture. Tom Truscott and Jim Ellis conceived the idea in 1979, and it was ...
*
WebSub WebSub (formerly PubSubHubbub) is an open protocol for distributed publish–subscribe communication on the Internet. Initially designed to extend the Atom (and RSS) protocols for data feeds, the protocol can be applied to any data type (e.g. HTM ...
, an implementation of pub/sub


References

{{DEFAULTSORT:Publish-subscribe pattern Architectural pattern (computer science) Distributed computing architecture Message-oriented middleware Software design patterns