Pastry (DHT)
   HOME

TheInfoList



OR:

:''This article describes the Pastry Distributed Hash Table. For the food, see
Pastry Pastry is baked food made with a dough of flour, water and shortening (solid fats, including butter or lard) that may be savoury or sweetened. Sweetened pastries are often described as '' bakers' confectionery''. The word "pastries" sugges ...
.'' Pastry is an
overlay network An overlay network is a computer network that is layered on top of another network. Structure Nodes in the overlay network can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through ...
and routing network for the implementation of a
distributed hash table A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table: key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The ...
(DHT) similar to Chord. The key–value pairs are stored in a redundant
peer-to-peer Peer-to-peer (P2P) computing or networking is a distributed application architecture that partitions tasks or workloads between peers. Peers are equally privileged, equipotent participants in the network. They are said to form a peer-to-peer ...
network of connected
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
hosts. The protocol is bootstrapped by supplying it with the
IP address An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
of a peer already in the network and from then on via the routing table which is dynamically built and repaired. It is claimed that because of its redundant and decentralized nature there is no
single point of failure A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working. SPOFs are undesirable in any system with a goal of high availability or reliability, be it a business practice, software appl ...
and any single node can leave the network at any time without warning and with little or no chance of data loss. The protocol is also capable of using a routing metric supplied by an outside program, such as
ping Ping may refer to: Arts and entertainment Fictional characters * Ping, a domesticated Chinese duck in the illustrated book '' The Story about Ping'', first published in 1933 * Ping, a minor character in ''Seinfeld'', an NBC sitcom * Ping, a c ...
or
traceroute In computing, traceroute and tracert are computer network diagnostic commands for displaying possible routes (paths) and measuring transit delays of packets across an Internet Protocol (IP) network. The history of the route is recorded as th ...
, to determine the best routes to store in its routing table.


Overview

Although the distributed hash table functionality of Pastry is almost identical to other DHTs, what sets it apart is the routing overlay network built on top of the DHT concept. This allows Pastry to realize the
scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...
and
fault tolerance Fault tolerance is the property that enables a system to continue operating properly in the event of the failure of one or more faults within some of its components. If its operating quality decreases at all, the decrease is proportional to the ...
of other networks, while reducing the overall cost of routing a packet from one node to another by avoiding the need to flood packets. Because the routing metric is supplied by an external program based on the IP address of the target node, the metric can be easily switched to shortest hop count, lowest latency, highest bandwidth, or even a general combination of metrics. The hash table's key-space is taken to be circular, like the key-space in the Chord system, and node IDs are 128-bit unsigned integers representing position in the circular key-space. Node IDs are chosen randomly and uniformly so peers who are adjacent in node ID are geographically diverse. The routing overlay network is formed on top of the hash table by each peer discovering and exchanging state information consisting of a list of leaf nodes, a neighborhood list, and a routing table. The leaf node list consists of the ''L''/2 closest peers by node ID in each direction around the circle. In addition to the leaf nodes there is also the neighborhood list. This represents the ''M'' closest peers in terms of the routing metric. Although it is not used directly in the routing algorithm, the neighborhood list is used for maintaining locality principles in the routing table. Finally there is the routing table itself. It contains one entry for each address block assigned to it. To form the address blocks, the 128-bit key is divided up into digits with each digit being ''b'' bits long, yielding a numbering system with base 2''b''. This partitions the addresses into distinct levels from the viewpoint of the client, with level 0 representing a zero-digit common prefix between two addresses, level 1 a one-digit common prefix, and so on. The routing table contains the address of the closest known peer for each possible digit at each address level, except for the digit that belongs to the peer itself at that particular level. This results in the storage of 2^b - 1 contacts per level, with the number of levels scaling as (\log_)/b. Values of b\approx 4, L \approx 2^b and M \approx 2^b represent operating values on a typical network.


Routing

A packet can be routed to any address in the keyspace whether there is a peer with that node ID or not. The packet is routed toward its proper place on the circular ring and the peer whose node ID is closest to the desired destination will receive the packet. Whenever a peer receives a packet to route or wants to send a packet it first examines its leaf set and routes directly to the correct node if one is found. If this fails, the peer next consults its routing table with the goal of finding the address of a node which shares a longer prefix with the destination address than the peer itself. If the peer does not have any contacts with a longer prefix or the contact has died it will pick a peer from its contact list with the same length prefix whose node ID is numerically closer to the destination and send the packet to that peer. Since the number of correct digits in the address always either increases or stays the same — and if it stays the same the distance between the packet and its destination grows smaller — the routing protocol converges.


Applications built on Pastry

Pastry itself specifies how keys are distributed among the nodes and how the node responsible for holding a key can be found. Using this as a substrate for a higher
protocol Protocol may refer to: Sociology and politics * Protocol (politics), a formal agreement between nation states * Protocol (diplomacy), the etiquette of diplomacy and affairs of state * Etiquette, a code of personal behavior Science and technolog ...
enables Pastry to implement functionality such as a distributed file system, a subscription and publishing system, or any other system which can be reduced to storing values and retrieving them later.


PAST

PAST is a distributed file system layered on top of Pastry. A file is stored into the system by computing the hash of its filename. Then Pastry routes the contents of the file to the node in the circular keyspace closest to the hash obtained from the filename. This node will then send copies of the file to the ''k'' nodes nearest the actual key, most of which are likely to be leaf nodes of this node and thus directly reachable. Retrieval of data is accomplished by rehashing the file name and routing a request for the data over Pastry to the proper place in the keyspace. The request can be fulfilled by any of the ''k'' nodes that have copies of the data. This accomplishes both data redundancy and load distribution. Since adjacent nodes in the keyspace are geographically diverse the odds that all ''k'' of them will go offline at the same time is very small. More importantly, since the Pastry routing protocol seeks to minimize the distance traveled, the nearest node to the machine that made the request (according to the metric) is likely to be the one that responds with the data.


SCRIBE

SCRIBE is a decentralized publish/subscribe system that uses Pastry for its underlying route management and host lookup. Users create topics to which other users can subscribe. Once the topic has been created, the owner of the topic can publish new entries under the topic which will be distributed in a
multicast In computer networking, multicast is group communication where data transmission is addressed to a group of destination computers simultaneously. Multicast can be one-to-many or many-to-many distribution. Multicast should not be confused with ...
tree to all of the SCRIBE nodes that have subscribed to the topic. The system works by computing the hash of the topic name concatenated with the name of the user who owns the topic. This hash is then used as a Pastry key, and the publisher then routes packets to the node closest to the key using Pastry's routing protocol to create the root node of the topic on that node. People then subscribe to the topic by computing the key from the topic and publisher's name and then using Pastry to route a subscribe message to the topic towards the root node. When the root node receives the subscribe message from another node it adds the node ID to its list of children and begins acting as a forwarder of the topic. Decentralization is accomplished through having all nodes in the network snoop on subscribe messages going past them on their way to the topic's root node. If the topic is one to which the current node subscribes, it will stop forwarding the packet toward the root node and add the node trying to subscribe as one of its children. In this way a treelike structure is formed with the root node at the top sending out to the first few subscriber nodes and then each of these nodes forwarding the messages on to their children, and so on. Because packets from random nodes on the Pastry network destined for the same node often end up traveling along the same path very soon in their journey, they end up attaching to whatever part of the tree is nearest to them in the Pastry network. Since each hop along a pastry route represents what is locally the best route according to the routing metric in use, the subscribe message seeks out the closest portion of the tree and attaches itself there. Finally fault tolerance among members of the distribution tree is accomplished through the use of timeouts and keepalives with actual data transmissions doubling as keepalives to minimize traffic. If a child node does not hear from its parent for a while, it routes a new subscribe message toward the root node of the tree, reattaching itself wherever it bumps into the tree for that topic. If a parent doesn't hear from a child for a timeout period, it drops the child from its list of children. (If this action causes its child list to become empty, the parent stops acting as a forwarder altogether.) The only remaining failure point is that of the root node, and Pastry itself automatically overcomes this. Because Pastry duplicates keys among the few nodes closest to the key's actual value, the root node already has mirrors set up, lying dormant. If the root node goes offline, again detected through timeouts, the next-closest Pastry node will begin acting as the root node. When the creator of the topic tries to publish new material the old root node will be unreachable. The publisher will then fall back on the Pastry network and use it to route its publish message to the new root node. Once this has been done, the publisher caches a copy of the new root node's IP address to reduce the use of the Pastry network for future transmissions.


See also

*
Content addressable network The content-addressable network (CAN) is a distributed, decentralized P2P infrastructure that provides hash table functionality on an Internet-like scale. CAN was one of the original four distributed hash table proposals, introduced concurrently wi ...
(CAN) *
Chord (DHT) In computing, Chord is a protocol and algorithm for a peer-to-peer distributed hash table. A distributed hash table stores key-value pairs by assigning keys to different computers (known as "nodes"); a node will store the values for all the keys ...
*
Kademlia Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups. Kademl ...
*
Tapestry (DHT) Tapestry is a peer-to-peer overlay network which provides a distributed hash table, routing, and multicasting infrastructure for distributed applications.{{cite journal , last=Zhao , first=Ben Y. , last2=Huang , first2=Ling , last3=Stribling , fir ...
*
Koorde In peer-to-peer networks, Koorde is a distributed hash table (DHT) system based on the Chord DHT and the De Bruijn graph (De Bruijn sequence). Inheriting the simplicity of Chord, Koorde meets hops per node (where is the number of nodes in the ...


References

* *{{cite journal , author= A. Rowstron , author2= A-M. Kermarrec , author3= M. Castro , author4= P. Druschel , name-list-style= amp , title=SCRIBE: The design of a large-scale event notification infrastructure , journal=NGC2001 UCL London , date=Nov 2001 , url=http://rowstron.azurewebsites.net/PAST/scribe.pdf


External links


Pastry projectOverSim simulator with Pastry implementation
Distributed data storage