HOME

TheInfoList



OR:

Mainline DHT is the name given to the
Kademlia Kademlia is a distributed hash table for decentralized peer-to-peer computer networks designed by Petar Maymounkov and David Mazières in 2002. It specifies the structure of the network and the exchange of information through node lookups. Kademli ...
-based
distributed hash table A distributed hash table (DHT) is a distributed system that provides a lookup service similar to a hash table: key–value pairs are stored in a DHT, and any participating node can efficiently retrieve the value associated with a given key. The m ...
(DHT) used by BitTorrent clients to find peers via the BitTorrent protocol. The idea of using a DHT for distributed tracking in BitTorrent was first implemented in Azureus 2.3.0.0 (now known as
Vuze Vuze (previously Azureus) is a BitTorrent client used to transfer files via the BitTorrent protocol. Vuze is written in Java, and uses the Azureus Engine. In addition to downloading data linked to .torrent files, Azureus allows users to view, p ...
) in May 2005, from which it gained significant popularity. Unrelated but around the same time,
BitTorrent, Inc. Rainberry, Inc., formerly known as BitTorrent, Inc., is an American company that is responsible for the ongoing development of the BitTorrent peer-to-peer protocol, as well as the ongoing development of μTorrent and BitTorrent Mainline, two c ...
released a similar DHT into their
client Client(s) or The Client may refer to: * Client (business) * Client (computing), hardware or software that accesses a remote service on another computer * Customer or client, a recipient of goods or services in return for monetary or other valuable ...
called Mainline DHT, and thus popularized the use of distributed tracking in the BitTorrent protocol. Measurement showed that by 2013, the concurrent number of users of Mainline DHT is from 16 million to 28 million, with intra-day changes of at least 10 million.


Description

Mainline DHT is based on the popular Kademlia DHT design. Previous to usage of a DHT for distributing peers, trackers were the only method of finding peers. The key feature of using the DHT over trackers is that the decentralized approach favours the nature of the BitTorrent protocol. The DHT operates by distributing lists of peers identified by the
SHA-1 In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecima ...
hash of the torrent.


Operation

The SHA-1 hash of a torrent, the ''infohash'', is synonymous with a Kademlia key, which is used for finding peers (values) in the overlay network. To find peers in a swarm, a node sends a ''get_peers'' query with the infohash as key (equivalent to a Kademlia ''FIND_VALUE'') to the closest known nodes (with respect to the key distance). Like Kademlia, if the node does not return the value (peers), it persists further in an iterative operation. However, after the search is exhausted, the client then also inserts the peer contact information for itself onto the responding nodes with IDs closest to the infohash of the torrent.


Token

Nodes use an additional measure known as a ''token'' to ensure others don't sign up other hosts for torrents. The return value for a query for peers includes this opaque value. For a node to announce that its controlling peer is downloading a torrent, it must present the token received from the same queried node in a recent query for peers. When a node attempts to "announce" a torrent, the queried node checks the token against the querying node's IP address. Mainline DHT uses the SHA-1 hash of the IP address concatenated onto a secret that changes every five minutes for a token value. Tokens up to ten minutes old are accepted.


KRPC

A node in the Mainline DHT consists of an IP and port combination. Nodes communicate via an RPC protocol called ''KRPC''. KRPC is a simple protocol that consists of nodes sending messages (queries, replies and errors) containing
bencode Bencode (pronounced like ''Bee-encode'') is the encoding used by the peer-to-peer file sharing system BitTorrent for storing and transmitting loosely structured data. It supports four different types of values: * byte strings, * integers, * list ...
d dictionaries over UDP. A KRPC message is a single dictionary with two keys common to every message and additional keys depending on the type of message. Every message has a key ''"t"'' with a string value representing a transaction ID. This transaction ID is generated by the querying node and is echoed in the response, so responses may be correlated with multiple queries to the same node. The transaction ID should be encoded as a short string of binary numbers, typically two octets are enough as they cover 2^16 outstanding queries. The other key contained in every KRPC message is ''"y"'' with a single character value describing the type of message. The value of the ''"y"'' key is one of ''"q"'' for query, ''"r"'' for response, or ''"e"'' for error.


Queries

Queries, or KRPC message dictionaries with a "y" value of "q", contain two additional keys; "q" and "a". Key "q" has a string value containing the method name of the query. Key "a" has a dictionary value containing named arguments to the query.


Responses

Responses, or KRPC message dictionaries with a "y" value of "r", contain one additional key "r". The value of "r" is a dictionary containing named return values. Response messages are sent upon successful completion of a query.


Errors

Errors, or KRPC message dictionaries with a "y" value of "e", contain one additional key "e". The value of "e" is a list. The first element is an integer representing the error code. The second element is a string containing the error message. Errors are sent when a query cannot be fulfilled.


Routing table

Buckets are structured differently from those in Kademlia. Instead of a list of 160 buckets, BitTorrent starts with only one bucket. When a bucket becomes full, one of two things can happen: * The bucket is split. * Old nodes are pinged (as in Kademlia). Splitting is an operation that occurs only if our own node ID falls within the range of the bucket. The bucket being split is replaced by two new buckets each with half the range of the old bucket and the nodes from the old bucket are distributed among the two new ones. There are two benefits to this bucket implementation: * Less memory is used for a routing table of less than 160 buckets. * When searching buckets, it isn't necessary to retrieve additional nodes from adjacent buckets because it is guaranteed that there are enough in the current bucket.


Implementations

Mainline DHT was first included in version 4.2.0 of the BitTorrent software (November 2005). Since then, it has been implemented by a number of other clients:


References

{{Reflist


External links


Official specification
an


BitTorrent Tech Talks: DHT

Mainline DHT Measurement
BitTorrent clients