The Chandy–Lamport algorithm is a
snapshot algorithm
A snapshot algorithm is used to create a consistent snapshot of the global state of a distributed system. Due to the lack of globally shared memory and a global clock, this is not trivially possible.
Example
Several computers work together in ...
that is used in
distributed systems
A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. Distributed computing is a field of computer sci ...
for recording a consistent global state of an
asynchronous
Asynchrony is the state of not being in synchronization.
Asynchrony or asynchronous may refer to:
Electronics and computing
* Asynchrony (computer programming), the occurrence of events independent of the main program flow, and ways to deal with ...
system. It was developed by and named after
Leslie Lamport
Leslie B. Lamport (born February 7, 1941 in Brooklyn) is an American computer scientist and mathematician. Lamport is best known for his seminal work in distributed systems, and as the initial developer of the document preparation system LaTeX and ...
and
K. Mani Chandy.
[
Leslie Lamport, K. Mani Chandy]
''Distributed Snapshots: Determining Global States of a Distributed System''
In: ''ACM Transactions on Computer Systems 3''. Nr. 1, Februar 1985.
PDF; 1 MB
History
According t
“The distributed snapshot algorithm described here came about when I visited Chandy, who was then at the
University of Texas in Austin
The University of Texas at Austin (UT Austin, UT, or Texas) is a public research university in Austin, Texas. It was founded in 1883 and is the oldest institution in the University of Texas System. With 40,916 undergraduate students, 11,075 ...
. He posed the problem to me over dinner, but we had both had too much wine to think about it right then. The next morning, in the shower, I came up with the solution. When I arrived at Chandy's office, he was waiting for me with the same solution.”
Definition
The assumptions of the algorithm are as follows:
* There are no failures and all messages arrive intact and only once
* The communication channels are unidirectional and
FIFO ordered
* There is a communication path between any two processes in the system
* Any process may initiate the snapshot algorithm
* The snapshot algorithm does not interfere with the normal execution of the processes
* Each process in the system records its local state and the state of its incoming channels
The algorithm works using marker messages. Each process that wants to initiate a snapshot records its local state and sends a marker on each of its outgoing channels. All the other processes, upon receiving a marker, record their local state, the state of the channel from which the marker just came as empty, and send marker messages on all of their outgoing channels. If a process receives a marker after having recorded its local state, it records the state of the incoming channel from which the marker came as carrying all the messages received since it first recorded its local state.
Some of the assumptions of the algorithm can be facilitated using a more reliable communication protocol such as
TCP/IP
The Internet protocol suite, commonly known as TCP/IP, is a framework for organizing the set of communication protocols used in the Internet and similar computer networks according to functional criteria. The foundational protocols in the suit ...
. The algorithm can be adapted so that there could be multiple snapshots occurring simultaneously.
Algorithm
The Chandy–Lamport algorithm works like this:
# The observer process (the process taking a snapshot):
## Saves its own local state
## Sends a snapshot request message bearing a snapshot token to all other processes
# A process receiving the snapshot token ''for the first time'' on ''any'' message:
## Sends the observer process its own saved state
## Attaches the snapshot token to all subsequent messages (to help propagate the snapshot token)
# When a process that has already received the snapshot token receives a message that does not bear the snapshot token, this process will forward that message to the observer process. This message was obviously sent before the snapshot “cut off” (as it does not bear a snapshot token and thus must have come from before the snapshot token was sent out) and needs to be included in the snapshot.
From this, the observer builds up a complete snapshot: a saved state for each process and all messages “in the ether” are saved.
References
Distributed algorithms
{{DEFAULTSORT:Chandy-Lamport algorithm