# Distributed Snapshots

## Problem

We are going to design and to implement a distributed program that simulates the behavior of a banking system:

• every node of the network holds a money deposit, and
• every node may randomly withdraw money from its deposit and send it to some neighbor node which stores the money in its own deposit.

Let the network value be the sum of all money stored in some node or contained in some message of the network; by above operation this value remains constant but is unknown to every node.

The program is to determine the network value in every node without interrupting normal operation; when the value is known to every node, execution may terminate. An assertion should check whether the result is correct.

## Program

Initially all nodes in the network are in mode RUNNING; when you press the "Run" button, the network starts execution managing the deposit and exchanging messages as described above. The current deposit of each node and the values of the messages exchanged between nodes may be determined by moving the mouse pointer over a node respectively channel.

### Taking the Snapshot

At some random time, node 0 triggers the request to determine the value of the network. At this time, the execution of the network is interrupted, which gives you the possibility to examine the switch of the node's mode from RUNNING to SNAPPING.

We implement the solution of the problem using the Chandy-Lamport Algorithm for determining consistent global snapshots of a network 1. In our program, this algorithm is applied as follows:

• When a node switches to mode SNAPPING, it determines the current value of its local deposit in a variable snapValue and sends snap messages to all neighbor nodes.
• Each node in mode SNAPPING continues normal operation. However the value of each message from some input channel is added to snapValue until a snap message arrives through that channel.
• If snap messages have arrived through all input channels of a node, the current snapValue represents the node's part of the network value.

The following figure illustrates the basic idea of the algorithm: the stream of messages passing through a channel is split by the snap message into two parts; a node's snapValue eventually represents the value of the node after having processed all messages within the partition bounded by the snap messages of the incoming and outgoing channels. Since every message is contained in exactly one such partition, the sum of all snapValues represents the total network value.

Distributed Snapshots: Basic Idea

If you now press the "Run" button, the network continues execution until node 0 has determined its final snapValue. The node switches then from mode SNAPPING to mode BROADCASTING and you can determine the node's internal state.

Please note that at this time the other nodes may have not yet determined their corresponding values, some of them may be even still in the mode RUNNING (i.e. they have not yet been informed about taking the snapshot). Please take a look at the different node states.

The program then continues to determine the total value of the network:

• When a node has determined its final snapValue, it broadcasts this value (together with its node number) to all neighbors and continues normal operation.
• A node that receives some other node's snapValue determines whether it has already seen this value. If not, this value is added to the node's local variable totalValue and the received value is forwarded to all neighbor nodes.
• If a node has seen and forwarded the snapValue of every other node, it terminates execution.

The program thus ultimately terminates in a state where every node holds the same totalValue which represents the value of the network.

If you press "Run" again, the network will continue execution until every node has determined totalValue (which will take place at a network time of about 350).

After the program has finally terminated, you may take a look at the totalValue of every node and should find that they agree. The program implements an assertion that checks in every node, whether totalValue really equals the sum of all current deposit and message values of the network. If the program does not show an "Assertion failed" message in the bottom line, also these values agree.

You may reset the program and run it again with another random network value and different execution behavior.

This is the full source code of above program.

This is a section of the DAJ report.

Maintained by: Wolfgang Schreiner
Last Modification: November 13, 1997

[Up] [RISC-Linz] [University] [Search]