DAJ --- A Toolkit for the Simulation of Distributed Algorithms in Java -- Invariants

Go backward to Correctness Theorem
Go up to B.5 Verification
Go forward to Program Properties

Invariants

In order to formulate the program invariants from which the correctness theorem can be proved, we need to refer to the all the messages that have been ever sent through a channel:

Definition. [Channel Trace] The trace of channel c is a triple (C, inpos_c, outpos_c) such that C is an array that holds the stream of all messages that have been sent through c, the next message that will be read by the receiver is stored in C[inpos_c], and the next message sent by the sender will be written into C[outpos_c] with inpos_c <= outpos_c.

Apparently, a channel c is empty if inpos_c = outpos_c.

We may think of a channel trace as an additional data structure where inpos_c and outpos_c are initialized as zero; the data structure is updated by every operation c.send(v) as follows:

C[outpos_c] = v;
outpos_c++

In addition the operation v = c.receive() has the effect

inpos_c++

We may then introduce a derived notion:

Definition. [Trace Value] The trace value tvalue(c, from, to) of a channel c in interval [from,to[ is the sum of the values of the messages in the trace of c between (including) `from` and (not including) `to`:
tvalue(c, from, to) := Sum_{from
<=k < to} mvalue(C[k])

This notion can be used to express the value of a network as follows:

Theorem. [Network Value] The value of a network equals the sum of all initial node deposits plus the sum of the values of all messages being processed by some node plus the sum of the values of the messages received by each node minus the sum of the values of the messages left by each node:
value_net =
Sum_i (d_i + mvalue(message_i)) +
Sum_c tvalue(c, inpos_c, outpos_c))

Proof: The result follows from the fact that the network value remains constant by the operation of the program; the proof of this proceeds analogously to the proof in Section A.

We now characterize the behavior of the program by the following set of propositions which we have to prove as invariants:

Definition. [Invariants] For every node i, we define the following propositions:
D_i (Deposit): the node deposit is non-negative and equals the initial deposit value plus (potentially) the value of the message just received plus the current trace value of all input channels minus the current trace value of all output channels:
deposit_i >= 0,
deposit_i = d_i
    + Sum_{c in in_i} tvalue(c, 0, inpos_c)
    - Sum_{c in out_i} tvalue(c, 0, outpos_c)

M_i (Snap Messages): if node i is in mode RUNNING, it has not yet sent a snap message to any of its output channels and it has not yet received a snap message from any of its input channels:
mode_i = RUNNING =>
    forall c in out_i forall k (~C[k] instanceOf SnapMessage)
    forall c in in_i forall k < inpos_c (~C[k] instanceOf SnapMessage)
If the node is not in mode RUNNING any more, it has sent exactly one snap message to every of its output channels and it has received exactly one snap message in every input channel where the snapped flag is set:
mode_i /= RUNNING =>
    forall c in out_i exists ! snap (C[snap] instanceOf SnapMessage)
    forall c in in_i forall h ((c = in_i(h), snapped_i[h]) =>
        exists ! snap < inpos_c (C[snap] instanceOf SnapMessage))
If a node is not in mode RUNNING and c is an input or output channel of this node, we denote by snap_c the unique position of the snap message in the trace of c.
S_i (Snapping): if node i is not in mode RUNNING any more, the snap value equals the deposit plus the trace value of all output channels from the position of their snap messages sent by the node minus the trace value of all input channels from the position of their snap messages received by the node (if such a snap message was received); missingSnaps denotes the number of outstanding snap messages:
mode_i /= RUNNING =>
    snapValue_i = deposit_i
        + Sum_{c in out_i} tvalue(c, snap_c, outpos_c)
        - Sum_{c in in_i,
c = in_i(h), snapped_i[h]} tvalue(c, snap_c, inpos_c),
    missingSnaps = Sum_{h, ~snapped[h]} 1

B_i (Broadcasting): if node i is in one of the modes BROADCASTING or TERMINATED, then its totalValue equals the sum of all received snapValues and missingValues denotes the number of all not yet received snapValues:
mode_i in {BROADCASTING, TERMINATED} =>
    forall _j,
done[j] mode_j in {BROADCASTING, TERMINATED},
    forall c in in_i forall h (c = in(h) => snapped[h])
    totalValue = Sum_j,
done[j] snapValue_i
    missingValues = Sum_{j, ~done[j]} 1

T_i (Terminated): if the node is in mode TERMINATED, no snap values are missing any more:
mode_i = TERMINATED =>
    missingValues_i = 0.

Distributed Snapshots: Proof

Let I be the conjunction of all propositions above (for all i). Assuming that I is a program invariant, we can proof Theorem B.5 as follows (see also Figure B.5):

Proof of Correctness Theorem: When node i is in mode TERMINATED, we know by T_i that missingValues_i = 0. By B_i, this means that totalValue_i = Sum_j snapValue_j, that every node j is in one of the modes BROADCASTING or TERMINATED, and that snapped[k] for every index k of an input channel. Together with S_j for every node j, we thus have

Sum_j snapValue_j = Sum_j
    deposit_j
        + Sum_{c in out_j} tvalue(c, snap_c, outpos_c)
        - Sum_{c in in_j} tvalue(c, snap_c, inpos_c)

From S_j for every node j, we know for every channel c that 0 <= snap_c < inpos_c <= outpos_c. Since every channel has exactly one sender and one receiver node, we then have

Sum_j snapValue_j
= Sum_j deposit_j
+ Sum_c tvalue(c, snap_c, outpos_c)
- Sum_c tvalue(c, snap_c, inpos_c)
= Sum_j deposit_j + Sum_c tvalue(c, inpos_c, outpos_c)

By D_i, we know

Sum_j deposit_j + Sum_c tvalue(c, inpos_c, outpos_c) = Sum_j d_j

and, by Theorem B.5,

Sum_j d_j = value_net

which completes the proof.

Maintainer: Wolfgang Schreiner
Last Modification: October 1, 1998