Go backward to 3 The Termination Detection Algorithm Go up to Top Go forward to 5 Progress of the Algorithm |
The derivation of the termination detection algorithm by repeated refinement of its invariant gives good reason to believe that the algorithm indeed solves the stated problem. However, the development was based on "high-level" reasoning about a concurrent system with rather large inference steps. We cannot be really sure that the algorithm is correct unless we formalize our reasoning, i.e., create a map for our thoughts, and decompose the large inference steps into smaller ones whose correctness is much more self-evident. A formal verification of a concurrent algorithm is tedious, because at every step some process may perform an action that violates the invariant. We have to convince ourselves that this is not the case. During this process we get a much deeper insight into the algorithm and its behavior because many subtleties are exposed that may be overlooked in a high-level presentation.
The core property to be proved is defined by a formula of the form [][P]. The basic strategy to prove such a property is based on induction: we prove that
'
holds (where P'
is derived from P by
replacing every program variable by its primed counterpart).
However, even if P is invariant, one may not be able to prove this fact, because P on its own may be to weak to show the second part of above rule, i.e., because P is not inductive. The solution is to find a stronger property I such that I is inductive and thus [][I] can be proven. Since I => P, we can then conclude [][P].
The invariant constructed in the previous section is an attempt to such an inductive property but it will turn out that I alone is also not sufficient. Strengthening I to some inductive I' will however make the proof quite clumsy. Instead we will repeatedly apply the same principle that we use in the proof of P: we construct a sequence of inductive properties such that in the proof of inductiveness of a property we can use the previously proven properties as additional "axioms". In this way, we can "modularize" the proof by decomposing it into a number of layers that are thin enough to be managed reasonably well.
First, we have to make sure that the termination detection algorithm does not unappropriately interfere with the working of the original system. Every behavior of the extended system must be also a possible behavior of the original system; vice versa, every behavior of the original system should be also allowed by the extended system. This idea is expressed by
Proposition 1 (Equivalence)
SystemT implements System and vice
versa, i.e., for all n, act
, and chan
:
(varterm
in B: SystemT_{n}(act
,chan
,term
)) <=> System_{n}(act
,chan
).
Proof: SystemT is constructed from System
In general, implementation proofs are more complicated because one has to show that every step of the implementation simulates some step of the specification. However, the construction of SystemT from System makes equivalence such obvious that we content ourselves with above sketch. Because of equivalence, we refer in the context of this section to SystemT simply as "the system". The system property that the algorithm is expected to detect is formalized by
Definition 1 (Termination) The system is terminated if all processes are inactive and all channels are empty:
terminated_{_n}(inact
, inchan
) :<=>
/\
- forall i in Z_{n}: ~
act
_{i}- forall i in Z_{n}, j in Z_{n}:
chan
_{i, j} = < > .
An important property of termination is established by
Proposition 2 (Stability of Termination)
Once the system is terminated, it stays so, i.e.
for all n, act
, chan
:
[](terminated_{_n}(act
,chan
) => []terminated_{_n}(act
,chan
)).
Proof: It suffices to prove
[](terminated_{_n}(Init implies ~terminated_{_n}(act
,chan
) => terminated_{_n}(act
'
,chan
'
)).
act
, chan
).
Assume terminated_{_n}(act
, chan
).
We know ~act
_{k} for every
k in Z_{n}, thus Send_{i,
j}(act
_{i}, chan
) is not enabled.
From chan
_{i, j}= < > we know that
Receive_{i, j}(act
, chan
, cnt
, mark
) is not enabled. By
~act
_{i}, Inactivate_{i}(act
) does
not have any effect. Neither Start_{i}(run
,
sig
, mark
, tval
, tmark
) nor
Forward_{i}(act
_{i}, cnt
_{i},
sig
, mark
, tval
, tmark
, term
) modify
act
or chan
. []The fundamental property of the algorithm is then expressed by
Proposition 3 (Correctness)
If the system detects its termination, it has indeed terminated,
i.e., for all n, act
,
chan
, term
:
SystemT_{n}(act
,chan
,term
) =>
[](term
=> terminated_{_n}(act
,chan
)).
The rest of this section is devoted to setting up the stage for finally proving this property.
We will establish a number of system invariants that will be used in the proof
the invariant derived in the previous section, which in turn will be used to
prove the corrrectness property. In order to simplify the formulation of these
properties, we will from now on consider arbitrary but fixed n,
act
, chan
, term
(the specification parameters) and take
cnt
, sig
, mark
, tval
, tmark
(the local
variables of the algorithm) such that
holds. The following propositions are expressed in the context of this assumption, i.e., they are true only for behaviors of the system.
/\
- Init_{}(
act
,chan
,term
,run
,cnt
,sig
,mark
)- [][exists i in Z_{n}: Action_{n, i}(
act
,chan
,term
,run
,cnt
,sig
,mark
,tval
,tmark
)]
We start the construction of the proof by establishing a simple invariant that assures that ensures that the algorithm is "well-typed".
Proposition 4 (Type Correctness) The system preserves the asserted variable types, i.e.,
[](act
in Z_{n}->B/\chan
in Z_{n} x Z_{n}->Seq(MSG)/\term
in B/\run
in B/\cnt
in Z_{n}->N/\sig
in Z_{n}->B/\mark
in Z_{n}->B/\tval
in N/\tmark
in B).
Proof: We show
[]act
in Z_{n}->B. Init
implies act
_{i} in B for every
i in Z_{n}. The only actions that modify
act
are Receive and Inactivate both of which construct
act
'
from act
by replacing the mapping of some
i in Z_{n} by an element of B. Type
correctness for the other variables is shown analogously. []
We will not refer to above property explicitly any more but use it implicitly in many of our reasoning steps.
The following proposition guarantees that the algorithm does not make any mistakes when counting messages.
Proposition 5 (Message Count)
The system keeps in cnt
track of the number of messages pending in
communication channels, i.e.,
[](msgcnt_{n}(inwherechan
, incnt
))
- msgcnt_{n}(in
chan
, incnt
) :<=> count_{n}(chan
) = sum_{i in Z_n}cnt
_{i},- count_{n}(
chan
) := sum_{i in Z_n} sum_{j in Z_n} len_{}(chan
_{i, j}),- len_{}(
ch
) := such l in N: exists m_{0}, ..., m_{l-1}:ch
= < m_{0}, ..., m_{l-1} > .
Proof: Init implies cnt
_{i} = 0
and chan
_{i,j} = < > for every
i in Z_{n} and j in Z_{n}. Assume
msgcnt_{n}(count
, chan
).
We show msgcnt_{n}(chan
'
,
cnt
'
). The only actions that modify
chan
or cnt
are Send and Receive:
act
_{i}, chan
, cnt
)chan
'
) = count_{n}(chan
[(i,
j) |-> chan
_{i,
j} o < m > ]) = 1+count_{n}(chan
) = 1+sum_{k in Z_n}cnt
_{k} = sum_{k in Z_n}cnt
[i |-> cnt
_{i}+1]_{k} = cnt
'
.
act
, chan
, cnt
, mark
)The following property ensures that Process 0 indeed knows whether there is a token in the system or not.
Proposition 6 (Detection Run)
The system records in run
whether there is an ongoing termination
detection run, i.e.,
[](run
<=> exists k in Z_{n}:sig
_{k}).
It turns out that above invariant is itself not inductive; we therefore combine its proof with the proof of
Proposition 7 (Single Token) There is at most one token in the system, i.e.,
[](forall k in Z_{n}:sig
_{k} => forall j in Z_{n}: (sig
_{j} => j = k)).
Proof of Property 6 and of Property 7:
Init implies ~run
and
~sig
_{k} for every k in Z_{n}. Assume
run
<=> exists k in Z_{n}: sig
_{k}
and
forall k in Z_{n}:
sig
_{k} => forall j in Z_{n}:
(sig
_{j} => j = k).
run
'
<=> exists k in Z_{n}:
sig
'
_{k}.
The only actions that modify run
or
sig
are Start and Forward.
run
, sig
,
mark
, tval
, tmark
)run
'
and
sig
'
[i-_{n}1].
act
_{i},
cnt
_{i}, sig
, mark
, tval
, tmark
,
term
)sig
_{i} and thus run
.
Assume i != 0. Then run
'
= run
and
sig
'
= sig
[i |-> false, i-_{n}1 |-> true].
Thus we have run
'
and
sig
'[i-_{n}1]. Now assume
i = 0. Then we have ~run
'
and
sig
'
= sig
[i |-> false]. By
sig
_{i} and the induction hypothesis, we thus have
~sig
'
_{k} for every k in Z_{n}.
(Here we used in the proof of Proposition 6 the induction hypothesis of Proposition 7).
sig
'
_{k} => forall j in Z_{n}:
(sig
'
_{j} => j = k).
The only actions that modify sig
are Start and Forward.
run
, sig
,
mark
, tval
, tmark
)run
and therefore
~sig
_{j} for every j in Z_{n}. From
sig
'
= sig
[i-_{n}1 |-> true]
we then know
sig
'
_{j} => j = i-_{n}1
for every j in Z_{n}.
(Here we used in the proof of Proposition 7 the induction hypothesis of Proposition 6).
act
_{i},
cnt
_{i}, sig
, mark
, tval
, tmark
,
term
)sig
_{i} and thus
sig
_{j} => j = i for every
j in Z_{n}. Assume i != 0. Then
sig
'
= sig
[i |-> false, i-_{n}1 |-> true]
and therefore
sig
'
_{j} => j = i-_{n}1
for every j in Z_{n}. Now assume i = 0.
Then sig
'
= sig
[i |-> false]
and thus ~sig
_{j} for every j in Z_{n}. []Now we have set up enough machinery to proof the crucial fact of the termination detection algorithm:
Proposition 8 (Invariant) The system maintains the following invariant:
[]Inv_{n}(inwhereact
, incnt
, insig
, inmark
, intval
, intmark
)
- Inv_{n}(in
act
, incnt
, insig
, inmark
, intval
, intmark
) :<=>
forall k in Z_{n}:sig
_{k} =>
\/
- InvC_{n, k}(in
act
, incnt
, intval
)- 0 <
tval
+sum_{j in Z_n, j <= k}cnt
_{j}- exists j in Z_{n}: j <= k/\
mark
_{j}tmark
.- InvC_{n, k}(in
act
, incnt
, intval
) :<=>
/\
- forall j in Z_{n}: j > k => ~
act
_{j}tval
= sum_{j in Z_n, j > k}cnt
_{j}
Proof: Init implies ~sig
_{k} for every
k in Z_{n}. Assume Inv_{n}(act
, cnt
,
sig
, mark
, tval
, tmark
).
We show Inv_{n}(act
'
, cnt
'
,
sig
'
, mark
'
, tval
'
,
tmark
'
).
act
_{i}, chan
, cnt
)cnt
, thus only the first
or the second disjunct of the invariant can be invalidated. However, the
action implies
act
_{i} and thus by the invariant also i < k; it can therefore not affect the first disjunct. Since cnt
'
differs
from cnt
only in
cnt
'_{i} = cnt
_{i}+1, the
second disjunct cannot be invalidated either.
(The last remark explains why the second disjunct of the invariant has to establish that the denoted sum remains positive, not just non-zero.)
act
, chan
, cnt
, mark
)sig
'
_{k}. We have to show
If j <= k, we have
\/
- ...
- 0 <
tval
+sum_{l in Z_n, l <= k}cnt
[j |->cnt
_{j}-1]_{l}- exists l in Z_{n}: l <= k/\
mark
[j |-> true]_{l}tmark
mark
[j |-> true]_{j} and thus the third disjunct.
(This case is the cause of the third disjunct of the invariant: a process which currently holds or did not yet get the token receives a message and becomes marked.)
Assume j > k. We know sig
_{k} from
sig
'
_{k} and the action. We proceed
by case distinction on the consequence of the induction hypothesis.
act
_{l})/\ (tval
= sum_{l in Z_n,
l > k}cnt
_{l})cnt
[j |-> cnt
_{j}-1]_{l} = sum_{l in Z_n,
l <= k}cnt
_{l}.
The case gives us
tval
+sum_{l in Z_n, l <= k}cnt
_{l} = sum_{l in Z_n}cnt
_{l}.
From the action and Proposition 5, we know
sum_{l in Z_n}cnt
_{l} > 0 and thus the second disjunct.
(This case is the cause for the second disjunct of the invariant: a message is received by a process that previously held the token, which invalidates the token counter, but keeps the message count positive.)
tval
+sum_{l in Z_n,
l <= k}cnt
_{l})cnt
_{l} = sum_{l in Z_n,
l <= k}cnt
[j |-> cnt
_{j}-1]_{l}.
mark
_{l}, for some
l in Z_{n})mark
[j |-> true]_{l}.
tmark
act
)act
_{i}, which cannot invalidate the
invariant.
run
, sig
,
mark
, tval
, tmark
)sig
'
_{k}, therefore we have
sig
[i-_{n}1 |-> true]_{k}. By
Proposition 7 and i = 0, we know
k = i-_{n}1 = n-_{n}1.
We have to show
which is trivially true.
\/
/\
- forall j in Z_{n}: j > n-_{n}1 => ~
act
_{j}- 0 = sum_{j in Z_n, j > n-_n1}
cnt
_{j}- ...
act
_{i},
cnt
_{i}, sig
, mark
, tval
, tmark
,
term
)sig
'
= sig
[i |-> false] that
~sig
'
_{k} for all k in Z_{n}.
Now assume i != 0 and
take k in Z_{n} with sig
'
_{k}.
Since sig
'
= sig
[i |-> false,
i-_{n}1 |-> true],
Proposition 7 gives us k = i-_{n}1. We have to
show
The action implies
\/
/\
- forall j in Z_{n}: j > i-_{n}1 => ~
act
_{j}tval
+cnt
_{i} = sum_{j in Z_n, j > i-_n1}cnt
_{j}- 0 <
tval
+cnt
_{i}+sum_{j in Z_n, j <= i-_n1}cnt
_{j}- exists j in Z_{n}: j <= i-_{n}1/\
mark
[i |-> false]_{j}tmark
\/mark
_{i}.
sig
_{i}; we proceed by case distinction on
the consequence of the induction hypothesis.
act
_{j})/\ (tval
= sum_{j in Z_n,
j > i}cnt
_{j})act
_{i} and thus
forall j in Z_{n}:
j > i-_{n}1 => ~act
_{j}.
Furthermore we know
tval
+cnt
_{i} = (sum_{j in Z_n,
j > i}
cnt
_{j})+cnt
_{i} = sum_{j in Z_n, j > i-_n1}
cnt
_{j}.
tval
+sum_{j in Z_n,
j <= i}cnt
_{j})cnt
_{j} = cnt
_{i}+sum_{j in Z_n,
j <= i-_n1}cnt
_{j}.
mark
_{j}, for some
j in Z_{n})mark
[i |-> false]_{j}.
If j = i, we have mark
_{i}.
(The second part of this case is the cause of the fourth disjunct of the invariant. If the token reaches a process that is marked because it has previously received a message, the token gets marked.)
tmark
Finally we come to the heart of the matter: showing that our system invariant indeed implies the stated correctness property.
Proof of Proposition 3 (Correctness):
Init implies ~term
. Assume
term
=> terminated_{_n}(act
,
chan
). We show
term
'
=> terminated_{_n}(act
'
,
chan
'
).
If term
, then the induction hypothesis and Proposition 2
imply terminated_{_n}(act
'
,
chan
'
). Now assume ~term
and term
'
. We
then have
Forward_{i}(act
_{i}, cnt
_{i},
sig
, mark
, tval
, tmark
, term
) with
i = 0. Thus we know:
(act) | ~act _{0}, |
(sig) | sig _{0}, |
(tval) | tval ' = tval +cnt _{0}, |
(tmark) | tmark ' = tmark \/ mark _{0}, |
(last) | ~tmark ' /\ tval ' = 0. |
(tvaln) | tval +cnt _{0} = 0. |
(markn) | ~tmark /\ ~mark _{0}. |
(inact) | forall j in Z_{n}:
j > 0 => ~act _{j}, |
(tvals) | tval = sum_{j in Z_n,
j > 0}cnt _{j}. |
From (act) and (inact), we conclude
(cres1) | forall j in Z_{n}: ~act _{j}, |
(cres2) | sum_{j in Z_n}cnt _{j} = 0. |
act
= act
'
, and
chan
= chan
'
, (cres1) and (cres2) imply
terminated_{_n}(act
'
, chan
'
). []