Go backward to 2 The System ModelGo up to TopGo forward to 4 Correctness of the Algorithm

# 3 The Termination Detection Algorithm

We will extend the System interface to

SystemTn in N(`act` in Zn->B, `chan` in Zn x Zn->Seq(MSG), `term` in B)
such that `term` initialized by
Init(...) :<=>

/\
• ...
• ~`term`
signals termination of the system.

Basic Idea The derivation of the termination detection algorithm is based on the following idea (see Figure 2): If Process 0 wants to detect termination, it sends a signal to process n-1. If an inactive process i > 0 receives the signal, it forwards the signal to process i-1. An active process keeps the signal until it becomes inactive. The signal therefore represents a "token" that circulates through the ring of processes.

We model this circulation by introducing a variable

`sig` in Zn->B

initialized by

Init(...) :<=>

/\
• ...
• forall i in Zn: ~`sig`i

and by two actions

• Starti(...) :<=>

/\
• i = 0
• `sig``'` = `sig`[i-n1 |-> true]
• Forwardi(...) :<=>

/\
• `sig``'` = if i != 0 then `sig`[i |-> false, i-n1 |-> true] else `sig`[i |-> false]

where a-nb denotes the difference modulo n, i.e., 0-n1 = n-1).

(Modeling token circulation by an array of boolean signals reflects the behavior of the distributed algorithm closer than modeling it by a single integer position that "automatically" guarantees token unicity.)

Figure 2: Token Circulation

However, above specification allows Process 0 to submit a new token before it has received the token it has previously submitted. We therefore introduce another variable

`run` in B

that records whether there is a token in the system and then disables Start:

• Starti(...) :<=>

/\
• ...
• ~`run`/\ `run``'`
• Forwardi(...) :<=>

/\
• ...
• `run``'` = if i != 0 then `run` else false

However, when Process 0 finally receives the token, it can only deduce that each process has been inactive at some time in the past. Since a process mave have received a message after forwarding the token, it may have become active again. Only if the algorithm could maintain the invariant

forall k in Zn: `sig`k =>
forall j in Zn: j > k => ~`act`j,

Process 0 could deduce from the receipt of the token (i.e., `sig`0) and its own inactivity (i.e., `~``act`0) that all processes are currently inactive.

#### Message Counters

Even if Process 0 receives the token under above invariant, there may be still messages pending in the network that can cause processes to become active again. We therefore need some mean to keep track of the number of messages pending in the network. Since a process only knows about the messages this it itself sends or receives, this can be only achieved by introducing a distributed counter

`cnt` in Zn->Z

such that every process sending a message increases its counter and every process receiving a message decreases its counter (see Figure 3):

• Sendi, j(...) :<=>

/\
• ...
• `cnt``'` = `cnt`[i |-> `cnt`i+1],

/\
• ...
• `cnt``'` = `cnt`[j |-> `cnt`j-1]

Figure 3: Message Counters

Then clearly the total sum of the distributed counter equals the number of the messages in the network, i.e., the system maintains the invariant

sumi in Z_n`cnt`i = sumi in Z_nsumj in Z_nlen(`chan`i, j)

where len(`ch`) := such l in N: exists m0, ..., ml-1: `ch` = < m0, ..., ml-1 > .

#### Token Value

How can Process 0 learn about this sum? Apparently this is only possible, if the circulating token collects this information and delivers it to this process. We therefore introduce

`tval` in Z

modeling the value carried by the token. Process 0 initializes the value to 0 when submitting the token; each forwarding process adds the value of its counter (see Figure 4):

• Starti(...) :<=>

/\
• ...
• `tval``'` = 0
• Forwardi(...) :<=>

/\
• ...
• `tval``'` = `tval`+`cnt`i

Figure 4: Token Value

If the algorithm could maintain the invariant

forall k in Zn: `sig`k =>

/\
• forall j in Zn: j > k => ~`act`j
• `tval` = sumj in Z_n, j > k`cnt`j

then Process 0 could on receipt of the token deduce from ~`act`0 that all processes are inactive and from `cnt`0+`tval` = 0 that no message is in the network, i.e., that the system has terminated.

Unfortunately, the algorithm cannot maintain this invariant. We therefore have to disjoin the conclusion part (the "core") of the invariant with some weakening conditions such that the algorithm is able to maintain the weaker invariant but Process 0 can still conclude termination from the information stored locally and received by the token.

#### Awaking a Process

Figure 5: Invalidating the Core Invariant

Let us investigate how above invariant can be falsified by an action (i.e., the invariant holds in the state before the action but does not any more hold in the state after the action). Since all processes that have already forwarded the token are inactive, the only possibility is that such a process receives a message and becomes active again (see Figure 5). This however means that there must be still pending messages in the network. Since the invariant still holds before invalidation, a weaker version of the variant is

forall k in Zn: `sig`k =>

\/

• /\
• forall j in Zn: j > k => ~`act`j
• `tval` = sumj in Z_n, j > k`cnt`j
• 0 < `tval`+sumj in Z_n, j <= k`cnt`j

If Process 0 receives the token (i.e., `sig`0 holds), it can still conclude termination from ~`act`0 and `cnt`0+`tval` = 0.

#### Process Marks

Is it still possible for an action to falsify the weaker invariant? Clearly it can be invalidated, if a process that has not yet forwarded the token receives a message, which causes the summation term in above condition to drop by one (see Figure 6). A simple way to record whether such an incident has taken place is to associate to each process a marker that is set on message receipt. We therefore introduce

`mark` in Zn->B

which is initialized to false in every position and which is set to true when a message is received:

• Init(...) :<=>

/\
• ...
• forall i in Zn: ~`mark`i

/\
• ...
• `mark``'` = `mark`[j |-> true],

Then we can weaken the invariant further to

forall k in Zn: `sig`k =>

\/

• /\
• forall j in Zn: j > k => ~`act`j
• `tval` = sumj in Z_n, j > k`cnt`j
• 0 < `tval`+sumj in Z_n, j <= k`cnt`j
• exists j in Zn: j <= k/\ `mark`j

If Process 0 receives the token, i.e., `sig`0 holds, it can still conclude termination from ~`act`0, `cnt`0+`tval` = 0, and ~`mark`0.

Figure 6: Process Markers

#### Token Marker

Clearly this weakened invariant is still falsified when the marked process with the least index forwards the token (see Figure 7). Therefore the token itself has to record the information when it passes a marked process. We introduce a token marker

`tmark` in B

which is initialized by Process 0 and forwarded by each process as

• Starti(...) :<=>

/\
• ...
• ~`tmark``'`
• Forwardi(...) :<=>

/\
• ...
• `tmark``'` = `tmark`\/ `mark`i

Then we can weaken the invariant further by

forall k in Zn: `sig`k =>

\/

• /\
• forall j in Zn: j > k => ~`act`j
• `tval` = sumj in Z_n, j > k`cnt`j
• 0 < `tval`+sumj in Z_n, j <= k`cnt`j
• exists j in Zn: j <= k/\ `mark`j
• `tmark`

This invariant cannot be falsified any more by the activity of any process.

Process 0 can then conclude termination from `sig`0, ~`act`0, `cnt`0+`tval` = 0, `cnt`0+`tval` = 0, and ~`tmark`. By definition of `tmark``'` and `tval``'`, this can be stated as

Forwardi(...) :<=>

/\
• ...
• `term``'` <=> if i != 0 then `term` else (~`tmark``'`/\ `tval``'` = 0).

Figure 7: Token Marker

#### Resetting Process Markers

If Process 0 cannot yet conclude termination after a termination detection round, it may later start another round. Without resetting the process markers, however, such an attempt is doomed to fail. Fortunately, after it has marked the token, the process marker has fulfilled its task and can be reset:

Forwardi(...) :<=>

/\
• ...
• `mark``'` = `mark`[i |-> false]

Since this action takes place simultaneously with token transmission (i.e., ~`sig``'`i holds), it cannot falsify the invariant.

If the system terminates during a termination detection round, Process 0 may not yet conclude termination at the end of this round. However, after this run, no process is active, there are no messages in the network any more, and the sum of the message counters equals 0. Still some processes may be marked; the next termination detection round may thus fail, too. Anyway, after this run all processes are unmarked, and the next termination detection round will succeed.

#### Algorithm Specification

The complete algorithm is compiled in Specification 2. The existential quantifier

var `v` in T: ...

introduces a local program variable `v` in a formula like (exists v in T: ...) introduces a local mathematical variable v. The crucial difference between both kinds is that the "variable" `v` may have different values in different states of a program behavior while the "constant" v has always the same value in all states.

Specification 2: The Termination Detection Algorithm

SystemTn in N(`act` in Zn->B, `chan` in Zn x Zn->Seq(MSG), `term` in B) :<=>
(a system that may detect its termination)
let
• Init(in `act`, in `chan`, in `term`, in `run`, in `cnt`, in `sig`, in `mark`) :<=>

/\
• ...(as in the basic system)
• ~`term`/\ ~`run`
• forall i in Zn: `cnt`i = 0/\ ~`sig`i/\ ~`mark`i,
• Sendi, j(in `a`, io `chan`, io `cnt`) :<=>

/\
• ...(as in the basic system)
• `cnt``'` = `cnt`[i |-> `cnt`i+1],
• Receivei, j(io `act`, io `chan`, io `cnt`, io `mark`) :<=>

/\
• ...(as in the basic system)
• `cnt``'` = `cnt`[j |-> `cnt`j-1]
• `mark``'` = `mark`[j |-> true],
• Inactivatei(io `act`) :<=> ...(as in the basic system)
• Starti(io `run`, io `sig`, io `mark`, io `tval`, out `tmark`) :<=>
(Process 0 starts termination detection)
/\
• i = 0/\ ~`run`
• `run``'`
• `sig``'` = `sig`[i-n1 |-> true]
• `mark``'` = `mark`[i |-> false]
• `tval``'` = 0
• ~`tmark``'`,
• Forwardi(in `a`, in `c`, io `sig`, io `mark`, io `tval`, io `tmark`, io `term`) :<=> (process i gets token, forwards it, or detects termination)

/\
• ~`a`/\ `sig`i
• `run``'` = if i != 0 then `run` else false
• `sig``'` = if i != 0 then `sig`[i |-> false, i-n1 |-> true] else `sig`[i |-> false]
• `mark``'` = `mark`[i |-> false]
• `tval``'` = `tval`+`c`
• `tmark``'` = `tmark`\/ `mark`i
• `term``'` <=> if i != 0 then `term` else (~`tmark``'`/\ `tval``'` = 0),
• Actionn, i(io `act`, io `chan`, io `term`, io `run`, io `cnt`, io `sig`, io `mark`, io `tval`, io `tmark`) :<=>

\/
• exists j in Zn: Sendi, j(in `act`i, io `chan`, io `cnt`)
• exists j in Zn: Receivei, j(io `act`, io `chan`, io `cnt`, io `mark`)
• Inactivatei(io `act`)
• Starti(io `run`, io `sig`, io `mark`, out `tval`, out `tmark`)
• Forwardi(in `act`i, in `cnt`i, io `sig`, io `mark`, io `tval`, io `tmark`, io `term`):

var
(internal variables of the algorithm)

• `run` in B, `cnt` in Zn->Z, `sig` in Zn->B, `mark` in Zn->B,
• `tval` in Z, `tmark` in B:

/\
• Init(in `act`, in `chan`, in `term`, in `run`, in `cnt`, in `sig`, in `mark`)
• [][exists i in Zn: Actionn, i(io `act`, io `chan`, io `term`, io `run`, io `cnt`, io `sig`, io `mark`, io `tval`, io `tmark`)]

Maintainer: Wolfgang Schreiner
Last Modification: August 20, 1998