Invitation Algorithm for Leader Election

Very often, a network of processors has to solve a problem for which a certain coordination is needed. There should be a unique processor, known by all the others and that, typically, knows the others, that coordinates the work of the others by assigning them subtasks. In any case, the leader should distribute to the processors under its coordination some information defining the status of the problem to be solved.

Problem
Given a network of processors, make all of them agree on one that will act as a coordinator.
If the network is disconnected, each connected component should elect its own coordinator.

Assumption: the presence of failures in an asynchronous environment.

One strategy will be to give to all of them a priority number, and to elect as a leader the non-failed processor with the highest priority. Under the failure assumption, after a certain amount of time, the network can be split apart into more than one connected components. Each connected subset of processors should then agree on a (different) leader.

From time to time, each processor checks the existence of its coordinator. If the coordinator is not detected,
the processor will enter into a phase that triggers a new leader election.

On the other hand, some faulty processors can become alive again, constructing, in this way, bridges that will connect some portions of the network. New coordinators should be then elected.

*    *    *

Initially, each node is its own coordinator. Each coordinator periodically initiates a check to see whether there are other coordinators in the network. If so, then it tries to merge the groups of processes coordinated be these into its own group. In order to eliminate the possibility of a livelock, the coordinators will wait a certain amount of time, inverse proportional to their  priority, before starting the attempt of merging other groups. Then they invite the coordinators detected among neighbors, to join together with their groups.

The groups will have assigned a unique identification number, known by all member processes. The leader of the group distributes the definition of the problem to all others.

If, after a certain time, a node has no signals from the coordinator, it tries to see whether the coordinator has failed, in which case it cancels any previous relation and tries to recover from the loss. The same action happens when a process recovers from a failure. Basically, it creates its own group anew, group that contains only itself.

*      *      *

 A description of the states and actions associated to each processor   in order to modell the leader election algorithm. A Java program that implements the algorithm (zip archive). An applet that visualizes the evolution of the system. Please enable Java!

Presentation, description of the automaton and applet made by:
Ralf Hemmecke
Bogdan Matasaru
Cleopatra Pau
Petru Pau