Distribution is one of the most pervasive features of modern computing architectures. From the multiple specialized processors that make up a personal computer, to the network of computers in a modern automobile, to the Internet itself, modern systems require the coordination of computations on different physical platforms, each with its own fault characteristics, and connected with inherently unreliable links. The distributed computing community develops formal methods for designing and analyzing such systems. One of its main conceptual tools, initiated with a 1973 technical report by Edsger Dijkstra, is the notion of a distributed algorithm that is self-stabilizing. That is, the algorithm is allowed to start in any state, even one that does not satisfy its specification, but is guaranteed to converge to a conforming state within a finite time. Such a mechanism not only allows a distributed system to start smoothly, but also enables it to recover automatically from faults that may emerge during its operation, without the need for centralized external intervention to reset the system.
This volume is a brief but readable formal introduction to distributed self-stabilizing algorithms, with copious references to the literature down to 2019. The first chapter introduces the problem and the relevant literature. Chapter 2 provides the formal foundation, defining a network of computational nodes, a model of computation (featuring the concept of a daemon that regulates the execution of local components), the nature of self-stabilization, and measures of complexity. These definitions are couched in an approach to self-stabilization known as the atomic state model, which focuses on the states of the individual processes.
The next four chapters give examples of self-stabilizing algorithms for four important prototypical problems in distributed computing: graph coloring (assigning colors to nodes so that no two nodes have the same color), synchronous unison (ensuring that the clocks on all computing nodes agree), construction of a breadth-first search spanning tree, and the token ring (allowing nodes to take turns in executing). Each chapter defines the problem, presents the algorithm, proves that it stabilizes, and analyzes its time and space complexity.
An important technique in algorithm design is the composition of simpler algorithms with proven properties to form a more complex algorithm. Chapter 7 discusses techniques for achieving such composition.
The main body of the book focuses on the atomic state model of self-stabilization. An alternative approach to formalization focuses not on the states of the nodes, but on the communication mechanisms by which they exchange information, and the final chapter introduces this message-passing model.
The book includes a bibliography and index.
This concise volume will be a useful supplement to courses on distributed computing, and an accessible introduction for self-study.