Scenes from Paxos
Some sketches of things that happen in an implementation of a distributed consensus algorithm.
Steady state
Most of the time there’s an ongoing conversation that looks like this.
Sometimes there’s no clients making requests, but the system’s conversation continues.
When nodes die
The system continues to run as long as more than half of the nodes are alive.
This is OK since two of the three nodes are still alive.
This is not OK since only one of the three nodes is alive. In particular, here the leader can’t tell whether the other two nodes are dead or if it’s just been disconnected from them and they’re still operating.
Nodes that are disconnected from the leader may attempt to become leader themselves. However followers should be faithful to their current leader while they remain connected to it.
When the leader dies…
… clever things happen to make sure that all the nodes carry on agreeing with each other and nothing is lost in the handover.