zookeeper 使用Zab(zookeeper atom broadcast).
Zab is a different protocol than Paxos, although it shares with it some key aspects, as for example:
- A leader proposes values to the followers
- Leaders wait for acknowledgements from a quorum of followers before considering a proposal committed (learned)
- Proposals include epoch numbers, which are similar to ballot numbers in Paxos
The main conceptual difference between Zab and Paxos is that it is primarily designed for primary-backup systems, like Zookeeper, rather than for state machine replication.
Paxos can be used for primary-backup replication by letting the primary be the leader. The problem with Paxos is that, if a primary concurrentlyproposes multiple state updates and fails, the new primary may apply uncommitted updates in an incorrect order. An example is presented in our DSN 2011 paper (Figure 1). In the example, a replica should only apply the state update B after applying A. The example shows that, using Paxos, a new primary and its follows may apply B after C, reaching an incorrect state that has not been reached by any of the previous primaries.
A workaround to this problem using Paxos is to sequentially agree on state updates: a primary proposes a state update only after it commits all previous state updates. Since there is at most one uncommitted update at a time, a new primary cannot incorrectly reorder updates. This approach, however, results in poor performance.
Zab does not need this workaround. Zab replicas can concurrently agree on the order of multiple state updates without harming correctness. This is achieved by adding one more synchronization phase during recovery compared to Paxos, and by using a different numbering of instances based on zxids.
Chubby VS Zookeeper：