Master-Workers Architecture and Leader Election Algorithm
Table of Contents
Master-Workers Architecture and Leader Election Algorithm
Challenges with Leader election
- Automatic and System Leader election is not a trivial task to solve.
- Arriving to an agreement on a leader in a large cluster of nodes is even harder.
- By default, each node knows only about itself - Service registration and discovery is required.
- Failure Detection mechanism is necessary to trigger automatic leader reelection in a cluster.
Problem statement
- Implement distributed algorithms for consensus and failover from scratch.
How will we achieve this?
Using Zookeeper
Leader Election Algorithm
- Every node that connects to zookeeper volunteers to become a leader. Each node submits its candidacy by adding a Znode that represents itself under the election parent. Since zookeeper maintains a global order, it can name each Znode according to the order of their addition
- After each node finishes creating a Znode, it would query the current children of the election parent. Because of that order that zookeeper provides us, each node when querying the children of the election parent, is guaranteed to see all the Znodes created prior to its own Znode creation.
- If the Znote that the current note created is the smallest number, it knows that it is now the leader. On the other hand, if the Znode that the current node is not the smallest, then the node knows that is not the leader, and it is now waiting for instructions from the elected leader.
- This is how we break the symmetry and arrive to a global agreement on the leader node.
Reading material
- https://www.google.com/search?q=leader+selection+in+ldr+infrastructure&oq=leader+selection+in+ldr+infrastructure&gs_lcrp=EgZjaHJvbWUyBggAEEUYOTIGCAEQLhhA0gEJMTE0MjRqMGoxqAIAsAIA&sourceid=chrome&ie=UTF-8
- https://aws.amazon.com/builders-library/leader-election-in-distributed-systems/
- https://theses.hal.science/tel-03624018/file/FAVIER_Arnaud_2022.pdf
TODO