Zookeeper
High Performance Distributed System Coordination Service
A high performance coordination service designed specifically for distributed systems.
Provides us with an abstraction layer for higher level distributed algorithms.
Centralized service
Distributed applications need a centralized service for the following features:
- configuration information
- naming
- providing distributed synchronization
- providing group services
Zookeeper
ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Features
- open-source
- distributed and highly reliable
- centralized coordination service
It exposes a simple set of primitives that can be used by distributed applications to implement higher level services for
- synchronization
- configuration maintenance
- groups and naming
What makes Zookeeper a good solution?
- It is a distributed system itself that provides us high availability and reliability.
- In Production, it typically runs in a cluster of an odd number of nodes, higher than 3.
- Uses redundancy to allow failures and stay functional.
Features and Benefits
- ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchal namespace which is organized similarly to a standard file system and keeps data in-memory
- ZooKeeper is replicated solution and it allows clients connect to a single ZooKeeper server. The client maintains a TCP connection, through which it sends requests, gets responses, gets watch events, and sends heart beats.
- Zookeeper manages the following
- Sequential Consistency: Updates from a client will be applied in the order that they were sent.
- Atomicity: Updates either succeed or fail. No partial results.
- Single System Image: A client will see the same view of the service regardless of the server that it connects to.
- Reliability: Once an update has been applied, it will persist from that time forward until a client overwrites the update.
- Timeliness: The client’s view of the system is guaranteed to be up-to-date within a certain time bound (configured unit of order of seconds).
- Note: Default configurations would be used as recommended by Zookeeper.
- In case of failure, Zookeeper uses FastLeaderElection algorithm and maintains the reliability. The leader election algorithm allows for the system to recover fast enough preventing throughput from dropping substantially. As per ZooKeeper documentation, it takes less than 200ms to elect a new leader.
- Security
- ZooKeeper now supports Kerberos security
- Authorization is done via ACLs
- Supports several types of restrictions
- Message digest
- Hostname
- IP address
- Can limit access by function
- Read, write, delete, etc.
Distributed systems with Zookeeper
Instead of making the nodes in a cluster communicate directly with each other to coordinate the work, the nodes will communicate with the Zookeeper server instead.
Data model
It looks a lot like a tree and very similar to a file system.

- Each element in the tree or file system is called
Znode
.- Properties of a Znode
- Hybrid between a file and a directory.
- Znodes can store any data inside (like a file)
- Znodes can have children znodes (like a directory)
- Types of a Znode
- Persistent - persists between sessions. e.g. if our application disconnects from the zookeeper, and then reconnects again, a persistent znode that was created by our application stays intact with all it’s children and data.
- Ephemeral - is deleted when the session ends. Lets us identify if the node that created a znode is now dead.
- Properties of a Znode
- Zookeeper manages information as a hierarchical system of
nodes
(much like a file system). Each node can contain data or can contain child nodes. - ZooKeeper models a hierarchical filesystem
- A znode may contain data and/or other znodes
References
- https://zookeeper.apache.org/
- Zookeeper wiki: https://cwiki.apache.org/confluence/display/ZOOKEEPER/Index
- ZooKeeper Recipes and Solutions: https://zookeeper.apache.org/doc/current/recipes.html
- The Tao of ZooKeeper: https://cwiki.apache.org/confluence/display/ZOOKEEPER/Tao
TODO
Use cases
- Distributed locking
- Locking for coordination among threads
- Imagine a multithreaded program where a lock is needed to coordinate among threads.
- Usage of the java.util.concurrent package is the first and most feasible solution most developers opt for - but how will we handle it when there is a need to scale up the application?
- The problem gets bigger when the locking needs to be handled across requests coming from different networks and machines.
- This is where we can use Apache Zooper.
Used by
Popular technology used by many companies and projects (Kafka, Hadoop, HBase, etc.)