Building Zookeeper cluster

Zookeeper is a cluster management solution. Therefore, Zookeeper needs to be set up as a cluster. If not, it can be the Single Point of Failure. Zookeeper cluster is called “ensemble”.

How to set up ensemble (based on v3.4.10)

Basic config for a single Zookeeper is as follows.

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/home/tkstone/some/path/zoo1
clientPort=2181

To change it into ensemble, add some more options. For simple test, I configured 3 Zookeepers on the same host.

Zookeeper#1

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/home/tkstone/some/path/zoo1
clientPort=2181
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

Zookeeper#2

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/home/tkstone/some/path/zoo2
clientPort=2182
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

Zookeeper#3

tickTime=2000
initLimit=10
syncLimit=5
dataDir=/home/tkstone/some/path/zoo3
clientPort=2183
server.1=localhost:2888:3888
server.2=localhost:2889:3889
server.3=localhost:2890:3890

For the above configurations, only clientPort and dataDir are different. “dataDir” is where Zookeeper snapshot is written.

And the most import option is “server.id=zoo_ip:port1:port2

  • id : id for each Zookeeper. It must be unique among ensemble
  • port1 : used for leader’s listening port (only enabled for leader process)
  • port2 : used for leader election, enabled for all processes

This is not enough. Inside each dataDir, a file, named “myid”, must be created. The file must have id value inside it. For example, for Zookeeper#1, /home/tkstone/some/path/zoo1/myid must have value “1”

How it works

When Zookeeper starts up, it looks for myid file and recognize itself’s id. After that, it tries to connect to the other nodes to elect leader (based on server.id option)

By default, Zookeeper ensemble works only if the majority nodes are running. (i.e 2 for 3 nodes ensemble) If it is not met, Zookeeper doesn’t show znode data.

If the majority nodes are running, the leader is elected. The others are followers.

Considerations

To avoid split brain issue, Zookeeper ensemble works only if the majority nodes are running. But by experimental option (readonlymode.enabled), you can enable read only mode when the majority nodes are not running. (More on the option)

How to determine which node is the leader

If a process is chosen as the leader, the following log is written

[QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2182:Leader@371] - LEADING - LEADER ELECTION TOOK - 217

But the other processes have the following log

[QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:Follower@64] - FOLLOWING - LEADER ELECTION TOOK - 3711

For the above logs, Zookeeper#2 is the leader.

One thought on “Building Zookeeper cluster

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.