Building Kafka cluster

Building Kafka cluster is crucial for the production system.

Kafka cluster gives the following advantages.

  • Support for failover in case of a node down
  • Queue replication
  • Support for consumer scale out (standalone Kafka also supports consumer scale out)

Step 1 – Build Zookeeper ensemble

Kafka depends on Zookeeper for it’s configuration management. Therefore, Zookeeper needs to run in cluster. (Refer to Zookeeper ensemble)

Step 2 – Run multiple Kafka

Kafka doesn’t need a specific configuration except

  • broker.id : each Kafka process must have a unique broker id
  • zookeeper.connect : Kafka cluster must connect to the same zookeeper

For test, I’v set different listeners and log.dirs to run multiple Kafka on a PC.

Step 3 – Create cluster enabled Topics

Creating a cluster enabled Topic is not different from a standalone Topic.

${KAFKA_HOME}/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 10 --topic test-topic4

replication-factor : This sets data redundancy. The number is the total number of an item. (not that of redundancy) For example, if the value is 2, an item is saved on 2 nodes including leader node

partitions : a Topic has multiple partitions to scale out consumers

After creating a Topic, you can check the status.

[tkstone@localhost bin]$ ./kafka-topics.sh --describe --zookeeper localhost:2181
Topic:test-topic4	PartitionCount:10	ReplicationFactor:2	Configs:
	Topic: test-topic4	Partition: 0	Leader: 2	Replicas: 2,1	Isr: 1,2
	Topic: test-topic4	Partition: 1	Leader: 0	Replicas: 0,2	Isr: 0,2
	Topic: test-topic4	Partition: 2	Leader: 1	Replicas: 1,0	Isr: 0,1
	Topic: test-topic4	Partition: 3	Leader: 2	Replicas: 2,0	Isr: 0,2
	Topic: test-topic4	Partition: 4	Leader: 0	Replicas: 0,1	Isr: 0,1
	Topic: test-topic4	Partition: 5	Leader: 1	Replicas: 1,2	Isr: 1,2
	Topic: test-topic4	Partition: 6	Leader: 2	Replicas: 2,1	Isr: 1,2
	Topic: test-topic4	Partition: 7	Leader: 0	Replicas: 0,2	Isr: 0,2
	Topic: test-topic4	Partition: 8	Leader: 1	Replicas: 1,0	Isr: 0,1
	Topic: test-topic4	Partition: 9	Leader: 2	Replicas: 2,0	Isr: 0,2

Above result shows which parition is set on which nodes. Some important points are

  • ISR means “in sync replica”. It is node ids which are having (copying) the topic partition
  • Leader means leader node for the partition. If the leader is shut down, one of replicas is chosen as the new leader
  • Replicas is the list of defined leader and slaves. If one of Replicas is down, Replicas and Isr value don’t match

You can check the status also on Zookeeper.

[zk: localhost:2181(CONNECTED) 22] ls /brokers/topics/test-topic4/partitions
[3, 2, 1, 0, 7, 6, 5, 4, 9, 8]

[zk: localhost:2181(CONNECTED) 21] get /brokers/topics/test-topic4/partitions/0/state
{"controller_epoch":8,"leader":2,"version":1,"leader_epoch":5,"isr":[1,2]}

Next time, I’ll write a post on managing Kafka Topic.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.