After Kafka cluster has been configured, we need to create a Topic which enables failover and data replication.
The conclusion in advance is that if a Topic’s replication factor is more than 2,
- Kafka supports automatic leader failover
- Data rebalance is supported only in manual operation
Test environment
- Kafka 2.12
- 3 Kakfa brokers (Id : 0, 1, 2) on a PC
To create a Topic
${KAFKA_HOME}/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 2 --partitions 2 --topic test-topic
replication-factor 2 means that there are 2 copies for every published item (including leader broker’s)
To check the Topic status
./kafka-topics.sh --describe --zookeeper localhost:2181 --topic test-topic Topic:test-topic PartitionCount:2 ReplicationFactor:2 Configs: Topic: test-topic Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2 Topic: test-topic Partition: 1 Leader: 2 Replicas: 2,0 Isr: 2,0
Above result shows that partitions 0 is located on broker 1 and 2 (check Isr value) and partition 1 is located on broker 2 and 0.
To check the status after a broker’s shutdown
What if a broker is shut down?. For test, I shut down broker 0. (This could affect partition 0)
{KAFKA_HOME}/bin/kafka-topics.sh --describe --zookeeper localhost:2181 Topic:test-topic PartitionCount:2 ReplicationFactor:2 Configs: Topic: test-topic Partition: 0 Leader: 2 Replicas: 1,2 Isr: 2 Topic: test-topic Partition: 1 Leader: 2 Replicas: 2,0 Isr: 2,0
Above result shows that partition 1 is not affected because it’s located on broker 0 and 1. But partition 0’s leader is changed into broker 2. (Leader value has changed from 1 to 2)
The problem is …
In this status, Kafka service continues. But partition 0’s data is only on broker 2. (check Isr value) So while broker 1’s down, if broker 2 dies, partition 0 data is lost. Unfortunately, Kafka doesn’t support automatic data rebalance. (As I know, it’s true.)
So we need to rebalance it manually.
To rebalance the topic manually
Kafka supports data rebalance in 3 steps.
1) to generate reassignment script (topics-to-move.json)
{"topics": [ {"topic": "test-topic"} ], "version":1 }
With above script, run the following command.
./kafka-reassign-partitions.sh --zookeeper localhost:2181 --topics-to-move-json-file topics-to-move.json --broker-list "0,2" --generate
The ouput shows like this.
Current partition replica assignment
{“version”:1,”partitions”:[{“topic”:”test-topic”,”partition”:1,”replicas”:[2,0],”log_dirs”:[“any”,”any”]},{“topic”:”test-topic”,”partition”:0,”replicas”:[1,2],”log_dirs”:[“any”,”any”]}]}
Proposed partition reassignment configuration
{“version”:1,”partitions”:[{“topic”:”test-topic”,”partition”:1,”replicas”:[0,2],”log_dirs”:[“any”,”any”]},{“topic”:”test-topic”,”partition”:0,”replicas”:[2,0],”log_dirs”:[“any”,”any”]}]}
2) to commit reassignment
Save the part of proposed partition reassignment. (i.e as “final-topics-to-move.json”)
And with the new file, run the following command.
./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file final-topics-to-move.json --execute
Now, the Topic has been rebalanced.
3) to verify
To verify the result, run the following command.
./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file final-topics-to-move.json --verify
The output shows like this.
Status of partition reassignment:
Reassignment of partition test-topic-1 completed successfully
Reassignment of partition test-topic-0 completed successfully
To check the status after the rebalance
Now, check the Topic status again.
${KAFA_HOME}/bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic test-topic Topic:test-topic PartitionCount:2 ReplicationFactor:2 Configs: Topic: test-topic Partition: 0 Leader: 0 Replicas: 0,2 Isr: 2,0 Topic: test-topic Partition: 1 Leader: 2 Replicas: 2,0 Isr: 2,0
Now, we can verify that partition 0’s data is replicated on broker 2 and 0 (Isr value has changed)