Kafka clients are composed of producer and consumer. Understanding consumer is very important for overall architecture.
A Topic is composed of several partitions (the number is defined when creating the Topic). And Kafka consumer is composed of group and client. (There could be multi groups and multi clients in a group)
Each group and client is identified by group id and client id. (which can be set in source)
Some rules if there is one consumer group
- A partition is assigned to only one client id, which guarantees that a message is sent to only one client
- A client can be assigned to several partitions
- If partition count < client count, some clients don’t get a message (whose count is more than partiton count)
- If partition count >= client count, all clients work
- If a new client is added to or removed from the group, partition – client mapping is rearranged automatically
For example, I constituted 3 clients for a group. And created “test-topic” which has 3 partitions.
If I run the first client and check the status
./kafka-run-class.sh kafka.admin.ConsumerGroupCommand --bootstrap-server localhost:9092 --describe --group testGroup01 TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID test-topic 0 523 523 0 consumer01-3aca7bd2-d619-414d-984e-4336cd09646a /127.0.0.1 consumer01 test-topic 1 519 519 0 consumer01-3aca7bd2-d619-414d-984e-4336cd09646a /127.0.0.1 consumer01 test-topic 2 8 8 0 consumer01-3aca7bd2-d619-414d-984e-4336cd09646a /127.0.0.1 consumer01
The result shows that all partitions are assigned to “consumer01”.
Later, I run all three clients and check the status
[tkstone@localhost bin]$ ./kafka-run-class.sh kafka.admin.ConsumerGroupCommand --bootstrap-server localhost:9092 --describe --group testGroup01 TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG CONSUMER-ID HOST CLIENT-ID test-topic 2 12 12 0 consumer03-dcb1f2dd-f496-4586-8300-fcd96a37d359 /127.0.0.1 consumer03 test-topic 1 521 521 0 consumer02-6016e0f6-428e-47da-9c38-572b5dcdc739 /127.0.0.1 consumer02 test-topic 0 527 527 0 consumer01-3b206c3e-21a5-4f95-8c6a-a7056b3fb057 /127.0.0.1 consumer01
The result shows that each partition is assigned to a different consumer. (“consumer01”, “consumer02”, “consumer03”)
Some rules if there are multi consumer groups
- A partition is assigned to all groups, which leads to message broadcasting to all partitions
- Within a group, each client works in one consumer group rule
By using these rules, we can implement peer-to-peer or publish-subscribe queue.
Peer-to-Peer architecture
In this architecture, one consumer works for a message. If there are multiple consumers in a group, a message is sent to only one consumer. This is good for high throughput and automatic failover.
Publish-Subscribe architecture
In this architecture, all consumers receive the same message. When anonymous consumers connect to Kafka, this architecture is needed.