How Kafka consumer works?

Kafka clients are composed of producer and consumer. Understanding consumer is very important for overall architecture.

A Topic is composed of several partitions (the number is defined when creating the Topic). And Kafka consumer is composed of group and client. (There could be multi groups and multi clients in a group)

Each group and client is identified by group id and client id. (which can be set in source)

Some rules if there is one consumer group

  • A partition is assigned to only one client id, which guarantees that a message is sent to only one client
  • A client can be assigned to several partitions
  • If partition count < client count, some clients don’t get a message (whose count is more than partiton count)
  • If partition count >= client count, all clients work
  • If a new client is added to or removed from the group, partition – client mapping is rearranged automatically

For example, I constituted 3 clients for a group. And created “test-topic” which has 3 partitions.

If I run the first client and check the status

./kafka-run-class.sh kafka.admin.ConsumerGroupCommand --bootstrap-server localhost:9092 --describe --group testGroup01
TOPIC           PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                     HOST            CLIENT-ID
test-topic      0          523             523             0               consumer01-3aca7bd2-d619-414d-984e-4336cd09646a /127.0.0.1      consumer01
test-topic      1          519             519             0               consumer01-3aca7bd2-d619-414d-984e-4336cd09646a /127.0.0.1      consumer01
test-topic      2          8               8               0               consumer01-3aca7bd2-d619-414d-984e-4336cd09646a /127.0.0.1      consumer01

The result shows that all partitions are assigned to “consumer01”.

Later, I run all three clients and check the status

[tkstone@localhost bin]$ ./kafka-run-class.sh kafka.admin.ConsumerGroupCommand --bootstrap-server localhost:9092 --describe --group testGroup01

TOPIC           PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             CONSUMER-ID                                     HOST            CLIENT-ID
test-topic      2          12              12              0               consumer03-dcb1f2dd-f496-4586-8300-fcd96a37d359 /127.0.0.1      consumer03
test-topic      1          521             521             0               consumer02-6016e0f6-428e-47da-9c38-572b5dcdc739 /127.0.0.1      consumer02
test-topic      0          527             527             0               consumer01-3b206c3e-21a5-4f95-8c6a-a7056b3fb057 /127.0.0.1      consumer01

The result shows that each partition is assigned to a different consumer. (“consumer01”, “consumer02”, “consumer03”)

Some rules if there are multi consumer groups

  • A partition is assigned to all groups, which leads to message broadcasting to all partitions
  • Within a group, each client works in one consumer group rule

By using these rules, we can implement peer-to-peer  or publish-subscribe queue.

Peer-to-Peer architecture

peer_to_peer

In this architecture, one consumer works for a message. If there are multiple consumers in a group, a message is sent to only one consumer. This is good for high throughput and automatic failover.

Publish-Subscribe architecture

pub_sub

In this architecture, all consumers receive the same message. When anonymous consumers connect to Kafka, this architecture is needed.

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.