Kafka log is not an informative file but repository for incoming queue message. Some queue software deletes queue message when it is acked by consumer. But Kafka keeps log regardless of consumer’s ack.
But it is anyhow a file, so it has storage limitation. Now I’m showing how to manage Kafka log. (The following contents are tested on version 2.12.)
Test environment
For test, I created a topic with the following command.
./kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 2 --topic test-topic
Above command creates “test-topic” with 2 partitions. After creating the topic, I can find 2 directories inside Kafka log directory. (which is defined by log.dirs param)
- test-topic-0
- test-topic-1
Inside each directory, I can find the following files.
[tkstone@localhost test-topic-0]$ ls 00000000000000015056.index 00000000000000015056.log 00000000000000015056.snapshot 00000000000000015056.timeindex leader-epoch-checkpoint
The log file is “00000000000000015056.log”. (The name vary depending on the first message’s offset)
How Log file is rolled
Before explaining log management, we need to understand how log file is rolled. It depends on the following options.
- Time based option
- log.roll.ms
- log.roll.hours (applied if log.roll.ms is not set)
- log.roll.ms
- Size based option
- log.segment.bytes
Time based rolling and size based rolling work independently.
- If time based rolling is set, log file is rolled when the last write time is beyond log.roll.ms or log.roll.hours.
- If size based rolling is set, log file is rolled when the file size reaches above log.segment.bytes.
- When a log file is rolled, a new log file is generated, whose name has new message’s offset.
Limiting log file usage
We need to limit log file usage (especially in production) because it can consume all disk. Retention policy is defined with the following options.
- Time based option
- log.retention.ms
- log.retention.minutes (applied if log.retention.ms is not set)
- log.retention.hours (applied if log.retention.minutes is not set)
- log.retention.ms
- Size based option
- log.retention.bytes
- Log delete interval
- log.retention.check.interval.ms
Time based option and size base option work independently.
- If time based policy is set, log files are deleted when each log file’s last message time is beyond time option (There could be multiple log files because of rolling)
- If size based policy is set, rolled log file is deleted when current log file’s size is bigger than size option, but the current log file is not deleted
Time based log delete example – current log file (00000000000000002608.log) is marked for delete and finally deleted with new 0 sized file creation (00000000000000006227.log)
[tkstone@localhost test-topic-0]$ ls -al total 3912 drwxrwxr-x 2 tkstone tkstone 178 Mar 14 13:30 . drwxrwxr-x 54 tkstone tkstone 4096 Mar 14 13:32 .. -rw-rw-r– 1 tkstone tkstone 10485760 Mar 14 13:32 00000000000000002608.index -rw-rw-r– 1 tkstone tkstone 3969756 Mar 14 13:32 00000000000000002608.log -rw-rw-r– 1 tkstone tkstone 10 Mar 14 13:27 00000000000000002608.snapshot -rw-rw-r– 1 tkstone tkstone 10485756 Mar 14 13:32 00000000000000002608.timeindex -rw-rw-r– 1 tkstone tkstone 11 Mar 14 13:29 leader-epoch-checkpoint |
[tkstone@localhost test-topic-0]$ ls -al total 3912 drwxrwxr-x 2 tkstone tkstone 306 Mar 14 13:33 . drwxrwxr-x 54 tkstone tkstone 4096 Mar 14 13:33 .. -rw-rw-r– 1 tkstone tkstone 7232 Mar 14 13:33 00000000000000002608.index.deleted -rw-rw-r– 1 tkstone tkstone 3969756 Mar 14 13:32 00000000000000002608.log.deleted -rw-rw-r– 1 tkstone tkstone 10860 Mar 14 13:33 00000000000000002608.timeindex.deleted -rw-rw-r– 1 tkstone tkstone 10485760 Mar 14 13:33 00000000000000006227.index -rw-rw-r– 1 tkstone tkstone 0 Mar 14 13:33 00000000000000006227.log -rw-rw-r– 1 tkstone tkstone 10 Mar 14 13:33 00000000000000006227.snapshot -rw-rw-r– 1 tkstone tkstone 10485756 Mar 14 13:33 00000000000000006227.timeindex -rw-rw-r– 1 tkstone tkstone 11 Mar 14 13:33 leader-epoch-checkpoint |
[tkstone@localhost test-topic-0]$ ls -al total 12 drwxrwxr-x 2 tkstone tkstone 178 Mar 14 13:34 . drwxrwxr-x 54 tkstone tkstone 4096 Mar 14 13:34 .. -rw-rw-r– 1 tkstone tkstone 10485760 Mar 14 13:33 00000000000000006227.index -rw-rw-r– 1 tkstone tkstone 0 Mar 14 13:33 00000000000000006227.log -rw-rw-r– 1 tkstone tkstone 10 Mar 14 13:33 00000000000000006227.snapshot -rw-rw-r– 1 tkstone tkstone 10485756 Mar 14 13:33 00000000000000006227.timeindex -rw-rw-r– 1 tkstone tkstone 11 Mar 14 13:33 leader-epoch-checkpoint |
Size based log delete example – 2 log files exist. (current – 11239.log and rolled file – 10284.log) but currnet file size is above size param(log.retention.bytes=524288) , so rolled file is deleted. but current log remains.
[tkstone@localhost test-topic-0]$ ls -al total 2208 drwxrwxr-x 2 tkstone tkstone 319 Mar 14 14:14 . drwxrwxr-x 54 tkstone tkstone 4096 Mar 14 14:23 .. -rw-rw-r– 1 tkstone tkstone 1904 Mar 14 14:14 00000000000000010284.index -rw-rw-r– 1 tkstone tkstone 1047553 Mar 14 14:14 00000000000000010284.log -rw-rw-r– 1 tkstone tkstone 10 Mar 14 14:08 00000000000000010284.snapshot -rw-rw-r– 1 tkstone tkstone 2868 Mar 14 14:14 00000000000000010284.timeindex -rw-rw-r– 1 tkstone tkstone 10485760 Mar 14 14:23 00000000000000011239.index -rw-rw-r– 1 tkstone tkstone 740448 Mar 14 14:23 00000000000000011239.log -rw-rw-r– 1 tkstone tkstone 10 Mar 14 14:14 00000000000000011239.snapshot -rw-rw-r– 1 tkstone tkstone 10485756 Mar 14 14:23 00000000000000011239.timeindex -rw-rw-r– 1 tkstone tkstone 12 Mar 14 14:10 leader-epoch-checkpoint |
[tkstone@localhost test-topic-0]$ ls -al total 2204 drwxrwxr-x 2 tkstone tkstone 306 Mar 14 14:23 . drwxrwxr-x 54 tkstone tkstone 4096 Mar 14 14:23 .. -rw-rw-r– 1 tkstone tkstone 1904 Mar 14 14:14 00000000000000010284.index.deleted -rw-rw-r– 1 tkstone tkstone 1047553 Mar 14 14:14 00000000000000010284.log.deleted -rw-rw-r– 1 tkstone tkstone 2868 Mar 14 14:14 00000000000000010284.timeindex.deleted -rw-rw-r– 1 tkstone tkstone 10485760 Mar 14 14:23 00000000000000011239.index -rw-rw-r– 1 tkstone tkstone 740448 Mar 14 14:23 00000000000000011239.log -rw-rw-r– 1 tkstone tkstone 10 Mar 14 14:14 00000000000000011239.snapshot -rw-rw-r– 1 tkstone tkstone 10485756 Mar 14 14:23 00000000000000011239.timeindex -rw-rw-r– 1 tkstone tkstone 12 Mar 14 14:23 leader-epoch-checkpoint |
[tkstone@localhost test-topic-0]$ ls -al total 744 drwxrwxr-x 2 tkstone tkstone 178 Mar 14 14:24 . drwxrwxr-x 54 tkstone tkstone 4096 Mar 14 14:25 .. -rw-rw-r– 1 tkstone tkstone 10485760 Mar 14 14:23 00000000000000011239.index -rw-rw-r– 1 tkstone tkstone 740448 Mar 14 14:23 00000000000000011239.log -rw-rw-r– 1 tkstone tkstone 10 Mar 14 14:14 00000000000000011239.snapshot -rw-rw-r– 1 tkstone tkstone 10485756 Mar 14 14:23 00000000000000011239.timeindex -rw-rw-r– 1 tkstone tkstone 12 Mar 14 14:23 leader-epoch-checkpoint |
How to change kafka log name?
LikeLike