大叔经验分享(137)kafka开启压缩

kafka开启压缩后,可以极大的优化磁盘占用和网络传输开销,以及cpu占用和gc时间,开启压缩的参数为compression.type

Specify the final compression type for a given topic. This configuration accepts the standard compression codecs ('gzip', 'snappy', 'lz4', 'zstd'). It additionally accepts 'uncompressed' which is equivalent to no compression; and 'producer' which means retain the original compression codec set by the producer.

这个参数在broker和producer都可以设置,建议:

  • 在producer设置,然后broker设置为producer;
  • 在producer不设置,统一在broker设置;

ps: 如果两个地方都设置,可能发生两个压缩算法不一致的情况,这样broker在收到消息后需要先解压,再压缩后落盘,增加cpu开销。

开启压缩后可以通过如下命令确认:

bin/kafka-run-class.sh kafka.tools.DumpLogSegments --files /data/kafka/test-0/00000000000000000000.log --print-data-log | grep compresscodec

各种压缩算法的对比如下:

参考:
https://blog.cloudflare.com/squeezing-the-firehose
https://stackoverflow.com/questions/67537111/how-do-i-decide-between-lz4-and-snappy-compression

posted @ 2022-05-20 17:52  匠人先生  阅读(1037)  评论(0编辑  收藏  举报