ZhangZhihui's Blog  

flush() on a Kafka Producer forces all buffered messages to be sent to the Kafka brokers and waits until they are delivered (or fail).

It’s important because Kafka producers are asynchronous by default.


✅ What flush() Does

1. Sends all messages in the producer’s buffer

When you call produce() or send(), the message is usually queued, not immediately sent.

flush() forces the producer to push all pending messages out to Kafka.

2. Blocks until delivery is complete

flush() waits until:

  • all messages have either been successfully acknowledged, or

  • failed permanently (e.g., due to a delivery error or timeout)

Only after all messages are processed does flush() return.

3. Ensures a clean shutdown

If you don’t flush at the end of your program, some messages may still be in the buffer and therefore never sent.


✅ Examples

confluent_kafka

from confluent_kafka import Producer

p = Producer({'bootstrap.servers': 'localhost:9092'})

p.produce('my-topic', b'hello')
p.flush()  # Blocks until 'hello' is delivered

kafka-python

from kafka import KafkaProducer

p = KafkaProducer(bootstrap_servers='localhost:9092')

p.send('my-topic', b'hello')
p.flush()  # Blocks until ack received

 


🤔 When should you call flush()?

✔️ At the end of your program

So no messages are left pending.

✔️ Before shutdown

To avoid data loss.

✔️ In scripts that produce only a few messages

To ensure delivery.

❌ Not needed in long-lived applications

Kafka producers already send batches asynchronously and efficiently.


🔍 Difference from poll() (Confluent client)

  • poll() serves delivery callbacks but doesn't flush all messages.

  • flush() polls until the queue is empty.


📌 Summary

flush() waits until the producer has sent all buffered messages and they are acknowledged.
Essential for scripts and proper shutdown, but not needed in high-throughput, long-lived services.

 

posted on 2025-12-02 16:36  ZhangZhihuiAAA  阅读(0)  评论(0)    收藏  举报