Introduction
When developing applications that communicate with Apache Kafka, it’s essential to understand how message delivery works, especially when reliability and durability are important. Kafka provides a range of options that allow you to fine-tune how the producer waits for acknowledgements from the brokers. This tutorial dives deep into how acknowledgements work in Kafka and how you can configure your Spring Boot application’s producer accordingly using the application.properties
file.
Message Lifecycle in Kafka: From Producer to Broker
When a Kafka producer sends a message, it is first sent to a Kafka broker. The broker then stores the message in a Kafka topic, which is a logical container for storing records.
Each topic in Kafka is divided into smaller units called partitions. Partitions are key to Kafka’s scalability and parallelism. When the producer sends a message, Kafka determines to which partition it should go (often using a round-robin approach or a key-based strategy). The message is then stored in that specific partition.
Kafka Clusters and Replication
If your Kafka setup consists of a cluster (i.e., multiple brokers), and the topic is created with a replication factor greater than one, then every partition of that topic will be replicated across multiple brokers.
In such a setup:
- One broker acts as the leader of each partition.
- The other brokers that hold replicated copies are known as followers.
- The leader is responsible for handling all reads and writes.
- Followers synchronize with the leader to stay updated. These are referred to as in-sync replicas (ISR).
This replication is crucial for fault tolerance. If the leader broker fails, one of the in-sync replicas is elected as the new leader, allowing Kafka to continue operating without data loss — but only if the message had already been replicated to those followers.
Understanding Acknowledgements in Kafka
Acknowledgements (acks) are how Kafka producers know whether a message has been successfully received and stored.
By default, Kafka producers are configured to wait for an acknowledgement only from the leader broker. While this is usually fine, it opens up a potential reliability issue: what if the leader goes down before replicating the message to any followers? In such a case, the message could be permanently lost.
To mitigate this, Kafka provides the acks
configuration property, which you can set to control how many acknowledgements the producer waits for before considering a message “successfully sent.”
Kafka Producer Acknowledgement Options
Option 1: acks=all
– Maximum Reliability
spring.kafka.producer.acks=all
When this option is set, the Kafka producer will wait until all in-sync replicas have acknowledged the message. This provides the strongest durability guarantee available in Kafka. It ensures that as long as at least one replica remains alive, your message will not be lost.
However, since the producer must wait for multiple acknowledgements, this approach comes with increased latency. It’s ideal for mission-critical use cases where data integrity is paramount—such as financial transactions, billing, or order processing systems.
To complement this setup, it’s recommended to use:
spring.kafka.producer.properties.min.insync.replicas=2
This means that a minimum of two in-sync replicas must acknowledge the message for it to be considered successful.
You can also add retries:
spring.kafka.producer.retries=5
This allows the producer to automatically retry sending messages if transient failures occur.
Option 2: acks=1
– Balanced Trade-off
spring.kafka.producer.acks=1
This is the default configuration in Kafka. Here, the producer only waits for an acknowledgement from the leader broker. This approach provides a balance between performance and reliability.
- It is faster than
acks=all
since it doesn’t wait for followers. - But if the leader crashes before followers replicate the message, data loss may occur.
This configuration works well when some message loss is acceptable, but you still want confirmation that your message was stored somewhere.
Retries can still be useful:
spring.kafka.producer.retries=3
Option 3: acks=0
– Maximum Performance
spring.kafka.producer.acks=0
In this configuration, the producer does not wait for any acknowledgment from Kafka brokers. As a result, this setup offers the highest performance but with no durability guarantees.
Messages could be lost without the producer ever knowing. This is only acceptable in scenarios where data loss is tolerable, and throughput is more important than durability.
A real-world example of this would be an application that sends real-time location updates or sensor data—if you lose one out of ten messages per second, you can still understand the overall direction or trend.
Performance vs Durability: Will More Brokers Slow Down the Producer?
It’s a common misconception that adding more brokers will slow down your producer, especially when using acks=all
. But the truth is more nuanced.
Kafka does not wait for acknowledgements from every broker in the cluster. Instead, it only waits for acks from the in-sync replicas of the partition to which the message is sent. The number of in-sync replicas is determined by the replication factor set on the topic at the time of creation.
For example, if your topic has a replication factor of 3, then only 3 brokers will participate in replication for each partition, regardless of whether your cluster has 5, 10, or 100 brokers.
Also, Kafka allows you to control the minimum number of acknowledgements required using:
spring.kafka.producer.properties.min.insync.replicas=2
This means the producer only needs 2 in-sync replicas (including the leader) to acknowledge the message. This speeds up the producer compared to waiting for all 3, while still providing some redundancy.
Summary: Choosing the Right Acknowledgement Strategy
acks Value | Behavior | Use Case | Trade-offs |
0 | Don’t wait for any acknowledgment | High-frequency, non-critical data (e.g., GPS updates) | Very fast, but risk of message loss |
1 | Wait for leader acknowledgment | General purpose | Good balance of speed and safety |
all | Wait for all in-sync replicas | Financial or critical systems | Very reliable but slower |