Learnitweb

Idempotent producer in Kafka

Kafka producers and brokers communicate over a network, which means there is always a risk of network failure, broker issues, or delayed acknowledgements. These issues can lead to duplicate messages in Kafka topics, especially when retries are enabled.

To solve this, Kafka offers idempotent producers, which ensure each message is delivered only once, even during failures or retries. Let’s dive into how this works and how to configure it in your Spring Boot application.

Why You Need an Idempotent Kafka Producer

Scenario Without Idempotence:

  1. A Kafka producer sends a message to a Kafka broker.
  2. The broker writes the message to a topic and sends back an acknowledgement (ACK).
  3. If the ACK is lost due to network issues, the producer doesn’t know if the message was delivered.
  4. The producer retries and sends the same message again.
  5. Kafka stores this message again, resulting in duplicate messages in the topic.

This kind of duplication may not affect all applications equally, but for critical domains like banking, inventory management, or order processing, duplicates can cause:

  • Double payments
  • Multiple shipments
  • Inconsistent system states

What Does “Idempotent Producer” Mean?

An idempotent producer is one that avoids writing duplicate messages to Kafka, even if it retries sending the same message due to failures.

With Idempotence Enabled:

  • Kafka assigns a Producer ID (PID) and sequence numbers to each message batch.
  • The broker checks the sequence number of incoming messages.
  • If a message with the same sequence number already exists, Kafka broker discards the duplicate.
  • The producer receives an ACK as if it was accepted again.

This guarantees exactly-once delivery semantics per partition, as long as messages are not lost at the producer side.

How to Enable Idempotence in Kafka Producer

Spring Boot application.properties Configuration:

To enable idempotence explicitly:

spring.kafka.producer.enable-idempotence=true
spring.kafka.producer.acks=all
spring.kafka.producer.retries=3
spring.kafka.producer.max-in-flight-requests-per-connection=5

Explanation of Required Properties:

  1. enable-idempotence=true
    Enables the idempotent producer feature.
  2. acks=all
    Ensures the broker sends an ACK only when all in-sync replicas have received the message. This provides stronger delivery guarantees.
  3. retries=3 or more
    Allows the producer to retry sending the message. Retrying is essential for fault tolerance.
  4. max-in-flight-requests-per-connection=5 or fewer
    Limits the number of unacknowledged requests that the producer can send to a broker at the same time. If this is greater than 5, message reordering might happen, and idempotence will be disabled.

Java Code Example: Idempotent Producer Configuration

You can also configure the producer directly in code:

import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.common.serialization.StringSerializer;

import java.util.Properties;

public class IdempotentProducerExample {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());

        // Enable Idempotence
        props.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, "true");
        props.put(ProducerConfig.ACKS_CONFIG, "all");
        props.put(ProducerConfig.RETRIES_CONFIG, "3");
        props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, "5");

        KafkaProducer<String, String> producer = new KafkaProducer<>(props);
        // send messages as usual
    }
}

What Happens If Configuration Is Incorrect?

Configuring a Kafka producer for idempotence requires multiple properties to work together in a consistent and compatible way. If any of the related properties are misconfigured, idempotence can be silently disabled or cause runtime errors. Let’s break it down thoroughly.

1. Silent Disablement When Not Explicitly Set

If you do not explicitly set enable.idempotence=true, Kafka versions 2.5 and later will enable it by default.

However, this default behavior is fragile because if you override other related configurations (like acks, retries, or max.in.flight.requests.per.connection) and use incompatible values, Kafka will automatically disable idempotence, without throwing any errors or warnings in some cases.

This can create a false sense of safety — you may believe your producer is idempotent, but in reality, duplicates could still occur.


2. Runtime Configuration Exceptions (Explicitly Set)

If you explicitly set enable.idempotence=true in your Spring Boot application.properties or producer configuration, Kafka will enforce consistency and throw an exception if related properties are in conflict.

For example, you may see an error like:

org.apache.kafka.common.config.ConfigException: 
Cannot enable idempotence if acks != 'all' or if retries < 1 or if max.in.flight.requests.per.connection > 5

This helps catch misconfigurations early at startup, so it’s considered a best practice to explicitly set this property to true.

3. Incorrect Property Values That Break Idempotence

Let’s go through common incorrect settings that break idempotence:

a. acks != all

Setting acks to "1" or "0" will break idempotent behavior.

  • acks=1: The producer will proceed after one broker confirms the message — this allows faster throughput but risks message loss and no idempotence guarantees.
  • acks=0: No acknowledgements are expected. This offers no reliability at all, and is incompatible with idempotence.

Correct: You must set acks=all for idempotent producers.

b. retries=0

If you set retries=0, then the producer will not retry failed sends. This defeats the purpose of having idempotence enabled, because idempotence is most useful during retries.

Correct: Set retries to a positive number (e.g., 3, 5, or Integer.MAX_VALUE for infinite retries).

c. max.in.flight.requests.per.connection > 5

This setting controls how many unacknowledged requests can be sent at the same time.

  • If set above 5, the producer may reorder messages during retries.
  • Reordering makes it impossible for the broker to guarantee exactly-once semantics, so Kafka will disable idempotence in this case.

Correct: Set this value to 5 or less. Kafka enforces this rule strictly when enable.idempotence=true.

4. In Spring Boot Applications

Spring Boot abstracts away many Kafka properties using spring.kafka.producer.*, but it’s still your responsibility to ensure consistency between these properties.

If Spring Boot’s auto-configuration detects inconsistent values while enable-idempotence is true, it will pass the values to Kafka’s native ProducerConfig, which will in turn throw an exception.

To avoid surprises, always validate that all of these settings are present and consistent:

spring.kafka.producer.enable-idempotence=true
spring.kafka.producer.acks=all
spring.kafka.producer.retries=3
spring.kafka.producer.max-in-flight-requests-per-connection=5

5. Why Silent Disablement Is Dangerous

  • In test environments, everything may seem fine because messages aren’t duplicated often.
  • In production, under network latency or broker restart scenarios, duplicates will appear, leading to inconsistent data.
  • Without enabling idempotence properly, your consumers will need to implement their own deduplication logic, which adds complexity and performance overhead.

6. How to Validate Configuration

  • Check Kafka logs: Look for lines that indicate whether idempotence is enabled or disabled.
  • Programmatic verification: Use the producer’s metrics or config to confirm values at runtime.
  • Simulate network failures: Temporarily shut down brokers or introduce delays to validate behavior during retries.