Learnitweb

Kafka Consumer – Multiple Consumers in a consumer group

Introduction

Let’s understand the setup with a common microservices architecture.

Components:

  1. Producer: products microservice
  2. Kafka Topic: Stores messages (e.g., product-created)
  3. Consumers:
    • email-notification microservice
    • sms-notification microservice
    • push-notification microservice

When the products microservice publishes a ProductCreated event:

  • The message is sent to a Kafka topic.
  • All three different consumer microservices (email, SMS, push) listen to the same topic.
  • Each receives its own copy of the message.

Each microservice is different and needs to independently handle the event (e.g., send an email, send an SMS, etc.).

The Problem with Scaling One Consumer Type

Let’s say you want to scale only the email-notification microservice for better performance or reliability.

What happens if you start 4 instances of it?

If each instance listens to the topic independently:

  • Each instance receives the same message.
  • The same email will be sent four times — not ideal!

Solution: Kafka Consumer Groups

Kafka consumer groups provide a scalable and fault-tolerant mechanism for processing messages in parallel, without duplication. Let’s dive deeper into how they work and why they are essential in distributed microservice architectures.

What is a Kafka Consumer Group?

A consumer group is a collection of Kafka consumer instances (often part of the same microservice) that work together to consume messages from one or more Kafka topics.

Each consumer in the group:

  • Has a shared group ID.
  • Is assigned a subset of the topic’s partitions.
  • Reads messages exclusively from those partitions.

Kafka guarantees that:

  • Each partition is consumed by only one consumer in a group at a time.
  • But a single consumer can consume messages from multiple partitions.

Example:

Suppose:

  • A topic has 4 partitions: P0, P1, P2, P3.
  • You start 4 instances of the email-notification microservice in the same consumer group.

Result:

  • Kafka assigns one partition to each consumer.
  • Each instance processes a unique subset of messages.

If you start 2 instances, Kafka assigns:

  • 2 partitions to each consumer (e.g., C1: P0 & P1, C2: P2 & P3).

Partitioning and Load Distribution

Kafka topics are divided into partitions for scalability. These partitions allow messages to be distributed and consumed in parallel.

  • Messages are written to a partition based on a partitioning strategy (e.g., round-robin, key-based).
  • Each consumer in a group is assigned a set of partitions.
  • Thus, consumer groups scale horizontally by partition.

Important: You cannot have more consumers than partitions in a group, because Kafka won’t assign the extra consumers any partitions — they remain idle.

Fault Tolerance: Rebalancing

Kafka automatically rebalance partitions across available consumers in a group when:

  • A new consumer joins the group.
  • A consumer crashes or disconnects.

Rebalancing is crucial for:

  • Ensuring continuous processing without human intervention.
  • Redistributing the load dynamically.

However, rebalancing introduces brief downtime, so it’s important to handle it gracefully in your application code (e.g., by using idempotent processing or storing offsets externally).

Why Use Consumer Groups?

  1. Parallel Processing
    Scale your microservice horizontally by spinning up more consumers in the same group — Kafka divides work.
  2. High Throughput
    Multiple consumers process messages concurrently, reducing end-to-end processing time.
  3. Avoid Redundancy
    Each message is processed exactly once per consumer group, eliminating duplicates within that group.
  4. Logical Separation
    Different consumer groups can serve different purposes — e.g., logging, indexing, sending notifications — each receives its own copy of every message.

Real-World Analogy

Think of a consumer group like a team of workers:

  • Each worker (consumer) is assigned a specific set of files (partitions).
  • No two workers in the same team work on the same file.
  • When a new worker joins, the manager (Kafka) reassigns files to balance the workload.
  • If a worker leaves, others pick up the slack.

This dynamic adjustment is what makes Kafka highly adaptable in modern distributed systems.

Kafka Rebalancing: How It Works

Kafka provides a powerful mechanism for horizontally scaling consumer microservices using consumer groups. This mechanism ensures efficient and coordinated consumption of messages from Kafka topics, especially when dealing with high throughput and multiple microservices. Let’s break it down and explore the critical points.

Key Concepts and Workflow

  1. Start with one consumer reading from all partitions.
    When you initially deploy a Kafka consumer microservice (like email-notification), it reads messages from all the partitions of a topic. For example, if the topic product-created-events has three partitions, a single consumer instance will consume messages from all three partitions. This setup is ideal for low message volume or initial development.
  2. As load increases, add more consumers to the same consumer group.
    To handle more messages, you can scale your application by deploying additional instances of the same microservice. Each instance should use the same consumer group ID. Kafka will automatically detect the new consumer and initiate rebalancing to distribute partitions among available consumer instances.
  3. Kafka will automatically rebalance and redistribute partitions.
    Rebalancing ensures that each partition is assigned to one consumer within the group. For example, if you now have two consumer instances and three partitions, one might read from two partitions, and the other from one. This distribution is dynamic and happens every time a consumer joins or leaves the group.
  4. Ensure that the number of consumers is less than or equal to the number of partitions to avoid idle consumers.
    A very important rule: one partition can be assigned to only one consumer in a group at a time. If you start more consumers than there are partitions in the topic, the extra consumers will remain idle. For instance, with a topic having 3 partitions, only up to 3 consumers can be actively consuming in a single consumer group.
  5. Each consumer must regularly send heartbeat signals to Kafka to stay active in the group.
    Kafka uses heartbeats to monitor the health of consumers in a group. Each consumer sends a heartbeat to the group coordinator (a Kafka broker) at regular intervals (default is 3 seconds). This tells Kafka the consumer is alive and ready to consume messages.
  6. On failure or disconnection, Kafka removes the consumer and reassigns partitions.
    If Kafka stops receiving heartbeats from a consumer (e.g., due to crash or network failure), it assumes the consumer is inactive. Kafka then triggers rebalancing and reassigns the lost consumer’s partitions to the remaining active members. This ensures fault tolerance and high availability.

Summary Table

Topic PartitionsActive Consumer InstancesResult
31One consumer reads from all 3 partitions
32One gets 1, another gets 2 partitions
33One-to-one partition-to-consumer mapping
34The fourth consumer remains idle

This dynamic consumer scaling strategy and partition rebalancing make Kafka an excellent choice for building robust and scalable event-driven microservices.