Learnitweb

Apache Kafka Broker and leader follower rules

Introduction

Apache Kafka is a distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. At the heart of Kafka’s architecture lies the Kafka Broker, which plays a crucial role in managing and processing event streams. This tutorial provides a detailed understanding of what a Kafka Broker is, how it functions, and why it is important in a distributed system.

What is a Kafka Broker?

A Kafka Broker is essentially a server running Kafka processes. It is responsible for receiving, storing, and forwarding messages between producers and consumers in a Kafka cluster. The broker is a key component in Kafka’s architecture, ensuring efficient message handling and replication.

Kafka Broker as a Server

You can think of a Kafka Broker as a Kafka Server, which runs on either a physical or virtual machine. This means you can run a Kafka Broker on your personal computer, a dedicated server, or in a cloud environment such as Amazon Web Services (AWS) or Google Cloud Platform (GCP). The flexibility of deployment makes Kafka highly scalable and resilient.

Running Kafka Broker in a Cluster

To achieve high availability and fault tolerance, multiple Kafka Brokers can be deployed in a Kafka Cluster. In a clustered setup:

  • If one broker fails, the remaining brokers continue to process requests.
  • The system remains available and operational without significant downtime.

How Kafka Brokers Work Behind the Scenes

Leader-Follower Model

Kafka Brokers operate using a leader-follower model, ensuring data consistency and reliability. Here’s how it works:

  1. Leader Broker: Each Kafka topic is divided into partitions, and one of the brokers is assigned as the leader for each partition.
  2. Follower Brokers: The remaining brokers passively replicate the leader’s data.
  3. Dynamic Role Assignment: If the leader broker goes offline, one of the follower brokers is promoted to leader to ensure continued operations.

This dynamic assignment of leadership ensures fault tolerance and prevents a single point of failure.

Kafka Leaders and Followers Explained

In a Kafka cluster, one Kafka Broker acts as a leader, and other brokers act as followers.

What is a Leader in Apache Kafka?

A leader is responsible for handling all read and write requests for partitions in a topic. The leader manages operations for topic partitions, similar to how a store manager oversees a specific section.

  • Kafka clients must interact with the leader when writing messages to a partition.
  • Once a message is successfully stored, it is replicated to followers for redundancy and consistency.
  • The replication process is ordered and controlled to maintain data integrity.

What is a Follower in Apache Kafka?

A follower is a replica of a partition that copies all data from the leader.

  • Followers do not accept writes to prevent inconsistencies across the cluster.
  • They enhance fault tolerance and redundancy in Kafka clusters.
  • If a leader broker fails, one of the followers is automatically promoted to leader.

Without multiple brokers in a cluster, data redundancy is lost, and a single failure can lead to system downtime. Having multiple brokers ensures that data remains available even if one goes offline.

Balancing Leadership Across Brokers

To prevent bottlenecks, Kafka distributes leadership roles dynamically across multiple brokers.

  • Each partition has a designated leader broker, assigned at the time of topic creation.
  • Different partitions in a topic may have different leader brokers to balance the load.
  • Kafka ensures that if a leader broker fails, a follower broker takes over seamlessly.

For example, if Topic A has three partitions:

  • Partition 0 may have Broker 1 as a leader.
  • Partition 1 may have Broker 2 as a leader.
  • Partition 2 may have Broker 3 as a leader.

This way, no single broker bears the entire workload, and the system remains efficient.

Producer and Consumer Communication

A Kafka Broker is responsible for handling communication between producers (applications that send messages) and consumers (applications that receive messages):

  • Producers send data to the broker, which stores it in topic partitions.
  • Consumers retrieve data from the broker, reading events from partitions.

Message Reliability and Replication

Kafka ensures data reliability using replication:

  • Each partition has multiple copies across different brokers.
  • The leader broker manages writes and reads, while followers replicate the data.
  • If a leader broker fails, a follower automatically takes over as the new leader.