Introduction
In this tutorial, you will learn about Kafka topics and their role in event-driven architectures. Kafka topics serve as storage for published messages and allow multiple microservices to consume events efficiently. Kafka is widely used in distributed systems for real-time data streaming and event-driven applications.
What is Kafka Topic?
A Kafka topic is a logical storage unit where Kafka stores all published messages. Topics enable microservices to communicate asynchronously by publishing and consuming events. They help decouple producers and consumers, allowing systems to scale independently.
Example Scenario
Consider a Products Microservice that publishes events when a new product is created. Several microservices, such as:
- SMS Notification Microservice: This microservice listens to events from the Kafka topic and sends SMS notifications to users whenever a new product is added.
- Email Notification Microservice: This microservice subscribes to the topic and generates email notifications for registered users.
- Push Notification Microservice: This microservice receives events from Kafka and sends push notifications to mobile applications to inform users about new products.
Instead of sending events directly to these microservices, the Products Microservice publishes the event to a Kafka topic. Each interested microservice then consumes the event from the topic at its own pace, ensuring reliable event processing.
Kafka Topics and Partitions
Kafka topics are divided into partitions, and each partition is replicated across multiple Kafka servers for fault tolerance. This ensures high availability and resilience in case of server failures. If one server fails, the event data remains accessible from other servers.
Benefits of Partitioning
- Parallel Processing: Multiple instances of the same microservice can read from different partitions, improving scalability and performance by distributing the load among multiple consumers.
- Load Distribution: Events are distributed across partitions, allowing better resource utilization and ensuring that no single consumer is overwhelmed with too many events.
- Fault Tolerance: Replicating partitions across multiple Kafka brokers ensures that data remains available even in case of hardware failures.
- Efficient Scaling: By increasing the number of partitions, the system can handle higher loads and distribute processing tasks across multiple consumers efficiently.
For example, if a topic has three partitions, different microservices can consume data from separate partitions simultaneously, increasing throughput and reducing latency.
Naming and Managing Topics
- Every topic has a unique name (e.g.,
product-created-event-topic
). The topic name should be meaningful and describe the type of events it stores. - A topic’s partition count is defined during creation and can be increased later but cannot be decreased due to Kafka’s immutable nature.
Each partition is physically stored on a Kafka broker’s hard disk as a sequence of messages, ensuring durability and persistence.
Kafka provides various configuration options for topics, such as retention policies, replication factors, and compression settings, which can be adjusted based on system requirements.
Kafka Topic Data Structure
Each partition consists of ordered events stored in sequence, similar to rows in a database table. Kafka topics use a log-based storage mechanism where messages are appended to the end of the log and read sequentially by consumers.
Understanding Offsets
Each event in a partition has a unique offset, starting from 0
. The offset acts as an identifier and helps consumers track their position in the partition. New events are always appended at the end, and offsets are immutable:
- Offset 0 → First event in the partition.
- Offset 1 → Second event in the partition.
- Offset N → Latest event stored in the partition.
Unlike traditional databases, Kafka does not allow updates or deletions of events once they are stored. This ensures a clear and reliable event history.

Message Retention in Kafka
- Messages persist in Kafka topics even after being consumed, allowing multiple consumers to process the same data independently.
- By default, messages are retained for 7 days, but this value can be configured using topic properties like
log.retention.hours
orlog.retention.bytes
. - Retention policies can be customized based on business needs, allowing messages to be stored indefinitely or deleted after a specified period.
- Kafka also supports compacted topics, where only the latest version of each key is retained, useful for maintaining stateful information like user profiles.