Understanding Sharding in Elasticsearch (and How It Differs from Segments)

Why Sharding Is Needed

Let us start with a practical scenario.

Imagine we have an Elasticsearch cluster consisting of multiple machines (nodes), and we are storing data in a products index. As the application grows, the number of product documents increases steadily, and the index becomes too large for a single node to handle efficiently.
If we expect the data volume to keep growing in the near future, relying on one node becomes risky. Disk usage increases, indexing slows down, and complex search queries start consuming more CPU and memory.
To solve this problem, Elasticsearch allows us to divide a large index into smaller pieces and distribute them across multiple nodes. This process is called sharding, and it enables horizontal scalability.

Sharding is a logical and physical data distribution mechanism in Elasticsearch.

Sharding is the process of splitting one large index into smaller, manageable units. Instead of storing all documents in a single place, Elasticsearch spreads them across multiple shards.
Each of these units is called a primary shard. When we talk about “number of shards” for an index, we are usually referring to the number of primary shards.
Primary shards allow Elasticsearch to distribute data and workload across multiple nodes. This makes indexing faster and allows search queries to be executed in parallel.

In simple terms, sharding is the foundation that allows Elasticsearch to scale out rather than scale up.

At this point, a very natural question arises.

Why do we call them primary shards? The term “primary” exists because Elasticsearch also supports another type of shard called a replica shard.
Primary shards are the original shards where data is first written. Replica shards are copies of these primary shards and are used for fault tolerance and read scalability.
Although replicas are important, the concept of primary shards comes first. That is why sharding discussions usually begin with primary shards, and replication is introduced later.

For now, it is enough to remember that primary shards are the main building blocks of an index.

Let us make this distinction very clear, because it often causes confusion.

When you create an index, you explicitly tell Elasticsearch how many shards it should have. For example, you might say: “Create the products index with 4 shards.”
Internally, Elasticsearch will create 4 primary shards for that index. These shards may be placed on different nodes in the cluster.
As developers, however, we always interact with the index as if it were a single logical unit. We index documents into products, and we search products, without worrying about which shard the data lives on.

So while Elasticsearch sees multiple shards behind the scenes, you and I see one logical index.

To simplify the understanding, let us assume a basic model.

Each shard belongs to an Elasticsearch instance (node). In a real cluster, Elasticsearch decides shard placement dynamically, but conceptually, it helps to imagine each shard being managed by a node.
Shards can move between nodes if needed. Elasticsearch can rebalance shards automatically when nodes are added or removed.
This dynamic shard management is what makes Elasticsearch flexible and resilient. You do not need to manually redistribute data when the cluster changes.

This is one of the most common and important questions.

“Sharding sounds a lot like segments. Are they the same?”

The answer is no, and the distinction is crucial.

Let us break this down carefully.

Shards are created and managed by Elasticsearch.
Shards exist to enable horizontal scalability. They allow large indices to be split across multiple nodes.
Each shard is a complete, independent unit of data distribution within the cluster.

Elasticsearch is built on top of Apache Lucene.
Each shard is internally a single Lucene index.
Lucene manages data inside a shard by creating smaller immutable units called segments.
Segments are optimized for fast reads and efficient writes.

So the hierarchy looks like this:

Index → Shards (Elasticsearch) → Segments (Lucene)

Segments are an internal implementation detail.

Lucene automatically creates and manages segments to optimize indexing and searching.
Segments are immutable, which makes searches very fast and reliable.
Elasticsearch handles segment creation, merging, and cleanup transparently.

If the concept of segments feels confusing at this stage, that is completely fine.

As an application developer or Elasticsearch user, you rarely need to think about segments directly.
Your primary concern should be index design, shard count, and replication strategy.

Segments are important internally, but they are not something you usually control directly.