Understanding Distributed Search in Elasticsearch

In this tutorial, we are going to perform a very important practical experiment that will help you visually and conceptually understand how Elasticsearch distributes data across nodes and shards, and how it still gives you a single combined search result. This is one of the core ideas behind Elasticsearch’s scalability and performance.

We will create an index with two primary shards, insert some documents, observe how those documents are distributed across the cluster, and finally send a search request to just one node and still receive results from the entire cluster.

This experiment assumes that you already have a three-node Elasticsearch cluster running using Docker (for example: es01, es02, es03).

Step 1: Make Sure Your Three-Node Cluster Is Running

Before we begin, ensure that:

All three Elasticsearch containers are up and running.
If you had stopped Docker earlier, start the cluster again using your docker compose up command.
Verify that es01, es02, and es03 are all running.

This is important because shard distribution and replication only make sense when multiple nodes are available in the cluster.

Step 2: Create the `products` Index with Two Primary Shards

Now open Kibana Dev Tools (Dev Console) and create a new index called products with the following configuration:

2 primary shards
1 replica per primary shard

Run the following request:

PUT /products
{
  "settings": {
    "number_of_shards": 2,
    "number_of_replicas": 1
  }
}

If everything is correct, Elasticsearch will respond with an acknowledgment that the index has been created successfully.

At this point, conceptually, what you have told Elasticsearch is:

“Please split my products index into two independent primary shards.”
“For each primary shard, also maintain one replica copy on some other node for high availability.”

Step 3: Check How Shards Are Distributed Across Nodes

Now let us inspect how Elasticsearch has actually placed these shards in the cluster.

Run the following command:

GET /_cat/shards/products?v

You will see output similar to this (exact node names may vary):

Shard 0 (primary) might be on es02
Shard 1 (primary) might be on es03
The replicas might be on es01

Conceptually, the situation looks like this:

Primary Shard 0 → es02
Primary Shard 1 → es03
Replica copies → es01

This already shows one very important idea:

Elasticsearch does not store the entire index on a single machine. Instead, it splits the index into shards and distributes them across nodes.

Step 4: Insert Some Documents into the Index

Now let us insert a few documents. We will add four products.

You can run requests like these:

POST /products/_doc/1
{
  "name": "Product One",
  "price": 100
}

POST /products/_doc/2
{
  "name": "Product Two",
  "price": 200
}

POST /products/_doc/3
{
  "name": "Product Three",
  "price": 300
}

POST /products/_doc/4
{
  "name": "Product Four",
  "price": 400
}

Now you have four documents in your products index.

Step 5: Refresh the Index (Optional but Useful for Observation)

Sometimes, immediately after inserting documents, Elasticsearch may not show the updated counts in certain low-level APIs because of its near real-time nature.

To make sure everything is visible immediately, you can run:

POST /products/_refresh

Important conceptual note:

Refreshing is not mandatory for search to work, but it is often useful when you are learning and want to immediately see updated statistics.

Step 6: Check How Documents Are Distributed Across Shards

Now again check the shard status:

GET /_cat/shards/products?v

You might see something like:

Shard 0 has 3 documents
Shard 1 has 1 document

This may look slightly unbalanced, but that is completely normal and expected because:

You have inserted only four documents.
Elasticsearch uses a hashing algorithm on the document ID to decide which shard a document should go to.
With a small number of documents, distribution will not look perfectly even.

Very important conceptual understanding:

If you inserted 100 or 10,000 documents, you would see a much more even distribution, such as 51–49 or 5020–4980, etc.

So do not worry about small imbalances when the dataset is tiny.

Step 7: Send a Search Request to Only One Node

Now comes the most interesting part of this experiment.

Let us say you send a search request directly to only one node, for example es03.

From your terminal (or browser), you can call:

GET http://localhost:9202/products/_search

(Here 9202 is just an example port mapped to es03. Your port mapping may differ.)

What you will see is:

All four products are returned in the search result.

Step 8: Understand What Just Happened Behind the Scenes

This is the core concept this lecture is trying to demonstrate.

Even though:

Your data is physically distributed across multiple nodes.
Shard 0 is on one node and Shard 1 is on another node.

Still:

You sent the request to only one node.
That node did not have all the data locally.
But Elasticsearch automatically:
- Scattered the search request to all relevant shards across the cluster.
- Collected the partial results from each shard.
- Merged them together.
- Returned a single combined response to you.

This is known as the scatter-gather or fan-out / fan-in search execution model.

In simple words:

To the user, the cluster behaves like a single big database, even though internally the data is split and distributed across many machines.

Step 9: Why This Is So Powerful

This design gives Elasticsearch some extremely powerful properties:

Transparency to the user, because you always interact with the cluster as if it were a single system.

Horizontal scalability, because you can add more nodes and more shards as data grows.

High performance, because search is executed in parallel on multiple shards.

High availability, because replicas exist on other nodes.