Learnitweb

Understanding Yellow Cluster Status in Elasticsearch

In this tutorial, we will understand why an Elasticsearch cluster shows YELLOW status, especially when you are running a single-node Docker setup, and how to fix or intentionally control this behavior.
This is a very common point of confusion for beginners, and once you understand it properly, cluster health indicators will start making a lot more sense.

Prerequisites and Setup Context

  • We are using a single-node Docker cluster with Elasticsearch and Kibana.
  • This setup is intentional for learning, and there is nothing wrong with it.
  • In upcoming lectures, multi-node YAML configurations will be introduced, but for now we focus on why cluster health behaves the way it does in a single-node environment.

Make sure Elasticsearch and Kibana containers are up and running before proceeding.

Creating the Index and Observing the Problem

  • We create an index (for example, products) using the Kibana Dev Tools console.
  • The index creation succeeds without any errors.
  • However, when we check cluster health, the status is YELLOW.

At this point, the natural question is:

Why is the cluster not GREEN if everything looks fine?

Checking Cluster Health

Use the cluster health API:

GET _cluster/health

The response shows:

  • status: yellow
  • No failed shards
  • No data loss

So clearly, something is not fully optimal, but also not broken.

{
  "cluster_name": "docker-cluster",
  "status": "yellow",
  "timed_out": false,
  "number_of_nodes": 1,
  "number_of_data_nodes": 1,
  "active_primary_shards": 33,
  "active_shards": 33,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 1,
  "unassigned_primary_shards": 0,
  "delayed_unassigned_shards": 0,
  "number_of_pending_tasks": 0,
  "number_of_in_flight_fetch": 0,
  "task_max_waiting_in_queue_millis": 0,
  "active_shards_percent_as_number": 97.05882352941177
}

Inspecting Shard Allocation (The Real Reason)

To understand why the cluster is yellow, we inspect shard allocation.

All shards (can be noisy)

GET _cat/shards?v

This output can be confusing if many indices exist.

Focus only on one index (recommended)

GET _cat/shards/products?v

Now the picture becomes clear.

index    shard prirep state      docs store dataset ip         node
products 0     p      STARTED       1 6.3kb   6.3kb 172.18.0.3 e062efcb532a
products 0     r      UNASSIGNED                               

Understanding prirep Column

In the shard output, focus on the prirep column:

  • P → Primary shard
  • R → Replica shard

You will see two entries for the products index:

  • One primary shard (P) → assigned and running
  • One replica shard (R) → UNASSIGNED

Why Replica Shard Is Unassigned

  • By default, every index is created with:
    • number_of_shards = 1
    • number_of_replicas = 1
  • A replica shard must be placed on a different node than its primary shard.
  • Since this is a single-node cluster, Elasticsearch has no other node where it can place the replica.

So Elasticsearch is effectively saying:

“Your data is available, but I cannot provide high availability.”

That is exactly what YELLOW means.

What YELLOW Cluster Status Actually Means

  • Primary shards are assigned and working
  • Replica shards are missing or unassigned
  • No data loss
  • High availability is not guaranteed

This is expected behavior for a single-node cluster.

Fix Option 1: Disable Replicas (Recommended for Single Node)

If you do not need high availability (which is perfectly fine for local learning), you can set replicas to zero.

Step 1: Delete the index

DELETE products

Step 2: Recreate index with zero replicas

PUT products
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 0
  }
}

Step 3: Check cluster health again

GET _cluster/health

Cluster status becomes GREEN

Why Is the Cluster GREEN Now?

  • We have one primary shard
  • That shard is assigned and running
  • There are no replica shards expected
  • Therefore, Elasticsearch has no unmet requirements

Experiment: Multiple Primary Shards on a Single Node

Now let’s try something interesting.

Delete the index again

DELETE products

Create index with two primary shards and no replicas

PUT products
{
  "settings": {
    "number_of_shards": 2,
    "number_of_replicas": 0
  }
}

Check cluster health

GET _cluster/health

Still GREEN

Check shard allocation

GET _cat/shards/products?v

You will see:

  • Shard 0 → Primary
  • Shard 1 → Primary
  • Both on the same node

Is This Allowed? Yes — and Here Is Why

  • Replica shards are about high availability, so they must be on different nodes.
  • Primary shards are about scaling, not availability.
  • Elasticsearch allows multiple primary shards on the same node when no other nodes are available.

This is completely valid behavior.

Why Multiple Primary Shards Can Help

  • Elasticsearch can index documents in parallel across multiple primary shards.
  • Even on the same node, this can improve throughput in some scenarios.
  • This is more about CPU parallelism, not data distribution.

⚠️ Important warning
Do not assume that more shards automatically mean better performance.

Performance must be measured, not guessed.

Creating 10 shards on one node does not guarantee 10× performance improvement.

Common Beginner Questions

  • How many primary shards should I use?
  • How many replica shards do I really need?
  • When should I increase or decrease shard count?

These are important capacity-planning questions, and they depend heavily on:

  • Data size
  • Query patterns
  • Hardware
  • Traffic characteristics