Learnitweb

How HashMap Internally Works in Java?

1. Key Components of HashMap

1.1 Array of Buckets

A HashMap uses an internal array, often referred to as a bucket array, to store the key-value mappings. Each element in this array is called a bucket, and each bucket can hold multiple entries due to collisions.

  • The array is of type Node<K, V>[], where each node holds a key-value pair along with the hash and a pointer to the next node.
  • When a key-value pair is inserted, it is hashed and the result determines the index (bucket) in the array where the entry will go.
  • Initially, the default array capacity is 16. This can grow dynamically if more entries are added.

This array is the backbone of the HashMap.

1.2 Node<K,V> Class

Every key-value pair inserted into the map is wrapped inside a node object. Here’s the simplified source code:

static class Node<K, V> implements Map.Entry<K, V> {
    final int hash;
    final K key;
    V value;
    Node<K, V> next;
}
  • hash: Precomputed hash code of the key for quick comparisons.
  • key: The actual key object.
  • value: The corresponding value associated with the key.
  • next: A reference to the next node in the bucket (used for chaining in case of collisions).

This Node class allows HashMap to implement separate chaining to handle collisions.

2. Insertion Workflow (put Method)

2.1 Computing Hash

Before a key can be inserted, a hash code must be calculated. Java calls key.hashCode() and then applies an additional transformation:

int h = key.hashCode();
int hash = (h) ^ (h >>> 16);
  • This XOR operation helps in evenly distributing hash codes across the bucket array and reduces clustering.
  • The goal is to reduce the chance of collisions by mixing the high-order and low-order bits of the hash code.

This process is known as hash spreading.

2.2 Determining Bucket Index

Once the hash code is generated, HashMap determines the array index (bucket) using:

index = (n - 1) & hash;
  • Here, n is the length of the bucket array.
  • The bitwise AND operation ensures that the result is within bounds (0 to n-1) and is faster than mod (%) operator.
  • This also works best when n is a power of 2 (which is always true in HashMap).

2.3 Handling Insertion in Bucket

There are 3 sub-cases to handle when placing the key-value pair in the selected bucket.

Case 1: Empty Bucket

If no entry exists at the computed index, a new Node is simply created and placed there:

table[index] = newNode(hash, key, value, null);

This is the most efficient scenario: no traversal, no collision.

Case 2: Key Already Exists (Update)

If the key already exists in the same bucket (determined using equals()):

  • The value is updated.
  • No new node is added.
  • HashMap returns the old value.
if (existingNode.key.equals(key)) {
    existingNode.value = newValue;
}

Case 3: Collision Handling

If another entry already exists in the bucket but has a different key:

  • HashMap will traverse the list (next references) or tree if it was previously treeified.
  • It appends the new node at the end if key is unique.
  • This is where performance can degrade if collisions are frequent and the chain is long.

3. Collision Handling in HashMap

3.1 What is a Collision?

A collision occurs when two different keys have the same bucket index after hashing. This is very common when:

  • Keys have poor hashCode() implementations.
  • The size of the bucket array is too small.

3.2 Linked List Chaining (Pre Java 8 & Default)

Initially, HashMap handles collisions using separate chaining, where each bucket contains a linked list of nodes.

  • New entries are added at the end of the list.
  • Time complexity becomes O(n) for lookup or insert if the chain is long.

This is acceptable for small maps but inefficient for large datasets.

3.3 Tree-Based Buckets (Java 8+)

From Java 8 onwards, if a single bucket’s chain exceeds 8 nodes and the bucket array has at least 64 entries, the linked list is converted into a Red-Black Tree.

  • Tree operations like insert, search, and delete are O(log n).
  • This greatly improves performance in cases with high collisions (e.g., DoS attacks using hash collisions).

When entries drop below 6, the bucket is converted back to a linked list.


4. Resizing the HashMap

4.1 Load Factor

The load factor defines when the HashMap should resize (grow).

loadFactor = number_of_entries / capacity
  • Default load factor = 0.75
  • Resize threshold = capacity × load factor

This balances time and space complexity.

4.2 Resize Trigger

When the number of entries exceeds the threshold:

  • A new array with double the capacity is created.
  • All old entries are rehashed and moved to the new array.
  • This is a time-consuming process (O(n)), especially if the map is large.

Java 8 optimized this by avoiding unnecessary object creation during transfer.

5. Retrieval Workflow (get Method)

To retrieve a value using a key, HashMap does:

  1. Compute the hash for the key.
  2. Use (n - 1) & hash to get the bucket index.
  3. Access the corresponding bucket:
    • If it’s a linked list, iterate over the nodes using equals() to find the key.
    • If it’s a tree, perform a binary search based on the key’s compareTo method (or hash order).
  4. Return the associated value.

Time complexity is O(1) in ideal cases, O(n) in worst-case linked list scenario, or O(log n) in tree mode.

6. Important Thresholds and Parameters

ParameterDefault / ValuePurpose
Initial Capacity16Initial size of the bucket array
Load Factor0.75Determines when to resize
Treeify Threshold8Converts bucket to a Red-Black Tree
Untreeify Threshold6Converts tree back to linked list
Min Treeify Capacity64Treeify only if capacity ≥ 64

These thresholds help balance performance and memory consumption.

7. Handling Null Keys and Values

HashMap has special handling for null:

  • Null keys are allowed, but only one null key.
  • All null keys are mapped to bucket index 0.
  • Multiple null values are allowed.
map.put(null, "value"); // valid
map.put("key", null);   // valid

This is unlike Hashtable, which does not allow null keys or values.

8. Time Complexity Summary

OperationBest CaseAverage CaseWorst Case (pre-Java 8)Worst Case (Java 8+)
put()O(1)O(1)O(n)O(log n)
get()O(1)O(1)O(n)O(log n)
remove()O(1)O(1)O(n)O(log n)
  • Average time complexity is O(1) when hash distribution is good.
  • In adversarial conditions or with poor hash functions, it can degrade.

9. Thread Safety

  • HashMap is not thread-safe.
  • Concurrent modifications from multiple threads can cause:
    • Infinite loops
    • Lost updates
    • Data inconsistency

To make it thread-safe:

  1. Use Collections.synchronizedMap(new HashMap<>())
  2. Use ConcurrentHashMap for high concurrency support and better scalability.

10. Java 8+ Improvements

EnhancementDescription
Red-Black TreeFaster performance for buckets with many collisions.
Better Hash FunctionImproves distribution and reduces clustering.
Resizing OptimizationReduces unnecessary node creation and improves transfer speed.

These changes made HashMap more robust and performant under high load.

11. Common Pitfalls to Avoid

  • Overriding equals() but not hashCode() leads to bugs.
  • Using mutable keys can break retrieval (hash changes after insertion).
  • Ignoring resize cost can hurt performance.

12. Simplified Version of put() Method

public V put(K key, V value) {
    if (table == null || size > threshold) {
        resize();
    }

    int hash = hash(key);
    int index = (table.length - 1) & hash;

    for (Node<K, V> e = table[index]; e != null; e = e.next) {
        if (e.hash == hash && (e.key == key || key.equals(e.key))) {
            V oldValue = e.value;
            e.value = value;
            return oldValue;
        }
    }

    addNode(hash, key, value, index);
    return null;
}

13. Summary Table

FeatureDescription
Data StructureArray + Linked List + Red-Black Tree
Collision HandlingSeparate chaining (Linked list → Tree)
Load FactorTriggers resize when 75% full
Thread-SafetyNot thread-safe
Null Handling1 null key allowed, multiple null values
PerformanceO(1) avg case, O(log n) worst (Java 8+)