1. Key Components of HashMap
1.1 Array of Buckets
A HashMap
uses an internal array, often referred to as a bucket array, to store the key-value mappings. Each element in this array is called a bucket, and each bucket can hold multiple entries due to collisions.
- The array is of type
Node<K, V>[]
, where each node holds a key-value pair along with the hash and a pointer to the next node. - When a key-value pair is inserted, it is hashed and the result determines the index (bucket) in the array where the entry will go.
- Initially, the default array capacity is 16. This can grow dynamically if more entries are added.
This array is the backbone of the HashMap.
1.2 Node<K,V> Class
Every key-value pair inserted into the map is wrapped inside a node object. Here’s the simplified source code:
static class Node<K, V> implements Map.Entry<K, V> { final int hash; final K key; V value; Node<K, V> next; }
- hash: Precomputed hash code of the key for quick comparisons.
- key: The actual key object.
- value: The corresponding value associated with the key.
- next: A reference to the next node in the bucket (used for chaining in case of collisions).
This Node
class allows HashMap to implement separate chaining to handle collisions.
2. Insertion Workflow (put Method)
2.1 Computing Hash
Before a key can be inserted, a hash code must be calculated. Java calls key.hashCode()
and then applies an additional transformation:
int h = key.hashCode(); int hash = (h) ^ (h >>> 16);
- This XOR operation helps in evenly distributing hash codes across the bucket array and reduces clustering.
- The goal is to reduce the chance of collisions by mixing the high-order and low-order bits of the hash code.
This process is known as hash spreading.
2.2 Determining Bucket Index
Once the hash code is generated, HashMap determines the array index (bucket) using:
index = (n - 1) & hash;
- Here,
n
is the length of the bucket array. - The bitwise AND operation ensures that the result is within bounds (
0
ton-1
) and is faster thanmod
(%
) operator. - This also works best when
n
is a power of 2 (which is always true in HashMap).
2.3 Handling Insertion in Bucket
There are 3 sub-cases to handle when placing the key-value pair in the selected bucket.
Case 1: Empty Bucket
If no entry exists at the computed index, a new Node
is simply created and placed there:
table[index] = newNode(hash, key, value, null);
This is the most efficient scenario: no traversal, no collision.
Case 2: Key Already Exists (Update)
If the key already exists in the same bucket (determined using equals()
):
- The value is updated.
- No new node is added.
- HashMap returns the old value.
if (existingNode.key.equals(key)) { existingNode.value = newValue; }
Case 3: Collision Handling
If another entry already exists in the bucket but has a different key:
- HashMap will traverse the list (
next
references) or tree if it was previously treeified. - It appends the new node at the end if key is unique.
- This is where performance can degrade if collisions are frequent and the chain is long.
3. Collision Handling in HashMap
3.1 What is a Collision?
A collision occurs when two different keys have the same bucket index after hashing. This is very common when:
- Keys have poor
hashCode()
implementations. - The size of the bucket array is too small.
3.2 Linked List Chaining (Pre Java 8 & Default)
Initially, HashMap handles collisions using separate chaining, where each bucket contains a linked list of nodes.
- New entries are added at the end of the list.
- Time complexity becomes O(n) for lookup or insert if the chain is long.
This is acceptable for small maps but inefficient for large datasets.
3.3 Tree-Based Buckets (Java 8+)
From Java 8 onwards, if a single bucket’s chain exceeds 8 nodes and the bucket array has at least 64 entries, the linked list is converted into a Red-Black Tree.
- Tree operations like insert, search, and delete are O(log n).
- This greatly improves performance in cases with high collisions (e.g., DoS attacks using hash collisions).
When entries drop below 6, the bucket is converted back to a linked list.
4. Resizing the HashMap
4.1 Load Factor
The load factor defines when the HashMap should resize (grow).
loadFactor = number_of_entries / capacity
- Default load factor = 0.75
- Resize threshold = capacity × load factor
This balances time and space complexity.
4.2 Resize Trigger
When the number of entries exceeds the threshold:
- A new array with double the capacity is created.
- All old entries are rehashed and moved to the new array.
- This is a time-consuming process (O(n)), especially if the map is large.
Java 8 optimized this by avoiding unnecessary object creation during transfer.
5. Retrieval Workflow (get Method)
To retrieve a value using a key, HashMap does:
- Compute the hash for the key.
- Use
(n - 1) & hash
to get the bucket index. - Access the corresponding bucket:
- If it’s a linked list, iterate over the nodes using
equals()
to find the key. - If it’s a tree, perform a binary search based on the key’s
compareTo
method (or hash order).
- If it’s a linked list, iterate over the nodes using
- Return the associated value.
Time complexity is O(1) in ideal cases, O(n) in worst-case linked list scenario, or O(log n) in tree mode.
6. Important Thresholds and Parameters
Parameter | Default / Value | Purpose |
---|---|---|
Initial Capacity | 16 | Initial size of the bucket array |
Load Factor | 0.75 | Determines when to resize |
Treeify Threshold | 8 | Converts bucket to a Red-Black Tree |
Untreeify Threshold | 6 | Converts tree back to linked list |
Min Treeify Capacity | 64 | Treeify only if capacity ≥ 64 |
These thresholds help balance performance and memory consumption.
7. Handling Null Keys and Values
HashMap has special handling for null
:
- Null keys are allowed, but only one null key.
- All null keys are mapped to bucket index 0.
- Multiple null values are allowed.
map.put(null, "value"); // valid map.put("key", null); // valid
This is unlike Hashtable
, which does not allow null keys or values.
8. Time Complexity Summary
Operation | Best Case | Average Case | Worst Case (pre-Java 8) | Worst Case (Java 8+) |
---|---|---|---|---|
put() | O(1) | O(1) | O(n) | O(log n) |
get() | O(1) | O(1) | O(n) | O(log n) |
remove() | O(1) | O(1) | O(n) | O(log n) |
- Average time complexity is O(1) when hash distribution is good.
- In adversarial conditions or with poor hash functions, it can degrade.
9. Thread Safety
HashMap
is not thread-safe.- Concurrent modifications from multiple threads can cause:
- Infinite loops
- Lost updates
- Data inconsistency
To make it thread-safe:
- Use
Collections.synchronizedMap(new HashMap<>())
- Use
ConcurrentHashMap
for high concurrency support and better scalability.
10. Java 8+ Improvements
Enhancement | Description |
---|---|
Red-Black Tree | Faster performance for buckets with many collisions. |
Better Hash Function | Improves distribution and reduces clustering. |
Resizing Optimization | Reduces unnecessary node creation and improves transfer speed. |
These changes made HashMap
more robust and performant under high load.
11. Common Pitfalls to Avoid
- Overriding
equals()
but nothashCode()
leads to bugs. - Using mutable keys can break retrieval (hash changes after insertion).
- Ignoring resize cost can hurt performance.
12. Simplified Version of put()
Method
public V put(K key, V value) { if (table == null || size > threshold) { resize(); } int hash = hash(key); int index = (table.length - 1) & hash; for (Node<K, V> e = table[index]; e != null; e = e.next) { if (e.hash == hash && (e.key == key || key.equals(e.key))) { V oldValue = e.value; e.value = value; return oldValue; } } addNode(hash, key, value, index); return null; }
13. Summary Table
Feature | Description |
---|---|
Data Structure | Array + Linked List + Red-Black Tree |
Collision Handling | Separate chaining (Linked list → Tree) |
Load Factor | Triggers resize when 75% full |
Thread-Safety | Not thread-safe |
Null Handling | 1 null key allowed, multiple null values |
Performance | O(1) avg case, O(log n) worst (Java 8+) |