Learnitweb

Two Heaps Coding Pattern

1. Introduction

The Two Heaps pattern is a powerful approach used for solving problems that require efficient retrieval of extreme elements (like the minimum or maximum) in a dynamic dataset, often in real-time.

The idea is to use two heaps (priority queues):

  1. Max-heap – Maintains the smaller half of the data.
    • Top element is the maximum of the smaller half.
  2. Min-heap – Maintains the larger half of the data.
    • Top element is the minimum of the larger half.

By carefully balancing the two heaps, we can efficiently solve problems like:

  • Finding the median of a stream
  • Sliding window maximum/minimum
  • Kth largest/smallest element in a dynamic dataset
  • Median maintenance in online algorithms

2. When to Use the Two Heaps Pattern

Use the Two Heaps approach when:

  1. You need frequent retrieval of min/max elements.
  2. The dataset is dynamic, i.e., numbers are added or removed over time.
  3. You need efficient insertion and extraction of extreme elements.
  4. Balancing two halves of data is required (like median).

Problem Indicators:

  • “Find median of stream of numbers”
  • “Find running median”
  • “Sliding window median”
  • “Kth largest/smallest element in a changing array”

3. Core Idea

The pattern works by maintaining two heaps:

  1. Max-heap (left half)
    • Stores the smaller half of the numbers.
    • The root contains the largest element of the left half.
  2. Min-heap (right half)
    • Stores the larger half of the numbers.
    • The root contains the smallest element of the right half.

Rules:

  • The max-heap can have at most one more element than the min-heap.
  • The top of the heaps gives direct access to the median:
    • If the heaps have equal size → median = average of top elements
    • If max-heap larger → median = top of max-heap

4. Step-by-Step Example: Median of a Stream

Stream: 5, 15, 1, 3

Step 1: Initialize heaps

  • Max-heap: empty
  • Min-heap: empty

Step 2: Add 5

  • Max-heap is empty → add to max-heap
  • Max-heap: [5]
  • Min-heap: []
  • Median: 5

Step 3: Add 15

  • 15 > 5 → add to min-heap
  • Max-heap: [5]
  • Min-heap: [15]
  • Median: (5 + 15)/2 = 10

Step 4: Add 1

  • 1 < 5 → add to max-heap
  • Max-heap: [5, 1]
  • Min-heap: [15]
  • Median: top of max-heap = 5

Step 5: Add 3

  • 3 < 5 → add to max-heap → [5, 1, 3]
  • Balance heaps → move top of max-heap (5) to min-heap
  • Max-heap: [3, 1]
  • Min-heap: [5, 15]
  • Median: (3 + 5)/2 = 4

Observation: By keeping heaps balanced, median is efficiently retrieved in O(1) time.


5. Java Implementation: Median of a Stream

import java.util.*;

public class MedianFinder {
    private PriorityQueue<Integer> maxHeap; // smaller half
    private PriorityQueue<Integer> minHeap; // larger half

    public MedianFinder() {
        maxHeap = new PriorityQueue<>(Collections.reverseOrder());
        minHeap = new PriorityQueue<>();
    }

    public void addNum(int num) {
        if (maxHeap.isEmpty() || num <= maxHeap.peek()) {
            maxHeap.offer(num);
        } else {
            minHeap.offer(num);
        }

        // Balance heaps
        if (maxHeap.size() > minHeap.size() + 1) {
            minHeap.offer(maxHeap.poll());
        } else if (minHeap.size() > maxHeap.size()) {
            maxHeap.offer(minHeap.poll());
        }
    }

    public double findMedian() {
        if (maxHeap.size() == minHeap.size()) {
            return (maxHeap.peek() + minHeap.peek()) / 2.0;
        } else {
            return maxHeap.peek();
        }
    }

    public static void main(String[] args) {
        MedianFinder mf = new MedianFinder();
        mf.addNum(5);
        System.out.println(mf.findMedian()); // 5
        mf.addNum(15);
        System.out.println(mf.findMedian()); // 10
        mf.addNum(1);
        System.out.println(mf.findMedian()); // 5
        mf.addNum(3);
        System.out.println(mf.findMedian()); // 4
    }
}

Time Complexity:

  • Adding a number → O(log n) for insertion and balancing
  • Finding median → O(1)

Space Complexity: O(n) for storing all numbers


6. Other Applications of Two Heaps

  1. Sliding Window Median
    • Maintain two heaps for the current window.
    • Efficiently add/remove elements as the window slides.
  2. Kth Largest/Smallest Element in Dynamic Data
    • Use min-heap of size k for Kth largest element
    • Use max-heap of size k for Kth smallest element
  3. Median Maintenance in Streaming Data
    • Useful in online algorithms where numbers are continuously arriving.
  4. Dynamic Order Statistics
    • Two heaps maintain approximate order without fully sorting the dataset.

7. Why Two Heaps Are Efficient

  • Direct access to extreme values (max or min) via heap root
  • Efficient insertion and extraction: O(log n)
  • Balancing ensures quick median retrieval without full sorting
  • Works well for online algorithms where data changes dynamically

8. Stepwise Tips

  1. Decide which heap stores smaller half (max-heap) and which stores larger half (min-heap).
  2. Always insert into appropriate heap based on value.
  3. After each insertion, balance the heaps so the size difference is ≤ 1.
  4. To find median:
    • If heaps equal → median = average of top elements
    • If max-heap larger → median = max-heap top
  5. For sliding window problems, remove outdated elements from heaps efficiently (often using a delayed removal map).

9. Edge Cases to Consider

  • Stream with only one number → median is that number
  • Negative numbers → heap ordering works with default comparators
  • All numbers identical → median is that repeated number
  • Large streams → always balance heaps to prevent size overflow

10. Key Takeaways

  • Two Heaps pattern is ideal for problems requiring dynamic min/max or median computation.
  • Max-heap stores the smaller half, min-heap stores the larger half.
  • Heaps must be balanced for correct median retrieval.
  • Supports O(log n) insertion and O(1) median retrieval.
  • Widely used in competitive programming, real-time analytics, and online algorithms.