Learnitweb

Top K Elements Coding Pattern

1. Introduction

The Top K Elements pattern is a common coding pattern used to find the largest or smallest K elements from a dataset, array, or stream.

This pattern is widely used in:

  • Competitive programming
  • Data analysis
  • Real-time streaming data
  • Machine learning feature selection

Problem Statement:

Given an array or dataset, find the K largest or K smallest elements efficiently.


2. When to Use the Top K Pattern

Use this pattern when:

  1. You need to retrieve a subset of extreme elements (top/bottom K) from an array or stream.
  2. The input dataset is too large to sort entirely.
  3. You are dealing with streaming data and want online Top K computation.
  4. The problem requires priority-based selection, like most frequent elements.

Problem Indicators:

  • “Find K largest or smallest elements”
  • “Find K most frequent elements”
  • “Median in a stream”
  • “Sliding window top K”

3. Core Idea

The main idea is to use a heap (priority queue) to keep track of the Top K elements:

  1. Min-heap for K largest elements:
    • Maintains the smallest of the top K elements at the root.
    • If a new number is larger than the root, replace root → ensures K largest are always in the heap.
  2. Max-heap for K smallest elements:
    • Maintains the largest of the top K smallest elements at the root.
    • If a new number is smaller than the root, replace root → ensures K smallest are always in the heap.

4. Step-by-Step Example: Top 3 Largest Elements

Input: [3, 1, 5, 12, 2, 11], K = 3

Approach using Min-Heap:

  1. Initialize a min-heap of size K.
  2. Iterate over the array:
    • Add first K elements: [3, 1, 5] → heap becomes [1, 3, 5]
    • Next element 12 > heap root (1) → remove 1, add 12 → heap [3, 12, 5]
    • Next element 2 < heap root (3) → ignore
    • Next element 11 > heap root (3) → remove 3, add 11 → heap [5, 12, 11]

Result: Heap contains [5, 12, 11] → Top 3 largest elements


5. Java Implementation: Top K Largest Elements

import java.util.*;

public class TopKElements {
    public static int[] topKLargest(int[] nums, int k) {
        PriorityQueue<Integer> minHeap = new PriorityQueue<>();
        
        for (int num : nums) {
            minHeap.offer(num);
            if (minHeap.size() > k) {
                minHeap.poll(); // remove smallest
            }
        }

        int[] result = new int[k];
        int i = 0;
        for (int num : minHeap) {
            result[i++] = num;
        }
        return result;
    }

    public static void main(String[] args) {
        int[] nums = {3, 1, 5, 12, 2, 11};
        int k = 3;
        System.out.println(Arrays.toString(topKLargest(nums, k)));
    }
}

Time Complexity: O(n log k) – Each insertion/removal in heap of size K is O(log k)
Space Complexity: O(k) – Heap stores K elements


6. Variants of Top K Problems

  1. Top K Smallest Elements
    • Use max-heap to maintain K smallest elements.
    • Replace root if a smaller element is encountered.
  2. Top K Frequent Elements
    • Count frequencies using HashMap
    • Use min-heap of size K to keep K most frequent elements
  3. Median / Top K Median
    • Use two heaps (max-heap + min-heap) to maintain Top K or median dynamically
  4. Sliding Window Top K
    • Maintain a heap for each window or use TreeMap for O(log K) insertion/removal

7. Top K Frequent Elements – Example

Problem: Find top 2 frequent elements in [1,1,1,2,2,3].

public int[] topKFrequent(int[] nums, int k) {
    Map<Integer, Integer> freq = new HashMap<>();
    for (int num : nums) freq.put(num, freq.getOrDefault(num, 0) + 1);

    PriorityQueue<Integer> minHeap = new PriorityQueue<>(
        (a, b) -> freq.get(a) - freq.get(b)
    );

    for (int num : freq.keySet()) {
        minHeap.offer(num);
        if (minHeap.size() > k) minHeap.poll();
    }

    int[] result = new int[k];
    int i = 0;
    while (!minHeap.isEmpty()) result[i++] = minHeap.poll();
    return result;
}

8. Tips and Observations

  1. Use min-heap for largest K elements, max-heap for smallest K.
  2. For frequency-based Top K, maintain heap based on frequency.
  3. For streaming data, always maintain a heap of size K → O(log K) per insertion.
  4. Sorting entire array → O(n log n), heap approach → O(n log k) → better when k << n.
  5. Use TreeMap or multiset for sliding window Top K for efficient removal.

9. Common Edge Cases

  • K = 0 → return empty array
  • K ≥ array length → return the entire array
  • Negative numbers → heap works as usual
  • Duplicate elements → include duplicates in Top K unless otherwise specified

10. Key Takeaways

  1. The Top K pattern is a heap-based optimization to find K largest/smallest elements efficiently.
  2. Min-heap keeps track of K largest; max-heap keeps track of K smallest.
  3. Heap size is always maintained at K to optimize time and space.
  4. Variants include:
    • Top K frequent elements
    • Sliding window Top K
    • Streaming Top K elements
  5. Time Complexity: O(n log K)
  6. Space Complexity: O(K)