1. Introduction
The Top K Elements pattern is a common coding pattern used to find the largest or smallest K elements from a dataset, array, or stream.
This pattern is widely used in:
- Competitive programming
- Data analysis
- Real-time streaming data
- Machine learning feature selection
Problem Statement:
Given an array or dataset, find the K largest or K smallest elements efficiently.
2. When to Use the Top K Pattern
Use this pattern when:
- You need to retrieve a subset of extreme elements (top/bottom K) from an array or stream.
- The input dataset is too large to sort entirely.
- You are dealing with streaming data and want online Top K computation.
- The problem requires priority-based selection, like most frequent elements.
Problem Indicators:
- “Find K largest or smallest elements”
- “Find K most frequent elements”
- “Median in a stream”
- “Sliding window top K”
3. Core Idea
The main idea is to use a heap (priority queue) to keep track of the Top K elements:
- Min-heap for K largest elements:
- Maintains the smallest of the top K elements at the root.
- If a new number is larger than the root, replace root → ensures K largest are always in the heap.
- Max-heap for K smallest elements:
- Maintains the largest of the top K smallest elements at the root.
- If a new number is smaller than the root, replace root → ensures K smallest are always in the heap.
4. Step-by-Step Example: Top 3 Largest Elements
Input: [3, 1, 5, 12, 2, 11]
, K = 3
Approach using Min-Heap:
- Initialize a min-heap of size K.
- Iterate over the array:
- Add first K elements:
[3, 1, 5]
→ heap becomes[1, 3, 5]
- Next element 12 > heap root (1) → remove 1, add 12 → heap
[3, 12, 5]
- Next element 2 < heap root (3) → ignore
- Next element 11 > heap root (3) → remove 3, add 11 → heap
[5, 12, 11]
- Add first K elements:
Result: Heap contains [5, 12, 11]
→ Top 3 largest elements
5. Java Implementation: Top K Largest Elements
import java.util.*; public class TopKElements { public static int[] topKLargest(int[] nums, int k) { PriorityQueue<Integer> minHeap = new PriorityQueue<>(); for (int num : nums) { minHeap.offer(num); if (minHeap.size() > k) { minHeap.poll(); // remove smallest } } int[] result = new int[k]; int i = 0; for (int num : minHeap) { result[i++] = num; } return result; } public static void main(String[] args) { int[] nums = {3, 1, 5, 12, 2, 11}; int k = 3; System.out.println(Arrays.toString(topKLargest(nums, k))); } }
Time Complexity: O(n log k) – Each insertion/removal in heap of size K is O(log k)
Space Complexity: O(k) – Heap stores K elements
6. Variants of Top K Problems
- Top K Smallest Elements
- Use max-heap to maintain K smallest elements.
- Replace root if a smaller element is encountered.
- Top K Frequent Elements
- Count frequencies using HashMap
- Use min-heap of size K to keep K most frequent elements
- Median / Top K Median
- Use two heaps (max-heap + min-heap) to maintain Top K or median dynamically
- Sliding Window Top K
- Maintain a heap for each window or use TreeMap for O(log K) insertion/removal
7. Top K Frequent Elements – Example
Problem: Find top 2 frequent elements in [1,1,1,2,2,3]
.
public int[] topKFrequent(int[] nums, int k) { Map<Integer, Integer> freq = new HashMap<>(); for (int num : nums) freq.put(num, freq.getOrDefault(num, 0) + 1); PriorityQueue<Integer> minHeap = new PriorityQueue<>( (a, b) -> freq.get(a) - freq.get(b) ); for (int num : freq.keySet()) { minHeap.offer(num); if (minHeap.size() > k) minHeap.poll(); } int[] result = new int[k]; int i = 0; while (!minHeap.isEmpty()) result[i++] = minHeap.poll(); return result; }
8. Tips and Observations
- Use min-heap for largest K elements, max-heap for smallest K.
- For frequency-based Top K, maintain heap based on frequency.
- For streaming data, always maintain a heap of size K → O(log K) per insertion.
- Sorting entire array → O(n log n), heap approach → O(n log k) → better when k << n.
- Use TreeMap or multiset for sliding window Top K for efficient removal.
9. Common Edge Cases
- K = 0 → return empty array
- K ≥ array length → return the entire array
- Negative numbers → heap works as usual
- Duplicate elements → include duplicates in Top K unless otherwise specified
10. Key Takeaways
- The Top K pattern is a heap-based optimization to find K largest/smallest elements efficiently.
- Min-heap keeps track of K largest; max-heap keeps track of K smallest.
- Heap size is always maintained at K to optimize time and space.
- Variants include:
- Top K frequent elements
- Sliding window Top K
- Streaming Top K elements
- Time Complexity: O(n log K)
- Space Complexity: O(K)