Learnitweb

LeetCode Problem 451: Sort Characters By Frequency

1. Problem Overview

The “Sort Characters By Frequency” problem asks us to rearrange characters in a given string such that characters appear in descending order of their frequency (the number of times they occur).

If two characters have the same frequency, their order relative to each other does not matter.

Problem Statement (LeetCode 451)

Given a string s, sort it in decreasing order based on the frequency of the characters.
Return the sorted string.

Example 1:

Input: s = "tree"
Output: "eert"
Explanation: 'e' appears twice while 't' and 'r' appear once.

Example 2:

Input: s = "cccaaa"
Output: "cccaaa"
Explanation: 'c' and 'a' both appear 3 times, so they can appear in any order.

Example 3:

Input: s = "Aabb"
Output: "bbAa"
Explanation: 'b' appears twice, 'A' and 'a' once each.

2. Problem Breakdown

  1. We need to count how many times each character appears.
  2. Then, we must sort the characters based on these frequencies.
  3. Finally, construct the output string with characters repeated according to their frequency.

This is essentially a frequency counting and sorting problem.


3. Step-by-Step Approach

Let’s understand the logic in detail:

Step 1: Count frequency of each character

We can use a HashMap<Character, Integer> to store how many times each character appears in the string.

Example for "tree":

t → 1
r → 1
e → 2

Step 2: Sort characters based on frequency

There are two common approaches to sort based on frequency:

  1. Using a Priority Queue (Max-Heap)
    • Insert each entry (character, frequency) into a max-heap that sorts by frequency in descending order.
    • Then, repeatedly remove the top element (highest frequency) and append that character multiple times to a result.
  2. Using Bucket Sort
    • Since the maximum possible frequency is the length of the string, we can create buckets where index i contains all characters that appeared i times.
    • Then iterate from highest frequency to lowest, appending characters to the result.

We’ll cover both methods.


4. Approach 1: Using Priority Queue (Max Heap)

Algorithm Steps

  1. Build a frequency map.
  2. Add all entries to a PriorityQueue that sorts by frequency (in descending order).
  3. Poll the queue, appending each character frequency times.
  4. Return the final string.

Java Code

import java.util.*;

public class SortCharactersByFrequency {
    public String frequencySort(String s) {
        // Step 1: Count frequencies
        Map<Character, Integer> freqMap = new HashMap<>();
        for (char c : s.toCharArray()) {
            freqMap.put(c, freqMap.getOrDefault(c, 0) + 1);
        }

        // Step 2: Create a max-heap (PriorityQueue) sorted by frequency
        PriorityQueue<Map.Entry<Character, Integer>> maxHeap =
                new PriorityQueue<>((a, b) -> b.getValue() - a.getValue());

        maxHeap.addAll(freqMap.entrySet());

        // Step 3: Build the output string
        StringBuilder result = new StringBuilder();
        while (!maxHeap.isEmpty()) {
            Map.Entry<Character, Integer> entry = maxHeap.poll();
            char c = entry.getKey();
            int count = entry.getValue();

            for (int i = 0; i < count; i++) {
                result.append(c);
            }
        }

        return result.toString();
    }

    public static void main(String[] args) {
        SortCharactersByFrequency obj = new SortCharactersByFrequency();
        System.out.println(obj.frequencySort("tree"));   // eert
        System.out.println(obj.frequencySort("cccaaa")); // cccaaa
        System.out.println(obj.frequencySort("Aabb"));   // bbAa
    }
}

Complexity Analysis

  • Time Complexity:
    • Counting frequencies: O(n)
    • Building heap: O(k log k), where k is number of unique characters
    • Constructing output: O(n)
      Overall: O(n + k log k)
  • Space Complexity:
    • Frequency map: O(k)
    • Heap: O(k)
    • Output string: O(n)
      Overall: O(n)

5. Approach 2: Using Bucket Sort

This approach can be faster in practice since it avoids heap overhead.

Idea

  • Maximum frequency can be at most n (length of the string).
  • Create an array of buckets where each bucket at index i stores characters that appear i times.

Steps

  1. Count character frequencies using HashMap.
  2. Create a bucket array: List<Character>[] buckets = new List[s.length() + 1];
  3. Fill each bucket with corresponding characters.
  4. Traverse from highest frequency bucket to lowest and build result string.

Java Code

import java.util.*;

public class SortCharactersByFrequency {
    public String frequencySort(String s) {
        // Step 1: Count frequency
        Map<Character, Integer> freqMap = new HashMap<>();
        for (char c : s.toCharArray()) {
            freqMap.put(c, freqMap.getOrDefault(c, 0) + 1);
        }

        // Step 2: Create buckets
        List<Character>[] buckets = new List[s.length() + 1];
        for (char c : freqMap.keySet()) {
            int freq = freqMap.get(c);
            if (buckets[freq] == null) {
                buckets[freq] = new ArrayList<>();
            }
            buckets[freq].add(c);
        }

        // Step 3: Build the result
        StringBuilder result = new StringBuilder();
        for (int i = s.length(); i >= 0; i--) {
            if (buckets[i] != null) {
                for (char c : buckets[i]) {
                    for (int j = 0; j < i; j++) {
                        result.append(c);
                    }
                }
            }
        }

        return result.toString();
    }

    public static void main(String[] args) {
        SortCharactersByFrequency obj = new SortCharactersByFrequency();
        System.out.println(obj.frequencySort("tree"));   // eert
        System.out.println(obj.frequencySort("cccaaa")); // cccaaa
        System.out.println(obj.frequencySort("Aabb"));   // bbAa
    }
}

Complexity Analysis

  • Time Complexity:
    • Counting frequencies: O(n)
    • Bucket traversal: O(n)
      Overall: O(n)
  • Space Complexity:
    • Frequency map: O(k)
    • Buckets: O(n)
      Overall: O(n)

6. Comparison of Both Approaches

AspectPriority QueueBucket Sort
Time ComplexityO(n + k log k)O(n)
Space ComplexityO(n)O(n)
ImplementationSimpler conceptuallySlightly more manual setup
Best ForWhen k is smallWhen n is large

7. Key Takeaways

  • Always start by counting frequency using a HashMap.
  • For sorting by frequency:
    • Use a heap when the dataset is small.
    • Use bucket sort when performance is crucial.
  • The result string is built by repeating each character based on its frequency.
  • Similar logic can be extended for problems involving sorting by occurrence or rank.