1. Understanding the Problem
You have:
- An array with 1,000,000 elements (e.g.,
int[],double[], orString[]) - Sufficient main memory (RAM) to hold the array
- A need to sort efficiently
Since this array easily fits in memory (a million integers take only ~4 MB), you don’t need any external or distributed sorting approach.
All sorting can be done in-memory — which allows for O(n log n) performance and sub-second execution on modern CPUs.
2. Java Sorting Algorithms Overview
Java provides several efficient in-memory sorting methods inside the java.util.Arrays class.
Let’s discuss the key ones:
A. Arrays.sort()
This is the standard sorting method.
For primitive types (int[], double[], etc.):
- Uses Dual-Pivot Quicksort (developed by Vladimir Yaroslavskiy)
- Sorts in-place (no extra array)
- Very fast on average
- Time complexity: O(n log n)
- Space complexity: O(log n) (recursion stack)
For objects (String[], Integer[], custom objects):
- Uses Timsort
- Stable (maintains relative order of equal elements)
- Works well for partially sorted data
- Time complexity: O(n log n)
Example:
int[] arr = new Random().ints(1_000_000, 0, 1_000_000).toArray(); Arrays.sort(arr);
This will complete in a few hundred milliseconds.
B. Arrays.parallelSort()
Introduced in Java 8, this method improves performance by utilizing multiple CPU cores.
It internally:
- Splits the array into subarrays.
- Sorts them in parallel using multiple threads (via ForkJoinPool).
- Merges the sorted parts efficiently.
When to use:
- Arrays with more than 100,000 elements.
- Systems with multi-core CPUs (almost all modern machines).
- Data that fits in memory.
Advantages:
- Can be 2x–3x faster than
Arrays.sort()on large arrays. - Automatically uses available cores.
Example:
int[] arr = new Random().ints(1_000_000, 0, 1_000_000).toArray();
long start = System.currentTimeMillis();
Arrays.parallelSort(arr);
long end = System.currentTimeMillis();
System.out.println("Parallel sort time: " + (end - start) + " ms");
On a quad-core CPU, you’ll likely see something around 100–200 ms sorting time.
3. Comparison: Arrays.sort() vs Arrays.parallelSort()
| Feature | Arrays.sort() | Arrays.parallelSort() |
|---|---|---|
| Algorithm (Primitives) | Dual-Pivot Quicksort | Parallel Merge Sort |
| Multithreaded | No (single-threaded) | Yes (uses multiple cores) |
| Stability | Stable for objects | Stable for objects |
| Memory Usage | Low | Slightly higher (temporary buffers) |
| Speed (for large arrays) | Fast | Faster on multi-core CPUs |
| Recommended For | Up to ~100k elements | 1 million+ elements |
Conclusion:
For a million elements, Arrays.parallelSort() is the best choice.
4. Complete Example: Sorting a Million Records
Example with Primitive Array
import java.util.Arrays;
import java.util.Random;
public class MillionRecordSort {
public static void main(String[] args) {
// Generate 1 million random integers
int[] arr = new Random().ints(1_000_000, 0, 1_000_000).toArray();
// Sort using parallel sort
long start = System.currentTimeMillis();
Arrays.parallelSort(arr);
long end = System.currentTimeMillis();
System.out.println("Sorting completed in " + (end - start) + " ms");
// Optional: verify sorted order
boolean sorted = true;
for (int i = 1; i < arr.length; i++) {
if (arr[i - 1] > arr[i]) {
sorted = false;
break;
}
}
System.out.println("Array sorted: " + sorted);
}
}
Typical Output:
Sorting completed in 120 ms Array sorted: true
Example with Custom Objects
If you have an array of custom objects (e.g., Employee records):
class Employee {
String name;
int salary;
Employee(String name, int salary) {
this.name = name;
this.salary = salary;
}
}
public class SortEmployees {
public static void main(String[] args) {
Employee[] employees = new Employee[1_000_000];
Random random = new Random();
for (int i = 0; i < employees.length; i++) {
employees[i] = new Employee("Emp" + i, random.nextInt(100000));
}
long start = System.currentTimeMillis();
Arrays.parallelSort(employees, (a, b) -> Integer.compare(a.salary, b.salary));
long end = System.currentTimeMillis();
System.out.println("Sorting employees completed in " + (end - start) + " ms");
}
}
This sorts employees by salary using a lambda comparator.
The underlying sort algorithm here is Parallel Timsort — stable and optimized for object arrays.
5. When to Stick to Arrays.sort()
Although Arrays.parallelSort() is generally faster for large arrays, there are cases where Arrays.sort() may be better:
- Very small arrays (< 100k elements)
Overhead of multi-threading can outweigh parallel benefits. - Single-core systems
No parallel benefit available. - Limited memory environments
parallelSort()uses temporary arrays during merging.
In such cases, the regular Arrays.sort() using Dual-Pivot Quicksort performs extremely well.
6. Practical Tips
- Avoid writing your own sort algorithm.
Java’s built-in sorting functions are heavily optimized and tuned over years of testing. - Use primitives instead of wrappers (
int[]vsInteger[]) when possible.
Primitives sort faster and consume less memory. - Parallel sorting scales with CPU cores.
On 8-core systems, you’ll see substantial speed improvements. - For partially sorted data, object sorting with Timsort (
Arrays.sort(Object[])) is especially efficient. - Benchmark both
Arrays.sort()andArrays.parallelSort()on your actual dataset.
Data distribution sometimes affects which performs better.
7. Summary
| Scenario | Recommended Sort | Reason |
|---|---|---|
| Array fits in memory (like yours) | Arrays.parallelSort() | Fastest due to multi-threading |
| Single-threaded environment | Arrays.sort() | Avoids thread overhead |
| Sorting primitive data (int[], long[]) | Dual-Pivot Quicksort (default) | In-place and efficient |
| Sorting objects with custom comparator | Parallel Timsort | Stable and parallelized |
| Data is already mostly sorted | Timsort (default for objects) | Detects runs, faster |
