Learnitweb

Parallel streams in Java

1. Introduction

Parallel computing entails fragmenting a task into smaller segments, tackling these segments concurrently (in parallel, with each one operating within its own thread), and subsequently combining the outcomes of the individual segment solutions.

There is problem in implementing parallelism in Java applications that use collections as collections are not thread-safe. This means that multiple threads cannot work on a collection without introducing thread interference or memory inconsistency errors. Java provides synchronization wrappers like synchronizedSet(Set<T> s), synchronizedList(List<T> list) etc. but synchronization introduces thread contention which prevents running threads in parallel.

Parallel stream feature was introduced in Java 8 which utilizes the multiple cores of the processor. In Java, a stream is always a serial stream unless specified.

When a stream executes in parallel, the Java runtime partitions the stream into multiple substreams. Aggregate operations iterate over and process these substreams in parallel and then combine the results. The order of execution, however, is not under our control. So parallel streams should not be used when the code needs to be executed in certain order and in such cases sequential streams should be used.

Incorrect usage of parallelStream can lead to issues such as race conditions and deadlocks.

You can create a parallel stream in the following ways:

  • Collection.parallelStream – Returns a possibly parallel Stream with this collection as its source. It is allowable for this method to return a sequential stream.
  • BaseStream.parallel – Returns an equivalent stream that is parallel. May return itself, either because the stream was already parallel, or because the underlying stream state was modified to be parallel.

2. Example

Let us write a program to use parallel stream.

class ParallelStream {
    public static void main(String[] args) {

        List<Integer> numbers = Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);

        int sum = numbers.parallelStream()
                .mapToInt(Integer::intValue)
                .sum();

        System.out.println(sum);
    }
}

3. Few important points

  • High memory consumption – Using parallel stream can lead to higher memory consumption because of creating multiple threads to run parallel operations.
  • Non-deterministic behavior – The output of program may vary with each execution.
  • Concurrency issues – Incorrect usage of Parallel Stream can lead to concurrency issues such as race conditions and deadlocks.

4. When should you use parallel streams?

Following are some cases when you should use parallel stream:

  • Large data sets: Parallel streams are especially advantageous for managing large data sets that can be efficiently split into smaller segments for simultaneous processing. This approach enhances the utilization of multiple cores or processors.
  • CPU-Intensive Operations: When the data operations are computationally demanding and can gain from parallel execution, utilizing parallel streams can result in enhanced performance.
  • Independent Operations: Parallel streams can be employed when the operations on the elements of a stream are independent of one another. This independence allows for increased parallelism without requiring synchronization.

5. Conclusion

Parallel streams in Java offer a robust mechanism for enhancing the performance of data processing, particularly with large data sets and CPU-intensive tasks. By utilizing multiple cores or processors, parallel streams enable concurrent execution of independent operations, optimizing resource usage. However, it is crucial to apply parallel streams wisely, as not all tasks will benefit from parallel execution, and misuse can lead to issues like contention and increased overhead.

When used correctly, parallel streams can greatly boost your application’s performance. By analyzing your data and understanding the required operations, you can determine when parallelism is appropriate for optimal results. Like any powerful tool, thoughtful consideration and thorough testing are key to ensuring that the advantages of parallel streams are fully realized without encountering significant downsides.