Fork/Join Parallelism in the Wild: Documenting Patterns and Anti-Patterns in Java Programs using the Fork/Join Framework
Parallel programming is frequently claimed to be hard and all kind of approaches have been proposed to solve the complexity issues. The Fork/Join programming style introduced with Cilk enables the parallel decomposition of problems in a recursive divide-and-conquer style, and on the surface looks very simple with its minimalistic approach of having a fork
and a join
language construct. But is it actually simple to use? To find out, Mattias started to dig through the Java open source projects on GitHub and tried to identify common patterns. Next week, he will present our findings at PPPJ’14.
The preprint of the paper is available below. Additionally, Mattias made the information on the corpus and how to obtain it available.
Abstract
Now that multicore processors are commonplace, developing parallel software has escaped the confines of high-performance computing and enters the mainstream. The Fork/Join framework, for instance, is part of the standard Java platform since version 7. Fork/Join is a high-level parallel programming model advocated to make parallelizing recursive divide-and-conquer algorithms particularly easy. While, in theory, Fork/Join is a simple and effective technique to expose parallelism in applications, it has not been investigated before whether and how the technique is applied in practice. We therefore performed an empirical study on a corpus of 120 open source Java projects that use the framework for roughly 362 different tasks.
On the one hand, we confirm the frequent use of four best-practice patterns (Sequential Cutoff, Linked Subtasks, Leaf Tasks, and avoiding unnecessary forking) in actual projects. On the other hand, we also discovered three recurring anti-patterns that potentially limit parallel performance: sub-optimal use of Java collections when splitting tasks into subtasks as well as when merging the results of subtasks, and finally the inappropriate sharing of resources between tasks. We document these anti-patterns and study their impact on performance.
- Fork/Join Parallelism in the Wild: Documenting Patterns and Anti-Patterns in Java Programs using the Fork/Join Framework; Mattias De Wael, Stefan Marr, Tom Van Cutsem; in ‘Proceedings of the 2014 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools’ , pp. 39-50.
- Paper: PDF
- BibTex: BibSonomy
- Corpus and additional material: online appendix