1 Introduction and Background
With the start of the multicore era, concurrent programming became more and more important. Unfortunately, it also complicates the software development significantly. Common concurrency problems include correctness issues such as data races and deadlocks as well as performance issues caused by resource contention and sequential bottlenecks. To avoid these problems, a wide variety of different concurrency models have been proposed.
Van Roy and Haridi  categorize the main models into declarative, message-passing, and shared-state concurrency. Shared-state concurrency includes, e. g., software transactional memory (STM) [Shavit and Touitou, 1995] and thread-based models. Message-passing concurrency includes, e. g., the actor model [Agha, 1986] and communicating sequential processes [Hoare, 1978]. In the category of declarative concurrency fall approaches such as various kinds of data flow programming [Johnston et al., 2004] and more recent reactive programming approaches [Bainomugisha et al., 2013].
Each of these concurrency models solves its own set of problems. Consequently, applications started to combine a variety of concurrency models to address different functional and performance requirements. A recent study of Scala programs [Tasharofi et al., 2013] investigated why developers use thread-based abstractions in actor-based applications. While the actor model avoids race conditions by design, because actors cannot share mutable state, this property can severely affect performance. Thus, developers combine it with threads. In addition, some problems are expressed more naturally by defining computations on shared memory. Unfortunately, combining concurrency models can lead to unexpected bugs, e. g., race conditions or deadlocks [Swalens et al., 2014].
While the use of multiple concurrency models in the same application has been recognized as an issue [Haller, 2015, Swalens et al., 2014, Tasharofi et al., 2013], the question of how to engineer and debug such systems has not been approached. Yet software tools are an integral part of application development since they help developers to cope with the complexity of software. On the one hand, tools such as inspectors, debuggers, and profilers need to be able to expose the high-level concurrency concepts such as transactions or asynchronous messages to the developers. On the other hand, tools need to overcome the fundamental performance overhead of collecting the runtime information on data access patterns, data races, conflicting updates in transactions, or actors that are sequential bottlenecks. To aid the development of complex concurrent applications, high-level debuggers are sorely needed to help programmers to find errors as well as to achieve a better understanding of the dynamic application behavior.
2 A Meta-Level Interface to Enable Tooling of Complex Concurrent Systems
To bring tooling to the wide range of concurrency models, we envision a meta-level interface that allows tools to connect to the language implementation and observe program execution as well as interact with it by using a common set of basic concepts that underlie a wide range of concurrency abstractions. This idea is reminiscent to classic metaobject protocols (MOP) [Kiczales et al., 1991], which reify the concepts of a language and its implementation to enable tooling as well as dynamic adaptation from within a program. This idea can also be seen in widely used VM tooling interfaces [Würthinger et al., 2010].
Figure 1 visualizes the relation of the envisioned meta-level interface with respect to concurrency abstractions, tools, and instrumentation at the level of language implementation. The key for making the interface independent of concrete concurrency abstractions is to identify the fundamental building blocks for each abstraction. For instance for message-passing approaches, the exchange of messages and the involved mailboxes or channels are essential, because they allow developers to reason about the information flow between actors or processes. Shared memory models on the other hand have other basic notions. In an STM, it is relevant to know in which order transactions have been executed, which ones caused transaction aborts or retries. To understand applications with locks, one needs however information on which resources are protected by locks, and how and when the locks are acquired. Declarative concurrency models are again different, in that they typically express data dependencies explicitly.
3 Open Research Issues
For such a meta-level interface, we need to distill the set of basic building blocks instead of merely enumerating all high-level concepts. Thus, one of the challenges is to identify the common core concepts shared between concurrency abstractions so that a wide range of concepts can be represented. However, this is not the only challenge. The next section details the problems that need to be tackled to realize the vision of a common meta-level interface for debugging.
To build high-level debuggers and other tools for the development of complex concurrent systems, we face issues in three areas: how to debug programs that employ multiple concurrency models, how to provide instrumentation for such tooling that minimizes interference with program execution, and how to decouple these mechanisms from concrete concurrency abstractions. We detail these problem briefly in the remainder of this section.
3.1 Debuggers for Programs Employing Multiple Concurrency Models
Prior research has studied debugging support for many of the individual concurrency models. However, debuggers for programs that employ multiple concurrency models are unexplored. Generally, McDowell and Helmbold  distinguish classic breakpoint-based debuggers and event-based ones. Event-based debuggers typically record the history of events in an application and might provide analyses on top of these events. Depending on the concurrency model, an event may be an API call, a memory load/store, or a message send/receive.
As an approach that is especially relevant in the field of concurrency, research on breakpoint-based debuggers led to time-traveling debuggers [Barr and Marron, 2014, Pluquet et al., 2009, Pothier et al., 2007], which preserve the dynamic program history to enable forward and backward step-by-step execution and state inspection. Focusing on specific models, breakpoint-based debuggers have been investigated for thread-based concurrency models, STM [Zyulkyarov et al., 2010], and event loops [Gonzalez Boix et al., 2011].
However, it remains open how to build a uniform debugger that supports a wide variety of concurrency models. We see this as an open question both from the user perspective and the implementation perspective. Considering typical scenarios, such a debugger needs to be able to help developers to understand their programs from different perspectives or levels of granularity. For instance in an actor program, one might want to continue execution until a message is processed completely, or one might want to see which messages caused which effects to see races or bottlenecks. In a transaction, one might want to complete a whole atomic block, or in a lock-based part of the program continue execution until a lock is taken or released. All these are different concepts that have to be carefully represented to be understandable for the user, but must also be realized in a way generic that is enough so that the new en vogue concurrency concept of the day can be supported.
3.2 Low-Interference Program Instrumentation
For debugging, it is also still an open question how to record concurrent executions, while minimizing the necessary data and instrumentation. Instrumentation of concurrent programs is generally problematic, because it can interfere with the execution so that data races might never appear with the instrumentation. Thus, it is essential to minimize general interference. Part of that is to minimize the amount of recorded events [Barr and Marron, 2014, Hofer et al., 2016, Lengauer et al., 2015, Pothier et al., 2007]. This could for instance go in the direction of techniques for debugging message-passing systems. Honda and Yonezawa  propose to raise the abstraction level to debug systems on the level of object groups. This approach structures the event history in terms of the tasks performed collectively by a group of objects, which is one relevant approach to present the high-level concepts to developers.
The main goal is, however, to identify techniques that are applicable to a wide range of concurrency models to be able to represent high-level concepts to developers. Thus, relevant techniques include recording for replay [Liu et al., 2015, Prähofer et al., 2011], concurrency bug reproduction [Huang and Zhang, 2012], dataflow [Tripp et al., 2011] and data dependency discovery [Bond et al., 2013], as well as recording of performance-related information [David et al., 2014, Hofer et al., 2015a]. So far, however, the known precise approaches cause interferences in the execution or do not scale to complex systems. Imprecise, i. e., probabilistic approaches are potentially a solution. Existing work investigates the use of hardware support [Hofer et al., 2015b, Inoue and Nakatani, 2009], and discusses how to derive high-level information [Jin et al., 2010, Mytkowicz et al., 2009], but has not yet been investigated in the context of multiple high-level concurrency models.
3.3 Decoupling Infrastructure from Concrete Concurrency Models
As mentioned in section 2, our goal is to define a meta-level interface that can be used to observe and to interact with the execution of complex concurrent systems that use a wide range of concurrency abstractions. We envision the use of a meta-level interface based on the insight that tools always have a tight connection to meta-level programming. Thus, it is natural that programs are either manipulated (statically) or executed and monitored (dynamically) by a tool, which can also be the program itself. The code of the tool interacts with the code of the program by means of such a meta-level interface, which is tightly bound to the design of the programming language. Examples are VM tooling interfaces [Würthinger et al., 2010] and metaobject protocols (MOP) [Kiczales et al., 1991]. MOPs have been studied in the field of distributed computing [Cazzola, 2003, Garbinato et al., 1994, McAffer, 1995] as well as to realize individual concurrency models on the same VM [Marr and D’Hondt, 2012]. However, it is still open which basic concepts such an interface needs to make it possible to build a wide range of concurrency abstractions on top of it and to provide the necessary information about interactions of concurrency abstractions for the purpose of building tools such as debuggers. Thus, we need to find ways to decouple the meta-level interface from concrete concurrency models, and to abstract from them.
For the development of complex concurrent systems, we need better tools! Currently, debugging and tooling is only available on the implementation level and does not capture high-level concepts. With the complexity of concurrent software especially when it mixes and matches different abstractions, it is essential to preserve these abstractions also during debugging in order to understand a program’s behavior.
We see a meta-level interface that abstracts from concrete concurrency models as a way forward. However, there are many open research questions: starting from how to build debuggers for different interacting concurrency abstractions, over the question of how to avoid problematic interference with program execution, as well as how to abstract from concrete concurrency models for such an interface.
Recently, we started to investigate these issues in the context of the Truffle and Graal language implementation platform [Würthinger et al., 2013] of Oracle Labs. We intent to focus on this research, but with the large number of open questions, we welcome collaborations and exchange of ideas that can lead to solving today’s issues of building complex concurrent applications.
This research is funded by a collaboration grant of the Austrian Science Fund (FWF) with the project I2491-N31 and the Research Foundation Flanders (FWO Belgium).
M. D. Bond, M. Kulkarni, M. Cao, M. Zhang, M. Fathi Salmi, S. Biswas, A. Sengupta, and J. Huang. Octet: Capturing and controlling cross-thread dependences efficiently. In Proc. of OOPSLA, pages 693–712. ACM, 2013. doi: 10.1145/2509136.2509519.
F. David, G. Thomas, J. Lawall, and G. Muller. Continuously measuring critical section pressure with the free-lunch profiler. In Proc. of OOPSLA, pages 291–307. ACM, 2014. doi: 10.1145/2660193.2660210.
B. Garbinato, R. Guerraoui, and K. Mazouni. Distributed programming in garf. In R. Guerraoui, O. Nierstrasz, and M. Riveill, editors, Object-Based Distributed Programming, volume 791 of LNCS, pages 225–239. Springer Berlin Heidelberg, 1994. doi: 10.1007/BFb0017543.
E. Gonzalez Boix, C. Noguera, T. Van Cutsem, W. De Meuter, and T. D’Hondt. Reme-d: A reflective epidemic message-oriented debugger for ambient-oriented applications. In Proc. of SAC, pages 1275–1281. ACM, 2011. doi: 10.1145/1982185.1982463.
P. Haller. High-Level Concurrency Libraries: Challenges for Tool Support, October 2015. Keynote of Eclipse Technology eXchange Workshop, collocated with SPLASH ’15. http://2015.splashcon.org/event/etx2015-keynote-by-philipp-haller.
P. Hofer, D. Gnedt, and H. Mössenböck. Efficient Dynamic Analysis of the Synchronization Performance of Java Applications. In Proc. of the WODA Workshop, pages 14–18. ACM, 2015a. doi: 10.1145/2823363.2823367.
P. Hofer, D. Gnedt, A. Schörgenhumer, and H. Mössenböck. Efficient Tracing and Versatile Analysis of Lock Contention in Java Applications on the Virtual Machine Level. In Proc. of the ACM/SPEC Conf. on Performance Engineering (ICPE’16), March 2016.
Y. Honda and A. Yonezawa. Debugging concurrent systems based on object groups. In S. Gjessing and K. Nygaard, editors, Proc. of ECOOP, volume 322 of LNCS, pages 267–282. Springer Berlin Heidelberg, 1988. doi: 10.1007/3-540-45910-3_16.
J. Huang and C. Zhang. Lean: Simplifying concurrency bug reproduction via replay-supported execution reduction. In Proc. of OOPSLA, volume 47, pages 451–466. ACM, Oct. 2012. doi: 10.1145/2398857.2384649.
S. Marr and T. D’Hondt. Identifying a unifying mechanism for the implementation of concurrency abstractions on multi-language virtual machines. In Proc. of TOOLS, volume 7304 of LNCS, pages 171–186. Springer, May 2012. doi: 10.1007/978-3-642-30561-0_13.
J. McAffer. Meta-level programming with coda. In M. Tokoro and R. Pareschi, editors, Proc. of ECOOP’95, volume 952 of LNCS, pages 190–214. Springer Berlin Heidelberg, 1995. doi: 10.1007/3-540-49538-X_10.
H. Prähofer, R. Schatz, C. Wirth, and H. Mössenböck. A comprehensive solution for deterministic replay debugging of softplc applications. Industrial Informatics, IEEE Transactions on, 7(4):641–651, November 2011. ISSN 1551-3203. doi: 10.1109/TII.2011.2166768.
S. Tasharofi, P. Dinges, and R. E. Johnson. Why do scala developers mix the actor model with other concurrency models? In ECOOP 2013 – Object-Oriented Programming, volume 7920 of LNCS, pages 302–326. Springer Berlin Heidelberg, 2013. doi: 10.1007/978-3-642-39038-8_13.
T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. One VM to Rule Them All. In Proc. of Onward!, pages 187–204. ACM, 2013. doi: 10.1145/2509578.2509581.
T. Würthinger, M. L. Van De Vanter, and D. Simon. Multi-level virtual machine debugging using the java platform debugger architecture. In Perspectives of Systems Informatics, volume 5947 of LNCS, pages 401–412. Springer Berlin Heidelberg, 2010. doi: 10.1007/978-3-642-11486-1_34.
F. Zyulkyarov, T. Harris, O. S. Unsal, A. Cristal, and M. Valero. Debugging programs that use atomic blocks and transactional memory. In Proc. of PPoPP, pages 57–66. ACM, 2010. doi: 10.1145/1693453.1693463.