Tag Archives: CSP

SOMns 0.2 Release with CSP, STM, Threads, and Fork/Join

Since SOMns is a pure research project, we aren’t usually doing releases for SOMns yet. However, we added many different concurrency abstractions since December and have plans for bigger changes. So, it seems like a good time to wrap up another step, and get it into a somewhat stable shape.

The result is SOMns v0.2, a release that adds support for communicating sequential processes, shared-memory multithreading, fork/join, and a toy STM. We also improved a variety of things under the hood.

Note, SOMns is still not meant for ‘users’. It is however a stable platform for concurrency research and student projects. If you’re interested to work with it, drop us a line, or check out the getting started guide.

0.2.0 – 2017-03-07 Extended Concurrency Support

Concurrency Support

  • Added basic support for shared-memory multithreading and fork/join
    programming (PR #52)

    • object model uses now a global safepoint to synchronize layout changes
    • array strategies are not safe yet
  • Added Lee and Vacation benchmarks (PR #78)

  • Configuration flag for actor tracing, -atcfg=
    example: -atcfg=mt:mp:pc turns off message timestamps, message parameters and promises

  • Added Validation benchmarks and a new Harness.

  • Added basic Communicating Sequential Processes support.
    See PR #84.

  • Added CSP version of PingPong benchmark.

  • Added simple STM implementation. See s.i.t.Transactions and PR #81 for details.

  • Added breakpoints for channel operations in PR #99.

  • Fixed isolation issue for actors. The test that an actor is only created
    from a value was broken (issue #101, PR #102)

  • Optimize processing of common single messages by avoiding allocation and
    use of object buffer (issue #90)

Interpreter Improvements

  • Turn writes to method arguments into errors. Before it was leading to
    confusing setter sends and ‘message not understood’ errors.

  • Simplified AST inlining and use objects to represent variable info to improve
    details displayed in debugger (PR #80).

  • Make instrumentation more robust by defining number of arguments of an
    operation explicitly.

  • Add parse-time specialization of primitives. This enables very early
    knowledge about the program, which might be unreliable, but should be good
    enough for tooling. (See Issue #75 and PR #88)

  • Added option to show methods after parsing in IGV with
    -im/--igv-parsed-methods (issue #110)

Communicating Sequential Processes for Newspeak/SOMns

One possible way for modeling concurrent systems is Tony Hoare’s classic approach of having isolated processes communicate via channels, which is called Communicating Sequential Processes (CSP). Today, we see the approach used for instance in Go and Clojure.

While Newspeak’s specification and implementation come with support for Actors, I want to experiment also with other abstractions, and CSP happens to be an interesting one, since it models systems with blocking synchronization, also know as channels with rendezvous semantics. I am not saying CSP is better in any specific case than actors. Instead, I want to find out where CSP’s abstractions provide a tangible benefit.

But, the reason for this post is another one. One of my biggest quibbles with most CSP implementations is that they don’t take isolation serious. Usually, they provide merely lightweight concurrency and channels, but they rarely ensure that different processes don’t share any mutable memory. So, the door for low-level race conditions is wide open. The standard argument of language or library implementers is that guaranteeing isolation is not worth the performance overhead that comes with it. For me, concurrency is hard enough, so, I prefer to have the guarantee of proper isolation. Of course, another part of the argument is that you might need shared memory for some problems, but, I think we got a more disciplined approach for those problems, too.

Isolated Processes in Newspeak

Ok, so how can we realize isolated processes in Newspeak? As it turns out, it is pretty simple. Newspeak already got the notion of values. Values are deeply immutable objects. This means values can only contain values themselves, which as a consequence means, if you receive some value from a concurrent entity, you are guaranteed that the state never changes.

In SOMns, you can use the Value mixin to mark a class as having value semantics. This means that none of the fields of the object are allowed to be mutable, and that we need to check that fields are only initialized with values in the object’s constructor. Since Newspeak uses nested classes pretty much everywhere, we also need to check that the outer scope of a value class does not have any mutable state. Once that is verified, an object can be a proper deeply immutable value, and can be shared with out introducing any data races between concurrent entities.

Using this as a foundation, we can require that all classes that represent CSP processes are values. This gives us the guarantee that a process does not have access to any shared mutable state by itself. Note, this is only about the class side. The object side can actually be a normal object an have mutable state, which means, within a process, we can have normal mutable state/objects.

Using the value notion of Newspeak feels like a very natural solution to me. Alternative approaches could use a magic operator that cuts off lexical scope. This is something that I have seen for instance in AmbientTalk with its isolates. While this magic isolate keyword gives some extra flexibility, it is also a new concept. Having to ensure that a process’ class is a value requires that its outer lexical scope is a value, and thus, restricts a bit how we structure our modules, but, it doesn’t require any new concepts. One other drawback is here that it is often not clear that the lexical scope is a value, but I think that’s something where an IDE should help and provide the necessary insights.

In code, this looks then a bit like this:

class ExampleModule = Value ()(
  class DoneProcess new: channelOut = Process (
  | private channelOut = channelOut. |
  )(
    public run = ( channelOut write: #done )
  )
  
  public start = (
    processes spawn: DoneProcess
               with: {Channel new out}
  )
)
So, we got a class DoneProcess, which has a run method that defines what the process does. Our processes module allows us to spawn the process with arguments, which is in this case the output end of a channel.

Channels

The other aspect we need to think about is how can we design channels so that they preserve isolation. As a first step, I’ll only allow to send values on the channel. This ensure isolation and is a simple efficient check whether the provided object is a value.

However, this approach is also very restrictive. Because of the deeply immutable semantics of values, they are quite inflexible in my experience.

When thinking of what it means to be a value, imagine a bunch of random objects: they all can point to values, but values can never point back to any mutable object. That’s a very nice property from the concurrency perspective, but in practice this means that I often feel the need to represent data twice. Once as mutable, for instance for constructing complex data structures, and a second time as values so that I can send data to another process.

A possible solution might be objects with copy-on-transfer semantics, or actual ownership transfer. This could be modeled either with a new type of transfer objects, or a copying channel. Perhaps there are other options out there. But for the moment, I am already happy with seeing that we can have proper CSP semantics by merely checking that a process is constructed from values only and that channels only pass on values.

Since the implementation is mostly a sketch, there are of course more things that need to be done. For instance, it doesn’t yet support any nondeterminism, which requires an alt or select operation on channels.

Why is Concurrent Programming hard?

In short, I think, it is hard because on the one hand there is not a single concurrency abstraction that fits all problems, and on the other hand the various different abstractions are rarely designed to be used in combination with each other.

But let us start at the beginning. The terminology might get otherwise in the way. For the purpose of this discussion, I distinguish concurrent programming and parallel programming.

Concurrent programming and its corresponding programming abstractions focus on the correctness of a computation and the consistency of state in the context of parallel or interleaved execution.

Parallel programming and its corresponding programming abstractions focus on structuring an algorithm in a way that parallel computational resources are used efficiently.

Thus, while both programming approaches are related and are often used in combination, their goals and consequently their main abstractions are different. I am not claiming that parallel programming is solved and widely understood, but it is comparably easy to apply it to an isolated problem when performance is an issue. The emphasize here is on isolated, because the integration into an existing system is the hard part and can expose all kind of concurrency issues for which concurrent programming techniques are required as a solution.

When do Concurrency Programming Abstractions Break Down?

The first questions might be why and when are we using concurrent programming? The main reason is the desire to increase “performance”, either by increasing throughput, by reducing latency, or improving interactivity by moving operations off the critical path. Sometimes, a concurrent design also happens to map best on a problem by aligning the program structure with the domain model for instance in terms of tasks or processes and thus is chosen as solution.

A classic example calling for concurrent execution is user interfaces. Independent of a particular solution, the overall goal is to move a computational or I/O task out of the loop that processes user-generated events to maintain the interactivity of an application. This offloading can either be done by using some form of asynchronous I/O and computation library. To give but a few examples, C#’s async/await is frequently used for this purpose, as well as Java’s ExecutorService, or Clojure’s future.

In another simple scenario, the application is already parallel and some form of execution monitoring needs to be added. Either to stear optimizations or even for billing purposes. Depending on the concrete scenario, various solution approaches are available. In a non-performance-critical scenario, a simple atomically modified counter can be sufficient. When performance matters, it might be more appropriate to gather the initial counts local to a single thread, however, this requires later communication to build the sum of all thread-local values, which might lead to consistency issue because it is harder to get one globally consistent snapshot of all local counters. Depending on the requirements all kind of different solutions in-between could be devised. If the count itself for instance does not matter, a scalable non-zero indicator might suffice. Either way, the problem remains the same. An existing parallel program needs to be changed, which potentially introduces concurrency issues.

Keeping something like an independent counter consistent is however rather trivial compared to making parallel or concurrent operations on a large and complex shared data structure yield consistent and correct results. Imagine a tree or graph representation for a program as frequently employed by IDEs. In such a scenario various subsystems might want to change or annotate the graph. For instance to include inferred types, add test coverage, or history information, apply refactorings, or simply account for the change done by the user in the editor. Often the relevant subsystems work concurrently. One could of course the graph immutable and ‘updates’ produce strictly new versions of it. However, for various reasons other choices might be made and then the question arises how consistent updates are possible. Solutions could potentially include locks or software transactional memory (STM).

When making the decision for how to manage the consistency for such a graph that is updated concurrently, the rest of the system has unfortunately to be considered as well. Suddenly our inconspicuous counter might need to take into account that the STM might retry transactions. Similarly, the library for asynchronous tasks might suddenly need to retract a task from the run queue when a transaction is retried. This is the point that makes concurrent programming really hard. Such ‘design’ decision are not strictly local anymore and the question arises not only for STM but for all concurrent programming abstractions: do they compose well?

Huge Number of Different Abstractions

The question of whether concurrent programming abstractions compose is not at all straightforward to answer. As indicated in the discussion in the previous section, there are many different possible requirements in any given situation, so that even the design of a simple counter becomes a complex undertaking. Over the decades, the huge amount of tradeoffs resulted in many different variations of few at least superficially related concepts. The tradeoffs are also not only about performance for instance in terms of how much guarantees a framework may provide. Often somewhat philosophical points come into the discussion. For instance, some people argue that blocking operations preserve better the local sequential view on a system and therefore are simpler to program, often however at the cost of potential deadlocks. On the other hand, a completely asynchronous non-blocking design might be deadlock free, but depending on how the language exposes it, one might end up in callback hell and code becomes hardly maintainable. Yet another aspect might be whether to allow non-determinism or not. It can be easier to reason about a strictly deterministic system. However, such a language or framework might restrict the expressiveness so much that not all conceivable applications can be expressed in it.

To give a few examples, the futures of Clojure and Java are blocking, which always introduces the risk for deadlocks when other blocking abstractions are used in conjunction. The futures offered by AmbientTalk and E (called promises) are inherently non-blocking to fit into the overall nature of these two languages as being non-blocking and deadlock free. Consequently however, both types of futures are used differently and one might argue that one is preferable over the other in certain situations.

Similar is the situation when it comes to the concrete implementation of communicating sequential processes. Personally, I consider the strict isolation between processes, and therefore the enforcement that any form of communication has to go via the explicit channels, as a major property that can simplify reasoning about the concurrent semantics and for instance makes sure that programs are free from low-level data races. However, Go for instance chose to adopt the notion of communicating via channels but its goroutines are not isolated. JCSP, a Java library goes the same way. The occam-pi language on the other hand chose to stick with the notion of fully isolated processes. The same design discussion could be had for implementations of the actor model. AmbientTalk and Erlang go with fully isolated processes, while for instance Akka makes the pragmatic decision that it cannot guarantee isolation because it is running on top of a JVM.

This discussion could go on for quite a while. Wikipedia lists currently more than 60 concurrent programming languages of which most will implement some specific variation of a concept. In previous work, we identified roughly a hundred concepts that are related to concurrent programming.

It can now of course be argued that a single language will not support all of them and thus applications will perhaps only have to cope with a handful concurrent programming abstractions. However, looking at large open source applications such as IDEs, it seems that the various subsystems from time to time start to introduce their own abstractions. NetBeans for instance has various representations of asynchronous task or future like abstractions and there are at least two implementations of somewhat ‘transactional’ systems, one in the refactoring subsystem and another one in the profiling library. They seem to implement something along the lines of STM in different degrees of complexity. And this again raises the question how are these different abstractions interacting with each other. A look at NetBeans bug track yields more than 4000 bugs that contain the word “deadlock” and more than 500 bugs with the phrase “race condition”. While most of these bugs are marked as closed, it is probably a good indication that concurrent programming is hard and error prone.

Concurrent Programming Abstractions Not Designed for use in Combination

Usually concurrent programming is considered hard because low-level abstractions such as threads and locks are used. While NetBeans uses these to a significant extent, it uses also considerably more high-level concepts such as futures, asynchronous tasks, and STM. Now I would argue that it is not necessarily the abstraction level but that various concurrent programming abstractions are used together while they have not been designed for that purpose. While each abstraction in isolation is well tailored for its purpose, and thus reduced the accidental complexity, concurrency often does not remain confined to modules or subsystem and thus the interaction between the abstractions causes significant accidental complexity.

As far as I am aware, the fewest languages have been designed from the ground up with concurrency in mind, and even fewer languages are designed with the interaction of concurrent programming abstractions in mind. While for instance Java was designed with threads in mind and has the synchronized keyword to facilitate thread-based programming, its memory model and the java.util.concurrent libraries were only added in Java 5. Arguable, Java’s libraries are so low-level that languages such as Clojure and Scala try to close the gap. Clojure was consequently designed from the start concurrent programming in mind. It started out with atoms, agents, and STM to satisfy the different use case for concurrency. However, even so Clojure was design with them from the start, they do not interact well. Atoms are considered low-level and do not regard transactions or agents at all. STM on the other hand accounts for agents by deferring message sends until the end of a transaction to make sure that a transaction can be retried safely. With only these three abstractions, Clojure actually could be considered a fine example. However, these abstractions were apparently not sufficient to cover all use cases equally well and futures and promises as well as CSP in form of the core.async library got added. Unfortunately, the abstractions were not designed to integrate well with the existing ones. Instead, there were merely added and interactions can cause for instance unexpected deadlocks or race conditions (for more details see this paper).

In order to give a more academic example, which might not be governed by mostly pragmatic concerns, Haskell might be a reasonable candidate. Unfortunately, even in Haskell the notion of adding instead of integrating seems to be the prevalent one. I am not a Haskell expert, but the STM shows the same symptoms Clojure has, however, in a slightly different way. The standard Control.Concurrent package comes for instance with MVar and Chan as abstractions for mutable state and communication channels. But instead of integrating the STM with these, it introduces its own variants TMVar and TChan. It might be performance reasons that led to this situation. However, from the perspective of engineering large applications this can hardly be ideal, because the question of whether these abstractions can be used without problems in the same application remains unanswered.

Conclusion

I think that concurrent programming is hard because the abstractions we use today are not prepared for the job. They are good for one specific task, but they are not easily used in conjunction with each other. Instead, interactions can lead for instance to unexpected race conditions or deadlocks. And just to support the claim that interaction is an issue, it is not just NetBeans that uses are variety of concurrent programming concepts. Eclipse looks similar, and so do MonoDevelop and SharpDevelop. A study in the Scala world suggests also that application developer chose to combine the actor model with other abstractions for instance for performance reasons.

So, what’s the solution? I think, we need to design languages and libraries that properly integrate a variety of concurrent programming abstractions. How that should look concretely, I don’t know yet. The work of Joeri De Koster shows how solutions could look like for actor languages, and together with Janwillem Swallens, we are extending this work to a wider set of languages. Personally, I still belief that the ownership-based metaobject protocol is a useful foundation to experiment with various different concurrent programming abstractions on top of one language. But, we will see.

For comments, suggestions, ideas, or complains that I did not consider your language that already solves to problem, please catch me on Twitter @smarr or send me a mail.

Towards Composable Concurrency Abstractions

One of the big questions that came up during my PhD was: ok, now you got your fancy ownership-based metaobject protocol, and you can implement actors, agents, communicating sequential processes, software transactional memory, and many others, but now what? How are you going to use all of these in concert in one application? Finding a satisfying answer is unfortunately far from trivial.

Since I am far from the first person thinking about these problems, we, that is Tom, Joeri, and most notably Janwillem put out heads together to figure out what the main issues are, and what the solutions are others have come up with. Janwillem took the lead and started to write down our first preliminary findings in a paper for the PLACES workshop, co-located with ETAPS in April.

Below, you can find the preprint and abstract of the paper. It is only a first small step, but I hope it won’t be the last one because in the end, the OMOP is only going to be useful if we actually can figure out how to combine the various concurrent programming models it enables in a safe and efficient manner.

Abstract

In the past decades, many different programming models for managing concurrency in applications have been proposed, such as the actor model, Communicating Sequential Processes, and Software Transactional Memory. The ubiquity of multi-core processors has made harnessing concurrency even more important. We observe that modern languages, such as Scala, Clojure, or F#, provide not one, but multiple concurrency models that help developers manage concurrency. Large end-user applications are rarely built using just a single concurrency model. Programmers need to manage a responsive UI, deal with file or network I/O, asynchronous workflows, and shared resources. Different concurrency models facilitate different requirements. This raises the issue of how these concurrency models interact, and whether they are composable. After all, combining different concurrency models may lead to subtle bugs or inconsistencies.

In this paper, we perform an in-depth study of the concurrency abstractions provided by the Clojure language. We study all pairwise combinations of the abstractions, noting which ones compose without issues, and which do not. We make an attempt to abstract from the specifics of Clojure, identifying the general properties of concurrency models that facilitate or hinder composition.

  • Towards Composable Concurrency Abstractions; Janwillem Swalens, Stefan Marr, Joeri De Koster, Tom Van Cutsem; Proceedings of the workshop on Programming Language Approaches to Concurrency and Communication-cEntric Software, 2014, co-located with ETAPS.
  • Paper: PDF
  • BibTex: BibSonomy