Tag Archives: Newspeak

SOMns 0.2 Release with CSP, STM, Threads, and Fork/Join

Since SOMns is a pure research project, we aren’t usually doing releases for SOMns yet. However, we added many different concurrency abstractions since December and have plans for bigger changes. So, it seems like a good time to wrap up another step, and get it into a somewhat stable shape.

The result is SOMns v0.2, a release that adds support for communicating sequential processes, shared-memory multithreading, fork/join, and a toy STM. We also improved a variety of things under the hood.

Note, SOMns is still not meant for ‘users’. It is however a stable platform for concurrency research and student projects. If you’re interested to work with it, drop us a line, or check out the getting started guide.

0.2.0 – 2017-03-07 Extended Concurrency Support

Concurrency Support

  • Added basic support for shared-memory multithreading and fork/join
    programming (PR #52)

    • object model uses now a global safepoint to synchronize layout changes
    • array strategies are not safe yet
  • Added Lee and Vacation benchmarks (PR #78)

  • Configuration flag for actor tracing, -atcfg=
    example: -atcfg=mt:mp:pc turns off message timestamps, message parameters and promises

  • Added Validation benchmarks and a new Harness.

  • Added basic Communicating Sequential Processes support.
    See PR #84.

  • Added CSP version of PingPong benchmark.

  • Added simple STM implementation. See s.i.t.Transactions and PR #81 for details.

  • Added breakpoints for channel operations in PR #99.

  • Fixed isolation issue for actors. The test that an actor is only created
    from a value was broken (issue #101, PR #102)

  • Optimize processing of common single messages by avoiding allocation and
    use of object buffer (issue #90)

Interpreter Improvements

  • Turn writes to method arguments into errors. Before it was leading to
    confusing setter sends and ‘message not understood’ errors.

  • Simplified AST inlining and use objects to represent variable info to improve
    details displayed in debugger (PR #80).

  • Make instrumentation more robust by defining number of arguments of an
    operation explicitly.

  • Add parse-time specialization of primitives. This enables very early
    knowledge about the program, which might be unreliable, but should be good
    enough for tooling. (See Issue #75 and PR #88)

  • Added option to show methods after parsing in IGV with
    -im/--igv-parsed-methods (issue #110)

Communicating Sequential Processes for Newspeak/SOMns

One possible way for modeling concurrent systems is Tony Hoare’s classic approach of having isolated processes communicate via channels, which is called Communicating Sequential Processes (CSP). Today, we see the approach used for instance in Go and Clojure.

While Newspeak’s specification and implementation come with support for Actors, I want to experiment also with other abstractions, and CSP happens to be an interesting one, since it models systems with blocking synchronization, also know as channels with rendezvous semantics. I am not saying CSP is better in any specific case than actors. Instead, I want to find out where CSP’s abstractions provide a tangible benefit.

But, the reason for this post is another one. One of my biggest quibbles with most CSP implementations is that they don’t take isolation serious. Usually, they provide merely lightweight concurrency and channels, but they rarely ensure that different processes don’t share any mutable memory. So, the door for low-level race conditions is wide open. The standard argument of language or library implementers is that guaranteeing isolation is not worth the performance overhead that comes with it. For me, concurrency is hard enough, so, I prefer to have the guarantee of proper isolation. Of course, another part of the argument is that you might need shared memory for some problems, but, I think we got a more disciplined approach for those problems, too.

Isolated Processes in Newspeak

Ok, so how can we realize isolated processes in Newspeak? As it turns out, it is pretty simple. Newspeak already got the notion of values. Values are deeply immutable objects. This means values can only contain values themselves, which as a consequence means, if you receive some value from a concurrent entity, you are guaranteed that the state never changes.

In SOMns, you can use the Value mixin to mark a class as having value semantics. This means that none of the fields of the object are allowed to be mutable, and that we need to check that fields are only initialized with values in the object’s constructor. Since Newspeak uses nested classes pretty much everywhere, we also need to check that the outer scope of a value class does not have any mutable state. Once that is verified, an object can be a proper deeply immutable value, and can be shared with out introducing any data races between concurrent entities.

Using this as a foundation, we can require that all classes that represent CSP processes are values. This gives us the guarantee that a process does not have access to any shared mutable state by itself. Note, this is only about the class side. The object side can actually be a normal object an have mutable state, which means, within a process, we can have normal mutable state/objects.

Using the value notion of Newspeak feels like a very natural solution to me. Alternative approaches could use a magic operator that cuts off lexical scope. This is something that I have seen for instance in AmbientTalk with its isolates. While this magic isolate keyword gives some extra flexibility, it is also a new concept. Having to ensure that a process’ class is a value requires that its outer lexical scope is a value, and thus, restricts a bit how we structure our modules, but, it doesn’t require any new concepts. One other drawback is here that it is often not clear that the lexical scope is a value, but I think that’s something where an IDE should help and provide the necessary insights.

In code, this looks then a bit like this:

class ExampleModule = Value ()(
  class DoneProcess new: channelOut = Process (
  | private channelOut = channelOut. |
  )(
    public run = ( channelOut write: #done )
  )
  
  public start = (
    processes spawn: DoneProcess
               with: {Channel new out}
  )
)
So, we got a class DoneProcess, which has a run method that defines what the process does. Our processes module allows us to spawn the process with arguments, which is in this case the output end of a channel.

Channels

The other aspect we need to think about is how can we design channels so that they preserve isolation. As a first step, I’ll only allow to send values on the channel. This ensure isolation and is a simple efficient check whether the provided object is a value.

However, this approach is also very restrictive. Because of the deeply immutable semantics of values, they are quite inflexible in my experience.

When thinking of what it means to be a value, imagine a bunch of random objects: they all can point to values, but values can never point back to any mutable object. That’s a very nice property from the concurrency perspective, but in practice this means that I often feel the need to represent data twice. Once as mutable, for instance for constructing complex data structures, and a second time as values so that I can send data to another process.

A possible solution might be objects with copy-on-transfer semantics, or actual ownership transfer. This could be modeled either with a new type of transfer objects, or a copying channel. Perhaps there are other options out there. But for the moment, I am already happy with seeing that we can have proper CSP semantics by merely checking that a process is constructed from values only and that channels only pass on values.

Since the implementation is mostly a sketch, there are of course more things that need to be done. For instance, it doesn’t yet support any nondeterminism, which requires an alt or select operation on channels.

Can we get the IDE for free, too?

What do we need for full IDE Integration for Truffle Languages?

With the Truffle language implementation framework, we got a powerful foundation for implementing languages as simple interpreters. In combination with the Graal compiler, Truffle interpreters execute their programs as very efficient native code.

Now that we got just-in-time compilation essentially “for free”, can we get IDE integration for our Truffle languages as well?

In case you wonder, this is inspired by the language server protocol project of Microsoft, RedHat, Eclipse Che, and others. Their goal is to develop a language-agnostic protocol that connects so-called language servers to IDEs. That made me wonder whether we could provide the infrastructure needed for such a language server as part of Truffle.

In the remainder of this post, I'll briefly discuss what IDE features would be desirable as a start, what of that Truffle currently could support, and how far we could got with a language-agnostic API as part of Truffle.

1. Which IDE Features would be desirable?

Generally, I am thinking about features available in IDEs such as Eclipse, Visual Studio, NetBeans, or Pharo. On the one hand, there are tools that help to understand the execution of a program. Typically, this includes debugging support, inspecting of values, but it can also be about profiling to identify performance issues. Such execution-related aspects are covered by Truffle already today. The framework comes with support for a debugger and a profiler. The debugger can be used across Truffle languages for instance in NetBeans or in a web-based experiment of mine.

Features that are not strictly related to execution are however not supported. In the research community, this area is something where Language Workbench projects [1] excel. They often come with their own domain-specific languages to define language grammars, and use transformation or compilation approaches to generate a wide ranges of tools from such specification.

The tools I find most essential for an IDE include:

  • support for highlighting (syntactic and semantic)
  • code browsing, structural representation
  • code completion (incl. signature information, and API documentation)
  • reporting of parsing and compilation errors
  • reporting of potential bugs and code quality issues
  • quick fix functionality
  • refactoring support
Code Completion for SOMns in VS Code

For code completion, as in the figure above, one needs of course language-specific support, ideally taking the current context into account, and adjusting the set of proposals to what is possible at that lexical location.

Go to Definition for SOMns in VS Code

Similarly, for the go to definition example above, it is most useful if the language semantics are taken into account. My prototype currently does not take into account that the receiver of a println in this case is a literal, which would allow it to reduce the set of possible definitions to the one in the String class.

Parsing Error in SOMns in VS Code

Other features are more simple mappings of already available functionality. The error message above is shown on parsing errors, and helps to understand why the parser complains. That type of error is also shown in typical languages for instance on the command line to enable basic programming.

Navigating a file based on its program entities

The other features are clearly more language-specific, as is the last example above, the browsing of a file based on the entities it defines. However, most of these elements can be mapped onto a common set of concepts that can be handled in a language-agnostic way in an IDE.

While the DynSem project might bring all these generation-based features to Truffle languages, I wonder whether we can do more in a bottom-up approach based on the interpreters we already got. Ensō, a self-describing DSL workbench, seems to go the route of interpretation over transformation as well.

2. What does Truffle currently support?

As mentioned already above, Truffle currently focuses on providing a framework geared towards tooling for language execution. This focuses mainly on providing the implementation framework for the languages themselves, but includes also support for language instrumentation that can be used to implement debuggers, profilers, tools for collecting dynamic metrics, coverage tracking, dynamic code reloading, etc.

The framework is based on the idea that AST nodes, i.e., the basic executable elements to which a parser transforms an input program, can be tagged. An instrumentation tool can then act based on these tags and for instance add extra behavior to a program or track its execution. AST nodes are correlated to their lexical representation in a program by so-called SourceSections. A SourceSection encodes the coordinates in the program file.

Unfortunately, this is where the support from the Truffle framework ends. Tooling aspects revolving around the lexical aspects of programs are currently not supported.

3. Can we provide a language-agnostic API to implement tooling focused on lexical aspects?

The most basic aspect that is currently missing in Truffle for any kind of lexical support is mapping any location in a source to a corresponding semantic entity. There are two underlying features that are currently missing for that. First, we would need to actually retain information about code that is not part of a method, i.e., is not part of the ASTs that are built for Truffle. Second, we would need a simple lookup data structure from a location in the source to the corresponding element. To implement for instance code completion, we need to identify the code entity located at the current position in the source, as well as its context so that we can propose sensible completion options.

Let's go over the list of things I wanted.

3.1 Support for Highlighting (Syntactic and Semantic)

This needs two things. First, a set of common tags to identify concepts such as keywords, literals, or method definitions. Second, it needs an API to communicate that information during parsing to Truffle so that it is stored along with the Source information for instance, and can be used for highlighting.

To support semantic instead of purely syntactic highlighting, Truffle would furthermore need support for dynamic tags. Currently, Truffle assumes that tags are not going to change after AST creation. For many dynamic languages, this is however too early. Only executing the code will determine whether an operation is for instance a field read or an access to a global.

3.2 Code Browsing, Structural Representation

Communicating structural information might be a bit more challenging in a language-agnostic way. One could got with the choice of a superset of concepts that is common to various languages and then provide an API that records these information on a per-Source basis, which could then be queried from an IDE.

One challenge here, especially from the perspective of editing would be to chose data structures that are easily and efficiently evolvable/updatable during the editing of a source file. Assuming that the editor provides the information about only parts of a source having changed, it should be possible to leverage that.

Note, this also requires a parser that is flexible enough to parse such chunks. This is however something that would be language-specific, especially since Truffle leaves the parsing aspect completely to the languages.

3.3 Code Completion (incl. signature information, and API documentation)

For code completion, one needs a mapping from the source locations to the 'lexically dominating' entities. With that I mean, not necessarily the structure of an AST, but as with the highlighting, a mapping from the source location to the most relevant element from a user's perspective. Assuming we got that for highlighting already, we would need language-specific lookup routines to determine the relevant elements for code completion. And those should then probably also return all the language-specific information about signatures (name and properties of arguments, e.g., types) as well as API documentation.

3.4 Reporting of Parsing and Compilation Errors

Depending on how far one wants to take that, this could be as simple as an API to report one or more parsing or compilation exceptions.

I see two relevant aspects here that should be considered. The first is that the current PolyglotEngine design of Truffle does not actually expose a parse() operation. It is kept sealed off under the hood. Second, depending on the language and the degree of convenience one would want to provide to users, the parser might want to continue parsing after the first error and report multiple issues in one go. This might make the parser much more complex, but for compilation, i.e., structural or typing issues unrelated to the syntax, one might want to report all issues, instead of aborting after the first one. Such features might require very different engineering decisions compared to implementations that abort on the first error, but it would improve the programming experience dramatically.

3.5 Reporting of Potential Bugs and Code Quality Issues

This doesn't seem to be fundamentally different from the previous issue. The question is whether an API for querying such information is something that belongs into the PolyglotEngine, or whether there should be another entry point for such tooling altogether. Since I have a strong dynamic languages background, I'd argue the PolyglotEngine is the right place. I want to execute code to learn more about the behavior of a program. I want to run unit tests to get the best feedback and information (including types and semantic highlighting) about my code. So, I'd say it belongs all in there.

3.6 Quick-Fix Functionality and Refactoring Support

I haven't really experimented with these aspects, but there seems to be a language-specific and a language-agnostic component to it. The language-specific component would be to identify the entities that need to be changed by a quick fix or refactoring, as well as determining the replacement. The actual operation however seems to be fairly language-independent and could be a service provided by a common infrastructure to change Source objects/files.

4. Conclusion

To me it seems that there is huge potential for Truffle to provide more language-agnostic infrastructure to realize standard and perhaps non-standard IDE features by providing additional APIs to be implemented by languages. Getting basic features is something reasonably straight forward and would help anyone using a language that doesn't already have IDE support traditionally.

However, there are also a couple of challenges that might be at odds with Truffle as a framework for languages that are mostly tuned for peak performance. In my own experience, adapting the SOMns parser to provide all the necessary information for highlighting, code completion, go to definition, etc., requires quite a few design decisions that depart from the straight-forward parsing that I did before to just directly construct the AST. On the one hand, I need to retain much more information than before. My Truffle ASTs are very basic and contain only elements relevant for execution. For editing in an IDE however, we want all the declarative elements as well. On the other hand, one probably wants a parser that is incremental, and perhaps works on the chunk that the editor identified as changed. If the parser wasn't designed from the start to work like that, this seems to require quite pervasive changes to the parser. Similarly, one would need a different approach to parsing to continue after a parse error was found. On top of that comes the aspect of storing the desired information in an efficient data structure. Perhaps that is something where persistent data structures would be handy.

While there are challenges and tradeoffs, for language implementers like me, this would be a great thing. I'd love to experiment with my language and still get the benefits of an IDE. Perhaps not exactly for free, but with a reasonable effort. While Language Workbenches provide such features, I personally prefer the bottom up approach. Instead of specifying my language, I'd rather express the one truth of its semantics as an interpreter. In the end, I want to execute programs. So, let's start there.

PS

I'd like to thank Michael L. Van De Vanter for discussions on the topic. The code for my experiments with the language server protocol and SOMns are available on GitHub as part of the SOMns-vscode project. So far, I haven't tried to integrate it with Truffle however.

And I have to admit, I am currently mostly focusing on the MetaConc project, where we aim a little higher and try to go beyond what people come to expect when debugging concurrent applications.

References

  1. Erdweg, S.; van der Storm, T.; Völter, M.; Boersma, M.; Bosman, R.; Cook, W. R.; Gerritsen, A.; Hulshout, A.; Kelly, S.; Loh, A.; Konat, G. D. P.; Molina, P. J.; Palatnik, M.; Pohjonen, R.; Schindler, E.; Schindler, K.; Solmi, R.; Vergu, V. A.; Visser, E.; van der Vlist, K.; Wachsmuth, G. H. & van der Woning, J. (2013), The State of the Art in Language Workbenches, Springer, pp. 197-217.

Type Hierarchies and Guards in Truffle Languages

Continuing a little bit with writing notes on Truffle and Graal, this one is based on my observations in SOMns and changes to its message dispatch mechanism. Specifically, I refactored the main message dispatch chain in SOMns. As in Self and Newspeak, all interactions with objects are message sends. Thus, field access and method invocation is essentially the same. This means that message sending is a key to good performance.

In my previous design, I structured the dispatch chain in a way that, I thought, I'd reduce the necessary runtime checks. This design decision still came from TruffleSOM where the class hierarchy was much simpler and it still seems to work.

My naive design distinguished two different cases. One case is that the receiver is a standard Java objects, for instance boxed primitives such as longs and doubles, or other Java objects that is used directly. The second case is objects from my own hierarchy of Smalltalk objects under SAbstractObject.

The hierarchy is a little more involved, it includes the abstract class, a class for objects that have a Smalltalk class SObjectWithClass, a class for objects without fields, for objects with fields, and that one is then again subclassed by classes for mutable and immutable objects. There are still a few more details to it, but I think you get the idea.

So, with that, I thought, let's structure the dispatch chain like this, starting with a message send node as its root:

MsgSend
  -> JavaRcvr
  -> JavaRcvr
  -> CheckIsSOMObject
        \-> UninitializedJavaRcvr
  -> SOMRcvr
  -> SOMRcvr
  -> UninitializedSOMRcvr

This represents a dispatch chain for a message send site that has seen four different receivers, two primitive types, and two Smalltalk types. This could be the case for instance for the polymorphic + message.

The main idea was to split the chain in two parts so that I avoid checking for the SOM object more than once, and then can just cast the receiver to SObjectWithClass in the second part of the chain to be able to read the Smalltalk class from it.

Now it turns out, this is not the best idea. The main problem is that SObjectWithClass is not a leaf class in my SOMns hierarchy (this is the case in TruffleSOM though, where it originates). This means, at runtime, the check, i.e., the guard for SObjectWithClass can be expensive. When I looked at the compilation in IGV, I saw many instanceof checks that could not be removed and resulted in runtime traversal of the class hierarchy, to confirm that a specific concrete class was indeed a subclass of SObjectWithClass.

In order to avoid these expensive checks, I refactored the dispatch nodes to extract the guard into its own node that does only the minimal amount of work for each specific case. And it only ever checks for the specific leaf class of my hierarchy that is expected for a specific receiver.

This also means, the new dispatch chain is not separated in parts anymore as it was before. Instead, the nodes are simply added in the order in which the different receiver types are observed over time.

Overall the performance impact is rather large. I saw on the Richards benchmark a gain of 10% and on DeltaBlue about 20%. Unfortunately my refactoring also changed a few other details beside the changes related to instanceof and casts. It also made the guards for objects with fields depend on the object layout instead of the class, which avoids having multiple guards for essentially the same constraint further down the road.

So, the main take-away here is that the choice of guard types can have a major performance impact. I also had a couple of other @Specialization nodes that were using non-leaf classes. For instance like this: @Specialization public Object doSOMObject(SObjectWithClass rcvr) {...}

This looks inconspicuous at first, but fixing those and a few other things resulted in overall runtime reduction on multiple benchmarks between 20% and 30%.

A good way to find these issues is to see in IGV that instanceof or checked cast snippets are inlined and not completely removed. Often they are already visible in the list of phases when the snippets are resolved. Another way to identify them is the use of the Graal option -Dgraal.option.TraceTrufflePerformanceWarnings=true (I guess that would be -G:+TraceTrufflePerformanceWarnings when mx is used). The output names the specific non-leaf node checks that have been found in the graph. Not all of them are critical, because they can be removed by later phases. To check that, you can use the id of the node from the output and search for it in the corresponding IGV graph using for instance id=3235 in the search field.

Optimizing Communicating Event-Loop Languages with Truffle

The past few month, I have been busy implementing a fast actor language for the JVM. The language is essentially Newspeak with a smaller class library and without proving access to the underlying platform, which can lead to violations of the language’s guarantees.

My main focus for this project so far has been the actor model. Or more precisely, the communicating event-loop model à la E or AmbientTalk. The research goal is to show that the actor model can be made more practical for multicore programming, without breaking any of the actor guarantees such as deadlock-freedom and absence of low-level data races. We have been working on this already in the past, but so far performance was not a concern. This time around, performance is the main focus.

Coming Monday, I’ll give a presentation on first results at the AGERE!’15 workshop in Pittsburgh. The talk will focus on how we can bring some of the classic JIT compiler optimizations to asynchronous message sends, and how we can reduce the cost of isolation using Truffle.

Below are the abstract and link to the work-in-progress paper, or extended abstract if you will. However, since this was written already a few weeks ago, here another little update on the current state:

SOMns vs. JVM Actor Frameworks

The plot shows the results of running the first few Savina benchmarks with a 4-core configuration for Scalaz as baseline, Akka, Jetlang, and SOMns. Scalaz seems to be the best performing actor framework at the moment. While SOMns is a little slower, it actually provides all the guarantees one would expect from an actor language. The JVM frameworks typically do not do that. So, I think, we are on a good way to make concurrent programming safe and fast.

Abstract

Communicating Event-Loop Languages similar to E and AmbientTalk are recently gaining more traction as a subset of actor languages. With the rise of JavaScript, E’s notion of vats and non-blocking communication based on promises entered the mainstream. For implementations, the combination of dynamic typing, asynchronous message sending, and promise resolution pose new optimization challenges.

This paper discusses these challenges and presents initial experiments for a Newspeak implementation based on the Truffle framework. Our implementation is on average 1.65x slower than Java on a set of 14 benchmarks. Initial optimizations improve the performance of asynchronous messages and reduce the cost of encapsulation on microbenchmarks by about 2x. Parallel actor benchmarks further show that the system scales based on the workload characteristics. Thus, we conclude that Truffle is a promising platform also for communicating event-loop languages.

  • Optimizing Communicating Event-Loop Languages with Truffle; Stefan Marr, Hanspeter Mössenböck; Presentation at the AGERE!’15 Workshop, co-located with SPLASH’15.
  • Paper: PDF, HTML
  • BibTex: BibSonomy

Slides