A 10 Year Journey, Stop 5: Growing the SOM Family
SOM, the Simple Object Machine, has been a steady companion for much of my research. As mentioned earlier, all this work on virtual machines started for me with CSOM, a C-based implementation of a simple Smalltalk language. From the beginning, SOM was meant as a vehicle for teaching language implementation techniques as well as doing research on related topics. As such, it is keep simple. The interpreter implementations do not aim to be fast. Instead, concepts are supposed to be expressed explicitly and consistent, so that the code base is accessible for students. Similarly, the language is kept simple and includes dynamic typing, objects, classes, closures, and non-local returns. With these features the core of typical object-oriented languages is easily covered. One might wonder about exceptions, but their dynamic semantics very similar to non-local returns and thus covered, too.
Originally, SOM was implemented in Java. Later, CSOM, SOM++ (a C++-based implementation), AweSOM (a Smalltalk-based implementation) joined the family. Some of this history is documented on the old project pages at the HPI, where much of this work was done.
When I picked up maintaining the SOM family for my own purposes, I added PySOM, a Python-based implementation, and JsSOM implemented in JavaScript. As part of the work on building a fast language implementation, I also added TruffleSOM, a SOM implementation using the Truffle framework on top of the JVM. As well as RPySOM, an RPython-based bytecode interpreter for SOM, and RTruffleSOM, a Truffle-like AST interpreter implemented in RPython.
A Fast SOM
For TruffleSOM and RTruffleSOM, the focus was on performance. This means, the clarity and simplicity got somewhat compromised. In the code base, that’s usually visible in form of concepts being implemented multiple times to cover different use cases or special cases. Otherwise, the language features haven’t really changed. The only thing that got extended is the set of basic operations implemented for SOM, which we call primitives, i.e., builtin operations such as basic mathematical operations, bit operations, and similar things that either cannot be expressed in the language, or are hard to express efficiently.
The main reason to extend SOM’s set of primitives was to support a wide set of benchmarks. With the Are We Fast Yet project, I started a project to compare the performance of a common set of object-oriented languages features across different programming languages. One of the main goals was for me to be able to understand how fast TruffleSOM and RTruffleSOM are, for instance compared to state-of-the-art Java or JavaScript VMs.
Well, let’s have a look at the results:
The figure shows the performance of various SOM implementations relative to Java 1.8, i.e., the HotSpot C2 compiler. To be specific, it shows the peak performance discounting warmup and compilation cost. As another reference point for a common dynamic language, I also included Node.js 8.1 as a JavaScript VM.
As the numbers show, TruffleSOM and RTruffleSOM reach about the same performance on the used benchmarks. Compared to Java, they all are about 2-4x slower. Looking at the results for Node.js, I would argue that I managed to reach the performance of state-of-the-art dynamic language VMs with my little interpreters.
The simple SOM implementations are much slower however. SOM and SOM++ are about 500x slower. That is quite a bit slower than the performance reached by the Java interpreter, which is only about 10-50x slower than just-in-time compiled and highly optimized Java. The slowness of SOM and SOM++ are very much expected because of their focus on teaching. Beside many of the little design choices that are not optimal for performance, there is also the used bytecode set, which is designed to be fairly minimal and thus causes a high overhead compared to the optimized bytecode sets used by Java, Ruby, or Smalltalk 80.
Making SOM++ Fast with Eclipse OMR
As shown with TruffleSOM and RTruffleSOM, meta-compilation approaches are one possible way to gain state-of-the-art performance. Another promising approach is the reuse of existing VM technology in the form of components to improve existing systems. One of the most interesting systems in that field is currently Eclipse OMR. The goal of this project, which is currently driven by IBM, is to enable languages such as Ruby or Python to use the technology behind IBM’s J9 Java Virtual Machine. At some point, they decided to pick up SOM++ as a show case for their technology. They first integrated their garbage collector, and later added some basic support for their JIT compiler. My understanding is that it currently compiles each bytecode of a SOM method into the J9 IR using the JitBuilder project, allowing a little bit of inlining, but not doing much optimizations. And the result is a 4-5x speedup over the basic SOM++ interpreter. For someone implementing languages, such a speedup is great, and not to snub at, even if we start from a super slow system. But as a result, you reach the performance of optimized interpreters, while still maintaining a minimal bytecode set and the general simplicity of the system. Of course, minus the complexity of the JIT compiler itself.
To reach the same performance as TruffleSOM and RTruffleSOM, there is quite a bit more work to be done. I’d guess, SOM++ OMR would need more profiling information to guide the JIT compiler. And, it probably will also need a few other tricks like an efficient object model and stack representation to really achieve the same speed. But anyway, to me it is super cool to see someone else picking up SOM for their purposes and built something new with it 🙂.
Other Uses of SOM
And while we are at it, over the years, some other projects spawned off from SOM. There was NXTalk for the Lego Mindstorm system. My own ActorSOM++, which implemented a simple Actor language as part of SOM. And more recently, SOMns, a Newspeak implementation derived from TruffleSOM. You might have noticed, it’s actually a bit faster than TruffleSOM itself :) And, it supports all kind of concurrency models, from actors over CSP, STM, fork/join, to classic threads and locks.
Similar to SOM++ OMR, the Mu Micro VM project picked up a SOM implementation to showcase their own technology. Specifically, they used RPySOM, an RPython-based bytecode interpreter for their experiments.
Guido Chari forked TruffleSOM to built TruffleMate and experiment with making really all parts of a language runtime reflectively accessible, while maintaining excellent performance.
And last, but not least, Richard Roberts is currently working on a Grace implementation on top SOMns.
So there are quite a few things happening around SOM and the various offspring. I hope, the next 10 years are going to be as much fun as the last.
And with that, I’ll end this series of blog posts. If you’re interested to learn more, check out the community section on the SOM homepage, ask me on Twitter @smarr, or sent me an email.