Irrationally Annoyed: The SIGPLAN Blog Post writing 30 Years of PL Research Out of Existence
Though, after multiple rounds of edits, I am now assuming that there’s simply an issue of awareness. What follows is therefore an attempt to raise awareness on the various academic but also industrial efforts around the performance of dynamic languages.
Disclaimer: this blog post isn’t attempting to be complete in any way, but I hope it provides enough pointers for people interested in language implementation technology for dynamic languages to go deeper and possibly even contribute.
Languages vs. Language Implementations
Before going into technical details, it is important to start distinguishing between a language and its implementation. Especially, when we talk about programming language research, these two are rarely the same.
While the languages mentioned in the post (Python, R, PHP) all have “standard” implementations, which are the by far most widely used ones, they are far from the only implementations. Let’s just pick three for each language: There are PyPy, IronPython, and Jython for Python, pqR, Renjin, and FastR for R, as well as HippyVM, PeachPie, and Quercus for PHP.
How do I know? I didn’t benchmark all of these language implementations, but I worked on comparing compilers across languages with the Are We Fast Yet project . One important insight here is that language design can definitely have a performance impact. But, the ingenuity of compiler writers and language implementers has eliminated all major performance hurdles in the last 30 years.
This means, claims about the languages for instance them being “all notoriously slow and bloated” and “not designed to be fast or space-efficient” are rather misguided.
Though, one may argue that these techniques have not been successful enough, because people don’t use the language implementations actually benefiting from them.
Fine, but I believe this is only partially a technical problem. Some of it is a social issue, too.
Others have spent time on identifying potential reasons of why people do not pick up our research. For instance, Laurie wrote blog posts on why not more users are happy with our VMs [1, 2].
Is Python doomed? Of course not!
So, what about Python then? The blog post picks up a claim that Python is incredibly slow when it comes to matrix multiplication. And later, without much context claims that PyPy is perhaps 2 times faster than Python, but sometimes slower.
One could have asked the PyPy community, or looked at their blog. PyPy doesn’t have auto-vectorization as far as I know, so, indeed, it has a hard time reaching the performance of vectorized code, but it is much faster than what’s implied. Such broad claims are not just unjustified and bad style, they are also painfully unscientific. No, Python is not necessarily slow and bloated. Maps, Hidden Classes, and object shapes make it possible to store even integers rather efficiently. With the previously mentioned storage strategies, this works extremely well for your hugh array with integers, too.
Are Dynamic-Languages Single Threaded?
This one, I do indeed take personally.
First of all, being effectively single threaded can be a Good Thing™. Why would you want to deal with shared memory race conditions? This is a language design question. So in the end a matter of taste. And, if one really wants to, some forms of shared memory are possible, even without introducing all its drawbacks.
That aside, Python, R, and others have fairly nice ways of supporting parallelism, including multiprocess approaches. And, to blow my own horn a little more, if you really want shared memory multithreading, you can have that efficiently, too with our work on thread-safe object models  and thread-safe storage strategies  .
Where to go from here?
To sum up: yes, metal languages will continue to be important, but so will the irrational exuberance languages. We as a community should lead the way in developing systems (at the hardware and software levels) that will make them run faster.
Yes, we should. And the PL community spent a lot of effort doing this for the last 30 years. So, please join us. We are out there and could use help!
MPLR has a keynote coming up titled “Hardware Support for Managed Languages: An Old Idea Whose Time Has Finally Come?”. Perhaps worth registering for. Attendance is free.
And of course, SPLASH is around the corner, too. With DLS and VMIL, SPLASH has a strong history of providing venues for discussing the implementation of dynamic languages. OOPSLA, as in the past, has pretty cool papers on dynamic language implementation.
Other relevant venues include for instance ICOOOLPS and MoreVMs.
My Introduction to Efficient and Safe Implementations of Dynamic Languages has many pointers to research, some of which also made it into some early overview of the field.
For questions or difference in opinion, find me on Twitter @smarr.
Part of my annoyance with the blog post is its dismissive and condescending tone.
Some of these so-called “scripting languages” are essentially moribund, like Perl (1987)
Seems a completely unnecessary point to be made. It is also unclear how this is assessed. The Perl community is active, but yes, a lot of effort went into Raku.
Post Post Scriptum
Python programmers need to distinguish between time and memory spent in pure Python (optimizable) from time and memory spent in C libraries (not so much). They need help tracking down expensive and insidious traffic across the language boundaries (copying and serialization).
We could also try to get rid of it instead. Or give more power to people like Stephen Kell, and avoid copying/serialization in a different way.