Towards a Synthetic Benchmark to Assess VM Startup, Warmup, and Cold-Code Performance

One of the hard problems in language implementation research is benchmarking. Some people argue, we should benchmark only applications that actually matter to people. Though, this has various issues. Often, such applications are embedded in larger systems, and it’s hard to isolate the relevant parts. In many cases, these applications can also not be made available to other researchers. And, of course, things change over time, which means maintaining projects like DaCapo, Renaissance, or Jet Stream is a huge effort.

Which brought me to the perhaps futile question of how we could have more realistic synthetic benchmarks. ACDC and ACDC-JS are synthetic garbage collection benchmarks. While they don’t seem to be widely used, they seemed to have been a useful tool for specific tasks. Based on observing metrics for a range of relevant programs, these synthetic benchmarks were constructed to be configurable and allow us to measure a range of realistic behaviors.

I am currently interested in the startup, warmup, and cold-code performance of virtual machines, and want to study their performance issues. To me it seems that I need to look at large programs to get interesting and relevant results. With large, I mean millions of lines of code, because that’s where our systems currently struggle. So, how could we go about to create a synthetic benchmark for huge code bases?

Generating Random Code that Looks Real

In my last two blog posts [1, 2], I looked at the shape of large code bases in Pharo and Ruby to obtain data for a new kind of synthetic benchmark. I want to try to generate random code that looks “real”. And by looking real I mean for the moment that it is shaped in a way that is similar to real code. This means, methods have realistic length, number of local variables, and arguments. For classes, they should have a realistic number of methods and instance variables. In the last two blog posts, I looked at the corresponding static code metrics to get an idea of how large code bases actually look like.

In this post, I am setting out to use the data to get a random number generator that can be used to generate “realistic looking” code bases. Of course, this doesn’t mean that the code does do anything realistic.

Small steps… One at a time… 👨🏼‍🔬

So, let’s get started by looking at how the length of methods looks like in large code bases.

Before we get started, just one more simplification: I will only consider methods that have 1 to 30 lines (of code). Setting an upper bound will make some of the steps here simpler, and plots more legible.

And a perhaps little silly, but nonetheless an issue, it will avoid me having to change one of the language implementations I am interested in, which is unfortunately limited to 128 bytecodes and 128 literals (constants, method names, etc.), which in practice translates to something like 30 lines of code. While this could be fixed, let’s assume 30 lines of code per method ought to be enough for anybody…

Length of Methods

When it comes to measuring the length of methods, there are plenty of possibly ways to go about. Pharo counts the non-empty lines. And for Ruby, I counted either all lines, or the lines that are not just empty and not just comments.

The histogram below shows the results for methods with 1-30 lines.

Distribution of the length of methods.

Despite difference in languages and metrics, we see a pretty similar shape. Perhaps with the exception of the methods with 1-3 lines.

Generating Realistic Method Length from Uniform Random Numbers

Before actually generating method length randomly, let’s define the goal a bit more clearly.

In the end, I do want to be able to generate a code base where the length of methods has a distribution very similar to what we see for Ruby and Pharo.

Though, the random number generators we have in most systems generate numbers in a uniform distribution typically in the range from 0 to 1. This means, each number between 0 and 1 is going to be equally likely to be picked. To get other kinds of distributions, for instance the normal distribution, we can use what is called the inverted cumulative distribution function. When we throw our uniformly distributed numbers into this function, we should end up with random numbers that are distributed according to the distribution that we want.

One of the options to do this would be:

  1. determine the cumulative distribution of the method length
  2. approximate a function to represent the cumulative distribution
  3. and invert the function

I found this post here helpful. Though, I struggled defining a good enough function to get results I liked.

So, instead, let’s do it the pedestrian way:

  1. calculate the cumulative sum for the method length (cumulative distribution)
  2. normalize it to the sum of all lengths
  3. use the result to look up the desired method length for a uniform random number

Determining the Cumulative Distribution

Ok, so, the first step is to determine the cumulative distribution. Since we have the three different cases for Pharo, Ruby with all lines and lines of code, this is slightly more interesting.

Percentage of methods with a specific length in lines or LOC.

The plot above shows the percentage of methods that have a length smaller or equal to a specific size.

So, the next question is, which metrics should I choose? Since the data is a bit noisy, especially for small methods, let’s try and see what the different types of means give us.

Different means applied to the cumulative percentage of methods for a given length.

From the above plot, the geometric mean seems a good option. Mostly because I don’t want to have a too high and too low number of methods with a single line.

Using the geometric mean, gives us the following partial cumulative distribution table:

length cum.perc
1 0.0520631
2 0.2647386
3 0.5137505
4 0.6068893
5 0.6851248
6 0.7377313
7 0.7861208
8 0.8200427
9 0.8490800

In R, the language I use for these blog posts, I can then use something like the following to take a uniform random number from the range of 0-1 to determine the desired method length in the range of 1-30 lines (u being here the random number):

loc_from_u <- function (u) {
  Position(function (e) { u < e }, cumulative_distribution_tbl)
}

There are probably more efficient ways of going about it. I suppose a binary search would be a good option, too.

The general idea is that with our random number u, we find the last position in our array with the cumulative distribution, where u is smaller than the value in the array at that position. The position then corresponds to the desired length of a method.

Three examples of methods generated, for 100, 1,000, or 100,000 methods.

As a test, the three plots above are generated from 100, 1,000, and 100,000 uniformly distributed random numbers, and it looks pretty good. Comparing to the very first set of plots in this post, this seems like a workable and relatively straightforward approach.

To use these results and generate methods of realistic sizes in other languages, the full cumulative distribution is as follows: [0.0520631241473676, 0.264738601144803, 0.51375051909561, 0.606889305644881, 0.685124787391578, 0.737731315373305, 0.786120782303596, 0.820042695503066, 0.849080035429476, 0.872949669419948, 0.893437469528804, 0.909716501217452, 0.923766913966731, 0.935357118689879, 0.945074445934092, 0.953001059092301, 0.959743413722937, 0.965457396618992, 0.97072530951053, 0.975142363341172, 0.979105371695575, 0.982654867280203, 0.985723232507825, 0.988399223471247, 0.990960559703172, 0.993172997617124, 0.9951015059492, 0.996855138214434, 0.998541672752458, 1].

Method Arguments and Local Variables

With the basics down, we can look at the number of arguments and local variables of methods. One thing I haven’t really thought about in the previous posts is that there’s a connection between the various metrics. They are not independent of each other.

Perhaps this is most intuitive for the number of local variables a method has. We wouldn’t expect a method with a single line of code to have many local variables, while longer methods may tent to have more local variables, too.

Number of Method Arguments

Let’s start out by looking at how method length and number of arguments relate to each other.

I’ll use the cumulative distribution for these plots, since that’s what I am looking for in the end.

Percentage of methods with a specific number of arguments.

The two plots above show for each method length from 1-30 a line (so, this is where limiting the method length becomes actually handy). Though, because there are many, I highlight only every third length, including length 1 methods. The bluest blue is length 1, and the red is length 30.

We can see here differences between the languages. For instance, for methods with only 1 line in Pharo, only ≈45% of them have no argument. While for Ruby methods, that’s perhaps around 70%.

The other interesting bit that is clearly visible is that the number of arguments doesn’t have a simple direct relationship to length. Indeed, longer methods seem to have more likely fewer arguments. While medium length methods are more likely to have a few more arguments, at least for the Ruby data this seems to be the case.

So, from these plots, I conclude that I actually need a different cumulative distribution table for each method length. Since we saw how they look for method length, I won’t include the details. Though, of course happy to share the data if anyone wants it.

Number of Methods Locals

Next up, let’s look at the number of locals.

Percentage of methods with a specific number of local variables.

For the Pharo data, it’s not super readable, but basically 100% of methods of length 1 have zero local variables. Compared to the plot on arguments, we also see a pretty direct relationship to length, because the blue-to-red gradient comes out nicely in the plot.

In the case of Ruby, this seems to be similar, but perhaps not as cleanly as for Pharo. The different y-axis start points are also interesting, because they indicate that longer methods in Pharo are more likely to have arguments than in Ruby.

For generating code, I suppose one needs to select the distributions that are most relevant for one’s goal.

Classes: Number of Methods and Fields

After looking at properties for methods, let’s look at classes. I fear, these various metrics are pretty tangled up, and one could probably find many more interesting relationships between them, but I’ll restrict myself for this post to the most basic ones. First I’ll look at the cumulative distribution for the number of methods per class, and then look at the number of instance variables classes have depending on their size.

Number of Methods per Class

I’ll restrict my analysis here to classes with a maximum of 100 methods, because the data I have does not include enough classes with more than 100 methods.

Percentage of classes with a specific number of methods.

As we saw in the previous post, we can here see that Ruby has many more classes with only one or two methods. On the other hand, it seems to have slightly fewer larger classes. Expressed differently, about 60% of all Ruby methods (which includes closures) have 1 or 2 arguments, while in the case of Pharo (where closures where ignored), we need about 5-6 arguments to reach the same level.

Number of Fields for Classes with a Specific Number of Methods

For the number of fields of a class, I can easily look at the relation to the number of methods, too.

Percentage of classes with a specific number of fields.

In the two plots above, we see that there is an almost clear relationship between the number of methods and fields. A class having more methods seems to indicate that is may have more fields. For both languages, there’s some middle ground where things are not as clear, but at least for classes with fewer methods, it seems to hold well.

Conclusion

The most important insight for me from this exercise is that I can generate code that has a realistic shape relatively straightforwardly based on the data collected.

At least, it seems easy and reliable to get a random number distribution of the desired shape.

In addition, we saw that there are indeed interdependencies between the different metrics. This is not too surprising, but something one needs to keep in mind when generating “realistic” code.

So, where from here? Well, I already got a code generator that can generate millions of lines of code that use basic arithmetic operations. The next step would be to fit this code into a shape that’s more realistic. One problem I had before is that my generated code started stressing things like the method look up, in a way real code doesn’t. Shaping things more realistically, will help avoid optimizing things that may not matter. Then again, we see pathologic cases also in real code.

Ending this post, there are of course more open questions, for instance:

  • how do I generate realistic behavior?
  • do large chunks of generated code allow me to study warmup and cold-code performance in a meaningful way? Or asked differently, does the generated code behave similar enough to real code?
  • which other metrics are relevant to generate realistic code?

Though, I fear, I’ll need to wait until spring or summer to revisit those questions.

For suggestions, comments, or questions, find me on Twitter @smarr.

The Shape of 6M Lines of Ruby

Following up on my last blog post, I am going to look at how Ruby is used to get a bit of an impression of whether there are major differences between Ruby and Smalltalk in their usage.

Again, I am going to look into the structural aspects of code bases. This means, looking at classes, methods, modules, and files.

Methodology

Not being a Ruby expert, I searched for large Ruby on Rails applications that could be of relevance. I found 10 that sounded promising: Diaspora, Discourse, Errbit, Fat Free CRM, GitLab, Kandan, Redmine, Refinery CMS, Selfstarted, Spree.

For each, I checked out the git repository (see version detail in appendix), and installed the Gems in a local directory. Since there’s a lot of overlap, I moved all gems into a single directory, and only kept the latest version to avoid counting the same, or sufficiently similar code multiple times.

With these projects and their dependencies, I had in the end 10 projects and 861 gems. Looking exclusively at the *.rb files, the analysis considered 50,865 files, with a total of 6,081,070 lines.

To analyze the code, I am building on top of the parser gem. The code to determine the statistics can be found in the ruby-stats project on GitHub.

Size of the Overall Code Base

Looking at the 50,865 files with their overall 6,081,070 lines, the first thing I noticed is that only about 64% of the lines are code, i.e., they are not empty and are not just comments. However, only 46% of all lines are attributed to some form of method or closure, which seemed unexpected to me.

2% of the code lines are simply in the direct body of a file, 6% are directly in modules, and 11% are directly in classes. And there are 19 gems that don’t define a single method or closure. The examples I looked at looked like either meta gems, including others (rspec), gems with JavaScript (babel-source), or data (mime-types-data).

In total, there are 625,761 methods (incl. closures), 32,897 classes, and 11,057 modules defined in all projects.

Of the 50,865 files, 12,150 were classified as tests, for which I more or less checked whether the file name or path contains a variant of “test” or “spec”.

To get an impression how files and classes are used by projects, let’s look at the number of files per project as a histogram:

Number of files per project.

The histograms show how many projects have a specific number of files in them. There’s less than 20 projects with just a single *.rb file. The largest project is GitLab with more than 9,500 files. The next project is Discourse, with about 3,300 files.

Number of classes per project.

When looking at classes, 823 out of 871 projects have at least one. In the histogram above, we can see that most of the projects that have classes, have indeed rather few of them. Discourse with about 2,000 classes and GitLab with about 3,000 classes again have the most.

Number of modules per project.

The use of modules seems to be somewhat similar as we can see in the histograms above.

When looking at methods per project, we see that the results look a bit different. There also seem to be some strange patterns and spikes, especially in the range from 1 to 100 methods per project.

Number of methods per project

Structure of Classes

When looking at the defined classes, we can see in the following histograms that there are many classes that have no or very few methods.

Number of methods per class

However, there’s also a bunch of classes with more than 200 methods. Most of these classes are for Ruby parsers of the different versions of Ruby. Others are unit test classes in the Redmine project.

While Ruby and Smalltalk are two very different programming systems, the languages have some similarities. So, let’s see whether classes have a similar number of methods:

The percent of classes with a given number of methods.

The above plot is similar to our histograms before. But instead of showing the number of classes, it shows the percent of classes with a specific number of methods. By normalizing the values, we can more easily compare between the two corpora. Just to make the semantics of the plot clear: the length of all bars together add up to 100% for Ruby and Pharo separately.

One artifact of how the data is collected, is that Pharo does not show any classes without methods, because I collected it per method, and didn’t get details for classes separately.

The major difference we can see is that Ruby has many more classes with only one or two methods. On the other hand, it seems to have a little fewer larger classes, but then ends up having also a few really large classes. As mentioned previously, the really large classes grouping around 430-ish methods are all variants of Ruby parsers. I’d assume there to be a large amount of code duplication between those classes.

Number of direct fields in a class.

The histograms above show how many classes have a specific number of fields that they access directly, for instance, with expressions like @count.

We can see that an overwhelming number of classes do not access fields at all, which seems a bit surprising to me. Though, there also seem to be a number of classes that have many fields. The two largest classes have 180 (RBPDF) and 52 distinct fields (csv.Parser).

I’ll refrain from a direct comparison with Pharo here, because it’s not really clear to me how to do this in a comparable way. The only way that would seem somewhat comparable would be to build the inheritance hierarchy, and resolve mixins, but so far, I haven’t implemented either.

Number of class fields, i.e., static fields per class.

However, we can look at the use of class variables with the double-at syntax: @@count. Here, the situation looks very different. Only 189 classes have one class field, and only 61 have more than one. This means, 117,079 classes don’t use class fields at all.

Structure of Methods

After looking at classes, let’s investigate the methods a bit closer.

Lines of code per method.

Let’s first look at the lines of code per method. This means, at how many non-empty lines there are that do not only contain comments.

There don’t seem to be any empty methods, but there are almost an equal number of methods with 1 (87,217) or 2 lines of code (82,467).

The largest method in the corpus is parser.Lexer.advance. I suppose, unsurprisingly that’s the Ruby parser again with 8,888 lines of code. It also has 55 local variables.

The other methods with over 3,000 lines of code are actually blocks in specs. There’s 5 of them in the mongoid gem, one in grape, and one in Discourse.

Number of lines per method (incl. blank lines and comments).

When looking at the data for method length in lines, which also counts blank lines and comments, the results seem a bit wonky. From the previous results, I would expect that there are no empty methods, which indeed is the case.

Then we got 88,803 methods with just one line, which seems in line with expectations. However, we got 2,267 methods with two lines, and 185,876 methods with three lines, which seems a little odd. Perhaps there is some code formatting convention at play.

The rest looks reasonably similar to the lines of code results. The huge methods are again the parser this time with 12,619 lines, and the spec blocks.

Comparing to Pharo is a little bit of an issue, because neither the line count nor the lines of code metric match what Pharo gives me. Pharo reports the number of non-empty lines, including comments. So, Pharo’s metric is somewhere between the lines and lines of code I got here for Ruby.

Percentage of methods with a specific length in lines or LOC.

While the metrics are not identical, having both the lines and lines of code for Ruby lets us draw at least one conclusion from the comparison. There seems to be a tendency for longer methods in Ruby. At least in the range from 30 to 250 lines, there seem to be more methods with this size in Ruby.

Percent of methods with a specific number of arguments.

When it comes to arguments, Ruby seems to have a few methods/blocks/lambdas without any argument. But a bit few with one argument. When it comes to methods/blocks/lambdas with many arguments, Pharo seems to have a few more of those. Though, the numbers here are not entirely comparable, because the Pharo numbers do not actually include blocks/closures.

The Ruby methods with the largest number of arguments (16 and 17) are RBPDF.Text and RBPDF.Image.

Percent of methods with a specific number of local variables.

In both languages, a lot of methods don’t have any local variables. However, in the Ruby corpus there are three methods with more than 50 local variables. That is the very long Lexer.advance method in the Ruby parser, a Markdown code processing method, and RBPDF’s writeHTML method.

Conclusion

For me, the main take away from this exercise is that when it comes to structural metrics, there are visible differences between Ruby and Pharo code. This isn’t surprising, since they are different languages, with different features, communities, and style guides.

However, there also seem to be similarities that are worth noting. Overall, number of methods in a class seems to be fairly similar. And while Ruby methods might have a small tendency of being larger when they are large, the majority of methods isn’t actually large and here both languages seem to show fairly similar method sizes.

The difference in the usage of arguments may or may not be explainable with syntax, such as implicit block arguments, or that I didn’t actually consider closures in Pharo. The use of local variables however, seems to be fairly similar between both languages.

Not sure there are any big lessons to be learned yet, but one could probably go further and study other metrics to gain additional insights. I’d probably start with class hierarchy, mixins, and other features that require either a bit of dynamic evaluation, or implementing the Ruby semantics in the tool determining the metrics.

For suggestions, comments, or questions, find me on Twitter @smarr.

Appendix

The following table contains the details on the projects included in this analysis.

Project Commit URL
Diaspora d2acad1 https://github.com/diaspora/diaspora
Discourse f040b5d https://github.com/discourse/discourse
Errbit cf792c0 https://github.com/errbit/errbit
Fat Free CRM 4e72e0c https://github.com/fatfreecrm/fat_free_crm
GitLab 21e08b6 https://github.com/gitlabhq/gitlabhq
Kandan 380efaf https://github.com/kandanapp/kandan
Redmine 988a36b https://github.com/edavis10/redmine
Refinery CMS 1b73e0b https://github.com/refinery/refinerycms
Selfstarted 740075f https://github.com/apigy/selfstarter
Spree 901cb64 https://github.com/spree/spree

The Shape of 1.7M Lines of Code

Recently, I was wondering how large code bases look like when it comes to the basic properties compiler might care about. And here I am not thinking about dynamic properties, but simply static properties such as length of methods, number of methods per class, number of fields, and so on.

I think there are a whole bunch of studies that ask questions related to this. And a quick search let me to a report titled Characterizing Pharo Code by Zaitsev et al., which also comes with the code the authors used to answer their questions.

Though, the report focuses on more high-level questions than what I had in mind. With a bit of extra effort, I managed to collect the data I was looking for.

Methodology

The report by Zaitsev et al. selected Pharo projects that represent a variety of different domains, widely used and less widely used projects, small and large ones, as well as active and less active projects. I kept the same selection of projects, but with a slightly more recent set of commits to look at.

Furthermore, I included the whole Pharo 8.0 base system and all loaded dependencies, which didn’t seem to be the case in the original analysis.

A full list of projects and commits is included at the end of this post. Overall, the analysis considers 183 projects, with 1,403 packages in total. A “project” is here a set of Pharo packages that are related by name. This includes for instance Moose, a platform for software and data analysis, Seaside, a web application framework, Roassal, scripting for visualizations, and various other packages, including the Pharo system itself.

Since Pharo has the classic introspection/reflection facilities of Smalltalk systems, I use them to collect the structural metrics, including lines of code, number of methods, classes, arguments, and local variables.

Size of the Overall Code Base

As mentioned earlier, the code base under investigation is composed of 183 projects. These projects contain 22,294 classes, of which 3,474 classes are unit tests. Overall, there are 275,602 methods in the system, of which 35,746 are on test classes.

This means, about 16% of the classes and 13% of the methods are related to tests. Since this seems to be a rather small number, I’ll keep the test code in the analysis even so the code may have different general properties.

To get an impression how classes are distributed over packages and projects, let’s look at the following plot.

Number of classes per package and per project.

The first two graphs are histograms that show how many packages have a specific number of classes in them. We only record something if there’s a method, and a method needs a class. So, there are no packages without any classes. But there are plenty of packages with only 1 class. The number of packages that have high number of classes decreases rapidly. The second histogram shows all packages that have 75 or more classes, and we see there are two packages with around 600 classes: Bloc, Brick.

The third histogram looks at the same data but this time by project. A project can consist of multiple packages, but it turns out, there are many projects with very few classes, and only very few projects with many classes. To make these details better visible, the second and third histogram uses a bin size of 25 instead of 1.

When looking at the following plots, we see that the results look a bit different for methods.

Number of methods per package and per project

The first histogram (on the left) shows how many packages have 1, 2, and up to 99 methods in them. There seem to be about 15 packages with just a single method in them. And about 40 with 2 methods. Interestingly, the number of methods per package seems to show fewer similarities to the power law or pareto distribution than the number of classes.

Looking at the second histogram, which only considers the packages with 100 or more methods, we see a shape more similar to the power law.

When looking at the data at the granularity of projects, in the third histogram, we see many projects with very few methods, and only very few projects with many methods.

In this corpus, the projects Bloc, Glamorous Toolkit, SmaCC, and Spec all have more than 10,000 methods.

Structure of Classes

Let us assume for the rest of this post that this is a single code base. In Pharo, it would feel like a single code base anyway, since everything is in the image and can be accessed and modified easily.

Number of methods per class.

The two histograms above show the number of classes that have a particular number of methods. On the left, we see all class with fewer than 50 methods. Turns out, a lot of classes have a single method, and even though there are considerably fewer, there are quite a number of classes with 40 to 50 methods. In the histogram on the right, with a bin size of 10, we see that there are still plenty of classes with 50 to 100 methods, after which we then find fewer and fewer classes. The classes Morph, Object, and VBNetParser have each more than 700 methods, and thus, have the most methods.

Number of direct fields in a class.

The histograms above show how many classes have a specific number fields that they directly declare. For comparison below, we’ll look at the total number of fields of a class, considering all fields of its superclasses.

For direct fields, we see that may classes do not have any fields, but plenty of them have some fields. In the histogram on the right, we see quite a number of classes with 15 or more fields (177 in total). The classes with more than 100 methods are PRPillarGrammar, PRPillarGrammarOld, PPYAMLGrammar, and FamixGenerator.

Number of all fields in a class including from superclasses.

When considering all fields, including the ones in the superclass hierarchy, things look a little different. On the left, we see the number of classes that have fewer than 30 fields. Since we now count the classes from the superclass hierarchy, we see there’s a spike at three fields. For classes with 30 or more fields in total, in the histogram on the right, we see a few more spikes, but at a smaller level. The class with the most fields is PPYAMLGrammar and has 123 fields.

Number of class fields, i.e., static fields per class.

When it comes to class fields, the situation looks very different. Only 357 classes have one class field, and only 69 have more than one.

Number of superclasses per class.

Since the number of fields depends on the superclass hierarchy, let’s have a look at the numbers for using inheritance.

The data looks a bit strange. We have few classes that have no superclass. This is a quirk in Pharo’s reflection system. These classes are not classic classes but traits. The few classes that have a single superclass are bit special, and reflect Pharo’s metalevel architecture. The most important one is Object. Its superclass is ProtoObject, where the hierarchy terminates. The other classes are what can be considered dynamic proxies, used for intercepting message sends/method calls.

Only few hierarchies turn out to be deep, which includes widgets and some test classes with 11 or 12 superclasses.

Structure of Methods

After looking at classes, let’s investigate the methods a bit closer.

Lines of code per method.

There’s indeed one method in a mock class that has no code. Not sure what’s going on there, but the method might simply not be a source method. I didn’t check. Though, there are 1797 methods with one line of code. First this seemed a little strange, too, but since Pharo considers method signatures as part of the method, it’s essentially empty methods. With this, it’s unsurprising that most methods have 2 lines, which includes accessors and all kind of other short methods.

If I recall correctly, Smalltalkers advice against methods with more than 6 or 7 lines. From the data distribution, the advice seems to be widely ignored. At least, there doesn’t seem to be a major step after 6-7 lines. There are 29 methods with more than 1,000 lines. The 9 methods with more than 5,000 lines seem to all carry various kind of data, things like JSON and JavaScript strings.

Lines of code per class.

Looking at the lines of code by aggregating them per class reveals a mostly similar picture. Many tiny classes, and few large classes.

Number of arguments per method.

When looking at the number of arguments a method takes, we see a huge number not taking any at all (the receiver is not considered). About half of the methods has 1 argument, which seems plausible considering setters have one argument. The two methods with 15 arguments are methods to test the bytecode compiler.

Number of local variables in a method.

A lot of methods don’t have any local variables. Probably not surprising given the number of getters and setters why may assume. And, it seems people don’t actually go all out when it comes to local variables. 36 variables seem sufficient for everyone, and the particular method seems to rotate an elliptical arc, thus, implements a somewhat complex algorithm.

Number of literals in a method.

Finally, let’s have a look at the number of literals per method. Literals include any kind of numbers, constants, and constructs that have a specific syntax in the language, e.g., arrays. However, it also includes the names of methods, which are used for the message sends. Thus, it somewhat correlates with the number of message sends a method may possibly have. Though, that’s probably not a perfect correlation because of the other kinds of literals as well as all kind of optimizations in the bytecode set.

Conclusion

Ok, so, what to do with this data? I am not quite sure yet. Though, there are a few bits and pieces in here that are interesting. And, since I recently started generating large code bases to assess the performance of cold code, i.e., interpreter speed, I think some of these bits will allow me to generate more “natural” code.

Other details suggest to have a good look at various optimizations classic interpreters do. For example, SOMns optimizes the accessor methods to object fields already, and thus avoids a full method/function call for them. Not sure whether that’s an optimization applied by many languages, though, HotSpot does it under the term “fast accessor methods”.

Would also be interesting to see how these numbers compare across languages. Python and Ruby come to mind as similar class-based dynamic languages.

There might be more to gain from this data, but that’s for another day.

For suggestions, comments, or questions, find me on Twitter @smarr.

Appendix

The following table contains the details on the projects included in this analysis.

Project Commit URL
DrTests 010eb9b https://github.com/juliendelplanque/DrTests
Mustache 728feda https://github.com/noha/mustache
PetitParser bd108b9 https://github.com/moosetechnology/PetitParser
Pillar 4d8a285 https://github.com/pillar-markup/pillar
Seaside e0c73a5 https://github.com/SeasideSt/Seaside
Spec2 988c6d7 https://github.com/pharo-spec/Spec
PolyMath 473b0b0 https://github.com/PolyMathOrg/PolyMath
Telescope 8c47cfc https://github.com/TelescopeSt/TelescopeCytoscape
Voyage f4f9d28 https://github.com/pharo-nosql/voyage
Bloc a8c7ecb https://github.com/pharo-graphics/Bloc
DataFrame 7422404 https://github.com/PolyMathOrg/DataFrame
Roassal2 d65a87a https://github.com/ObjectProfile/Roassal2
Roassal3 167de2d https://github.com/ObjectProfile/Roassal3
Moose fc8fb07 https://github.com/moosetechnology/Moose
GToolkit e3c98fc https://github.com/feenkcom/gtoolkit
Iceberg 7e78a75 https://github.com/pharo-vcs/iceberg

Older Posts

Subscribe via RSS