Tomoharu noticed in his work on the eJSVM, a JavaScript virtual machine for embedded systems, that quite a bit of memory is needed for the data that helps us to represent JavaScript objects efficiently. So, we started to look into how the memory use could be reduced without sacrificing performance.

Objects in Dynamic Languages

In languages like JavaScript, Python, and Ruby, objects are much more flexible than in many other languages including Java, C++, or C#, because fields can be added and possibly even removed dynamically as needed.

I’ll use JavaScript for our examples. Let’s imagine we are working with a sensor that can determine location and movement information. Not all data may be available at every point when we access the sensor. Indeed, most likely we may just have the current longitude and latitude. Perhaps sometimes, we have access to precise GPS coordinates, which also give us altitude. In even more rare cases, the sensor might even be moving, which gives us a bearing and speed. If we imagine this in code, we might get something that looks vaguely like the following code.

1
2
3
4
5
6
7
8
9
10
11
12
let location = {}
location.longitude = 51.28;  // getLong()
location.latitude  =  1.08;  // getLat()

if (hasAltitude()) {
  location.altitude = getAltitude();
}

if (isMoving()) {
  location.bearing = getBearing();
  location.speed   = getSpeed();
}

Thus, when we access the sensor, we create a new JavaScript object and then add the longitude and latitude fields. Depending on the available data, we may still add the field for altitude as well as bearing and speed. However, if the data is not available, our JavaScript object will only have longitude and latitude.

In addition to adding fields arbitrarily, in JavaScript, we can even delete them. So, how do our modern language implementations implement this efficiently?

Hidden Class: Finding Structure in Dynamic Programs

The most direct way to implement objects, where one can add and remove fields arbitrarily, would probably be some kind of hash table or a list of field names and their values. However, neither of these two approaches is as efficient as directly accessing a known memory offset for a specific field as it’s possible in languages with less flexible objects.

To gain the same efficiency in dynamic languages, hidden classes were invented. The key idea is that a language implementation can determine at run time the structure of objects, create a kind of map or hidden class that can tell us which fields an object has, possibly even the types stored in a field, and most importantly where the field can be found in memory. This works well because most code is much less dynamic than what the language would allow for.

For our example code from above, fig. 1 shows us how this may look like in an implementation. We start out with our location object being empty. The object only contains is a pointer to an empty hidden class. Once we execute the code on line 2, the longitude field is added. We store the value 51.28 into an array that we use to store all field values. Since it’s the first, it’s stored at index 0, and we record this in the hidden class. However, we really want to be able to reuse hidden classes easily. So, instead of changing the existing hidden class, we create a new one, which records longitude being stored at index 0.

Basically the same happens when we execute line 3, and add the latitude to the object. We need to expand the array by one slot to hold the value 1.08, and create a new hidden class that includes that latitude is stored at index 1.

The next time we would execute those lines again, we wouldn’t actually need to create new hidden classes but can lookup them up based on the fields that we are adding.

Fig. 1. When a field is added to an object, the object needs to change the hidden class, which says where the field is to be stored. We may also need to expand the array to have enough space to store the field's value.

Though, so far, we only looked at the first three lines of code. Lines 6, 10, and 11 add more fields, but do so conditionally. Focusing only on the graph of hidden classes, fig. 2 shows what that would look like.

Fig. 2. Hidden class graph for our example program. For the case that there's no altitude but movement information, the hidden class graph has a branch since the altitude field is not present.

The first three hidden classes are the same as before. In the case that we have neither an altitude nor a movement, we simply stay in the third hidden class. However, if we have any of these additional details, the so-called hidden class graph would further evolve. In this particular case, we would even introduce a branch depending on whether we have the altitude details. If we first have the altitude, in the simplest kind of hidden class graph, we would end up with bearing and speed being stored at different indexes in the object.

Optimizing Hidden Class Graphs

In his previous work on the eJSVM, Tomoharu already relied on JavaScript programs showing fairly stable patterns in their behavior. This means, the difference based on user input and between different runs of a program is relatively minor, when one observes a good sample of program executions. Thus, we came up with the idea of using a classic profile-guided optimization approach to optimize the hidden class graph of an application.

The basic idea is that during execution of a representative set of runs, we can use the garbage collector to gather statistics about the kind of objects used in a program, their hidden classes, and which of the hidden classes are used most. With this information, we can optimize the hidden class graph of a program for future executions.

Specifically, we apply the following optimizations:

  1. move branches in the graph to reduce memory use for the most-used hidden classes
  2. eliminate hidden classes that are only used temporarily before changing to another one
  3. merge identical branches of the graph from unrelated allocation sites

Let’s go through these optimizations step by step. For our example, let’s assume we profiled a number of representative executions and found out that indeed altitude and movement are rarely available. Thus, most objects will only have longitude and latitude fields. Furthermore, our data also tells us that it basically never happens that we have movement information without also having the altitude.

This means, we can apply our first optimization, and “move” the branch for adding altitude in the hidden class graph. Figure 3 highlights the change in red. By moving the branch for adding altitude to the hidden class after adding bearing and speed, we can merge the two branches since they are now identical. This means, the dotted hidden classes in the figure can be dropped. In the paper, we go into a bit more detail of how to make this correct without changing JavaScript semantics.

Fig. 3. By moving the rarely used branch without altitude information, the two branches become identical, and we can drop the dotted elements from the graph.

As the second optimization, we eliminate “temporary” hidden classes, which are rarely used over longer time spans. The prime example for these temporary hidden classes are the ones that are only used between adding fields one after another. As highlighted with dotted lines in fig. 4, the hidden class between adding longitude and latitude, as well as the ones between adding bearing, speed, and finally altitude can be removed. This leaves in the end, only three hidden classes in our graph.

Fig. 4. By removing intermediate classes which are only in use very briefly, the hidden class graph can be shrunk to just three hidden classes.

The third optimization is relevant for larger programs and not directly visible in our example. However, often different code paths would create the same kind of objects. Thus, the hidden classes would be basically the same, which allows us to merge these identical parts of the hidden class graph.

Results

For the evaluation, we used a variation of the Are We Fast Yet benchmarks. Here, I’ll just quickly look at the memory savings.

For these measurements, we first collected the profiling information for each benchmark, optimized the hidden class graph, and then used the result to guide the execution.

As can be seen in fig. 5, the optimized hidden class graphs are quite a bit smaller. We reduce their memory use by about 62% on average.

Fig. 5. Memory used by hidden classes and related data structures. Overall, this meta data uses about 62% less memory on average with our optimizations.

By reducing the overall number of hidden classes in the system, we also noticed a few speedups in our benchmarks. Figure 6 shows that the reducing in hidden classes reduces the cache misses for eJSVM’s single-entry lookup caches, especially for larger benchmarks such as CD and Havlak. However, there are other effects. For instance with the hidden classes known up front, we can size objects correctly on allocation, avoid the extra array to store the fields, and reduce the number of times the array needs to be expanded because of frequent transitions.

Fig. 6. Reduction in lookup/inline cache misses. eJSVM has single-entry lookup caches, which means the overall reduction in hidden classes can improve performance.

Profile Guided Offline Optimization of Hidden Class Graphs for JavaScript VMs in Embedded Systems

In the paper, we discuss a lot more details, background, and evaluation. Of course, there are various corner cases to be considered, for example things like JavaScript’s prototype objects, how to make sure that the JavaScript semantics don’t break even though the hidden classes change, and a few other bits.

Please give the paper a read, attend our presentation at VMIL’22, and find some of us for questions, comments, and suggestions on Twitter @profrejones and @smarr.

Abstract

JavaScript is increasingly used for the Internet of Things (IoT) on embedded systems. However, JavaScript’s memory footprint is a challenge, because normal JavaScript virtual machines (VMs) do not fit into the small memory of IoT devices. In part this is because a significant amount of memory is used by hidden classes, which are used to represent JavaScript’s dynamic objects efficiently.

In this research, we optimize the hidden class graph to minimize their memory use. Our solution collects the hidden class graph and related information for an application in a profiling run, and optimizes the graph offline. We reduce the number of hidden classes by avoiding introducing intermediate ones, for instance when properties are added one after another. Our optimizations allow the VM to assign the most likely final hidden class to an object at its creation. They also minimize re-allocation of storage for property values, and reduce the polymorphism of inline caches.

We implemented these optimizations in a JavaScript VM, eJSVM, and found that offline optimization can eliminate 61.9% of the hidden classes on average. It also improves execution speed by minimizing the number of hidden class transitions for an object and reducing inline cache misses.

  • Profile Guided Offline Optimization of Hidden Class Graphs for JavaScript VMs in Embedded Systems
    T. Ugawa, S. Marr, R. Jones; In Proceedings of the 14th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages, VMIL'22, p. 11, ACM, 2022.
  • Paper: PDF
  • DOI: 10.1145/3563838.3567678
  • BibTex: bibtex
    @inproceedings{Ugawa:2022:HCGOpt,
      abstract = {JavaScript is increasingly used for the Internet of Things (IoT) on embedded systems. However, JavaScript's memory footprint is a challenge, because normal JavaScript virtual machines (VMs) do not fit into the small memory of IoT devices. In part this is because a significant amount of memory is used by hidden classes, which are used to represent JavaScript's dynamic objects efficiently.
        
      In this research, we optimize the hidden class graph to minimize their memory use. Our solution collects the hidden class graph and related information for an application in a profiling run, and optimizes the graph offline. We reduce the number of hidden classes by avoiding introducing intermediate ones, for instance when properties are added one after another. Our optimizations allow the VM to assign the most likely final hidden class to an object at its creation. They also minimize re-allocation of storage for property values, and reduce the polymorphism of inline caches.
        
      We implemented these optimizations in a JavaScript VM, eJSVM, and found that offline optimization can eliminate 61.9% of the hidden classes on average. It also improves execution speed by minimizing the number of hidden class transitions for an object and reducing inline cache misses.},
      author = {Ugawa, Tomoharu and Marr, Stefan and Jones, Richard},
      booktitle = {Proceedings of the 14th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages},
      day = {5},
      doi = {10.1145/3563838.3567678},
      keywords = {EmbeddedSystems HiddenClasses InlineCaching IoT JavaScript MeMyPublication OfflineOptimization VirtualMachine myown},
      location = {Auckland, New Zealand},
      month = dec,
      pages = {11},
      pdf = {https://stefan-marr.de/downloads/vmil22-ugawa-et-al-profile-guided-offline-optimization-of-hidden-class-graphs.pdf},
      publisher = {ACM},
      series = {VMIL'22},
      title = {Profile Guided Offline Optimization of Hidden Class Graphs for JavaScript VMs in Embedded Systems},
      year = {2022},
      month_numeric = {12}
    }
    

Acknowledgments

Thanks to Tomoharu and Richard for suggestions and corrections on this blog post.