What if we could see all concurrency bugs in the debugger?

Multiverse Debugging: Non-Deterministic Debugging for Non-Deterministic Programs

We have been working on various bits and pieces to improve debugging in the past. For example, we tried to understand what bugs are possible, how debugging of actor systems works, how we could support all kind of concurrency models in debuggers, and more recently, how to avoid some issues with non-determinism by recording and replaying the execution of buggy actor applications.

Though, one of the big issues with debugging is that non-determinism makes it practically impossible to see what a program could be doing at any given point in time. Debugging a sequential program, we can see what is happening in every moment, and for most parts, programs are deterministic and will do the same thing reliably for a give point in a program. But with non-determinism in the mix, even with a debugger, we can’t be sure what a program would do the next time we inspect it. Non-determinism can cause for instance operations to be scheduled differently or messages to arrive in a different order. This means, we can’t really just go and look at a program in a debugger to see what it would be doing.

So, what if we could see all concurrency bugs in the debugger? What if we could simply explore the different non-deterministic paths a program could take?

With Multiverse Debugging, we explore exactly that idea.

Based on the AmbientTalk language, we designed and built a debugger that allows us to explore all possible universes, i.e., states a program can reach. Instead of staying in a single universe, we can ask the What If? question at every point in our programs and start exploring what may happen if messages are received in different orders, actor B is scheduled before actor A, and how the final results of such non-deterministic variations differ.

To get an idea of how that could look like, see the demo video below, come to our ECOOP’19 presentation next Thursday, or talk to us during the ECOOP poster session!

So far, it is early work, and a brave new idea paper, but despite various challenges ahead, we think it can make a difference.

You can find the abstract and preprint of the paper below.

Abstract

Many of today’s software systems are parallel or concurrent. With the rise of Node.js and more generally event-loop architectures, many systems need to handle concurrency. However, its non-deterministic behavior makes it hard to reproduce bugs. Today’s interactive debuggers unfortunately do not support developers in debugging non-deterministic issues. They only allow us to explore a single execution path. Therefore, some bugs may never be reproduced in the debugging session, because the right conditions are not triggered. As a solution, we propose multiverse debugging, a new approach for debugging non-deterministic programs that allows developers to observe all possible execution paths of a parallel program and debug it interactively. We introduce the concepts of multiverse breakpoints and stepping, which can halt a program in different execution paths, i.e. universes. We apply multiverse debugging to AmbientTalk, an actor-based language, resulting in Voyager, a multiverse debugger implemented on top of the AmbientTalk operational semantics. We provide a proof of non-interference, i.e., we prove that observing the behavior of a program by the debugger does not affect the behavior of that program and vice versa. Multiverse debugging establishes the foundation for debugging non-deterministic programs interactively, which we believe can aid the development of parallel and concurrent systems.

  • Multiverse Debugging: Non-deterministic Debugging for Non-deterministic Programs
    C. Torres Lopez, R. Gurdeep Singh, S. Marr, E. Gonzalez Boix, C. Scholliers; In 33rd European Conference on Object-Oriented Programming, ECOOP'19, p. 27:1–27:30, Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, 2019.
  • Paper: PDF
  • DOI: 10.4230/LIPIcs.ECOOP.2019.27
  • Appendix: online appendix
  • BibTex: bibtex
    @inproceedings{TorresLopez:2019:MD,
      abstract = {Many of today's software systems are parallel or concurrent. With the rise of Node.js and more generally event-loop architectures, many systems need to handle concurrency. However, its non-deterministic behavior makes it hard to reproduce bugs. Today's interactive debuggers unfortunately do not support developers in debugging non-deterministic issues. They only allow us to explore a single execution path. Therefore, some bugs may never be reproduced in the debugging session, because the right conditions are not triggered. As a solution, we propose multiverse debugging, a new approach for debugging non-deterministic programs that allows developers to observe all possible execution paths of a parallel program and debug it interactively. We introduce the concepts of multiverse breakpoints and stepping, which can halt a program in different execution paths, i.e. universes. We apply multiverse debugging to AmbientTalk, an actor-based language, resulting in Voyager, a multiverse debugger implemented on top of the AmbientTalk operational semantics. We provide a proof of non-interference, i.e., we prove that observing the behavior of a program by the debugger does not affect the behavior of that program and vice versa. Multiverse debugging establishes the foundation for debugging non-deterministic programs interactively, which we believe can aid the development of parallel and concurrent systems.},
      acceptancerate = {0.37},
      appendix = {https://doi.org/10.4230/DARTS.5.2.4},
      author = {Torres Lopez, Carmen and Gurdeep Singh, Robbert and Marr, Stefan and Gonzalez Boix, Elisa and Scholliers, Christophe},
      booktitle = {33rd European Conference on Object-Oriented Programming},
      day = {15},
      doi = {10.4230/LIPIcs.ECOOP.2019.27},
      isbn = {978-3-95977-111-5},
      issn = {1868-8969},
      keywords = {Actors AmbientTalk Concurrency Debugging FormalSemantics Formalism MeMyPublication Multiverse NonDeterminism Redex myown},
      month = jul,
      number = {27},
      pages = {27:1--27:30},
      pdf = {https://stefan-marr.de/downloads/ecoop19-torres-lopez-et-al-multiverse-debugging-non-deterministic-debugging-for-non-deterministic-programs.pdf},
      publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
      series = {ECOOP'19},
      title = {{Multiverse Debugging: Non-deterministic Debugging for Non-deterministic Programs}},
      volume = {134},
      year = {2019},
      month_numeric = {7}
    }
    

SOMns 0.7.0 Release with Extension Modules and Artifacts

It has been a while since we put together a release for SOMns. And it has been even longer, since I last wrote about it on this blog.

Though, since I recently described how to put together academic artifacts based on a CI setup, I wanted to use the opportunity to announce that the setup is now part of the latest SOMns release. This means SOMns comes now out of the box with the basic elements to generate artifacts for reproducible research.

Additionally, SOMns 0.7.0 accumulates all the changes done over the last 1.5 or so years.

Major new features include extension modules, support for snapshot writing, and various fixes for concurrency issues. With the update to GraalVM 19, we should also eventually get Windows support, though, the build scripts are not yet made compatible.

The number of contributors to SOMns is slowly growing, and I’d like to thank everyone.

0.7.0 - Extension Modules, Snapshots, Artifacts v7

New Features

  • Add support for extension modules #252
  • Object heap serialization #278, #271, #276
  • Optimized tracing and restoring replay support #261, #257
  • Added VirtualBox artifact #296, #299

Concurrency Issues

  • Swap tracing buffers for blocked threads #297
  • Renew Safepoint assumption only when invalid #291
  • Blocking primitives cannot participate in safepoints #288, #286
  • Fix object layout races #285

General Maintenance

  • Update Truffle to >19.0.1 release #298
  • Update Truffle to 1.0.0-rc5 #266, #295, #267
  • Make the current instance of SomLanguage accessible for language server #277
  • VMM’18 Demo and Kompos improvements #270
  • Add Kent CI and update ReBench, Checkstyle, and Kompos dependencies #258
  • Improve readability of Travis CI Logs by using folding #281
  • Update basic-setup documentation #284
  • Update list of publications in README #283
  • Use CompilerDirectives::castExact instead of ValueProfiles #256
  • Move Actor mailbox loop to Truffle #250
  • Change Warmup in harnesses to simply disregard measurements #249
  • Remove All.ns #248

Bug Fixes

  • Set promise data fields to null on resolution #300, #294
  • Fix Travis Error codes and Codespeed URL #287
  • Minimize the Truffle-related building #282
  • Remove dead code #275
  • Fix potential NPE in ExpressionNode.toString #269
  • Fix replay test flakiness #268
  • Redesign Timer Prim #264
  • Trace TimerPrim #262
  • Ensure charAt: and file read/write are compatible with PE #260
  • Signal exceptions without helper methods #251, #214

Generating an Artifact From a Benchmarking Setup as Part of CI

Disclaimer: The artifact, for which I put this automation together, was rejected. I take this as a reminder that the technical bits still require good documentation to be useful.

In the programming language community, as well as in other research communities, we strive to follow scientific principles. One of them is that others should be able to verify the results that we report. One way of enabling verification of our results is by making all, or at least most elements of our systems available. Such an artifact can then be used for instance to rerun benchmarks, experiment with the system, or even build on top of it, and solve entirely different research questions.

Unfortunately, it can be time consuming to put such artifacts together. And, the way we do artifact evaluation does not necessarily help with it: You submit your research paper, and then at some later point get to know whether it is accepted or not. And only if it is accepted, we start preparing the artifacts. Because it can be a lot of work and the deadlines are tight, the result may be less then perfect, which is rather unfortunate.

So, how can we reduce the time it takes to create artifacts?

1. More Automation

For a long time now, I have worked on gradually automating the various elements in my benchmarking and paper writing setup. It started early on with ReBench (docs), a tool to define benchmarking experiments. The goal was to enable others and myself to reexecute experiments with the same parameters and build setup. However, in the context of an artifact, this is only one element.

Perhaps more importantly, with an artifact we want to ensure that others do not run into any kind of issues during the setup of the experiments, avoiding problems with unavailable software dependencies, version conflicts, and the usual mess of our software ecosystems.

One way of going about avoiding these issues is to setup the whole experiment in a systems virtual machine. This means, all software dependencies are included and someone using the artifact will need only the software that can execute the virtual machine image.

VirtualBox is one popular open source solution for these kind of systems virtual machines. Unfortunately, setting up a virtual machine for an artifact is time consuming.

Let’s see how we can automate it.

2. Making Artifacts Part of Continuous Integration

2.1 Packer: Creating a VirtualBox with a Script

Initially, I started using Vagrant, which allows us to script the “provisioning” of virtual machine images. This means, we can use it to install the necessary software for our benchmarking setup in a VirtualBox image. Vagrant also supports systems such as Docker and VMWare, but I’ll stick to VirtualBox for now.

Unfortunately, my attempt of using Vagrant was less than successful. While I was able to generate an image with all the software needed for my benchmarks, when testing the image, it would not correctly boot. Might have been me, or some fluke with the Vagrant VirtualBox image repository.

Inspired by a similar post on creating artifacts, I looked into packer.io, which allows us to create a full VirtualBox image from scratch. Thus, we have full control of what ends up in an artifact, and script the process in a way that can be run as part of CI. Having a fully automated setup, I can create an artifact on our local GitLab CI Runner, either as part of the normal CI process or perhaps weekly, because it takes about 2h to build a VM image.

As a small optimization, I split the creation of the image into two steps. The first step creates a base image with a minimal Lubuntu installation, which can be used as a common base for different artifacts. The second step creates the concrete artifact by executing shell scripts inside the VM, which install dependencies and build all experiments so that the VM image is ready for development or benchmarking.

2.2 Fully Automated Benchmarking as Stepping Stone

Before going into setting up the VM image, we need some basics.

My general setup relies on two elements: a GitLab CI runner, and an automated benchmarking setup.

The fully automated benchmarking setup is useful in its own right. We have used it successfully for many of our research projects. It executes a set of benchmarks for every change pushed to our repository.

Since I am using ReBench for this, running benchmarks on the CI system is nothing more than executing the already configured set of benchmarks:

rebench benchmark.conf ci-benchmarks

For convenience, the results are reported to a Codespeed instance, where one can see the impact of any changes on performance.

Since ReBench also builds the experiments before running, we are already half way to a useful artifact.

2.3 Putting the Artifact Together

Since we could take any existing VirtualBox image as a starting point, let’s start with preparing the artifact, before looking at how I create my base image.

In my CI setup, creating the VirtualBox image boils down to:

packer build artifact.json
mv artifact/ ~/artifacts  # to make the artifact accessible

The key here is of course the artifact.json file, which describes where the base image is, and what to do with it to turn it into the artifact.

The following is an abbreviated version of what I am using to create an artifact for SOMns:

"builders" : [ {
  "type": "virtualbox-ovf",
  "format": "ova",
  "source_path": "base-image.ova",
  ...
} ],
"provisioners": [ {
  "execute_command":
    "echo 'artifact' | {{.Vars}} sudo -S -E bash -eux '{{.Path}}'",
  "scripts": [
    "provision.sh"
  ],
  "type": "shell"
}]

In the actual file, there is a bit more going on, but the key idea is that we take an existing VirtualBox image, boot it, and run a number of shell scripts in it.

These shell scripts do the main work. For a typical paper of mine, they would roughly:

  1. configure package repositories, e.g. Node.js and R
  2. install packages, e.g., Python, Java, LaTeX
  3. checkout the latest version of the experiment repo
  4. run ReBench to build everything and execute a benchmark to see that it works. I do this with rebench --setup-only benchmark.conf
  5. copy the evaluation parts of the paper repository into the VM
  6. build the evaluation plots of the paper with KnitR, R, and LaTeX
  7. link the useful directories, README files, and others on the desktop
  8. and for the final touch, set a project specific background file.

A partial script looks perhaps something like the following:

wget -O- https://deb.nodesource.com/setup_8.x | bash -

apt-get update
apt-get install -y openjdk-8-jdk openjdk-8-source \
                   python-pip ant nodejs

pip install git+https://github.com/smarr/ReBench

git clone ${GIT_REPO} ${REPO_NAME}

cd ${REPO_NAME}
git checkout ${COMMIT_SHA}
git submodule update --init --recursive
rebench --setup-only ${REBENCH_CONF} SOMns

It configures the Node.js apt repositories and then installs the dependencies. Afterwards, it clones the project, checks it out, and runs the benchmarks. There are a few more things to be done, as can be seen for instance with the SOMns artifact.

This gives us a basic artifact that can be rebuilt whenever needed. It can of course also be adapted to fit new projects easily.

The overall image is usually between 4-6GB in size, and the build process, including the minimal benchmark run takes about 2h. Afterwards, we have a tested artifact.

What remains is writing a good introduction and overview, so that others may use it, verify the results, and may even be able to regenerate the plots in the paper with their own data.

3. Creating a Base Image

As mentioned before, we can use any VirtualBox image as a base image. We might already have one from previous artifacts, and now simply want to increase automation, or we use one of the images offered by the community. We can also build one specifically for our purpose.

For artifacts size matters. Having huge VM images makes downloads slow, storage difficult, and requires users to have sufficient free disk space. Therefore, we may want to ensure that the image only contains what we need.

With packer, we can automate the image creation including the initial installation of the operating system, which gives us the freedom we need. The packer community provides various examples that are a useful foundation for custom images. Inspired by bento and an Idris artifact, I put together scripts for my own base images. These script download a Ubuntu server installation disk, create a VirtualBox, and then start installation. An example configuration is artifact-base-1604.json, which creates a Lubuntu 16.04 core VM. The configuration sets various details including memory size, number of cores, hostname, username, password, etc. Perhaps worthwhile to highlight are the following two settings:

"hard_drive_nonrotational": true,
"hard_drive_discard": true,

This instructions VirtualBox to create the hard drive as an SSD. This hopefully ensures that the disk only uses the actual required space, and therefore minimizes the size of our artifacts. Though, I am not entirely sure this is without drawbacks. But so far, it seems that disk zeroing and other tricks used to reduce the size of VM images is not necessary with these settings.

In addition to the artifact-base-1604.json file, the preseed.cfg instructs the Ubuntu installer to configure the system, installs useful packages such as an SSH server, a minimal Lubuntu core systems, Firefox, a PDF viewer, and a few other things. After these are successfully installed, the basic-setup.sh configures the system to disable automatic updates, configure the SSH server for use in a VM, enable password-less sudo, and install the VirtualBox guest support.

The result is packaged up as a *.ova file which can be directly loaded by VirtualBox and becomes the base image for my artifacts.

4. What’s Next?

With this setup, we automate the recurring and time consuming tasks of creating VirtualBox images that contain our artifacts. In my case, such artifacts contain the sources, benchmarking infrastructure, as well as the scripts to recreate all numbers and plots in the paper.

That means, the artifact misses documentation of how the pieces can be used, how SOMns is implemented, and what one would need to do to change things. Thus, for the next artifact, I hope having this automation will allow me to focus on writing better documentation instead of putting together all the bits and pieces manually. For new projects, I can hopefully reuse this infrastructure, and get the artifact created by the CI server from day 1. Whether it actually works, I’ll hopefully see soon.

Docker would be also worth looking into as a more lightweight alternative to VirtualBox. Last year I asked academic Twitter, and containers seemed by a small margin the desired solution. Ideally, most of the scripts can be used, and just be executed in a suitable Docker container. Though, I still haven’t tried it.

Acknowledgements

I’d like to thank Richard and Guido for comments on a draft.

Older Posts

Subscribe via RSS