This post is co-authored with Fabio Niephaus.

The Truffle framework allows us to write language interpreters in an easy way. In combination with the Graal compiler and its partial evaluator, such Truffle interpreters are able to be as fast as custom VMs. A crucial part of the framework to achieve performance are so-called specializations, which are used to define highly optimized and speculative optimizations for the very basic operations of a language.

Writing such specializations is generally pretty straight forward, but there is at least one common pitfall. When designing specializations, we need to remind ourselves that the parameter types of specializations are technically guards. This means, the activation semantics of specializations depends not only on explicit guards, but also on the semantics of Java’s type system.

Pitfall for Specializations

Let’s have a look at the following code example. It sketches a Truffle node that can be used to check whether an object is some kind of number.

public abstract class IsNumberNode extends Node {

  public abstract int executeEvaluated(Object o);

  @Specialization
  protected final int doInteger(final int o) {
    return 1;
  }

  @Specialization
  protected final int doFloat(final float o) {
    return 2;
  }

  @Specialization
  protected final int doObject(final Object o) {
    return 0;
  }
}

Truffle generates a concrete implementation for this abstract class. To use it, the executeEvaluated(Object) method can be called, which will automatically select one of the three specializations for int, float, and Object based on the given argument.

Next, let’s see this node in action:

IsNumberNode n = IsNumberNodeGen.create();

n.executeEvaluated(42);            // --> 1
n.executeEvaluated(44.3);          // --> 2
n.executeEvaluated(new Object());  // --> 0

n.executeEvaluated(22.7);          // --> 2

Great, so the node works as expected, right? Let’s double check:

IsNumberNode n = IsNumberNodeGen.create();

n.executeEvaluated(new Object());  // --> 0
n.executeEvaluated(44.3);          // --> 0
n.executeEvaluated(42);            // --> 0

This time, the node seems to always return 0. But why?

The first time the node is invoked, it sees an Object and returns the correct result. Additionally, and this is the important side effect, this invocation also activates the isObject(Object) specialization inside the node. When the node is invoked again, it will first check whether any of the previously activated specializations match the given argument. In our example, the float and int values are Java Objects and therefore the node always returns 0. This also explains the behavior of the node in the previous series of invocations. First, the node was called with an int, a float, and then an Object. Therefore, all specializations were activated and the node returned the expected result for all invocations.

One reason for these specialization semantics is that we need to carefully balance the benefits of specializations and the cost of falling back to a more general version of an operation. This falling back, or more technically deoptimizing can have a high run-time overhead, because it might require recompilation of methods by the just-in-time compiler. Thus, if we saw the need for a more general specialization, we try to continue to use it, and only activate another specialization when none of the previously used ones is sufficient. Another important reason for this approach is to minimize the number of guards that need to be checked at run time. The general assumption here is that we have a high chance to match a previously activated one.

In case we do not actually want the Java semantics, as in our example, the isObject(Object) specialization needs to be guarded. This means, we need to be sure that it cannot be called with and activated by ints and floats. Here’s how this could look like in our example:

public abstract class IsNumberNode extends Node {
  // ...

  protected final boolean isInteger(final Object o) {
    return o instanceof Integer;
  }

  protected final boolean isFloat(final Object o) {
    return o instanceof Float;
  }

  @Specialization(guards = {"!isInteger(o)", "!isFloat(o)"})
  protected final int doObject(final Object o) {
    return 0;
  }
}

These guards are parameters for the @Specialization annotation and one can use helper functions that performinstanceof checks to guard the specialization accordingly.

For nodes with many specializations, this can become very tedious, because we need to repeat all implicit and explicit guards for such specializations. To avoid this in cases there is only one such fallback specialization, the Truffle framework provides the @Fallback annotation as a shortcut. It will implicitly use all guards and negate them. Thus, we can write the following for our example:

public abstract class IsNumberNode extends Node {
  // ...

  @Fallback
  protected final int doObject(final Object o) {
    return 0;
  }
}

How to Avoid Specialization Pitfalls?

As the example demonstrates, the described problem can occur when there are specializations for types that are in the same class hierarchy, especially in case of a specialization for the most general type Object.

At the moment, Truffle users can only manually check if they have nodes with such specializations to avoid this issue. But perhaps we can do a little better.

Very useful would be a testing tool that ensures coverage for all specializations as well as all possible combinations. This would allow us to find erroneous/undesired generalization relationships between specializations, and could also ensure that a node provides all required specializations. Especially for beginners, it would also be nice to have a visual tool to inspect specializations and their activation behavior. Perhaps it could be possible to have it as part of IGV.

Depending on how commonly one actually wants such generalization or subsumption semantics of specializations, one could consider using Truffle’s annotation processors to perform extra checks. They already perform various checks and triggers errors, for example, for syntax errors in guard definitions. Perhaps, it could also generate a warning or an info message in case it detects specializations for types that are part of the same class hierarchy to make users aware of this issue. Thus, if generalization/subsumption are less common, one might simply indicate them explicitly, perhaps in addition to the existing replaces parameter for the @Specialization annotation.