Theory and Practice of Language Implementation, part 1
Thanks to the financial support of FWO and a grant by the summer school itself, I was able to go to Eugene, Oregon and learn about theory and practice of language implementation.
The first part of this summer school had three different topics. Ras Bodik presented highly interesting approaches for program synthesis. Oliver Danvy taught us how to use continuations and how to transform a program into continuation passing style. Matt Might tried to explain control-flow analysis (CFA) for high-order programs, starting with 0-CFA (zero-CFA) and k-CFA.
Algorithmic Program Synthesis
The program synthesis lectures were from my perspective the most comprehensible and interesting ones. Ras gave a high-level overview over different approaches to utilize expert knowledge i.e. domain insights to guide the synthesis of algorithms. He presented two different directions: derivative and selective synthesis. The derivative synthesis has its roots in the formal world. The idea is to define a problem with a formal language and to automatically derive a proof for it. As a by-product of this proof, an algorithm to solve the problem is synthesized. This approach is limited mainly by the formal skills of the programmers, and thus, was not successful in a larger scale.
Selective synthesis tries to circumvent these limitations by trying to preserve the usual tools for the programmers as the basic foundation. One example he explained in more detail was SKETCH. Here, the programmer has to describe a space of possible programs and the synthesizer can chose an exemplar of this space by testing/verifying whether it fulfills a specification, which also has to be given by the programmer. An example could be an optimized sort algorithm, which might be tailored to a specific memory model. The programmer implements the basic sketch and leaves some open holes, which needs to be synthesized. She also can give a simple bubble sort as specification, or as an alternative, a checking algorithm, whether the resulting data is sorted. He concluded his lectures with a demonstration of angelic programming, which uses the idea of sketching a solution and automatically deriving an implementation.
Continuations to Go
Oliver was the first lecturer asking us to do exercises. This was a good way to force us to really think about the content of his lectures. At first he taught us the three basic steps to transform a program to continuation-passing style (CPS): 1. name all immediate values, 2. sequentialize immediate evaluations, and 3. introduce the actual continuation. Then, he gave us some exercises, for instance to build the list of suffixes of a list and the list of prefixes. In a language supporting linked lists, the list of suffixes can be realized by sharing the data between all suffixes. The list of suffixes, thus is comprised of pointers pointing to the according element of the original list, where the suffix starts. For the list of prefixes, it is not possible to share the data, but, by implementing it in a continuation-passing style. You can share the computation of the prefixes similar to sharing the data in the case of the suffixes.
Another example, he ask to implement was a convolution of two lists. Thus, the result is a list of pairs of elements of each list, where the second list is reverse beforehand. The interesting thing here is, that it can be implemented in a recursive way with n steps of recursion for lists with n elements. The solution is simple, if you figure out that you can use the return time of a recursion to traverse the second lists, which is equal to a reverse. Later he pointed out, that an implementation of this, using CPS and defunctionalization will result in a plain usual zip/reverse like it would be done in Haskell.
Control-flow Analysis of Higher-Order Languages
Well, and then there was the control-flow analysis (CFA). At the moment, I have not figured out completely how to read the slides, since my Greek is rather limited. What I was able to take away from this course was, that there are several different techniques where 0-CFA is a robust and efficient one. In general, the k in k-CFAs denotes the amount of history, which is considered in the analysis. Usually, only 0-CFA and 1-CFA are applied in practice, i.e. in compilers, since the state space of the analysis explodes very fast. The concrete semantics of a program would be described by an infinite-CFA. He also talked about possible optimizations, like applying garbage collection, to eliminate unreachable states/possibilities during the CFA and thereby increasing the efficiency of the analysis significantly.