This post is a follow up on my first report on the TiC’10 summer school. It covers mainly the talks about X10 and formal aspects of concurrency.
Vijay Saraswat gave an overview about X10, Matthew Parkinson introduced separation logic and deny-guarantee reasoning, and Ganesan Ramalingam reported on Analysis and Verification of Concurrent Programs.
Vijay Saraswat: The X10 Language: Design and Implementation
The talk started with a brief description of the usual application scenarios IBM tries to address. They are looking for a language, which allows to program homogenous multi- or many-core systems as well as potentially heterogeneous cluster or HPC systems. The main goal is to design an expressive, productive and efficient programming model for application developers. The means to that end are a type system, static analysis, a comfortable debugger, and optimized standard libraries.
X10 is part of the PGAS family, but compared to UPC or Co-Array Fortran, X10 is an asynchronous PGAS (APGAS) language, which means that accesses over location boundaries are handled asynchronous by the runtime.
Some of the language properties, which were discussed, are briefly summarized below. As an APGAS language writes are only local. For remote writes you need to use a place-shift operation. The semantics of it is synchronous, but usually the implementation is asynchronous via messages. In that regard, Vijay also mentioned a tail-call like optimization to minimize the number of messages, which need to be exchanged. The atomic semantics is, well, rough. They use a per-place lock for atomics, but atomic operations on primitive values like integers are mapped to atomic operations provided by the underlying system, thus it is inherently ‘broken’, i.e., it is up to the programmer to ensure that different atomic blocks do not interfere with each other.
The implementation uses the notion of workers, which represent threads of the underlying system. There are 1 to n workers per place and activities are executed by workers. At the moment they use a model similar to Java’s fork/join, where the number of workers can grow over time. A problematic point is the implementation of the
when(C) construct, which currently requires to spawn new threads since the thread executing the activity which uses the
when construct will actually block on it. They will later change to a Cilk-like work-stealing system, which will allow to use a constant number of threads.
The place structure is static from the beginning, and currently a flat structure. In the future, this might be changed. Currently, they think about a hierarchical model as a better fit for GPGPUs. The plan is to explore two-level hierarchical places.
The design with workers and activates allows the compiler to coarsen up concurrency to remove the overhead of very fine-grained activities.
Another more flexible language construct they introduced recently is labeled finishes to allow the implementation of server-like programs where you do not want a pure hierarchical relation between
finish but need more flexibility. Thus, an
async can indicate on which finish it wants to announce its completion.
Matthew Parkinson: Deny-Guarantee Reasoning
The largest portion of his lectures was about the foundation for his own work. Thus, he was introducing separation logic. The goal is to have logic foundation to reason about programs, which can use pointer data structures, i.e., a heap like in usual object oriented programs. As far as I understood, this logic provides the means to establish statements about the behavior of programs. More precisely, it is possible to prove certain semantic assertions about their stacks and heaps.
One thing, which can be done, is to check whether the sets of memory locations modified by a program a disjoint, and thus, whether the program is race-free.
Rely-guarantee reasoning is a logic framework to reason about composed concurrent programs and is independent of separation logic. However, this logic is not able to capture the typical fork/join parallelism of many programs.
Deny-Guarantee reasoning is a reformulation of both, which explicitly supports dynamic parallelism, and thus, enables a formal reasoning about these kind of concurrent programs.
Well, I think I understand what it is useful for. The next thing I would need is a good textbook introducing all the basics very patiently. Might be necessary to prove that my barrier/phaser algorithm is actually correct :-/
Ganesan Ramalingam: Analysis & Verification of Concurrent Programs
Another Greek talk *sigh*. Anyway, lets see what it is good for: The goal of the work Mr. Ramalingam presented is to verify programs for some correctness criterion and for instance find concurrency bugs. To this end, they are using technique called concurrent shape analysis, which allows static verification in the presence of a dynamic heap and a statically unbounded number of threads by using abstract interpretation.
Before he goes into the details, he presents a typical stack example, and how it breaks if concurrency comes into play. To be able to reason about the correctness of the implementation a sequential specification of the stack is necessary. In a concurrent setting, this implies that the operations have to be linearizable in some form. The definition for that is: an interleaved execution is said to be linearizable iff it is equivalent to a legal sequential execution that preserves the order of non-overlapping operations.
Well, and then he goes on to explain the technique in detail.
In the second lecture, he starts with talking about serializability, similar to what is done in databases. His goal is to use the techniques presented earlier, to synthesize correct locking-schemes for originally sequential data-structures, while avoiding the trivial solution but providing an efficient scheme. The simplified idea is to compute a sequential proof, introduce the locks, and then afterwards eliminate redundant locks to achieve performance.