Trends in Concurrency 2010, part 3
This is the last post about the TiC’10 summer school and covers the two remaining lectures. Mark Moir talked about concurrent data structures and transactional memory, and last but not least, Madan Musuvathi presented the work they did at Microsoft Research to improve the testing of concurrent applications.
Mark Moir: Concurrent Data Structures and Transactional Memory
Stacks are the basic data structure in computer science, so it seams, well, at least they are used to introduce concurrency issues like the ABA problem. After presenting the lock-free stack with a version counter, which he claimed is for some people even with 64bit not safe enough, he presented a strategy to implement a stack much more scalable. The idea is that pairs of push-pop operations leave the actual stack unchanged as a result. Thus, these pairs could meet up in parallel and independent of the actual stack to avoid bottlenecks and contention. However, such a strategy still has to maintain some correctness criterion. Which he defines to be that it is linearizable with respect to a specific sequential semantics.
Afterwards he discusses and defines non-blocking progress conditions, i.e., wait-free, lock-free, and obstruction-free. The general idea is that weakening the requirements can enable more efficient or simpler implementation strategies.
As an example of how weakening the requirements allows to achieve an implementation strategy, which matches better the performance requirements, he introduced their SNZI counter-like constructs. SNZI means Scalable Non-Zero Indicator.
In the second lecture, he reports about their experience with transactional memory, not only software transactional memory, but also hardware TM, and hybrid approaches. He presented several interesting insights they gained while experimenting with the TM provided by Sun’s Rock processors. Very interesting was the report on the several factors, which lead to problems in the HTM systems. For instance, one of the main problems was, that a transaction could fail because the branch predictor was going down the wrong path and speculatively executing instructions, which then in return lead to transaction aborts.
Thereafter, he discussed the performance they managed to gain by using the different TM or lock-based systems. The result is that it very much depends on the algorithms and its parallel behavior. But the general conclusion is: STM, HTM, and also hybrid approaches do not just yet provide the necessary performance for general applications.
Another experiment he reported on was the attempt to simplify the implementation of concurrent data structures, which was mostly a success. They also need carful tuning with TM but enable simplified algorithms and even performance improvements depending on the use-case/algorithm.
Madan Musuvathi: Concurrency Testing: Challenges, Algorithms, and Tools
First Madan tried to wake us up by engaging in a discussion about the tradeoffs of testing vs. verification. Well, this is obviously about the effort you have to spend to achieve reasonable results. However, if the question comes to critical systems, which might cause the loss of life, it ends up in weighting ethical concerns with economic decisions. So, yes, he achieved his goal and of course, people were expressing their strong opinions.
He then continued with giving a demo of CHESS. It is a tool for debugging concurrent programs. It tries to explore the possibly problematic instruction interleavings with different algorithms since a brute force approach would not be feasible for real programs. The tool supports two different modes. In the fast mode, it inserts scheduling points before every synchronization, volatile access, and interlocked operation. This allows to find many bugs in practices, but not all. The second mode, the data-race mode inserts scheduling operations before every memory access, which allows to identify race-conditions, since it captures all sequentially consistent executions. What the tool does not do is generating and testing different inputs, which is a hard problem on its own.
The remainder of the talk discusses two different strategies to approach the exploding number of possible interleavings. The first approach is a reduction approach. The goal is to eliminate behavioral equivalent interleavings. Here he describes a number of different opportunities. The second approach is to prioritize interleavings, which might exhibit problematic behavior. Well, his claim is that random is the best heuristic here but of course can be combined with the reduction approach.
In the second lecture he details the randomization approach and presents another tool called Cuzz, which allows disciplined randomization of schedules. Now he goes into the theory behind this approach to establish an estimate of how likely it is to find bugs with this approach. The general idea is that bugs have different complexity if it comes to the necessary preconditions and possible constellations of interleavings. Another point is, that many bugs share the same root cause. This is expressed by a metric called ‘bug depth’, which is the number of ordering constrains that are sufficient to find the bug. In his experience, most bugs have a rather small bug depth. With the bug depth, he can calculate the probability of being able to find every bug in the program.
His claim is, that with all the optimizations applied to the algorithms they use, in practice, Cuzz beats the theoretical boundary. Most programs exhibit a lot of bugs, and these bugs are executed often which allows to find many of them in a few hundred runs through Cuzz.