Make it run; make it right; make it fast; make it small.
The anonymous mantra above captures the evolution of all software (except for the “make it bloat” phase of overly mature products). Distributed processing has passed the novelty stage where just making it run was a thrill. Most improvements in the past decade have been targeted at making it right; in other words, ensuring correct, consistent data in the face of increasingly complicated user interaction scenarios. Pervasive Internet availability and the proliferation of handheld devices are currently motivating research into making software fast and small.
It is the move of both casual and mission-critical applications to the Internet that has made improved performance of distributed software a primary goal. However, as in many other domains, designers have continually faced a tradeoff between making software fast and right; you could only increase one attribute by decreasing the other.
The authors of this paper have broken through this dilemma. Their speculative techniques vastly improve performance without sacrificing correctness. Moreover, these are not just theoretical or hypothetical performance gains suggested by isolated laboratory experiments. The authors implement their techniques in off-the-shelf commodity software and benchmark the changes. Theirs is proven, real-world success.
Their core finding is that it is cheaper to burn central processing unit (CPU) cycles and random access memory (RAM) (abundant resources) on checkpoints that are usually discarded than to waste time waiting for a synchronous remote procedure call to complete (time is the scarce resource). Because of longer latency times, Internet-based applications show even greater performance improvements than local area network applications. It’s that rare case where an increase in need is met with an increase in satisfaction.
The paper is 15 pages long with one full page of bibliographic references (the references alone provide a great survey of the field). It is easy to read, and is more like a Communications of the ACM article than a research paper. A download of the base BlueFS file system the authors modified is available at http://notrump.eecs.umich.edu/group/group.html, the home page of the Pervasive Computing Research Group at the University of Michigan. However, source code to the Speculator library or the modified file system is not available. The authors have revealed enough information to recreate their speculative library, but it would be great to see their work released under an open source license.