Data flow frameworks are algebraic formulations of the kind of questions compilers solve when they attempt to optimize programs. Gary Kildall was the first to see that a compiler could decide whether an expression like “a+b*c” has a constant value at a particular place in a program (and thus might have better code generated) by solving a set of algebraic equations using a straightforward and generalizable algorithm. Since this invention--the most important advance in compiler theory in the 1970s--dozens of uses, extensions, and variations have been proposed and used. As might be expected, all these developments have created a welter of notational and theoretical inconsistency. Marlowe and Ryder survey data flow frameworks (their bibliography contains 98 items) and organize them so that the rest of us do not have to.
The paper is primarily tutorial and is arranged like a textbook; indeed, it begins with a page-long table of contents. After a brief motivational section, Marlowe and Ryder present an algebraic definition of data flow frameworks that covers a page of small type and four pages of explanation. Although this section seems forbidding, it simplifies the rest of the paper. As the paper surveys the many uses of data flow, each is rewritten in the terminology of the basic definition so that techniques can be compared and evaluated without having to wade through the idiosyncratic terminology of the many individual papers. Finally, the survey considers a variety of properties that data flow frameworks might have; these properties generally concern how accurate the optimization results of a framework are and how much work must be applied to achieve the results. Marlowe and Ryder provide examples for properties that are not otherwise illustrated in the literature; these illustrations are original and useful.
This paper does a signal service. Although it is not simple--the subject, after all, is a technical one--the plan is orderly, the writing is clear, and the evaluations are solid. Some small errors (mostly typographical) occur, and the editing could have been a little more solid. Also, some paragraphs (for example, paragraph 5.6.1) remind me of Old Testament genealogies--simple lists of references with little context. Undoubtedly, these lists are forced by the conventions of journal papers. If I have to work with a data flow framework problem in the future, the first thing I will do is use this paper to understand how the new problem fits with what has gone before. I also encourage Marlowe and Ryder to expand this paper into a full textbook with practical programming examples and evaluations of the use of data flow frameworks in real programs.