Data mining involves finding meaningful patterns in large data sets. It presents interesting problems, and good solutions to these would be valuable to most businesses.

This paper tackles the problem of computing the strength of approximate functional dependencies. These occur if values in one column in a table are partly predicted by values in other columns. The authors’ research reviews well-known ways to measure the strength of such dependencies. It compares three in detail: a normalized conditional entropy, the &tgr; of Piatetsky-Shapiro, and the *g*_{3} of Kivinen and Mannila. The authors prove that entropy is the only possible measure that satisfies five plausible axioms, extending Shannon and Weaver’s work [1]. They then show that the three measures are independent, by presenting cases that maximize the differences between them.

The authors computed the differences between the three measures on four well-known data sets. Figures 7 through 9 use bar charts to show the means, ranges, and standard deviations of these differences; box-plots or histograms would have been more helpful to the reader. Yet, the distribution of differences cannot tell us much about the relations between numerical data. Scattergrams reveal these best. This one section is not worth reading, but theorists will find the rest of the paper worthwhile.