Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
On approximation measures for functional dependencies
Giannella C., Robertson E. Information Systems29 (6):483-507,2004.Type:Article
Date Reviewed: Jul 6 2005

Data mining involves finding meaningful patterns in large data sets. It presents interesting problems, and good solutions to these would be valuable to most businesses.

This paper tackles the problem of computing the strength of approximate functional dependencies. These occur if values in one column in a table are partly predicted by values in other columns. The authors’ research reviews well-known ways to measure the strength of such dependencies. It compares three in detail: a normalized conditional entropy, the &tgr; of Piatetsky-Shapiro, and the g3 of Kivinen and Mannila. The authors prove that entropy is the only possible measure that satisfies five plausible axioms, extending Shannon and Weaver’s work [1]. They then show that the three measures are independent, by presenting cases that maximize the differences between them.

The authors computed the differences between the three measures on four well-known data sets. Figures 7 through 9 use bar charts to show the means, ranges, and standard deviations of these differences; box-plots or histograms would have been more helpful to the reader. Yet, the distribution of differences cannot tell us much about the relations between numerical data. Scattergrams reveal these best. This one section is not worth reading, but theorists will find the rest of the paper worthwhile.

Reviewer:  Richard Botting Review #: CR131464 (0512-1356)
1) Shannon, C.E.; Weaver, W. The mathematical theory of communication. The University of Illinois Press, Urbana, IL, 1963.
Bookmark and Share
  Featured Reviewer  
Would you recommend this review?
Other reviews under "Data Mining": Date
Survey of text mining
Berry M., Springer-Verlag New York, Inc., Secaucus, NJ, 2003.  272, Type: Book (9780387955636)
Mar 25 2004
Data mining and knowledge discovery handbook
Maimon O., Rokach L., Springer-Verlag New York, Inc., Secaucus, NJ, 2005.  1419, Type: Book (9780387244358)
Jan 10 2006
Data mining the Web: uncovering patterns in Web content, structure, and usage
Markov Z., Larose D., Wiley-Interscience, 2007.  218, Type: Book (9780471666554)
Nov 23 2007

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 2004™
Terms of Use
| Privacy Policy