Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Security procedures for classification mining algorithms
Johnsten T., Raghavan V.  Database and application security (Proceedings of the fifteenth annual working conference, Niagara, Ontario, Canada, Jul 15-18, 2001)285-297.2002.Type:Proceedings
Date Reviewed: Jan 9 2004

Classification mining algorithms can predict the categories or labels of unseen data. In database terms, this practice can infer or approximate a functional dependency between the target column (categories/labels) and other columns. This has an impact on database security if the target columns, or the values of target columns, are protected sensitive data elements.

In this paper, the authors first review the evaluation algorithm Exact_OB1, presented by Johnsten and Raghavan [1], which assesses the risk of disclosure of protected data with respect to decision-region-based classification algorithms. Then, the evaluation algorithm Exact_OB2 is introduced, for extended decision-region-based classification algorithms. Due to the potentially high execution time of Exact_OB2, an alternative algorithm APPROX_VAEL, which approximates the exact evaluation of EXACT_OB2, is also presented. The experimental results show that the APPROX_EVAL algorithm appears to provide an effective and efficient evaluation of disclosure risks of protected data. Throughout the paper, a decision tree is used to illustrate the risk of disclosure of protected data and of the security polices implemented. It is worth noting that all these evaluation algorithms are only applicable when the data elements (tuples) are integers or categorical values. They are not applicable to continuous data elements.

The goal of the implementation of security policies is to effectively remove unauthorized inference from data, so that classification mining algorithms cannot correctly infer or predict the values of protected data elements. However, the implementation of such security policies may also remove some legitimate inferences from the data. Data miners need to be aware of what dependencies have been removed, and to be much more careful about their findings if they are working with sanitized data. This also poses an interesting question: What is the impact of security policy on data mining?

This paper is of interest to both the database security and data mining communities. Some of the terms used in this paper are not clearly explained; it would be better read Johnsten and Raghavan’s paper [1].

Reviewer:  Donghui Wu Review #: CR128880 (0406-0714)
1) Johnsten, T.; Raghavan, V. Impact of decision-region based classification mining algorithms on database security. In Proceedings of 13th IFIP WG 11.3 Working Conference on Database Security Atluri, V., Hale, J., Eds. Kluwer Academic, 1999, 177–191.
Bookmark and Share
Would you recommend this review?
Other reviews under "Data Mining": Date
A scalable, incremental learning algorithm for classification problems
Ye N., Li X. Computers and Industrial Engineering 43(4): 677-692, 2002. Type: Article, Reviews: (2 of 2)
Sep 4 2003
Partitioning of Web graphs by community topology
Ino H., Kudo M., Nakamura A.  World Wide Web (Proceedings of the 14th International Conference on the World Wide Web, Chiba, Japan, May 10-14, 2005)661-669, 2005. Type: Proceedings
Nov 1 2005
Query enrichment for Web-query classification
Shen D., Pan R., Sun J., Pan J., Wu K., Yin J., Yang Q. ACM Transactions on Information Systems 24(3): 320-352, 2006. Type: Article
Jan 8 2007

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 2004™
Terms of Use
| Privacy Policy