Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Explaining mixture models through semantic pattern mining and banded matrix visualization
Adhikari P., Vavpetić A., Kralj J., Lavrać N., Hollmén J. Machine Learning105 (1):3-39,2016.Type:Article
Date Reviewed: Jan 3 2017

Data analysis is concerned with making data comprehensible and amenable to interpretation by domain specialists. The central contribution of this paper is a three-part approach to data analysis.

First, the data is clustered using mixture models that combine different probability distributions. Mixture models are particularly suitable for analyzing heterogeneous data, including the DNA copy number amplification data used to identify chromosomal regions implicated in the development of various cancers that originally motivated this work. The clustered data is then mined for semantic patterns. Background knowledge in the form of ontologies is used in this process, which results in names that can be used by domain specialists to explain the results of clustering. Finally, the rules are used to create banded matrices that expose the structure of the data in a visually accessible form. The approach was applied to several public datasets (NY Daily, Tweets, and Cities) as well as to DNA copy number amplification data.

The results indicated that the method is highly versatile and provides an effective way to summarize and present large amounts of data in a form that is likely to lead to useful insight. Of particular interest is the use of visualization to help explain the clusters.

As well as presenting the central data analysis methodology, the paper includes a useful review of related literature, addressing mixture models, multi-resolution data analysis, semantic pattern mining, and data visualization using banded matrices.

The work as a whole is likely to be of interest to anyone with an interest in mining large datasets for useful information, or in visualizing the inner structure of large datasets.

Reviewer:  Edel Sherratt Review #: CR144984 (1703-0190)
Bookmark and Share
 
Pattern Recognition (I.5 )
 
 
Data Mining (H.2.8 ... )
 
 
Clustering (I.5.3 )
 
 
Learning (I.2.6 )
 
Would you recommend this review?
yes
no
Other reviews under "Pattern Recognition": Date
Classification and learning using genetic algorithms: applications in bioinformatics and Web intelligence (Natural Computing Series)
Bandyopadhyay S., Pal S., Springer-Verlag New York, Inc., Secaucus, NJ, 2007.  311, Type: Book (9783540496069), Reviews: (1 of 2)
Oct 24 2007
Classification and learning using genetic algorithms: applications in bioinformatics and Web intelligence (Natural Computing Series)
Bandyopadhyay S., Pal S., Springer-Verlag New York, Inc., Secaucus, NJ, 2007.  311, Type: Book (9783540496069), Reviews: (2 of 2)
Feb 8 2008
Computational intelligence: concepts to implementations
Eberhart R., Shi Y., Morgan Kaufmann Publishers Inc., San Francisco, CA, 2007.  496, Type: Book (9781558607590)
Feb 22 2010
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy