Computing Reviews, the leading online review service for computing literature.

Search

Feature selection and enhanced krill herd algorithm for text document clustering
Abualigah L., Springer International Publishing, New York, NY, 2019. 165 pp. Type: Book (978-3-030106-73-7)

Date Reviewed: May 10 2019

This monograph, which comes out of the author’s PhD thesis, studies text document clustering with the help of the krill herd (KH) algorithm. KH is a relatively new class of bio-inspired algorithms. The essence of the algorithm is to solve optimization problems by simulating the behavior of a krill herd. Multiple fronts (multi-objectives) of potential solutions are pursued at the same time to avoid being trapped in a localized optimal solution. In its original form, The minimum distances of each individual krill from food and from highest density of the herd are considered as the objective function for the krill movement. The time-dependent position of the krill individuals is formulated by three main factors: (i) movement induced by the presence of other individuals, (ii) foraging activity, and (iii) random diffusion. Here, the author adapts the KH algorithm for text document clustering. Table 4.7 (p. 83) establishes the relationship among the general KH algorithm (KHA), optimization, and its applications in the text document clustering problem (TDCP). Here are the key mappings between the terms in the two domains (KHA and TDCP): “maximum action item” in KHA is “partitioning” in TDC; “food distance” in KHA is “centroid distance” in TDCP; “krill individual” in KHA is “document” in TDCP; and “food” in KHA is “cluster centroid” in TDCP. This mapping helps the reader to understand the KH algorithm and its application to the text clustering problem. The book contains six chapters. After an introduction, the author discusses what KHA is and why it is used in TDCP. Chapter 3 is a comprehensive literature review that covers topics such as similarity measures, weighing schemes, feature selection, dimension reduction, partitioning, heuristic algorithms, and hybrid techniques used in TDCP. The key concepts and methodology are presented in chapter 4, followed by detailed experimental results in chapter 5 and conclusions in chapter 6. The book is well written, with high-quality tables and graphs. Each chapter ends with a collection of references, including the most recent work in the area. The book should be very useful for scholars who want to study the general field of text document clustering. It is also a good reference for those who work in text document clustering and use genetic algorithms.

Reviewer: Xiannong Meng	Review #: CR146567 (1908-0305)

Feature Evaluation And Selection (I.5.2 ... )

Clustering (H.3.3 ... )

Record Classification (H.3.2 ... )

Content Analysis And Indexing (H.3.1 )

Information Search And Retrieval (H.3.3 )

Optimization (G.1.6 )

Would you recommend this review?

yes

Other reviews under "Feature Evaluation And Selection":	Date

Labeled point pattern matching by Delaunay triangulation and maximal cliques Ogawa H. Pattern Recognition 19(1): 35-40, 1986. Type: Article	Feb 1 1988

Features selection and ‘possibility theory’ Di Gesù V., Maccarone M. Pattern Recognition 19(1): 63-72, 1986. Type: Article	Dec 1 1987

An analytic-to-holistic approach for face recognition based on a single frontal view Lam K., Yan H. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(7): 673-686, 1998. Type: Article	Oct 1 1998

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy