Computing Reviews

Feature selection and enhanced krill herd algorithm for text document clustering
Abualigah L., Springer International Publishing,New York, NY,2019. 165 pp.Type:Book
Date Reviewed: 05/10/19

This monograph, which comes out of the author’s PhD thesis, studies text document clustering with the help of the krill herd (KH) algorithm.

KH is a relatively new class of bio-inspired algorithms. The essence of the algorithm is to solve optimization problems by simulating the behavior of a krill herd. Multiple fronts (multi-objectives) of potential solutions are pursued at the same time to avoid being trapped in a localized optimal solution.

In its original form,

The minimum distances of each individual krill from food and from highest density of the herd are considered as the objective function for the krill movement. The time-dependent position of the krill individuals is formulated by three main factors: (i) movement induced by the presence of other individuals, (ii) foraging activity, and (iii) random diffusion.

Here, the author adapts the KH algorithm for text document clustering. Table 4.7 (p. 83) establishes the relationship among the general KH algorithm (KHA), optimization, and its applications in the text document clustering problem (TDCP). Here are the key mappings between the terms in the two domains (KHA and TDCP): “maximum action item” in KHA is “partitioning” in TDC; “food distance” in KHA is “centroid distance” in TDCP; “krill individual” in KHA is “document” in TDCP; and “food” in KHA is “cluster centroid” in TDCP. This mapping helps the reader to understand the KH algorithm and its application to the text clustering problem.

The book contains six chapters. After an introduction, the author discusses what KHA is and why it is used in TDCP. Chapter 3 is a comprehensive literature review that covers topics such as similarity measures, weighing schemes, feature selection, dimension reduction, partitioning, heuristic algorithms, and hybrid techniques used in TDCP. The key concepts and methodology are presented in chapter 4, followed by detailed experimental results in chapter 5 and conclusions in chapter 6.

The book is well written, with high-quality tables and graphs. Each chapter ends with a collection of references, including the most recent work in the area. The book should be very useful for scholars who want to study the general field of text document clustering. It is also a good reference for those who work in text document clustering and use genetic algorithms.

Reviewer:  Xiannong Meng Review #: CR146567 (1908-0305)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy