Computing Reviews, the leading online review service for computing literature.

Search

Rule based systems for big data : a machine learning approach
Liu H., Gegov A., Cocea M., Springer International Publishing, New York, NY, 2015. 121 pp. Type: Book (978-3-319236-95-7)

Date Reviewed: May 23 2016

This monograph, authored by researchers at the University of Portsmouth, deals with specific issues of their research on rule-based systems for big data. The preface of the book ingeniously opens with a quote from the legendary The art of war by Sun Tzu, which uses the metaphor of unshaped water to describe the non-constant conditions of war. The allegory is twofold here: on the one hand, there is the huge variability and complexity of big data flows, like water shapes; on the other hand, there are the efforts to discipline the data in order to learn from it, a task that is essentially a battle of knowledge against uncertainty and ignorance. Under this perspective, the choice of researchers to equip their arsenal with learning and rule-based methodologies is absolutely justified. And this is exactly the aim of the book: to review the advances of these methodologies, as a short manual of weapons and tactics against big data complexity. The book follows a consistent taxonomical organization, emphasizing the terminology and descriptions of fundamental notions. The basic notion here is of course the classification rule-based system, a subclass of the wider class of expert systems. Such systems consist of a set of “if (condition)-then (outcome)” logical expressions, which are used for storing knowledge, supporting decisions, and making predictions. More specifically, the book focuses on data-based approaches within a machine learning framework. The text is easily readable and nicely organized, deploying gradually the most important aspects encountered in the theory and practice of rule-based systems. I especially appreciated the consistency of the authors in discussing comparatively the different types of methods, documenting their own preferences, and also their insistence on emphasizing the importance of interpretability, a subject that is obvious in statistics, but is often underestimated by machine learning practitioners who use black-box methodologies almost blindly. The material is organized in nine chapters, each followed by its own bibliography. The first chapter introduces the background of rule-based systems. The basic notions of machine learning are discussed first, along with categorizations of methods and algorithms and related issues like overfitting, accuracy, scaling, computational efficiency, and interpretability. The same chapter introduces the categorization and terminology of rule-based systems. The second chapter expands the former background since it is devoted to short references to entire mathematical and computer science fields related to rule-based systems. It is essential for the reader to realize the strong foundations of the current methods on set and graph theory (discrete mathematics), probability theory and statistics, logic, and algorithms. The chapter closes with two sections describing frameworks for single and ensemble rule-based classification systems. Chapter 3 describes the algorithms belonging to two basic approaches for rule generation: the divide and conquer approach, which constructs decision trees, and the separate and conquer approach, which produces an ordered set of rules. The application of algorithms is illustrated by an example on real data, while the advantages and disadvantages of both approaches are discussed in detail. Chapter 4 deals with ways of handling the overfitting problem, that is, the generation of too many complicated rules that threaten the efficiency and the predictability of systems. The pruning methods described aim to simplify the classification rules and fall into two categories, pre- and post-pruning, which are illustrated by examples and discussed with respect to advantages and disadvantages. Chapter 5 explores another important aspect of classification rules: their representation. After discussing tree structures and linear lists, the authors focus on networks, which are actually graph structures consisting of colored nodes arranged in layers and connected with weighted edges. The various methods are compared with respect to efficiency and interpretability. Chapter 6 discusses the improvement of accuracy through a combination of algorithms, which is called ensemble learning. Three approaches are presented--parallel, sequential, and hybrid learning--and comparatively discussed with respect to efficiency and accuracy, while interesting notions like variance and bias also emerge. The important aspect of interpretability of methods is discussed in chapter 7. The factors impacting the interpretability are categorized with respect to modeling and human evaluation, and the advantages of rule-based methods are highlighted in contrast to black-box models, such as neural networks. Three case studies presented in chapter 8 illustrate the methods and aspects of the former chapters on large data sets. Finally, chapter 9 concludes the book with interesting discussions on theoretical, practical, methodological, and philosophical issues and future directions that round out the book’s presentation of a unified framework for research on classification rule-based systems. Overall, the book is recommended to researchers and practitioners who wish to apply sound methods for understanding and exploiting their big data, and for those who plan to direct their research toward rule-based methodologies.

Reviewer: Lefteris Angelis	Review #: CR144438 (1608-0562)

Learning (I.2.6 )

Rule-Based Databases (H.2.4 ... )

Database Applications (H.2.8 )

Systems (H.2.4 )

Would you recommend this review?

yes

Other reviews under "Learning":	Date

Learning in parallel networks: simulating learning in a probabilistic system Hinton G. (ed) BYTE 10(4): 265-273, 1985. Type: Article	Nov 1 1985

Macro-operators: a weak method for learning Korf R. Artificial Intelligence 26(1): 35-77, 1985. Type: Article	Feb 1 1986

Inferring (mal) rules from pupils’ protocols Sleeman D. Progress in artificial intelligence (, Orsay, France,391985. Type: Proceedings	Dec 1 1985

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy