Computing Reviews, the leading online review service for computing literature.

Search

Predictive data mining models
Olson D., Wu D., Springer International Publishing, New York, NY, 2017. 102 pp. Type: Book

Date Reviewed: Jan 31 2018

In an age of artificial intelligence connected primarily to large amounts of constantly changing data, this book’s topic is extremely interesting. It focuses on a demonstration of predictive methods and tools using real business-related data. My expectations for this book were rather high. I have wondered whether it is intended more for beginners in the field or for advanced users and whether it can serve as material for my (graduate) students. However, after reading the book, the intended audience is not really clear to me. The book is divided into eight chapters describing introductory motivation (chapter 1, “Knowledge Management”), used datasets (chapter 2), selected methods and tools (chapter 3, “Basic Forecasting Tools”; chapter 4, “Multiple Regression”; chapter 5, “Regression Tree Models”; chapter 6, “Autoregressive Models”; chapter 7, “Classification Tools”), and a short discussion on big data in relation to predictive models (chapter 8). The book also provides electronic supplementary material, which consists of four CSV files with data for replicating some of the analyses presented in the book. The authors provide demonstrations on three (rather small) business-related time series datasets (the price of gold, crude oil prices, and stock indices) and describe several approaches for prediction of future values on these datasets. However, the book does not focus on an explanation of used methods or algorithms, nor on a description of tools used or how to perform listed analyses with these tools, nor on principles of various predictive data modeling methods in general. It resembles a follow-up text for one specific type of analysis performed on these datasets, and is nearly devoid of any information usable for different but similar cases. All reasoning is left to the reader, which has little value for a beginner; even further, a knowledgeable user of particular tools could have problems because there is not enough information on the tools settings. The main focus is on a description of obtained results without general comments that could be used to help readers understand described concepts in depth or how to transfer them into different use cases. Instead of describing how to use the mentioned tools for the analysis and how to properly set up the described methods, the authors for example provide an installation guide for tools used (Part 3.7). During the analysis described in the book, many general tools such as R, Excel, Weka, MATLAB, and many others are mentioned. In fact, the analysis could be made in one chosen tool because almost any of the listed tools are able to perform all parts of the analysis. The book could be used as a reference, but because of the shortage of general information, the applicability of described approaches in different scenarios is limited. There is no information on properties of described methods or their inner workings (often not even on an intuitive level). One could use the book as an example of an analysis and an index of methods to look up further detail; however, at some places, the authors use nonstandard names for common methods such as “coincidence matrix” instead of “confusion matrix” or “correct classification rate” instead of “accuracy.” When listing possible approaches, several standard ones are missing, including measures of differences between values (for example, mean absolute percentage error (MAPE), symmetric mean absolute percentage error (SMAPE), and root mean squared error (RMSE)), or popular classification measures (for example, precision, recall, sensitivity, and specificity). In many places, the text lacks structure and flow. In some parts, it is a mere list of statements describing reports returned by utilized analysis tools. Overall, the text could be a nice report for a specific analysis, or for a list of tools and methods to look up further details on when performing basic prediction on time series data. Still, it is usable as a reference, mainly for researchers who want to gain some practical information on data analysis and examples of the analysis of real business-related data.

Reviewer: M. Bielikova	Review #: CR145818 (1804-0166)

Data Mining (H.2.8 ... )

Time Series Analysis (G.3 ... )

Content Analysis And Indexing (H.3.1 )

Learning (I.2.6 )

Would you recommend this review?

yes

Other reviews under "Data Mining":	Date

Feature selection and effective classifiers Deogun J. (ed), Choubey S., Raghavan V. (ed), Sever H. (ed) Journal of the American Society for Information Science 49(5): 423-434, 1998. Type: Article	May 1 1999

Rule induction with extension matrices Wu X. (ed) Journal of the American Society for Information Science 49(5): 435-454, 1998. Type: Article	Jul 1 1998

Predictive data mining Weiss S., Indurkhya N., Morgan Kaufmann Publishers Inc., San Francisco, CA, 1998. Type: Book (9781558604032)	Feb 1 1999

more...

Reproduction in whole or in part without permission is prohibited. Copyright 1999-2024 ThinkLoud^®
Terms of Use | Privacy Policy