Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Deep learning
Goodfellow I., Bengio Y., Courville A., The MIT Press, Cambridge, MA, 2016. 800 pp. Type: Book (978-0-262035-61-3)
Date Reviewed: Jun 21 2017

Neural net technology, like some teenagers, has grown by fits and starts. Originating as a biological model [1], it was first promoted as a computational tool, the “perceptron,” by Frank Rosenblatt [2]. Minsky and Papert vigorously opposed the model [3] in order to promote the symbolic approach to artificial intelligence (AI), leading to a hiatus that ended when several persistent researchers repopularized the method as “parallel distributed processing” [4]. The field flourished under the general rubric of neural nets, perhaps best represented by C. M. Bishop’s landmark text, Neural networks for pattern recognition [5], but then fell by the wayside with the growth of graphical probabilistic models. Researchers in the neural net period already recognized that to avoid the weaknesses noted by Minsky and Papert, networks needed to have multiple layers, each reorganizing the knowledge presented by the previous one, and this emphasis on “deep” networks has led to the latest catchword for the field, “deep learning.” Written by leaders in developing the latest techniques for training and applying deep networks, this volume is a comprehensive summary of the state of the technology that deserves a place with Rosenblatt, Rumelhart and McClelland, and Bishop as milestones of this technology.

The book is organized into three parts. Part 1 provides background information on the underlying mathematics (linear algebra and probability theory), the characteristics of the numerical computation on which so much of modern machine learning rests, and a brief introduction to machine learning in general. Part 2 expounds the current state of the art in deep learning, and is addressed to readers who want to apply this technique in their own work. It describes specific kinds of networks (feedforward, convolutional, recurrent, and recursive) and the important tools of regularization and optimization, and gives practical pointers and a series of example applications. Part 3 traces major current research challenges in deep learning. At 236 pages, this section is nearly as long as Part 2 (314 pages), showing the authors’ desire to capture as much of the current activity in the field as they can. These last eight chapters (more than in either of the previous sections) will guide many new PhD students in their search for a dissertation topic.

This volume is not an introduction to deep learning. It does not simplify formal details, and its terseness will often lead the reader to follow its prolific references (over 800 publications, extending through 2015) to the original papers on which the exposition rests. Nor, in spite of the practical orientation of Parts 1 and 2, is it a self-contained textbook. It includes no exercises, and though the front matter points to a website with the promise of exercises, the site actually contains a wiki to which readers can contribute exercises, a resource that at the time of this review is not populated. The work is best characterized as a handbook--a broad, carefully documented survey of the field that will be an excellent reference for mature practitioners, and that could serve, with the addition of exercises and explanatory lectures, as the backbone of an academic course.

More reviews about this item: Amazon, Goodreads

Reviewer:  H. Van Dyke Parunak Review #: CR145362 (1708-0516)
1) McCulloch, W. S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5, 4(1943), 115–133.
2) Rosenblatt, F. The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review 65, 6(1958), 386–408.
3) Minsky, M.; Papert, S. A. Perceptrons. MIT Press, Cambridge, MA, 1969.
4) Rumelhart, D. E.; McClelland, J. L. (Eds.) Parallel distributed processing. MIT Press, Cambridge, MA, 1986.
5) Bishop, C. M. Neural networks for pattern recognition. Oxford University Press, Oxford, UK, 1995.
Bookmark and Share
  Featured Reviewer  
 
Learning (I.2.6 )
 
 
Classifier Design And Evaluation (I.5.2 ... )
 
 
Applications And Expert Systems (I.2.1 )
 
Would you recommend this review?
yes
no
Other reviews under "Learning": Date
Learning in parallel networks: simulating learning in a probabilistic system
Hinton G. (ed) BYTE 10(4): 265-273, 1985. Type: Article
Nov 1 1985
Macro-operators: a weak method for learning
Korf R. Artificial Intelligence 26(1): 35-77, 1985. Type: Article
Feb 1 1986
Inferring (mal) rules from pupils’ protocols
Sleeman D.  Progress in artificial intelligence (, Orsay, France,391985. Type: Proceedings
Dec 1 1985
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy