Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
The art of R programming : a tour of statistical software design
Matloff N., No Starch Press, San Francisco, CA, 2011. 400 pp. Type: Book (978-1-593273-84-2)
Date Reviewed: Apr 19 2012

R is a scripting language that can be used for manipulating and analyzing statistical data. While a lot of material can be found on the Internet (R is open source), it is very difficult to become familiar with all domains (for example, reading data, manipulating and analyzing it, and plotting) because a very large number of libraries and routines exist. In this context, this book tries to bring some order to the R world. It is perhaps worth noting that rather than separated parts, chapters resemble a sequence of lessons that start from the basics and work up to some advanced topics, such as how to interface R to other programming languages, how to write parallel R code, and how to debug R programs.

Chapter 1 is a simple introduction to R, discussing very basic topics such as how to issue commands (interactive or batch mode) and introducing the reader to functions, data structures, and so forth.

Chapters 2 through 6 discuss the core of R. People new to the language will probably find that this is the most interesting part of the book. The structure of these chapters is very similar; that is, they start by discussing basic operations and then move to more advanced topics, and discuss “extended examples,” which describe in great detail how to get the best out of the language in a variety of scenarios.

Chapter 2 introduces the reader to the fundamental data type in R, the vector. Covered topics include vector declaration, recycling, indexing, and filtering. Chapter 3 discusses matrices and related operations (for example, indexing, linear algebra, filtering, and how to change the size of an existing matrix). An interesting example covers how to manipulate images by representing a picture by means of a matrix. This enables one to perform a number of operations (for example, deleting some pixels) in a very simple manner. Chapter 3 introduces the apply() function, one of the most used R functions. In this context, apply() instructs R to call a user-defined function on each row or column of a matrix. Practical applications of apply() include a function that finds the outliers (data points that differ substantially from most of the observations) from a collection of data or a function that finds the closest pair of vertexes in a graph. Next, R lists (data structures that can combine objects of different types in a manner similar to the Python dictionary) are introduced.

Chapter 5 describes data frames, the heterogeneous analogs of matrices for 2D data (see the discussion about lists and vectors), while chapter 6 introduces the reader to factors and tables data types, which are very useful when dealing with tabular data.

Next, the author covers the basic structures of R as a programming language, for example, the scope of the variables or the control statements (chapter 7). Chapter 8 covers built-in math and statistical distribution functions, while chapter 9 introduces object-oriented programming. With respect to object orientation, it should be noted that R is very different from object-oriented programming languages such as C++ or Java (as the author points out, the syntax somewhat resembles that of Perl).

Next, the book covers a number of topics such as input and output, including how to load data from a remote location (chapter 10); string manipulation, including regular expressions (chapter 11); graphics (chapter 12); advice on debugging R code (chapter 13); tips about how to write fast code (chapter 14); how to interface R to C and C++ and Python (chapter 15); and how to parallelize R programs (chapter 16). The book concludes with two appendixes describing how to install R and related libraries.

Apart from its use in statistics, R is very well known for producing high-quality charts. Chapter 12 is relatively basic, so the reader who is interested in learning the nuts and bolts of graphics in R should probably refer to other resources [1,2].

Overall, the author does a very good job, covering a number of important concepts in great detail. As pointed out elsewhere [3], the most captivating aspect of the book is perhaps the author’s thoughtful manner of exposition.

Reviewer:  Michele Mazzucco Review #: CR140072 (1209-0889)
1) Murrell, P. R graphics, second edition. CRC Press, Boca Raton, FL, 2011.
2) Yau, N. Visualize this: the FlowingData guide to design, visualization, and statistics. Wiley, Indianapolis, IN, 2011, http://book.flowingdata.com/.
3) Rickert,J. Review of "The art of R programming" by Norman Matloff http://blog.revolutionanalytics.com/2011/11/review-of-the-art-of-r-programming-by-norman-matloff.html (04/18/2012).
Bookmark and Share
  Reviewer Selected
Featured Reviewer
 
 
Statistical Software (G.3 ... )
 
 
General (D.1.0 )
 
Would you recommend this review?
yes
no
Other reviews under "Statistical Software": Date
Applied statistics and the SAS programming language (2nd ed.)
Cody R., Smith J., North-Holland Publishing Co., Amsterdam, The Netherlands, 1987. Type: Book (9789780444011923)
Jun 1 1988
Applied statistics algorithms
Griffiths P. (ed), Hill I. (ed), John Wiley & Sons, Inc., New York, NY, 1985. Type: Book (9789780470201848)
Apr 1 1986

Blank G. (ed)Type: Journal
May 1 1987
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy