Author: Joseph Adler
Publisher: O'Reilly
ISBN: 978-0-596-80170-0

Click Here To Purchase R in a Nutshell: A Desktop Quick Reference

R is a leading open source statistical analysis software that has gained acceptance both among academic and business users in the past several years.

R in a Nutshell is organized into four parts. The first part covers the basics of the R language starting with how to install R for different operating systems and details of the user interface. Also, for Microsoft Excel users, there is guidance on how to use R with Excel.

A key strength of R is the availability of a wide range of freely available comprehensive packages. The book explains how to take maximum advantage of the reusability capabilities inherent in R.

Using R to its fullest requires some knowledge of the language internals. The next section covers the features of R as a programming language. Starting with an overview of the language the book describes the R syntax, objects, symbols and environment and functions.

In addition, advanced features of the R language including support for object-oriented programming is discussed clearly. Practical guidance for real life performance issues when analyzing very large data sets is also covered in detail.

A significant barrier for a new user is getting data input into the system and getting output in the required format. Also, a slightly experienced user will aim at connecting to the system from different databases. With many sources of data available online from the internet, the book gives an example of real time retrieval of online data from Yahoo! Finance.

There is a detailed part on the data preparation and visualization aspects for analysis. While preparation of data possibly takes a significant percent of the total time, many other books attach less importance in dealing with data preparation issues.

The next two chapters present an overview of data visualization features in R starting with the basics and then followed by an extensive explanation of lattice graphics.

While many of the books on using R make use of the data sets that come as part of the software, the book has examples outside the standard data sets which highlights the different contexts in which R can be used.

A good coverage of basic statistical concepts is provided which covers correlation, covariance, probability distributions, techniques such as principal components analysis, factor analysis, experimental design, and other standard statistical tests.

The section on regression covers a range of techniques including semi-parametric and nonparametric methods as well machine learning algorithms and is well explained by several examples. While the book’s emphasis is on R, there is extensive explanation on interpreting the results of the statistical tests.

Several data mining techniques for association and clustering including logistic regression, linear discriminant, classification tree, neural network and support vector machines are covered in good detail.

The chapter on time series covers standard models such as ARIMA.The final chapter deals with the Bioconductor project, beginning with an example based on a data set from the Gene Expression Omnibus website and build on this example.

R in a Nutshell is a concise but still comprehensive desktop reference to R. A key highlights of the book is that it highlights the power of the software to handle statistical analysis in different contexts using comprehensive data sets and code examples. The book is strongly recommended for both for the novice as well as the experienced user interested in serious data analysis.

Click Here To Purchase R in a Nutshell: A Desktop Quick Reference