Chapter 1 Introduction to R

R is a language and environment for statistical analysis and data science. The R language was created by and (initially) for statisticians around 1997 from within the Statistics Department of the University of Auckland. The R software platform is free to download from several mirrored sites listed here. Over time, R is increasingly the software platform of choice in many fields, but especially in the natural, agricultural, and environmental sciences.

As an R user you can write code to use inbuilt functions to import, manipulate and analyse data. R also has very flexible data visualisation functionality. As a language R allows people to create their own functions which has meant that pretty much any statistical procedure you read about - someone, somewhere, has written it up as an R function and made it available for everyone to use. When someone has written some functions that they want to share, they create a package which is then available to any R user to install and use. There are many thousands of R packages, and growth is exponential. To get an idea of the different domains in which people have created R packages, some more popular packages have been grouped into topics, although there are several package repositories. Staff at UWA are also active contributors of R packages.

It is possible to use R for more than just statistics. For example, this entire document was created using R! That’s right, all text, data, figures, notes ans references are written in R. It is possible to do this using the R package knitr and bookdown, which are two of several packages that allow document creation from within R. The real beauty of this is that you can create a report based on your data manipulation, analysis, plots and tables. If the data change (say you change the data or want the same analysis with a different data set) then you can re-produce the entire report by changing a couple of lines of code and then hitting the “run” button. As you can imagine this saves a lot of pointing, clicking, cutting, pasting, typing and hours - and if you do it well can reduce potential for errors.

More recently a lot of work has gone into making R more user-friendly, so that scientists and others who maybe don’t have strong programming skills can make use of R’s extensive capability. One tool that helps make R even more accessible is RStudio, which is what we will be using in this course.

Due to the R’s popularity there are many, many manuals, tutorials and other learning resources online for users at all levels. Rather than give you a long list of references, let’s use the resources identified by RStudio as the most useful, available at https://www.rstudio.com/online-learning/. For more face-to-face learning, UWA periodically offers courses on R programming in general and specific statistical techniques. See http://www.cas.maths.uwa.edu.au/courses.