Part 1: Introduction and getting started

The aim of this session is to provide you with the basic tools to get you started with R. I expect that you have already worked through the materials that were provided to you in advance of this training (the ), then this session should be a useful refresher. If this is the first time that you have used R / RStudio then shame on you for not doing your homework, but you should be able to get up to speed.

This session will reinforce some of what was covered in the Introduction to R: basic tools and cover a number of specific but key activities:

1. Why use R

R was initially developed by Robert Gentleman and Ross Ihaka of the Department of Statistics at the University of Auckland. R is increasingly becoming the default software package in many areas of science. There are a number of reasons for this:

For these reasons R is becoming widely used in many areas of scientific activity and quantitative research.

R can be found at the CRAN website:

2. Working in R

There are 2 key points about working in R

  1. When working in R, either writing your own code or copy and pasting from these materials, you should* write the code into a script or document. Go to File > New File > R Script** to open a new R file.

The reasons for this are so that you get used to using the R console and running the code will help your understanding of the code’s functionality. Then in order to run the code in the R console,a quick way to enter it is to highlight the code (with the mouse or using the keyboard controls) and the press ctrl-R or cmd-Enter on a Mac.

  1. Learning is R is learning to drive. You may pass your test but ti become a good driver it is time behind the wheel that counts. The importance of learning by doing and getting your hands dirty cannot be overstated. Some of the code might look a bit fearsome when first viewed, especially in later session BUT the only really effective way to understand it is to give it a try.

A further minor point is that in the code comments are prefixed by # and are ignored by R when entered into the console.

If you have worked your way through the Introduction to R: basic tools then you will have come across a few things that will be re-capped here:

  1. Assignment: this is the basic process of giving R objects values
vals <- c(4.3,7.1,6.3,5.2,3.2,2.1)
  1. Operations: having assigned values to object that can be manipulated
vals*2
## [1]  8.6 14.2 12.6 10.4  6.4  4.2
sum(vals)
## [1] 28.2
mean(vals)
## [1] 4.7
  1. Indexing: individual elements of R objects with multiple data elements can be referred to:
vals[1]    # first element
## [1] 4.3
vals[1:3]   # a subset of elements 1 to 3
## [1] 4.3 7.1 6.3
sqrt(vals[1:3]) #square roots of the subset
## [1] 2.073644 2.664583 2.509980
vals[c(5,3,2)]  # a subset of elements 5,3,2 - note the ordering
## [1] 3.2 6.3 7.1
  1. There are many different data types in R: character, logical, integer etc - too many to cover here.

  2. There are many different data classes in R: Vectors, Matrices, Factors, Lists

2.2 R packages

When you install R / RStudio it comes with a large number of tools already (refereed to as base functionality).

However, one of the joys of R is the community of users. Users share what they do and create in R in a number of ways. One of these is through packages. Packages are collections of related functions that have been created, tested and supported with help files. These are bundled into a package and shared with other R users via the that users can download from the CRAN repository.

There are 1000s of packages in R. These contain set of tools and can be written by anyone. The number of packages is continually growing. When packages are installed these can be called as libraries. The background to R, along with documentation and information about packages as well as the contributors, can be found at the R Project website http://www.r-project.org.

Packages can be found at the CRAN website - https://cran.r-project.org/web/packages/: