1 What is R

  • R is a system for statistical analyses and graphics created by Ross Ihaka and Robert Gentleman. It is free, open-source software, with versions for Windows, Mac OS X, and Linux operating systems.

  • The R language allows the user, for instance, to program loops to successively analyse several data sets. It is also possible to combine in a single program different statistical functions to perform more complex analyses.

2 Getting and Installing Interactive R Binaries

  • If you are using a Mac or Windows machine, you well probably want to download the files yourself and then run the installers
    • Visit the official R website.
    • Click “Download”.
    • Chose a mirror sites. You well probably want to pick a site that is geographically close, because it is likely to also be close on the Internet, and thus fast.
    • Find the right binary for your platform and run the installer.

3 How to start

  • If you are using Windows, launch R from the Start Menu. On a Mac, double-click the R icon in the Applications folder. When you launch R in Windows, you will see a window with the R console.

3.1 A sample R session

1+2
## [1] 3
20/4
## [1] 5
5%%3
## [1] 2
5%/%3
## [1] 1
2%/%3
## [1] 0
floor(8.9)
## [1] 8
ceiling(8.1)
## [1] 9
abs(-3)
## [1] 3
sign(-3)
## [1] -1
sign(3)
## [1] 1
2^(1/2)
## [1] 1.414214
sqrt(2)
## [1] 1.414214
log(64, base = 4)
## [1] 3
log2(64)
## [1] 6
log10(64)
## [1] 1.80618
log(64)
## [1] 4.158883
exp(2.3026)
## [1] 10.00015
sin(pi)
## [1] 1.224606e-16
factorial(10)
## [1] 3628800
prod(1:10)
## [1] 3628800
gamma(10+1)
## [1] 3628800
pi
## [1] 3.141593
1==1
## [1] TRUE
2>=1
## [1] TRUE
1<=1
## [1] TRUE
1>2
## [1] FALSE
3<2
## [1] FALSE
2!=3
## [1] TRUE
  • vector is a single entity consisting of an ordered collection of numbers. Using the function c() , which combines its arguments into a vector.
    • For example
    age <- c(1,3,5,2,11,9,3,9,12,3) # mo.
    weight <- c(4.4,5.3,7.2,5.2,8.5,7.3,6.0,10.4,10.2,6.1) # kg.

    or

    age = c(1,3,5,2,11,9,3,9,12,3)
    weight = c(4.4,5.3,7.2,5.2,8.5,7.3,6.0,10.4,10.2,6.1)
    data.frame(age, weight)
    age weight
    1 4.4
    3 5.3
    5 7.2
    2 5.2
    11 8.5
    9 7.3
    3 6.0
    9 10.4
    12 10.2
    3 6.1
    mean(weight)
    ## [1] 7.06
    sd(weight)
    ## [1] 2.077498
    cor(age,weight)
    ## [1] 0.9075655
    plot(age,weight, main = 'Scatter plot of infant weight (kg) by age (mo)')

4 Getting help

  • R has an inbuilt help facility. To get more information on any specific named function, for example solve, the command is in the following table.

Table R help functions.

Function Action
help.start() General help
help("solve") or ?solve Help on function solve (the quotation marks are optional).
help.search("solve") or ??solve Search the help system for instances of the string solve.
example("solve") Examples of function solve (the quotation marks are optional).
RSiteSearch("solve") Search for the string solve in online help manuals and archived mailing lists.
apropos("solve", mode="function") List all available functions with solve in their name.
data() List all available example datasets contained in currently loaded packages.
vignette() List all available vignettes for currently installed packages.
vignette("solve") Display specific vignettes for topic solve.

5 R commands, case sensitivity, etc.

  • Technically R is an expression language with a very simple syntax. It is case sensitive, so A and a are different symbols and would refer to different variables.
    • For example
    A <- 3
    A
    ## [1] 3
    • and
    a
    ## Error in eval(expr, envir, enclos): object 'a' not found
  • The set of symbols which can be used in R names depends on the operating system and country within which R is being run. Normally all alphanumeric symbols are allowed plus . and _, with the restriction that a name must start with . or a letter, and if it starts with . the second character must not be a digit. Names are effectively unlimited in length.
    • For example
    A.B_C <- 1
    A.B_C
    ## [1] 1
    .A <- 2 #Not recommended
    .A
    ## [1] 2
  • Commands are separated either by a semi-colon ;, or by a newline.
    • For example
    1 + 2; 3 * 2
    ## [1] 3
    ## [1] 6
    • or
    1 + 2
    ## [1] 3
    3 * 2
    ## [1] 6
  • Comments can be put almost anywhere, starting with a hash mark #, everything to the end of the line is a comment.
    • For example
    runif(10, min = 0, max = 1) # Generate 10 samples from Uniform(0, 1)
    ##  [1] 0.99704079 0.62063864 0.96419561 0.61418344 0.63675383 0.38017155
    ##  [7] 0.64714804 0.71036844 0.24920401 0.08042532

6 Workspace

  • The workspace is your current R working environment and includes any user defined objects (vectors, matrices, functions, data frames, or lists). At the end of an R session, you can save an image of the current workspace that’s automatically reloaded the next time R starts. Commands are entered interactively at the R user prompt. You can use the up and down arrow keys to scroll through your command history. Doing so allows you to select a previous command, edit it if desired, and resubmit it using the Enter key.

Table Functions for managing the R workspace.

Function Action
getwd() List the current working directory.
setwd("mydirectory") Change the current working directory to mydirectory.
ls() List the objects in the current workspace.
rm(objectlist) Remove (delete) one or more objects.
help(options) Learn about available options.
options() View or set current options.
history(#) Display your last # commands (default = 25).
savehistory("myfile") Save the commands history to myfile ( default =.Rhistory).
loadhistory("myfile") Reload a command’s history (default = .Rhistory).
save.image("myfile") Save the workspace to myfile (default = .RData).
save(objectlist,file="myfile") Save specific objects to a file.
load("myfile") Load a workspace into the current session (default =.RData).
q() Quit R. You’ll be prompted to save the workspace.

7 Input and output

  • By default, launching R starts an interactive session with input from the keyboard and output to the screen. But you can also process commands from a script file (a file containing R statements) and direct output to a variety of destinations.

7.1 Input

  • The source("filename") function submits a script to the current session. If the filename doesn’t include a path, the file is assumed to be in the current working directory.
    • For example
    source("script1.R") 

7.2 Graphic Output

  • To redirect graphic output, use one of the functions listed in the following table. Use dev.off() to return output to the terminal.

Table Functions for saving graphic output

Function Output
pdf("filename.pdf") PDF file
win.metafile("filename.wmf") Windows metafile
png("filename.png") PBG file
jpeg("filename.jpg") JPEG file
bmp("filename.bmp") BMP file
postscript("filename.ps") PostScript file

For example

age <- c(1,3,5,2,11,9,3,9,12,3)
weight <- c(4.4,5.3,7.2,5.2,8.5,7.3,6.0,10.4,10.2,6.1)
jpeg(file = "example_1.jpg")
plot(age, weight)
dev.off()
## png 
##   2
age <- c(1,3,5,2,11,9,3,9,12,3)
weight <- c(4.4,5.3,7.2,5.2,8.5,7.3,6.0,10.4,10.2,6.1)
pdf(file = "example_1.pdf")
plot(age, weight)
dev.off()
## png 
##   2
age <- c(1,3,5,2,11,9,3,9,12,3)
weight <- c(4.4,5.3,7.2,5.2,8.5,7.3,6.0,10.4,10.2,6.1)
png(file = "example_1.png")
plot(age, weight)
dev.off()
## png 
##   2

8 Packages

8.1 What are packages?

  • R comes with extensive capabilities right out of the box. But some of its most exciting features are available as optional modules that you can download and install. There are over 2,500 user-contributed modules called packages that you can download from http://cran.r-project.org/web/packages. They provide a tremendous range of new capabilities, from the analysis of geostatistical data to protein mass spectra processing to the analysis of psychological tests!

8.1.1 Installing a package

  • To install a package for the first time, use the install.packages() command. For example, the gclus package contains functions for creating enhanced scatter plots. You can download and install the package with the command.
install.packages("gclus")
  • You only need to install a package once. But like any software, packages are often updated by their authors. Use the command update.packages() to update any packages that you’ve installed. For example
update.packages("gclus")

8.2 Installing a package from GitHub

  • For example
#install.packages("devtools")
library(devtools)
install_github("xliusufe/sqrtn")

8.2.1 Installing a package from local files

8.3 Loading a package

  • Installing a package downloads it from a CRAN mirror site and places it in your library. To use it in an R session, you need to load the package using the library() command . For example
library("gclus")
library("sqrtn")

8.4 Learning about a package

  • When you load a package, a new set of functions and datasets becomes available. Small illustrative datasets are provided along with sample code, allowing you to try out the new functionalities. The help system contains a description of each function (along with examples), and information on each dataset included. Entering help(package="package_name").
    • For example
    help(package="gclus")
    help(package="sqrtn")

    or

    ?gclus
    ??gclus
    ?sqrtn
    ??sqrtn

9 A short tutorial

  • Let’s use the sequence operator to produce a vector with every integer between 1 and 50:
1:50
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
## [24] 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
## [47] 47 48 49 50
x <-  1:50
x
##  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
## [24] 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
## [47] 47 48 49 50

Notice the numbers in the brackets on the lefthand side of the results. These indicate the index of the first element shown in each row.

  • When you perform an operation on two vectors, R will match the elements of the two vectors pairwise and return a vector. For example
c(1, 2, 3, 4) + c(10, 20, 30, 40)
## [1] 11 22 33 44
c(1, 2, 3, 4) - c(1, 1, 1, 1)
## [1] 0 1 2 3
c(1, 2, 3, 4) * c(10, 20, 30, 40)
## [1]  10  40  90 160
c(1, 2, 3, 4) / c(0.5, 0.5, 0.5, 0.5)
## [1] 2 4 6 8
  • If the two vectors aren’t the same size, R will repeat the smaller sequence multiple times:
 c(1, 2, 3, 4) + 1
## [1] 2 3 4 5
1 / c(1, 2, 3, 4, 5)
## [1] 1.0000000 0.5000000 0.3333333 0.2500000 0.2000000
c(1, 2, 3, 4) + c(10, 100)
## [1]  11 102  13 104
  • In R, you can also enter expressions with characters:
 "Hello world."
## [1] "Hello world."

10 References

  • Kabacoff, R. I. . (2011). “R in Action”. Manning Publications Co.
  • Baeza, S. . (2015). “R For Beginners”. CreateSpace Independent Publishing Platform.
  • Adler, J. (2010). “R in a nutshell: A desktop quick reference”. O’Reilly Media, Inc.“.