R-studio tutorial

Krung Sinapiromsaran
October 2014

Outline

  • R programming language
  • R IDE
  • R commands
  • R Studio
    • Console Pane
    • Viewer Pane
    • Extra viewing Pane
    • Source Pane

R programming language

R icon is a computing environment, similar to matlab

  • R is a high-level language based on scheme and S.
  • R communicates to a user via
  • console to execute one command at a time
  • script to execute selection of R statements
  • interactive to execute GUI commands

R history

  • S language from Bell labs and became commercial as S-plus (http://www.insightful.com/) in 1988
  • R is a combination of S and Scheme from Ross Ihaka and Robert Gentleman

R data structures

  • R is a dynamic type language.
  • A variable in R can hold any type of data structures such as integer, double, character, boolean, vector, matrix, array, data.frame, list, etc.
  • primitive type such as integer, double, character, boolean
  • list type such as vector, list of objects
  • array type such as matrix, array of more than 2 dimensions
  • tabular type such as dataframe

R advantages

  • Free
  • Available on many platforms
  • Excellent development team Apr/Oct release
  • Source always available
  • Fast for vectorized calculations
  • Comprehensive R Archive Network (CRAN) ~ 1100 packages
  • On-line documents with examples
  • High-quality graphics

R workspace

  • In each R session, variables, functions are stored in the active memory called workspace as objects.
  • Operators and functions can be performed on these objects.
  • To import/export data, a user needs read.* and write.* function

R architecture

R architecture

R variables

N <- 15; N
[1] 15

Generate an object to keep integer 15 pointed by N.

5 -> n; n
[1] 5

R is case-sensitive, N and n are different variables. A user can use -> or <- to assign value to a variable.

R list

x <- c(3, 5, 7); x
[1] 3 5 7

Generate a list of integers using a function c().

y <- 1:30; y
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
[24] 24 25 26 27 28 29 30

A user can generate a vector of indices :. Note the bracket in front of the result [1] indicates the index of the first object to be printed.

R lists

x <- c(T, F, F); x
[1]  TRUE FALSE FALSE

Generate a list of boolean values.

y <- c(1.2, 4.3, 10.1, 11.2); y
[1]  1.2  4.3 10.1 11.2

Generate a list of real numbers.

R IDE

  1. R standard console (default)

  2. R commander (R GUI required additional modules)

  3. R studio (R GUI via browser offered both client and server) R studio icon

  4. RKWard (R Stand-alone GUI) RKWard icon

  5. Other editors such as Emacs (http://ess.r-project.org/)

R commands

args(ls)
function (name, pos = -1L, envir = as.environment(pos), all.names = FALSE, 
    pattern) 
NULL
  • help(ls)
  • Show help file for a function ls()

R sample commands

Commands Description
help.start() Start the on-line help
help.search(“matrix”) Search for “matrix” in R libraries
demo() Start the R demonstration
q() Exit the current session

R example command

example(boxplot)

boxplt> ## boxplot on a formula:
boxplt> boxplot(count ~ spray, data = InsectSprays, col = "lightgray")

plot of chunk unnamed-chunk-8


boxplt> # *add* notches (somewhat funny here):
boxplt> boxplot(count ~ spray, data = InsectSprays,
boxplt+         notch = TRUE, add = TRUE, col = "blue")

boxplt> boxplot(decrease ~ treatment, data = OrchardSprays,
boxplt+         log = "y", col = "bisque")

plot of chunk unnamed-chunk-8


boxplt> rb <- boxplot(decrease ~ treatment, data = OrchardSprays, col = "bisque")

plot of chunk unnamed-chunk-8


boxplt> title("Comparing boxplot()s and non-robust mean +/- SD")

boxplt> mn.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, mean)

boxplt> sd.t <- tapply(OrchardSprays$decrease, OrchardSprays$treatment, sd)

boxplt> xi <- 0.3 + seq(rb$n)

boxplt> points(xi, mn.t, col = "orange", pch = 18)

boxplt> arrows(xi, mn.t - sd.t, xi, mn.t + sd.t,
boxplt+        code = 3, col = "pink", angle = 75, length = .1)

boxplt> ## boxplot on a matrix:
boxplt> mat <- cbind(Uni05 = (1:100)/21, Norm = rnorm(100),
boxplt+              `5T` = rt(100, df = 5), Gam2 = rgamma(100, shape = 2))

boxplt> boxplot(as.data.frame(mat),
boxplt+         main = "boxplot(as.data.frame(mat), main = ...)")

plot of chunk unnamed-chunk-8


boxplt> par(las = 1) # all axis labels horizontal

boxplt> boxplot(as.data.frame(mat), main = "boxplot(*, horizontal = TRUE)",
boxplt+         horizontal = TRUE)

plot of chunk unnamed-chunk-8


boxplt> ## Using 'at = ' and adding boxplots -- example idea by Roger Bivand :
boxplt> 
boxplt> boxplot(len ~ dose, data = ToothGrowth,
boxplt+         boxwex = 0.25, at = 1:3 - 0.2,
boxplt+         subset = supp == "VC", col = "yellow",
boxplt+         main = "Guinea Pigs' Tooth Growth",
boxplt+         xlab = "Vitamin C dose mg",
boxplt+         ylab = "tooth length",
boxplt+         xlim = c(0.5, 3.5), ylim = c(0, 35), yaxs = "i")

plot of chunk unnamed-chunk-8


boxplt> boxplot(len ~ dose, data = ToothGrowth, add = TRUE,
boxplt+         boxwex = 0.25, at = 1:3 + 0.2,
boxplt+         subset = supp == "OJ", col = "orange")

boxplt> legend(2, 9, c("Ascorbic acid", "Orange juice"),
boxplt+        fill = c("yellow", "orange"))

boxplt> ## more examples in  help(bxp)
boxplt> 
boxplt> 
boxplt> 

R system commands

  • R uses the current working directory to read and write files.
getwd()
[1] "/media/krung/KrungPassport/Data/data2014/Projects/Seagate/TrainingTimeSeriesOct2014/RTraining/TimeSeries"
  • setwd(“path”)
  • Change the working directory to path

R Input/Output commands

  • R writes data to a text file via write.table()
x <- 1:4
write.table(x, "x.dat")
  • R reads data stored in text (ASCII) via read.table()
y <- read.table("x.dat"); y
  x
1 1
2 2
3 3
4 4

R read functions

read.csv(file, header = TRUE, sep = ",", quote="\"", dec=".", fill = TRUE, comment.char="",  ...)

read.csv2(file, header = TRUE, sep = ";", quote="\"", dec=",", fill = TRUE, comment.char="",  ...)

read.delim(file, header = TRUE, sep = "\t", quote="\"", dec=".", fill = TRUE, comment.char="", ...)

read.delim2(file, header = TRUE, sep = "\t", quote="\"", dec=",", fill = TRUE, comment.char="", ...)

R scan function

scan(file = "", what = double(), nmax = -1, 
     n = -1, sep = "", dec = ".", 
     quote=if(identical(sep,"\n"))""else"'\"",
     skip = 0, nlines = 0, na.strings = "NA",
     flush = FALSE, fill = FALSE, 
     strip.white = FALSE,
     quiet = FALSE, blank.lines.skip = TRUE, 
     multi.line = TRUE, fileEncoding = "",
     comment.char = "", allowEscapes = FALSE,
     encoding = "unknown", text)

mydata<-scan("data.dat", what=list("", 0, 0)) 

R write function

write.table(x, file = "", append = FALSE,
            quote = TRUE, sep = " ",
            eol = "\n", na = "NA", dec = ".", 
            row.names = TRUE,
            col.names = TRUE, 
            qmethod = c("escape", "double"),
            fileEncoding = "")

write(x, file="data.txt")

R object creation

v1 <- as.vector(1:10, mode="double"); class(v1); v1
[1] "numeric"
 [1]  1  2  3  4  5  6  7  8  9 10
lv <- factor(1:3); lv
[1] 1 2 3
Levels: 1 2 3

R matrix

m1 <- matrix(1:10, nrow=3, ncol=3); m1
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

Generate a matrix of \( 3 \times 3 \) from a list of numbers 1 to 10

R array

a1 <- array(1:10, c(2, 3, 2)); a1
, , 1

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

, , 2

     [,1] [,2] [,3]
[1,]    7    9    1
[2,]    8   10    2

Generate an array of \( 2 \times 2 \times 2 \) from a list of 1 to 10 and reused elements from 1 to 10.

R data frame

  • R data frame is implicitly created via read() functions or a user can use a data.frame() function to generate it.
id <- c(1022, 1203, 2101)
name <- c("Narudom", "Kantarat", "Sathit")
d1 <- data.frame(id, name); d1
    id     name
1 1022  Narudom
2 1203 Kantarat
3 2101   Sathit

R list

  • R list can contains any data types as its component.
id <- c(1022, 1203, 2101)
name <- c("Narudom", "Kantarat", "Sathit")
l1 <- list(id, name, d1); l1
[[1]]
[1] 1022 1203 2101

[[2]]
[1] "Narudom"  "Kantarat" "Sathit"  

[[3]]
    id     name
1 1022  Narudom
2 1203 Kantarat
3 2101   Sathit

R user defined object

  • R also allows a user to create their own object
ts1 <- ts(c(1203, 3012, 4120, 2310), frequency=1)
ts1
Time Series:
Start = 1 
End = 4 
Frequency = 1 
[1] 1203 3012 4120 2310

R useful functions

z <- rep(c(2,1), 4); z
[1] 2 1 2 1 2 1 2 1

Generate a list of 8 elements

w <- seq(from=-1,to=1,by=0.2); w
 [1] -1.0 -0.8 -0.6 -0.4 -0.2  0.0  0.2  0.4  0.6  0.8  1.0

Generate a sequences of elements from -1 to 1 incremented by 0.2

R expression

  • R allows a user to add two unequal-sized vectors. It will recycle elements from the smaller sized vector.
z + w
 [1] 1.0 0.2 1.4 0.6 1.8 1.0 2.2 1.4 2.6 1.8 3.0
w + (1:3)
 [1] 0.0 1.2 2.4 0.6 1.8 3.0 1.2 2.4 3.6 1.8 3.0

R arithmetic functions

Commands Description
sum(x) Sum of elements of x
prod(x) Product of elements of x
max(x) Return the largest element of x
min(x) Return the smallest element of x
which.max(x) Return the index of the greatest element of x
which.min(x) Return the index of the smallest element of x
length(x) Return the number of elements of x

R statistical functions

Commands Description
range(x) Return a vector of the minimum and maximum of x
mean(x) The arithmetic mean of x
median(x) The median of x
var(x) The variance of x
cor(x) The correlation matrix of x

R functions dealing with a list

Commands Description
pmin(x, y, …) A vector which ith element is the minimum of x[i], y[i], …
pmax(x, y, …) A vector which ith element is the maximum of x[i], y[i], …
cumsum(x) A vector which ith element is the sum from x[1] to x[i]
cumprod(x) A vector which ith element is the product from x[1] to x[i]
cummin(x) A vector which ith element is the minimum from x[1] to x[i]
cummax(x) A vector which ith element is the maximum from x[1] to x[i]

R functions

Commands Description
round(x, n) Rounds the elements of x to n decimals
rev(x) Reverse elements of x
sort(x) Sorts elements of x in increasing order
log(x, base) Computes logarithm of x with base
choose(n, k) Compute the combinations of k events amoung n repetitions
na.omit(x) Suppress the observations with missing data (NA)
na.fail(x) Return an error message if x contains at least one NA

R functions dealing matrix

Commands Description
scale(x) If x is a matrix, centers and reduces the data
match(x, y) Return a vector of the same length as x with the element of x which are in y
which(x == a) Return a vector of the indices of x if the comparison operation is true.
table(x) Return a table with the numbers of different values of x
table(x, y) Contingency table of x and y

R other functions

Commands Description
subset(x, …) Return a selection of x with respect to criteria
sample(x, size) Resample randomly and without replacement size elements in the vector x
unique(x) If x is a vector or a data frame, returns a similar object without duplication

R plot

plot of chunk unnamed-chunk-23

R Studio

R Studio

  • R Studio is a Window application that composes of the standard window components.
  • Control box
  • Title bar
  • Menu bar
  • R Studio server will interact with a user via the browser software

R Studio main window

R Studio

  • Console Pane shows the R command box
  • Viewer Pane shows Files/Plots/Packages/Help/Viewer
  • Extra viewing Pane shows Environment/History/Presentation

R Studio main window

R Studio

  • Source Pane appears when a user selects new command.

R Studio plot sin function

plot(sin, -3*pi, 3*pi)

plot of chunk unnamed-chunk-24

R Studio plot with options

plot(sin,-3*pi,3*pi,type="l",col="red",lwd=5)
title("Plot sin from -3 pi to 3 pi")

plot of chunk unnamed-chunk-25

R Studio plot

y <- c(3.1, 4.2, 2.1, 3.2, 4.2, 5.1)
plot(y, type="b", col="blue", lwd=5)
title("Plot y from data")

plot of chunk unnamed-chunk-26

Summary

  • A user learns how to write R statements.
  • There are a lot of R IDEs available such as R Studio.
  • R has many primitive data types and a user can generate combination of them easily
  • R Studio composes of four extra panes
  • Console pane
  • Viewer pane
  • Extra viewing pange
  • Source pane