1 What is R?

  • R is a dialect of the S language. S language was developed by John Chambers as an internal statistical environment.
  • Created in 1991 by Ross Ihaka and Robert Gentlemen. Announced to the public in 1993.
  • It is free! Very active and vibrant user community.
  • Facilitates data manupulation, calculation, and graphics.
  • Base system runs most fundamental functions
    • Need packages for other specific tasks
    • Packages are user contributed.
  • More info: https://www.r-project.org/about.html

1.1 Getting started

In this class we are going to use RStudio Cloud for in-class exercises and assignment. I have sent you an email invitation to join the course home page in RStudio Cloud. You can run everything we do in RStudio Cloud on your local computer.

2 R Objects

2.1 RAM and HDD

RAM and HDD, are both types of computer memory. RAM is used to store computer programs and data that CPU needs in real time. It is a working memory of the computer. RAM data is volatile and is erased once computer is switched off. HDD, hard disk has permanent storage and it is used to store user specific data.

2.2 Types of objects

  1. character: “LSU”
  2. numeric: 5.5342
  3. integer: 7
  4. complex: 1+4i
  5. logical: TRUE, FALSE or T,F for short
  6. factor: categorical variables, male or female

3 Interactive Session

3.1 Assignment operator

  • “<-” symbol is the assignment operator
x <- 1
y <- 2
x+y
## [1] 3
text <- "hello"
print(text)
## [1] "hello"

3.2 Listing objects

ls(): Lists all the objects

ls()
## [1] "text" "x"    "y"

rm(list=ls()): clears all the objects

rm(list=ls())
ls()
## character(0)

3.3 Vector: a combination of same class

a <-c("a","b","c")
print(a)
## [1] "a" "b" "c"
b <- 1:10 # : operator is used to create integer sequence
print(b)
##  [1]  1  2  3  4  5  6  7  8  9 10
print(b[4:8])
## [1] 4 5 6 7 8
d  <- c(TRUE,FALSE)

3.4 Factors: Used to represent categorical data

  • Can be ordered (agree, disagree, …) or unordered (male, female)
  • More descriptive and easier to interpret
  • You can think of factors as integer vector with each having a label
fcts <- factor(c("male","female","male","female"),levels=c("male","female"))
print(fcts)
## [1] male   female male   female
## Levels: male female

3.5 Missing Values

  • denoted by NA or NaN (not a number)
  • is.na(): checks if an element is not a number
  • is.nan()

4 R Scripts

  • A script is a good way to keep track of what you’re doing
  • If you have a long analysis, and you want to be able to recreate it later, a good idea is to type it into a script
  • R scripts are saved as ‘.R’ files and can be opened later and rerun the analysis
  • Any command that begins with character ‘#’ is ignored when scripts are executed. These are called comments
  • To create an R Script use the following steps in RStudio
    • File > New File > R Script
    • Save the file
  • To run the script press: Ctrl + Alt + R
  • To run a particular line(s), click on the line(s) and press: Ctrl + Enter

4.1 First R Script

rm(list=ls()) # clears the memory

# creates a vector of 25 random numbers between 0 and 1
random_data <- runif(25, 0, 1)

# calculate mean
random_data_mean = mean(random_data)

# calculate standard deviation
random_data_sd = sd(random_data)

# print results
print(paste("Mean: ",random_data_mean,", SD: ",random_data_sd))
## [1] "Mean:  0.50355591090396 , SD:  0.312337265044136"

5 R Markdown

  • R Markdown is a file format for making dynamic documents with R.
  • An R Markdown document is written in markdown (an easy-to-write plain text format) and contains chunks of embedded R code
  • You can ‘Knit’ and R Markdown so that and HTML file or a PDF file is genarated that includes both content as well as the output of any embedded R code chunks within the document.
  • To create an R Markdown use the following steps in RStudio
  • File > New File > R Markdown
  • Save the file
  • ‘#’,‘##’, . . . indicates header level
  • List items should be preceded by ’*’
  • R code chunks are included between ``` {r} and “‘
  • To not display r code set echo=FALSE {r echo=FALSE}
  • More details: https://rmarkdown.rstudio.com/articles_intro.html

6 Control Structures

Control structures allow programmer to control the structure of the program. We are going to talk about ‘if else’, ‘for’, ‘while’, etc.

6.1 if-else conditions

Suppose you want to assign a credit rating based on the probability of default. If the probability of default is less than 1% the credit rating would be ‘A’. If the probability of default is greater than 1% but less than 5% we want to assign a credit rating of ‘B’. 5% or greater probability of default would be assigned a rating of ‘C’.

creditrating = NA
probofdefault = 0.002

if(probofdefault<0.01) {
  creditrating = "A"
} else if(probofdefault<0.05) {
  creditrating= "B"
} else {
  creditrating = "C"
}

print(paste("Credit Rating:",creditrating))
## [1] "Credit Rating: A"

6.2 for loop

Print the squreroots of numbers from 1 to 10

for(i in 1:10) {
  print(sqrt(i))
}
## [1] 1
## [1] 1.414214
## [1] 1.732051
## [1] 2
## [1] 2.236068
## [1] 2.44949
## [1] 2.645751
## [1] 2.828427
## [1] 3
## [1] 3.162278
numbervec <- c(5,7,9)

for(i in numbervec)  {
  print(i)
}
## [1] 5
## [1] 7
## [1] 9

6.3 while loop

Count by 5 up to 100

6.5 break

Iterate through the given list of letters and stop if the letter is equal to ‘f’

letters <- c("s","r","e","x","f","a","o")

for(c in letters) {
  if(c=="f") {
    break
  }
  print(c)
}
## [1] "s"
## [1] "r"
## [1] "e"
## [1] "x"

7 Functions

A function is a block of code that is used to perform a task. A function typically takes and input, executes the block of code, and returns a value.

More: https://www.youtube.com/watch?v=Pi0Yf-jn7O8

7.1 A function to add 2 numbers

add2 <- function(a,b) {
  output = a+b
  return(output)
}
add2(2,6)
## [1] 8

7.2 A function to calculate the present value

presentvalue <- function(fv,r,n) {
  pv = fv/(1+r)^n
  return(pv)
}
presentvalue(245,0.05,3)
## [1] 211.6402
cf = c(-10,5,4,5,1)
r = 0.14
npv = 0

for(index in 1:length(cf))  {
 pv = cf[index]/(1+r)^(index-1)
 npv = npv+pv
}

print(npv)
## [1] 1.430773

7.2.1 Reusing functions

  1. Open a new R script
  2. Write your function
  3. Save it as an R file
  4. Use ‘source’ command to import the function

Write a function to calculate the net present value of a project

npv <- function(cashflows,r) {
  t = 1
  npv = 0
  for(c in cashflows) {
    npv <- npv+cashflows[t]/(1+r)^(t-1)
    t=t+1
  }
  return(npv)
}
source('npv.R')
cf = c(-100,120,34,56,23,54)
r = 0.08

npv(cf,r)
## [1] 138.3724

8 Coding Standards

Make your coding readable and easy to understand