Introduction to R for Finance

Chapter 1: The Basics

1.1 Your first R script

Type R code in the script to the right to solve exercises, hit submit button. Lines of code in script are executed and you’re told if the code is correct or not. Output is shown in R console. You can also execute code directly from the R console.

# Addition!
3 + 5
## [1] 8

# Subtraction!
6 - 4
## [1] 2

1.2 Arithmetic in R(1)

The ^ operator raises the “base”" number to the number to it’s right. (3^2=9). the Modulo (%%) returns the remainder of the division of the number to the left. example: 9 %% 3 is 0, because 3x3 is 9 and there are no left over.

# Addition 
2 + 2
## [1] 4

# Subtraction
4 - 1
## [1] 3

# Multiplication
3 * 4
## [1] 12

# Division
4 / 2
## [1] 2

# Exponentiation
2^4
## [1] 16

# Modulo
7 %% 3
## [1] 1

1.4 Assignment and variables(1)

a variable allows the storage of a value or an object in R. Typing in the variables name later on will allow easy access to it’s value. <- is the code to assign a variable. example: my_money <- 100

# Assign 200 to savings
savings <- 200

# Print the value of savings to the console
savings
## [1] 200

1.5 Assignment and variables(2)

You can perform equations in the script with the variables you’ve assigned. The assignment must be on the right:

# Assign 100 to my_money
my_money <- 100

# Assign 200 to dans_money
dans_money <- 200

# Add my_money and dans_money
my_money + dans_money
## [1] 300

# Add my_money and dans_money again, save the result to our_money
my_money + dans_money
## [1] 300
our_money <- my_money + dans_money

1.6 Financial returns(1)

Financial returns: 105% is 1.05. multipler= 1 + (return) / 100.

# Variables for starting_cash and 5% return during January
starting_cash <- 200
jan_ret <- 5
jan_mult <- 1 + (jan_ret / 100)

# How much money do you have at the end of January?
post_jan_cash <- 210

# Print post_jan_cash
210
## [1] 210

# January 10% return multiplier
jan_ret_10 <- 10
jan_mult_10 <- 1 + (jan_ret_10/100)

# How much money do you have at the end of January now?
post_jan_cash_10 <- 220

# Print post_jan_cash_10
220
## [1] 220

1.7 Financial returns(2)

to add another 2% the following month, just add another multiplier: $100 * 1.05 * 1.02 = $1.07.1 This helps find the Total Return over the 2 months, which would be 1.05 * 1.02 = 1.071, which means you earned 7.1% in toal over 2 months.

# Starting cash and returns 
starting_cash <- 200
jan_ret <- 4
feb_ret <- 5

# Multipliers
jan_mult <- 1.04
feb_mult <- 1.05

# Total cash at the end of the two months
total_cash <- 200 * 1.04 * 1.05

# Print total_cash
218.4
## [1] 218.4

1.8 Data type exploration

Numerics are decimal numbers, Integers are whole numbers and (must be specified with an L such as 4L) Logicals are boolean values (True and False)- capitalize them Characters are text values “hello world”

# Apple's stock price is a numeric
apple_stock <- 150.45

# Bond credit ratings are characters
credit_rating <- "AAA"

# You like the stock market. TRUE or FALSE?
my_answer <- FALSE

# Print my_answer
FALSE
## [1] FALSE

1.9 What’s that data type?

class(my_var) will return the data type (or class) of whatever varialbe. with variables already assigned:

class(a) [1] “logical” class(b) [1] “numeric” class(c) [1] “character”

Chapter 2: Vectors and Matrices

2.1 c()combine

create a vector by the combine function, c(), each element you add is seperated by a comma. For example, this is a vector of Apple’s stock prices from December, 2016: apple_stock <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12) And this is a character vector of bond credit ratings: credit_rating <- c(“AAA”, “AA”, “BBB”, “BB”, “B”)

# Another numeric vector
ibm_stock <- c(159.82, 160.02, 159.84)

# Another character vector
finance <- c("stocks", "bonds", "investments")

# A logical vector
logic <- c(TRUE, FALSE, TRUE)

2.2 coerce it

Vector can only be composed of 1 data type. If attempted: the lower ranking type will be coerced into the higher ranking type. hierarchy is: logical < integer < numeric < character Logicals are coerced different depending on what the highest data type is. c(TRUE, 1.5) will return c(1, 1.5) where TRUE is coerced to the numeric 1 (FALSE would be converted to a 0). On the other hand, c(TRUE, “this_char”) is converted to c(“TRUE”, “this_char”).

a <- c(1L , “I am a character”)

b <- c(TRUE, “Hello”)

c <- c(FALSE, 2)

a is a character vector, b is a character vector, c is a numeric vector.

2.3 Vector names()

Can add names to each return inside of a vector. ret <- c(5, 2) names() names(ret) <- c(“Jan”, “Feb”)

# Vectors of 12 months of returns, and month names
ret <- c(5, 2, 3, 7, 8, 3, 5, 9, 1, 4, 6, 3)
months <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")

# Add names to ret
names(ret) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")

# Print out ret to see the new names!
ret
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 
##   5   2   3   7   8   3   5   9   1   4   6   3

2.4 Visualize your vector

The plot() function is one of the many ways to create a graph from your data in R. Passing in a vector will add its values to the y-axis of the graph, and on the x-axis will be an index created from the order that your vector is in. nside of plot(), you can change the type of your graph using type =. The default is “p” for points, but you can also change it to “l” for line.

# Look at the data
apple_stock <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19, 115.82, 115.97, 116.64, 116.95, 117.06, 116.29, 116.52, 117.26, 116.76, 116.73, 115.82)

# Plot the data points
plot(apple_stock)


# Plot the data as a line graph
plot(apple_stock, type = "l")

2.5 Weighted average (1)

your weighted average is your total portfolio return: assume 40% of your stock in apple earned 5%, and 60% in IBM earned 7%, calculate the total return. Take the return of each stock, and multiply it by the weight of each stock, sum up the results. 6.2 = 5 * .4 + 7 * .6 portf_ret <- apple_ret * apple_weight + ibm_ret * ibm_weight

# Weights and returns
micr_ret <- 7
sony_ret <- 9
micr_weight <- .2
sony_weight <- .8

# Portfolio return
portf_ret <- micr_ret * micr_weight + sony_ret * sony_weight

2.6 Weighted average (2)

Same arithmetic with vectors: First, calculate ret * weight, which multiplies each element in the vectors together to create a new vector ret_X_weight then use sum() to sum up each element in the vector.

# Weights, returns, and company names
ret <- c(7, 9)
weight <- c(.2, .8)
companies <- c("Microsoft", "Sony")

# Assign company names to your vectors
names(ret) <- companies
names(weight) <- companies

# Multiply the returns and weights together 
ret_X_weight <- ret * weight

# Print ret_X_weight
ret_X_weight
## Microsoft      Sony 
##       1.4       7.2

# Sum to get the total portfolio return
portf_ret <- sum(ret_X_weight)

# Print portf_ret
portf_ret
## [1] 8.6

2.7 Weighted average (3)

Giving equal weight to 2 companies (50/50) ret <- c(7, 9) weight <- .5 ret_X_weight <- ret * weight ret_X_weight [1] 3.5 4.5 ret is a vector of length 2, and weight is a vector of length 1. R reuses the .5 in weight twice to make it the same length of ret, then performs the element-wise arithmetic.

# Print ret
ret
## Microsoft      Sony 
##         7         9

# Assign 1/3 to weight
weight <- 1/3

# Create ret_X_weight
ret_X_weight <- ret * weight

# Calculate your portfolio return
portf_ret <- sum(ret_X_weight)

# Vector of length 3 * Vector of length 2?
ret * c(.2, .6)
## Microsoft      Sony 
##       1.4       5.4

2.8 Vector subsetting

Accessing specific pieces of vectors Accessing a specific month from the vector of 12 months of returns: Here is the 12 month return vector: ret <- c(5, 2, 3, 7, 8, 3, 5, 9, 1, 4, 6, 3) Select the first month: ret[1] Select the first month by name: ret[“Jan”] Select the first three months: ret[1:3] or ret[c(1, 2, 3)]

# Define ret
ret <- c(5, 2, 3, 7, 8, 3, 5, 9, 1, 4, 6, 3)
names(ret) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
ret
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 
##   5   2   3   7   8   3   5   9   1   4   6   3
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 
##   5   2   3   7   8   3   5   9   1   4   6   3

# First 6 months of returns
ret[1:6]
## Jan Feb Mar Apr May Jun 
##   5   2   3   7   8   3

# Just March and May
ret[c("Mar", "May")]
## Mar May 
##   3   8

# Omit the first month of returns
ret[-1]
## Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 
##   2   3   7   8   3   5   9   1   4   6   3

2.9 Create a matrix!

matrices are like vectors but with 2 dimensions matrix(data = c(2, 3, 4, 5), nrow = 2, ncol = 2) [,1] [,2] [1,] 2 4 [2,] 3 5 the actual data for the matrix is passed in as a vector using c(), and is then converted to a matrix by specifying the number of rows and columns (also known as the dimensions). equivilent: my_vector <- c(2, 3, 4, 5) matrix(data = my_vector, nrow = 2, ncol = 2)

# A vector of 9 numbers
my_vector <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)

# 3x3 matrix
my_matrix <- matrix(data = my_vector, nrow = 3, ncol = 3)

# Print my_matrix
my_matrix
##      [,1] [,2] [,3]
## [1,]    1    4    7
## [2,]    2    5    8
## [3,]    3    6    9

# Filling across using byrow = TRUE
matrix(data = c(2, 3, 4, 5), nrow = 2, ncol = 2, byrow = TRUE)
##      [,1] [,2]
## [1,]    2    3
## [2,]    4    5

2.10 Matric<- bind vectors

often create vectors from combining multiple. easiest to use the functions cbind() and rbind() (column bind and row bind respectively) combine two vectors of Apple and IBM stock prices:

# Define Vectors
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19,
           115.82, 115.97, 116.64, 116.95, 117.06, 116.29, 116.52, 117.26, 116.76, 116.73,
           115.82)
ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79, 165.36, 166.52, 165.50, 168.29, 168.51, 
         168.02, 166.73, 166.68, 167.60, 167.33, 167.06, 166.71, 167.14, 166.19, 166.60, 
         165.99)
micr <- c(59.20, 59.25, 60.22, 59.95, 61.37, 61.01, 61.97, 62.17, 62.98, 62.68, 62.58,
          62.30, 63.62, 63.54, 63.54, 63.55, 63.24, 63.28, 62.99, 62.90, 62.14)

# cbind the vectors together
cbind_stocks <- cbind(apple, ibm, micr)

# Print cbind_stocks
cbind_stocks
##        apple    ibm  micr
##  [1,] 109.49 159.82 59.20
##  [2,] 109.90 160.02 59.25
##  [3,] 109.11 159.84 60.22
##  [4,] 109.95 160.35 59.95
##  [5,] 111.03 164.79 61.37
##  [6,] 112.12 165.36 61.01
##  [7,] 113.95 166.52 61.97
##  [8,] 113.30 165.50 62.17
##  [9,] 115.19 168.29 62.98
## [10,] 115.19 168.51 62.68
## [11,] 115.82 168.02 62.58
## [12,] 115.97 166.73 62.30
## [13,] 116.64 166.68 63.62
## [14,] 116.95 167.60 63.54
## [15,] 117.06 167.33 63.54
## [16,] 116.29 167.06 63.55
## [17,] 116.52 166.71 63.24
## [18,] 117.26 167.14 63.28
## [19,] 116.76 166.19 62.99
## [20,] 116.73 166.60 62.90
## [21,] 115.82 165.99 62.14

# rbind the vectors together
rbind_stocks <- rbind(apple, ibm, micr)

# Print rbind_stocks
rbind_stocks
##         [,1]   [,2]   [,3]   [,4]   [,5]   [,6]   [,7]   [,8]   [,9]
## apple 109.49 109.90 109.11 109.95 111.03 112.12 113.95 113.30 115.19
## ibm   159.82 160.02 159.84 160.35 164.79 165.36 166.52 165.50 168.29
## micr   59.20  59.25  60.22  59.95  61.37  61.01  61.97  62.17  62.98
##        [,10]  [,11]  [,12]  [,13]  [,14]  [,15]  [,16]  [,17]  [,18]
## apple 115.19 115.82 115.97 116.64 116.95 117.06 116.29 116.52 117.26
## ibm   168.51 168.02 166.73 166.68 167.60 167.33 167.06 166.71 167.14
## micr   62.68  62.58  62.30  63.62  63.54  63.54  63.55  63.24  63.28
##        [,19]  [,20]  [,21]
## apple 116.76 116.73 115.82
## ibm   166.19 166.60 165.99
## micr   62.99  62.90  62.14

2.11 Visualize your matrix

Plotting matrices of Apple and Microsoft stocks to see the relationships between their stock prices in December 2016.

# Define matrix
apple_micr_matrix <- cbind(apple, micr)

# View the data
apple_micr_matrix
##        apple  micr
##  [1,] 109.49 59.20
##  [2,] 109.90 59.25
##  [3,] 109.11 60.22
##  [4,] 109.95 59.95
##  [5,] 111.03 61.37
##  [6,] 112.12 61.01
##  [7,] 113.95 61.97
##  [8,] 113.30 62.17
##  [9,] 115.19 62.98
## [10,] 115.19 62.68
## [11,] 115.82 62.58
## [12,] 115.97 62.30
## [13,] 116.64 63.62
## [14,] 116.95 63.54
## [15,] 117.06 63.54
## [16,] 116.29 63.55
## [17,] 116.52 63.24
## [18,] 117.26 63.28
## [19,] 116.76 62.99
## [20,] 116.73 62.90
## [21,] 115.82 62.14

# Scatter plot of Microsoft vs Apple
plot(apple_micr_matrix)

2.12 cor()relation

-1 is perfect negative correlation, 1 is perfect positive correlation, 0 means no correlation at all. cor() function calculates correlation between two vectors, or will create a correlation matrix when given a matrix. cor(apple, micr) [1] 0.9477011 = strong correlation (almost 1) cor(apple_micr_matrix) apple micr apple 1.0000000 0.9477011 micr 0.9477011 1.0000000

# Correlation of Apple and IBM
cor(apple, ibm)
## [1] 0.8872467

# stock matrix
stocks <- cbind(apple, micr, ibm)

# cor() of all three
cor(stocks)
##           apple      micr       ibm
## apple 1.0000000 0.9477010 0.8872467
## micr  0.9477010 1.0000000 0.9126597
## ibm   0.8872467 0.9126597 1.0000000

2.13 Matrix subsetting

selecting and subsetting matrices: my_matrix[row, col] To select the first row and first column of stocks from the last example: stocks[1,1] entire first row, leave the col empty: stocks[1, ] first two rows: stocks[1:2, ] or stocks[c(1,2), ] entire column, leave the row empty: stocks[, 1] entire column by name: stocks[, “apple”]

# Third row
stocks[3, ]
##  apple   micr    ibm 
## 109.11  60.22 159.84

# Fourth and fifth row of the ibm column
stocks[4:5, "ibm"]
## [1] 160.35 164.79

# apple and micr columns
stocks[, c("apple", "micr")]
##        apple  micr
##  [1,] 109.49 59.20
##  [2,] 109.90 59.25
##  [3,] 109.11 60.22
##  [4,] 109.95 59.95
##  [5,] 111.03 61.37
##  [6,] 112.12 61.01
##  [7,] 113.95 61.97
##  [8,] 113.30 62.17
##  [9,] 115.19 62.98
## [10,] 115.19 62.68
## [11,] 115.82 62.58
## [12,] 115.97 62.30
## [13,] 116.64 63.62
## [14,] 116.95 63.54
## [15,] 117.06 63.54
## [16,] 116.29 63.55
## [17,] 116.52 63.24
## [18,] 117.26 63.28
## [19,] 116.76 62.99
## [20,] 116.73 62.90
## [21,] 115.82 62.14

Chapter 3: Data Frames

Same structure as a matrix but it can include multiple data types.

3.1 Create your first data frame()

Companies future cash flows:

# Variables
company <- c("A", "A", "A", "B", "B", "B", "B")
cash_flow <- c(1000, 4000, 550, 1500, 1100, 750, 6000)
year <- c(1, 3, 4, 1, 2, 4, 5)

# Data frame
cash <- data.frame(company, cash_flow, year)

# Print cash
cash
##   company cash_flow year
## 1       A      1000    1
## 2       A      4000    3
## 3       A       550    4
## 4       B      1500    1
## 5       B      1100    2
## 6       B       750    4
## 7       B      6000    5

3.2 Making head()s and tail()s of your data with some str()ucture

very useful functions. head() - Returns the first few rows of a data frame. By default, 6. To change this, use head(cash, n = ) tail() - Returns the last few rows of a data frame. By default, 6. To change this, use tail(cash, n = ) str() - Check the structure of an object. This fantastic function will show you the data type of the object you pass in (here, data.frame), and will list each column variable along with its data type.

# Call head() for the first 4 rows
head(cash, n = 4)
##   company cash_flow year
## 1       A      1000    1
## 2       A      4000    3
## 3       A       550    4
## 4       B      1500    1

# Call tail() for the last 3 rows
tail(cash, n= 3)
##   company cash_flow year
## 5       B      1100    2
## 6       B       750    4
## 7       B      6000    5

# Call str()
str(cash)
## 'data.frame':    7 obs. of  3 variables:
##  $ company  : Factor w/ 2 levels "A","B": 1 1 1 2 2 2 2
##  $ cash_flow: num  1000 4000 550 1500 1100 750 6000
##  $ year     : num  1 3 4 1 2 4 5

3.3 Naming your columns / rows

Change your column names with the colnames() function and row names with the rownames() function.

# Fix your column names
colnames(cash) <- c("company", "cash_flow", "year")

# Print out the column names of cash
colnames(cash)
## [1] "company"   "cash_flow" "year"

3.4 Accessing and subsetting data frames (1)

Select the first row: cash[1, ] Select the first column: cash[ ,1] Select the first column by name: cash[ ,“company”]

# Third row, second column
cash[3, 2]
## [1] 550

# Fifth row of the "year" column
cash[5, "year"]
## [1] 2

3.5 Accessing and subsetting data frames (2)

$ is a shortcut for selectig a specific column from a data frame You can delete a column by assigning it NUL

# Select the year column
cash$year
## [1] 1 3 4 1 2 4 5

# Select the cash_flow column and multiply by 2
cash$cash_flow * 2
## [1]  2000  8000  1100  3000  2200  1500 12000

# Delete the company column
cash$company <- NULL

# Print cash again
cash
##   cash_flow year
## 1      1000    1
## 2      4000    3
## 3       550    4
## 4      1500    1
## 5      1100    2
## 6       750    4
## 7      6000    5

3.6 Accessing and subsetting data frames (3)

only interested in the cash flows from company A? For more flexibility, try subset() subset(cash, company == “A”) The first argument is the name of your data frame (cash) do not put company in quotes The == is the equality operator. It tests to find where two things are equal, then returns a logical vector

# Restore cash
company <- c("A", "A", "A", "B", "B", "B", "B")
cash_flow <- c(1000, 4000, 550, 1500, 1100, 750, 6000)
year <- c(1, 3, 4, 1, 2, 4, 5)

cash <- data.frame(company, cash_flow, year)

# Rows about company B
subset(cash, company == "B")
##   company cash_flow year
## 4       B      1500    1
## 5       B      1100    2
## 6       B       750    4
## 7       B      6000    5

# Rows with cash flows due in 1 year
subset(cash, year == 1)
##   company cash_flow year
## 1       A      1000    1
## 4       B      1500    1

3.7 Adding new columns

Create a new column in your data frame using data_frame$new_column

# Quarter cash flow scenario
cash$quarter_cash <- cash$cash_flow * 0.25  
cash
##   company cash_flow year quarter_cash
## 1       A      1000    1        250.0
## 2       A      4000    3       1000.0
## 3       A       550    4        137.5
## 4       B      1500    1        375.0
## 5       B      1100    2        275.0
## 6       B       750    4        187.5
## 7       B      6000    5       1500.0
# Double year scenario
cash$double_year <- cash$year * 2
cash
##   company cash_flow year quarter_cash double_year
## 1       A      1000    1        250.0           2
## 2       A      4000    3       1000.0           6
## 3       A       550    4        137.5           8
## 4       B      1500    1        375.0           2
## 5       B      1100    2        275.0           4
## 6       B       750    4        187.5           8
## 7       B      6000    5       1500.0          10

3.8 Present value of projected cash flows (1)

Calculate the present value of $100 to be received 1 year from now at a 5% interest rate present_value <- cash_flow * (1 + interest / 100) ^ -year

# Present value of $4000, in 3 years, at 5%
present_value_4k <- 4000 * (1.05)^(-3)

# Present value of all cash flows
cash$present_value <-cash$cash_flow * (1.05)^(-cash$year)


# Print out cash
cash
##   company cash_flow year quarter_cash double_year present_value
## 1       A      1000    1        250.0           2      952.3810
## 2       A      4000    3       1000.0           6     3455.3504
## 3       A       550    4        137.5           8      452.4864
## 4       B      1500    1        375.0           2     1428.5714
## 5       B      1100    2        275.0           4      997.7324
## 6       B       750    4        187.5           8      617.0269
## 7       B      6000    5       1500.0          10     4701.1570

3.9 Present value of projected cash flows (2)

Calculate how much company A and company B individually contribute to the total present value.

# Total present value of cash
total_pv <- sum(cash$present_value)
total_pv
## [1] 12604.71

# Company B information
cash_B <- subset(cash, company == "B")
cash_B
##   company cash_flow year quarter_cash double_year present_value
## 4       B      1500    1        375.0           2     1428.5714
## 5       B      1100    2        275.0           4      997.7324
## 6       B       750    4        187.5           8      617.0269
## 7       B      6000    5       1500.0          10     4701.1570

# Total present value of cash_B
total_pv_B <- sum(cash_B$present_value)
total_pv_B
## [1] 7744.488

Chapter 4: Factors

4.1 Create a factor

Using the Factor() function to create factors out of categorical variables.

# credit_rating character vector
credit_rating <- c("BB", "AAA", "AA", "CCC", "AA", "AAA", "B", "BB")

# Create a factor from credit_rating
credit_factor <- factor(credit_rating)

# Print out your new factor
credit_factor
## [1] BB  AAA AA  CCC AA  AAA B   BB 
## Levels: AA AAA B BB CCC

# Call str() on credit_rating
str(credit_rating)
##  chr [1:8] "BB" "AAA" "AA" "CCC" "AA" "AAA" "B" "BB"

# Call str() on credit_factor
str(credit_factor)
##  Factor w/ 5 levels "AA","AAA","B",..: 4 2 1 5 1 2 3 4

4.2 Factor levels

Accessing unique levels of your factor by using the levels() function.

# Identify unique levels
levels(credit_factor)
## [1] "AA"  "AAA" "B"   "BB"  "CCC"

# Rename the levels of credit_factor
levels(credit_factor) <- c("2A", "3A", "1B", "2B", "3C")

# Print credit_factor
credit_factor
## [1] 2B 3A 2A 3C 2A 3A 1B 2B
## Levels: 2A 3A 1B 2B 3C

4.3 Factor Summary

To present a table 0f counts of each bond credit rating, use the summary() function

# Summarize the character vector, credit_rating
summary(credit_rating)
##    Length     Class      Mode 
##         8 character character

# Summarize the factor, credit_factor
summary(credit_factor)
## 2A 3A 1B 2B 3C 
##  2  2  1  2  1

4.4 Visualize your Factor

Vizualize a table (such as the data from the last example) by using the plot() function.

# Visualize your factor!
plot(credit_factor)

4.5 Bucketing a numeric variable into a factor

Create a factor from a numeric vector with the cut() function. (ex: bucketing list of bonds by rank)

# Define AAA_rank
AAA_rank <- c(31, 48, 100, 53, 85, 73, 62, 74, 42, 38, 97, 61, 48, 86, 44, 9, 43, 18,  62, 38,  23, 37,  54, 80, 78, 93, 47, 100, 22, 22, 18, 26, 81, 17, 98, 4, 83, 5, 6, 52, 29, 44, 50, 2, 25, 19,  15, 42, 30, 27)
              
# Create 4 buckets for AAA_rank using cut()
AAA_factor <- cut(x = AAA_rank, breaks = c(0, 25, 50, 75, 100))

# Rename the levels 
levels(AAA_factor) <- c("low", "medium", "high", "very_high")

# Print AAA_factor
AAA_factor
##  [1] medium    medium    very_high high      very_high high      high     
##  [8] high      medium    medium    very_high high      medium    very_high
## [15] medium    low       medium    low       high      medium    low      
## [22] medium    high      very_high very_high very_high medium    very_high
## [29] low       low       low       medium    very_high low       very_high
## [36] low       very_high low       low       high      medium    medium   
## [43] medium    low       low       low       low       medium    medium   
## [50] medium   
## Levels: low medium high very_high

# Plot AAA_factor
plot(AAA_factor)

4.6 Create an ordered factor

ordering your factor on a plot: To order your factor, there are two options. When creating a factor, specify ordered = TRUE and add unique levels in order from least to greatest: For an existing unordered factor use the ordered() function

# Use unique() to find unique words
unique(credit_rating)
## [1] "BB"  "AAA" "AA"  "CCC" "B"

# Create an ordered factor
credit_factor_ordered <- factor(credit_rating, ordered = TRUE, levels = c("AAA", "AA", "BB", "B", "CCC"))

# Plot credit_factor_ordered
plot(credit_factor)

4.7 Subsetting a factor

remove a factor level from your analysis tell R to drop an entire factor level from the analysis To do that, add drop = TRUE

# Remove the A bonds at positions 3 and 7. Don't drop the A level.
keep_level <- credit_factor[-c(3,7)]

# Plot keep_level
plot(keep_level)


# Remove the A bonds at positions 3 and 7. Drop the A level.
drop_level <- credit_factor[-c(3,7), drop = TRUE]

# Plot drop_level
plot(drop_level)

4.8 stringsAsFactors

Stopping the default Factoring of a column whe creating data frames. cash <- data.frame(company, cash_flow, year, stringsAsFactors = FALSE)

# Variables
credit_rating <- c("AAA", "A", "BB")
bond_owners <- c("Dan", "Tom", "Joe")

# Create the data frame of character vectors, bonds
bonds <- data.frame(credit_rating, bond_owners, stringsAsFactors = FALSE)

# Use str() on bonds
str(bonds)
## 'data.frame':    3 obs. of  2 variables:
##  $ credit_rating: chr  "AAA" "A" "BB"
##  $ bond_owners  : chr  "Dan" "Tom" "Joe"

# Create a factor column in bonds called credit_factor from credit_rating
bonds$credit_factor <- factor(bonds$credit_rating, ordered = TRUE, levels = c("AAA","A","BB"))

# Use str() on bonds again
str(bonds)
## 'data.frame':    3 obs. of  3 variables:
##  $ credit_rating: chr  "AAA" "A" "BB"
##  $ bond_owners  : chr  "Dan" "Tom" "Joe"
##  $ credit_factor: Ord.factor w/ 3 levels "AAA"<"A"<"BB": 1 2 3

Chapter 5: Lists

5.1 Create a list

list Can store any type of data.

# List components
name <- "Apple and IBM"
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03)
ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79)
cor_matrix <- cor(cbind(apple, ibm))

# Create a list
portfolio <- list(name, apple, ibm, cor_matrix)

# View your first list
portfolio
## [[1]]
## [1] "Apple and IBM"
## 
## [[2]]
## [1] 109.49 109.90 109.11 109.95 111.03
## 
## [[3]]
## [1] 159.82 160.02 159.84 160.35 164.79
## 
## [[4]]
##           apple       ibm
## apple 1.0000000 0.9131575
## ibm   0.9131575 1.0000000

5.2 Named Lists

You could name the elements as you create the list with the form name = value : my_list <- list(my_words = words, my_numbers = numbers) Or, if the list was already created, you could use names(): my_list <- list(words, numbers) names(my_list) <- c(“my_words”, “my_numbers”)

# Add names to your portfolio
names(portfolio) <- c("portfolio_name", "apple", "ibm", "correlation")

# Print portfolio
portfolio
## $portfolio_name
## [1] "Apple and IBM"
## 
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
## 
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79
## 
## $correlation
##           apple       ibm
## apple 1.0000000 0.9131575
## ibm   0.9131575 1.0000000

5.3 Access elements in a list

To access the elements in the list, use [ ]. This will always return another list. To pull out the data inside each element of your list, use [[ ]]

# Second and third elements of portfolio
portfolio[c(2,3)]
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
## 
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79

# Use $ to get the correlation data
portfolio$correlation
##           apple       ibm
## apple 1.0000000 0.9131575
## ibm   0.9131575 1.0000000

5.4 Adding to a list

Say you want to add your friend Dan’s favorite movie to your list. You can do so using $ like you did when adding new columns to data frames.

# Add weight: 20% Apple, 80% IBM
portfolio$weight <- c(apple = .2, ibm = .8)

# Print portfolio
portfolio
## $portfolio_name
## [1] "Apple and IBM"
## 
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
## 
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79
## 
## $correlation
##           apple       ibm
## apple 1.0000000 0.9131575
## ibm   0.9131575 1.0000000
## 
## $weight
## apple   ibm 
##   0.2   0.8

# Change the weight variable: 30% Apple, 70% IBM
portfolio$weight <- c(apple = .3, ibm = .7)

# Print portfolio to see the changes
portfolio
## $portfolio_name
## [1] "Apple and IBM"
## 
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
## 
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79
## 
## $correlation
##           apple       ibm
## apple 1.0000000 0.9131575
## ibm   0.9131575 1.0000000
## 
## $weight
## apple   ibm 
##   0.3   0.7

5.5 Removing from a list

example: To remove dans_movie: my_list$dans_movie <- NULL If your list is not named, you can also remove elements by position using my_list[1] <- NULL or my_list[[1]] <- NULL

# Take a look at portfolio
portfolio
## $portfolio_name
## [1] "Apple and IBM"
## 
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
## 
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79
## 
## $correlation
##           apple       ibm
## apple 1.0000000 0.9131575
## ibm   0.9131575 1.0000000
## 
## $weight
## apple   ibm 
##   0.3   0.7

# Remove the microsoft stock prices from your portfolio
portfolio$microsoft <- NULL

5.6 Split it

Splitting data frames into two seperate data frames. Create a grouping to split on, and use split() to create a list of two data frames.

# Define grouping from year
grouping <- cash$year

# Split cash on your new grouping
split_cash <- split(cash, grouping)

# Look at your split_cash list
split_cash
## $`1`
##   company cash_flow year quarter_cash double_year present_value
## 1       A      1000    1          250           2       952.381
## 4       B      1500    1          375           2      1428.571
## 
## $`2`
##   company cash_flow year quarter_cash double_year present_value
## 5       B      1100    2          275           4      997.7324
## 
## $`3`
##   company cash_flow year quarter_cash double_year present_value
## 2       A      4000    3         1000           6       3455.35
## 
## $`4`
##   company cash_flow year quarter_cash double_year present_value
## 3       A       550    4        137.5           8      452.4864
## 6       B       750    4        187.5           8      617.0269
## 
## $`5`
##   company cash_flow year quarter_cash double_year present_value
## 7       B      6000    5         1500          10      4701.157

# Unsplit split_cash to get the original data back.
original_cash <- unsplit(split_cash, grouping)

# Print original_cash
original_cash
##   company cash_flow year quarter_cash double_year present_value
## 1       A      1000    1        250.0           2      952.3810
## 2       A      4000    3       1000.0           6     3455.3504
## 3       A       550    4        137.5           8      452.4864
## 4       B      1500    1        375.0           2     1428.5714
## 5       B      1100    2        275.0           4      997.7324
## 6       B       750    4        187.5           8      617.0269
## 7       B      6000    5       1500.0          10     4701.1570

5.7 Split-Apply-Combine

to split your data frame by a grouping, apply some transformation to each group, and then recombine those pieces back into one data frame.

# Print split_cash
split_cash
## $`1`
##   company cash_flow year quarter_cash double_year present_value
## 1       A      1000    1          250           2       952.381
## 4       B      1500    1          375           2      1428.571
## 
## $`2`
##   company cash_flow year quarter_cash double_year present_value
## 5       B      1100    2          275           4      997.7324
## 
## $`3`
##   company cash_flow year quarter_cash double_year present_value
## 2       A      4000    3         1000           6       3455.35
## 
## $`4`
##   company cash_flow year quarter_cash double_year present_value
## 3       A       550    4        137.5           8      452.4864
## 6       B       750    4        187.5           8      617.0269
## 
## $`5`
##   company cash_flow year quarter_cash double_year present_value
## 7       B      6000    5         1500          10      4701.157

# Print the cash_flow column of B in split_cash
split_cash$B$cash_flow
## NULL

# Set the cash_flow column of company A in split_cash to 0
split_cash$A$cash_flow <- 0

# Use the grouping to unsplit split_cash
cash_no_A <- unsplit(split_cash, grouping)

# Print cash_no_A
cash_no_A
##   company cash_flow year quarter_cash double_year present_value
## 1       A      1000    1        250.0           2      952.3810
## 2       A      4000    3       1000.0           6     3455.3504
## 3       A       550    4        137.5           8      452.4864
## 4       B      1500    1        375.0           2     1428.5714
## 5       B      1100    2        275.0           4      997.7324
## 6       B       750    4        187.5           8      617.0269
## 7       B      6000    5       1500.0          10     4701.1570

5.8 Attributes

common attributes are: row names and column names, dimensions, and class. use the attributes() function to return a list of attributes about the object you pass in. To access a specific attribute, you can use the attr() function.

# my_matrix and my_factor
my_matrix <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
rownames(my_matrix) <- c("Row1", "Row2")
colnames(my_matrix) <- c("Col1", "Col2", "Col3")

my_factor <- factor(c("A", "A", "B"), ordered = T, levels = c("A", "B"))

# attributes of my_matrix
attributes(my_matrix)
## $dim
## [1] 2 3
## 
## $dimnames
## $dimnames[[1]]
## [1] "Row1" "Row2"
## 
## $dimnames[[2]]
## [1] "Col1" "Col2" "Col3"

# Just the dim attribute of my_matrix
attr(my_matrix, which = "dim")
## [1] 2 3

# attributes of my_factor
attributes(my_factor)
## $levels
## [1] "A" "B"
## 
## $class
## [1] "ordered" "factor"

quiz 1

Question 1: Matrices are basically vectors with multiple dimensions.

Question 2: A data frame has the same structure as a matrices, except a data frame can have multiple data types.

Question 3:

fake_stock <- c(75.43, 76.57, 77.11, 76.88, 76.98, 77.31, 78.42, 78.56, 79.13, 77.45, 77.87, 77.95, 78.76, 78.78, 79.40, 79.07, 78.65, 77.09, 76.76, 76.13, 75.48)

plot(fake_stock)

my_vector <- c(45, 57, 61, 63, 68, 75, 89, 93, 98)

my_matrix <- matrix(data = my_vector, nrow = 3, ncol = 3)

my_matrix
##      [,1] [,2] [,3]
## [1,]   45   63   89
## [2,]   57   68   93
## [3,]   61   75   98

School <- c("A", "A", "A", "A", "B", "B", "B", "B")
Points <- c(14, 16, 18, 10, 15, 10, 20, 11)
Quarter <- c(1, 2, 3, 4, 1, 2, 3, 4)

game <- data.frame(School, Points, Quarter)

game
##   School Points Quarter
## 1      A     14       1
## 2      A     16       2
## 3      A     18       3
## 4      A     10       4
## 5      B     15       1
## 6      B     10       2
## 7      B     20       3
## 8      B     11       4

fake_stock <- c(75.43, 76.57, 77.11, 76.88, 76.98, 77.31, 78.42, 78.56)
credit_factor <- factor(fake_stock)
credit_factor
## [1] 75.43 76.57 77.11 76.88 76.98 77.31 78.42 78.56
## Levels: 75.43 76.57 76.88 76.98 77.11 77.31 78.42 78.56

name <- "Utah and Texas"

Utah <- c(45, 55, 48, 50, 52)

Texas <- c(48, 58, 40, 60, 50)

cor_matrix <- cor(cbind(Utah, Texas))

game <- list(name, Utah, Texas, cor_matrix)

game
## [[1]]
## [1] "Utah and Texas"
## 
## [[2]]
## [1] 45 55 48 50 52
## 
## [[3]]
## [1] 48 58 40 60 50
## 
## [[4]]
##            Utah     Texas
## Utah  1.0000000 0.5691546
## Texas 0.5691546 1.0000000

Introduction to R for Finance

Gabe Nichols

6/6/17

Introduction to R for Finance

Chapter 1: The Basics

1.1 Your first R script

1.2 Arithmetic in R(1)

1.4 Assignment and variables(1)

1.5 Assignment and variables(2)

1.6 Financial returns(1)

1.7 Financial returns(2)

1.8 Data type exploration

1.9 What’s that data type?

Chapter 2: Vectors and Matrices

2.1 c()combine

2.2 coerce it

2.3 Vector names()

2.4 Visualize your vector

2.5 Weighted average (1)

2.6 Weighted average (2)

2.7 Weighted average (3)

2.8 Vector subsetting

2.9 Create a matrix!

2.10 Matric<- bind vectors

2.11 Visualize your matrix

2.12 cor()relation

2.13 Matrix subsetting

Chapter 3: Data Frames

3.1 Create your first data frame()

3.2 Making head()s and tail()s of your data with some str()ucture

3.3 Naming your columns / rows

3.4 Accessing and subsetting data frames (1)

3.5 Accessing and subsetting data frames (2)

3.6 Accessing and subsetting data frames (3)

3.7 Adding new columns

3.8 Present value of projected cash flows (1)

3.9 Present value of projected cash flows (2)

Chapter 4: Factors

4.1 Create a factor

4.2 Factor levels

4.3 Factor Summary

4.4 Visualize your Factor

4.5 Bucketing a numeric variable into a factor

4.6 Create an ordered factor

4.7 Subsetting a factor

4.8 stringsAsFactors

Chapter 5: Lists

5.1 Create a list

5.2 Named Lists

5.3 Access elements in a list

5.4 Adding to a list

5.5 Removing from a list

5.6 Split it

5.7 Split-Apply-Combine

5.8 Attributes

quiz 1