Type R code to solve the equations. To add 3 and 4 you type 3 + 4 at the prompt and press enter. The solution will be returned as [1] 7. The exercise demonstrated 3+5 and asked you perform 6-4.
# Addition!
3 + 5
## [1] 8
# Subtraction!
6 - 4
## [1] 2
This exercise showed how to perform basic arithmetic. A helpful hint was that clicking on a line of code in the script, and then pressing Command + Enter will execute just that line in the R Console.
# Addition
2 + 2
## [1] 4
# Subtraction
4 - 1
## [1] 3
# Multiplication
3 * 4
## [1] 12
# Division
4 / 2
## [1] 2
# Exponentiation
2 ^ 4
## [1] 16
# Modulo
7 %% 3
## [1] 1
This section reviewed the order of operations rule - PEMDAS
A variable allows you to store a value or an object in R. You can then later use this variable’s name to easily access the value or the object that is stored within this variable. You use <- to assign a variable:
# Assign 200 to savings
savings <- 200
# Print the value of savings to the console
savings
## [1] 200
You can assign values to your variables and use arithmetic in R to perform functions
# Assign 100 to my_money
my_money <- 100
# Assign 200 to dans_money
dans_money <-200
# Add my_money and dans_money
my_money + dans_money
## [1] 300
# Add my_money and dans_money again, save the result to our_money
our_money <- my_money + dans_money
You can use multipliers to calculate financial returns. Multiplier = 1 + (return/100).
# Variables for starting_cash and 5% return during January
starting_cash <- 200
jan_ret <- 5
jan_mult <- 1 + (jan_ret / 100)
# How much money do you have at the end of January?
post_jan_cash <- starting_cash * jan_mult
# Print post_jan_cash
post_jan_cash
## [1] 210
# January 10% return multiplier
jan_ret_10 <- 10
jan_mult_10 <- 1 + (jan_ret_10 / 100)
# How much money do you have at the end of January now?
post_jan_cash_10 <- starting_cash * jan_mult_10
# Print post_jan_cash_10
post_jan_cash_10
## [1] 220
You find the total return over two or more months by multiplying the multipliers together.
# Starting cash and returns
starting_cash <- 200
jan_ret <- 4
feb_ret <- 5
# Multipliers
jan_mult <- 1 + (jan_ret / 100)
feb_mult <- 1 + (feb_ret / 100)
# Total cash at the end of the two months
total_cash <- starting_cash * jan_mult * feb_mult
# Print total_cash
total_cash
## [1] 218.4
There are 3 types of data: Numbers either with decimals or integers which are whole numbers, integers must be specified by adding L after the number; Logical data which are the values TRUE and FALSE which must be capitalized; and Charcters which is text and must be entered in quotation marks.
# Apple's stock price is a numeric
apple_stock <- 150.45
# Bond credit ratings are characters
credit_rating <- "AAA"
# You like the stock market. TRUE or FALSE?
my_answer <- TRUE
# Print my_answer
my_answer
## [1] TRUE
You can determine what data type a variable is by entering class(my_var). This will return the data type (or class) of whatever variable you pass in.
A vector is created using the combine function, c(). Each element you add is separated by a comma.
# Another numeric vector
ibm_stock <- c(159.82, 160.02, 159.84)
# Another character vector
finance <- c("stocks", "bonds", "investments")
# A logical vector
logic <- c( TRUE, FALSE, TRUE)
Since a vector can only be composed of one data type. If you use more than one data type in a vector, the lower ranking type will be coerced into the higher ranking type. The hierarchy for coercion is: logical < integer < numeric < character
You can add names to each value in your vector. You do this using names()
# Vectors of 12 months of returns, and month names
ret <- c(5, 2, 3, 7, 8, 3, 5, 9, 1, 4, 6, 3)
months <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
# Add names to ret
names(ret) <- c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")
# Print out ret to see the new names!
ret
## Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
## 5 2 3 7 8 3 5 9 1 4 6 3
You can create a graph of your data by using the plot() function. Passing in a vector will add its values to the y-axis of the graph, and on the x-axis will be an index created from the order that your vector is in. Inside of plot(), you can change the type of your graph using type =. The default is “p” for points, but you can also change it to “l” for line.
# Define apple_stock
apple_stock <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19, 115.82, 115.97, 116.64, 116.95, 117.06, 116.29, 116.52, 117.26, 116.76, 116.73, 115.82)
# Look at the data
apple_stock
## [1] 109.49 109.90 109.11 109.95 111.03 112.12 113.95 113.30 115.19 115.19
## [11] 115.82 115.97 116.64 116.95 117.06 116.29 116.52 117.26 116.76 116.73
## [21] 115.82
# Plot the data points
plot(apple_stock)
# Plot the data as a line graph
plot(apple_stock, type = "l")
## 2.5 Weighted Average
Weighted average allows you to calculate your portfolio return over a time period. To calculate the weighted average, take the return of each stock in your portfolio, and multiply it by the weight of that stock.
# Weights and returns
micr_ret <- 7
sony_ret <- 9
micr_weight <- .2
sony_weight <- .8
# Portfolio return
portf_ret <- micr_ret * micr_weight + sony_ret * sony_weight
# Weights, returns, and company names
ret <- c(7, 9)
weight <- c(.2, .8)
companies <- c("Microsoft", "Sony")
# Assign company names to your vectors
names(ret) <- c("Microsoft", "Sony")
names(weight) <- c("Microsoft", "Sony")
# Multiply the returns and weights together
ret_X_weight <- ret * weight
# Print ret_X_weight
ret_X_weight
## Microsoft Sony
## 1.4 7.2
# Sum to get the total portfolio return
portf_ret <- sum(ret_X_weight)
# Print portf_ret
portf_ret
## [1] 8.6
Matrices are similar to vectors, except they are in 2 dimensions. The actual data for the matrix is passed in as a vector using c(), and is then converted to a matrix by specifying the number of rows and columns (also known as the dimensions).
# A vector of 9 numbers
my_vector <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)
# 3x3 matrix
my_matrix <- matrix(data =c(1, 2, 3, 4, 5, 6, 7, 8, 9), nrow = 3, ncol = 3)
# Print my_matrix
my_matrix
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
# Filling across using byrow = TRUE
matrix(data = c(2, 3, 4, 5), nrow = 2, ncol = 2, byrow = TRUE)
## [,1] [,2]
## [1,] 2 3
## [2,] 4 5
You can create matrices by combining multiple vectors by using the functions cbind() and rbind().
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19, 115.82, 115.97, 116.64, 116.95, 117.06, 116.29, 116.52, 117.26, 116.76, 116.73, 115.82)
ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79, 165.36, 166.52, 165.50, 168.29, 168.51, 168.02, 166.73, 166.68, 167.60, 167.33, 167.06, 166.71, 167.14, 166.19, 166.60, 165.99)
micr <- c(59.20, 59.25, 60.22, 59.95, 61.37, 61.01, 61.97, 62.17, 62.98, 62.68, 62.58, 62.30, 63.62, 63.54, 63.54, 63.55, 63.24, 63.28, 62.99, 62.90, 62.14)
# cbind the vectors together
cbind_stocks <- cbind (apple, ibm, micr)
# Print cbind_stocks
cbind_stocks
## apple ibm micr
## [1,] 109.49 159.82 59.20
## [2,] 109.90 160.02 59.25
## [3,] 109.11 159.84 60.22
## [4,] 109.95 160.35 59.95
## [5,] 111.03 164.79 61.37
## [6,] 112.12 165.36 61.01
## [7,] 113.95 166.52 61.97
## [8,] 113.30 165.50 62.17
## [9,] 115.19 168.29 62.98
## [10,] 115.19 168.51 62.68
## [11,] 115.82 168.02 62.58
## [12,] 115.97 166.73 62.30
## [13,] 116.64 166.68 63.62
## [14,] 116.95 167.60 63.54
## [15,] 117.06 167.33 63.54
## [16,] 116.29 167.06 63.55
## [17,] 116.52 166.71 63.24
## [18,] 117.26 167.14 63.28
## [19,] 116.76 166.19 62.99
## [20,] 116.73 166.60 62.90
## [21,] 115.82 165.99 62.14
# rbind the vectors together
rbind_stocks <- rbind (apple,ibm, micr)
# Print rbind_stocks
rbind_stocks
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## apple 109.49 109.90 109.11 109.95 111.03 112.12 113.95 113.30 115.19
## ibm 159.82 160.02 159.84 160.35 164.79 165.36 166.52 165.50 168.29
## micr 59.20 59.25 60.22 59.95 61.37 61.01 61.97 62.17 62.98
## [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18]
## apple 115.19 115.82 115.97 116.64 116.95 117.06 116.29 116.52 117.26
## ibm 168.51 168.02 166.73 166.68 167.60 167.33 167.06 166.71 167.14
## micr 62.68 62.58 62.30 63.62 63.54 63.54 63.55 63.24 63.28
## [,19] [,20] [,21]
## apple 116.76 116.73 115.82
## ibm 166.19 166.60 165.99
## micr 62.99 62.90 62.14
You can plot matrices using plot().
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19, 115.82, 115.97, 116.64, 116.95, 117.06, 116.29, 116.52, 117.26, 116.76, 116.73, 115.82)
micr <- c(59.20, 59.25, 60.22, 59.95, 61.37, 61.01, 61.97, 62.17, 62.98, 62.68, 62.58, 62.30, 63.62, 63.54, 63.54, 63.55, 63.24, 63.28, 62.99, 62.90, 62.14)
apple_micr_matrix <- cbind (apple,micr)
# View the data
apple_micr_matrix
## apple micr
## [1,] 109.49 59.20
## [2,] 109.90 59.25
## [3,] 109.11 60.22
## [4,] 109.95 59.95
## [5,] 111.03 61.37
## [6,] 112.12 61.01
## [7,] 113.95 61.97
## [8,] 113.30 62.17
## [9,] 115.19 62.98
## [10,] 115.19 62.68
## [11,] 115.82 62.58
## [12,] 115.97 62.30
## [13,] 116.64 63.62
## [14,] 116.95 63.54
## [15,] 117.06 63.54
## [16,] 116.29 63.55
## [17,] 116.52 63.24
## [18,] 117.26 63.28
## [19,] 116.76 62.99
## [20,] 116.73 62.90
## [21,] 115.82 62.14
# Scatter plot of Microsoft vs Apple
plot (apple_micr_matrix)
The cor() function will calculate the correlation between two vectors, or will create a correlation matrix when given a matrix.
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19, 115.82, 115.97, 116.64, 116.95, 117.06, 116.29, 116.52, 117.26, 116.76, 116.73, 115.82)
ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79, 165.36, 166.52, 165.50, 168.29, 168.51, 168.02, 166.73, 166.68, 167.60, 167.33, 167.06, 166.71, 167.14, 166.19, 166.60, 165.99)
micr <- c(59.20, 59.25, 60.22, 59.95, 61.37, 61.01, 61.97, 62.17, 62.98, 62.68, 62.58, 62.30, 63.62, 63.54, 63.54, 63.55, 63.24, 63.28, 62.99, 62.90, 62.14)
# Correlation of Apple and IBM
cor(apple,ibm)
## [1] 0.8872467
# stock matrix
stocks <- cbind (apple, micr, ibm)
cor(stocks)
## apple micr ibm
## apple 1.0000000 0.9477010 0.8872467
## micr 0.9477010 1.0000000 0.9126597
## ibm 0.8872467 0.9126597 1.0000000
Matrices can be selected from and subsetted. The basic structure is: my_matrix[row, col]
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03, 112.12, 113.95, 113.30, 115.19, 115.19, 115.82, 115.97, 116.64, 116.95, 117.06, 116.29, 116.52, 117.26, 116.76, 116.73, 115.82)
ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79, 165.36, 166.52, 165.50, 168.29, 168.51, 168.02, 166.73, 166.68, 167.60, 167.33, 167.06, 166.71, 167.14, 166.19, 166.60, 165.99)
micr <- c(59.20, 59.25, 60.22, 59.95, 61.37, 61.01, 61.97, 62.17, 62.98, 62.68, 62.58, 62.30, 63.62, 63.54, 63.54, 63.55, 63.24, 63.28, 62.99, 62.90, 62.14)
# Third row
stocks[3, ]
## apple micr ibm
## 109.11 60.22 159.84
# Fourth and fifth row of the ibm column
stocks[4:5,"ibm"]
## [1] 160.35 164.79
# apple and micr columns
stocks[,c("apple", "micr")]
## apple micr
## [1,] 109.49 59.20
## [2,] 109.90 59.25
## [3,] 109.11 60.22
## [4,] 109.95 59.95
## [5,] 111.03 61.37
## [6,] 112.12 61.01
## [7,] 113.95 61.97
## [8,] 113.30 62.17
## [9,] 115.19 62.98
## [10,] 115.19 62.68
## [11,] 115.82 62.58
## [12,] 115.97 62.30
## [13,] 116.64 63.62
## [14,] 116.95 63.54
## [15,] 117.06 63.54
## [16,] 116.29 63.55
## [17,] 116.52 63.24
## [18,] 117.26 63.28
## [19,] 116.76 62.99
## [20,] 116.73 62.90
## [21,] 115.82 62.14
A data frame is a table. It is like a matrix but it can hold different types of data.
# Variables
company <- c("A", "A", "A", "B", "B", "B", "B")
cash_flow <- c(1000, 4000, 550, 1500, 1100, 750, 6000)
year <- c(1, 3, 4, 1, 2, 4, 5)
# Data frame
cash <- data.frame(company, cash_flow, year)
# Print cash
cash
## company cash_flow year
## 1 A 1000 1
## 2 A 4000 3
## 3 A 550 4
## 4 B 1500 1
## 5 B 1100 2
## 6 B 750 4
## 7 B 6000 5
You can create data frames with all types of data.
head() - Returns the first few rows of a data frame. By default, 6. To change this, use head(cash, n = ). tail() - Returns the last few rows of a data frame. By default, 6. To change this, use tail(cash, n = ) str() - Check the structure of an object. This function will show you the data type of the object you pass in (here, data.frame), and will list each column variable along with its data type.
# Call head() for the first 4 rows
head (cash, n = 4)
## company cash_flow year
## 1 A 1000 1
## 2 A 4000 3
## 3 A 550 4
## 4 B 1500 1
# Call tail() for the last 3 rows
tail (cash, n = 3)
## company cash_flow year
## 5 B 1100 2
## 6 B 750 4
## 7 B 6000 5
# Call str (cash)
str (cash)
## 'data.frame': 7 obs. of 3 variables:
## $ company : Factor w/ 2 levels "A","B": 1 1 1 2 2 2 2
## $ cash_flow: num 1000 4000 550 1500 1100 750 6000
## $ year : num 1 3 4 1 2 4 5
You can name columns by using colnames() and you can name rows by using rownames(),
# Fix your column names
colnames(cash) <- c("company", "cash_flow", "year")
# Print out the column names of cash
cash
## company cash_flow year
## 1 A 1000 1
## 2 A 4000 3
## 3 A 550 4
## 4 B 1500 1
## 5 B 1100 2
## 6 B 750 4
## 7 B 6000 5
You can subset your data frame or access certain columns by using [ ].
# Third row, second column
cash[3,2]
## [1] 550
# Fifth row of the "year" column
cash[5, "year"]
## [1] 2
# Select the year column
cash$year
## [1] 1 3 4 1 2 4 5
Selecting a specific column from a data frame can be done using the shortcut, the $.
# Select the cash_flow column and multiply by 2
cash$cash_flow * 2
## [1] 2000 8000 1100 3000 2200 1500 12000
# Delete the company column
cash$company <- NULL
# Print cash again
cash
## cash_flow year
## 1 1000 1
## 2 4000 3
## 3 550 4
## 4 1500 1
## 5 1100 2
## 6 750 4
## 7 6000 5
The first argument you pass to subset() is the name of your data frame. The == is the equality operator. It tests to find where two things are equal, and returns a logical vector.
# Rows about company B
subset (cash,company == "B")
## cash_flow year
## 4 1500 1
## 5 1100 2
## 6 750 4
## 7 6000 5
# Rows with cash flows due in 1 year
subset (cash, year == 1)
## cash_flow year
## 1 1000 1
## 4 1500 1
You can add new columns in your data frame by assigning the new information to data_frame$new_column.
# Quarter cash flow scenario
cash$quarter_cash <- cash$cash_flow * .25
cash
## cash_flow year quarter_cash
## 1 1000 1 250.0
## 2 4000 3 1000.0
## 3 550 4 137.5
## 4 1500 1 375.0
## 5 1100 2 275.0
## 6 750 4 187.5
## 7 6000 5 1500.0
# Double year scenario
cash$double_year <- cash$year * 2
cash
## cash_flow year quarter_cash double_year
## 1 1000 1 250.0 2
## 2 4000 3 1000.0 6
## 3 550 4 137.5 8
## 4 1500 1 375.0 2
## 5 1100 2 275.0 4
## 6 750 4 187.5 8
## 7 6000 5 1500.0 10
The general formula for calculating the present value is: present_value <- cash_flow * (1 + interest / 100) ^ -year
# Present value of $4000, in 3 years, at 5%
present_value_4k <- 4000 * (1 + 5 / 100) ^ -3
# Present value of all cash flows
cash$present_value <-cash$cash_flow * (1 + 5 / 100) ^ -cash$year
# Print out cash
cash
## cash_flow year quarter_cash double_year present_value
## 1 1000 1 250.0 2 952.3810
## 2 4000 3 1000.0 6 3455.3504
## 3 550 4 137.5 8 452.4864
## 4 1500 1 375.0 2 1428.5714
## 5 1100 2 275.0 4 997.7324
## 6 750 4 187.5 8 617.0269
## 7 6000 5 1500.0 10 4701.1570
You can use the sum() function to add up the elements of your present value calculations.
# Total present value of cash
total_pv <- sum (cash$present_value)
# Company B information
cash_B <- subset (cash, company == "B")
# Total present value of cash_B
total_pv_B <- sum (cash_B$present_value)
To create a factor in R, use the factor() function, and pass in a vector that you want to be converted into a factor.
# credit_rating character vector
credit_rating <- c("BB", "AAA", "AA", "CCC", "AA", "AAA", "B", "BB")
# Create a factor from credit_rating
credit_factor <- factor (credit_rating)
# Print out your new factor
credit_factor
## [1] BB AAA AA CCC AA AAA B BB
## Levels: AA AAA B BB CCC
# Call str() on credit_rating
str (credit_rating)
## chr [1:8] "BB" "AAA" "AA" "CCC" "AA" "AAA" "B" "BB"
# Call str() on credit_factor
str (credit_factor)
## Factor w/ 5 levels "AA","AAA","B",..: 4 2 1 5 1 2 3 4
You can access and rename your factor levels by using the levels() function.
# Identify unique levels
levels (credit_factor)
## [1] "AA" "AAA" "B" "BB" "CCC"
# Rename the levels of credit_factor
levels (credit_factor) <- c("2A", "3A", "1B", "2B", "3C")
# Print credit_factor
credit_factor
## [1] 2B 3A 2A 3C 2A 3A 1B 2B
## Levels: 2A 3A 1B 2B 3C
You can summarize factors using the summary() command.
# Summarize the character vector, credit_rating
summary (credit_rating)
## Length Class Mode
## 8 character character
# Summarize the factor, credit_factor
summary (credit_factor)
## 2A 3A 1B 2B 3C
## 2 2 1 2 1
You can use plot() to create a bar graph of your factor.
# Visualize your factor!
plot (credit_factor)
## 4.5 Bucketing a numeric variable into a factor
You can create a factor from a numeric vector by using cut(). The ( in the factor levels means we do not include the number beside it in that group, and the ] means that we do include that number in the group.
AAA_rank <- c(31, 48, 100, 53, 85, 73, 62, 74, 42, 38, 97, 61, 48, 86, 44, 9, 43, 18, 62, 38, 23, 37, 54, 80, 78, 93, 47, 100, 22, 22, 18, 26, 81, 17, 98, 4, 83, 5, 6, 52, 29, 44, 50, 2, 25, 19, 15, 42, 30, 27)
# Create 4 buckets for AAA_rank using cut()
AAA_factor <- cut(x = AAA_rank, breaks = c (0, 25, 50, 75, 100))
# Rename the levels
levels(AAA_factor) <- c("low", "medium", "high", "very_high")
# Print AAA_factor
AAA_factor
## [1] medium medium very_high high very_high high high
## [8] high medium medium very_high high medium very_high
## [15] medium low medium low high medium low
## [22] medium high very_high very_high very_high medium very_high
## [29] low low low medium very_high low very_high
## [36] low very_high low low high medium medium
## [43] medium low low low low medium medium
## [50] medium
## Levels: low medium high very_high
# Plot AAA_factor
plot(AAA_factor)
## 4.6 Create an ordered factor You can asssign an order to your factor by adding ordered = TRUE when you create the factor, and assigning levels.
# Use unique() to find unique words
unique(credit_rating)
## [1] "BB" "AAA" "AA" "CCC" "B"
# Create an ordered factor
credit_factor_ordered <- factor(credit_rating, ordered = TRUE, levels = c("AAA", "AA", "BB", "B", "CCC"))
# Plot credit_factor_ordered
plot (credit_factor_ordered)
You can subset a factor using []. If you want to drop an element entirely you need to add Drop = TRUE.
# Remove the A bonds at positions 3 and 7. Don't drop the A level.
keep_level <- credit_factor[-c(3, 7)]
# Plot keep_level
plot (keep_level)
# Remove the A bonds at positions 3 and 7. Drop the A level.
drop_level <- credit_factor[-c(3, 7), drop = TRUE]
# Plot drop_level
plot (drop_level)
## 4.8 stringAsFactors
R’s default behavior when creating data frames is to convert all characters into factors. To turn off this default: cash <- data.frame(company, cash_flow, year, stringsAsFactors = FALSE)
# Variables
credit_rating <- c("AAA", "A", "BB")
bond_owners <- c("Dan", "Tom", "Joe")
# Create the data frame of character vectors, bonds
bonds <- data.frame(credit_rating, bond_owners, stringsAsFactors = FALSE)
# Use str() on bonds
str(bonds)
## 'data.frame': 3 obs. of 2 variables:
## $ credit_rating: chr "AAA" "A" "BB"
## $ bond_owners : chr "Dan" "Tom" "Joe"
# Create a factor column in bonds called credit_factor from credit_rating
bonds$credit_factor <- factor(bonds$credit_rating, ordered = TRUE, levels = c("AAA","A","BB"))
# Use str() on bonds again
str(bonds)
## 'data.frame': 3 obs. of 3 variables:
## $ credit_rating: chr "AAA" "A" "BB"
## $ bond_owners : chr "Dan" "Tom" "Joe"
## $ credit_factor: Ord.factor w/ 3 levels "AAA"<"A"<"BB": 1 2 3
You can create a list in R to hold together items of different data types by using the list() function.
# List components
name <- "Apple and IBM"
apple <- c(109.49, 109.90, 109.11, 109.95, 111.03)
ibm <- c(159.82, 160.02, 159.84, 160.35, 164.79)
cor_matrix <- cor(cbind(apple, ibm))
# Create a list
portfolio <- list (name, apple, ibm, cor_matrix)
# View your first list
portfolio
## [[1]]
## [1] "Apple and IBM"
##
## [[2]]
## [1] 109.49 109.90 109.11 109.95 111.03
##
## [[3]]
## [1] 159.82 160.02 159.84 160.35 164.79
##
## [[4]]
## apple ibm
## apple 1.0000000 0.9131575
## ibm 0.9131575 1.0000000
You can name the elements as you create a list with the form name = value:. If the list was already created, you could use names():
# Add names to your portfolio
names (portfolio) <-c ("portfolio_name", "apple", "ibm", "correlation")
# Print portfolio
portfolio
## $portfolio_name
## [1] "Apple and IBM"
##
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
##
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79
##
## $correlation
## apple ibm
## apple 1.0000000 0.9131575
## ibm 0.9131575 1.0000000
To access the elements in the list, use [ ]. This will always return another list.
# Second and third elements of portfolio
portfolio [c( 2,3)]
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
##
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79
# Use $ to get the correlation data
portfolio$correlation
## apple ibm
## apple 1.0000000 0.9131575
## ibm 0.9131575 1.0000000
You can $ use to add new elements to a list my_list. You can also use c() to add another element to the list:, this can be useful if you want to add multiple elements to your list at once.
# Add weight: 20% Apple, 80% IBM
portfolio$weight <- c(apple = .20, ibm = .80)
# Print portfolio
portfolio
## $portfolio_name
## [1] "Apple and IBM"
##
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
##
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79
##
## $correlation
## apple ibm
## apple 1.0000000 0.9131575
## ibm 0.9131575 1.0000000
##
## $weight
## apple ibm
## 0.2 0.8
# Change the weight variable: 30% Apple, 70% IBM
portfolio$weight <- c(apple = .30, ibm = .70)
# Print portfolio to see the changes
portfolio
## $portfolio_name
## [1] "Apple and IBM"
##
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
##
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79
##
## $correlation
## apple ibm
## apple 1.0000000 0.9131575
## ibm 0.9131575 1.0000000
##
## $weight
## apple ibm
## 0.3 0.7
Using NULL is the easiest way to remove an element from your list. If your list is not named, you can also remove elements by position using my_list[1] <- NULL or my_list[[1]] <- NULL.
# Take a look at portfolio
portfolio
## $portfolio_name
## [1] "Apple and IBM"
##
## $apple
## [1] 109.49 109.90 109.11 109.95 111.03
##
## $ibm
## [1] 159.82 160.02 159.84 160.35 164.79
##
## $correlation
## apple ibm
## apple 1.0000000 0.9131575
## ibm 0.9131575 1.0000000
##
## $weight
## apple ibm
## 0.3 0.7
# Remove the microsoft stock prices from your portfolio
portfolio$microsoft <- NULL
You can use split() to create a list of two data frames. You can reverse the split by using unsplit().
# Define grouping from year
grouping <- cash$year
# Split cash on your new grouping
split_cash <- split (cash, grouping)
# Look at your split_cash list
split_cash
## $`1`
## cash_flow year quarter_cash double_year present_value
## 1 1000 1 250 2 952.381
## 4 1500 1 375 2 1428.571
##
## $`2`
## cash_flow year quarter_cash double_year present_value
## 5 1100 2 275 4 997.7324
##
## $`3`
## cash_flow year quarter_cash double_year present_value
## 2 4000 3 1000 6 3455.35
##
## $`4`
## cash_flow year quarter_cash double_year present_value
## 3 550 4 137.5 8 452.4864
## 6 750 4 187.5 8 617.0269
##
## $`5`
## cash_flow year quarter_cash double_year present_value
## 7 6000 5 1500 10 4701.157
# Unsplit split_cash to get the original data back.
original_cash <- unsplit (split_cash, grouping)
# Print original_cash
original_cash
## cash_flow year quarter_cash double_year present_value
## 1 1000 1 250.0 2 952.3810
## 2 4000 3 1000.0 6 3455.3504
## 3 550 4 137.5 8 452.4864
## 4 1500 1 375.0 2 1428.5714
## 5 1100 2 275.0 4 997.7324
## 6 750 4 187.5 8 617.0269
## 7 6000 5 1500.0 10 4701.1570
You can split your data frame by a grouping, apply some transformation to each group, and then recombine those pieces back into one data frame. This is referred to in R as split-apply-combine.
# Print split_cash
split_cash
## $`1`
## cash_flow year quarter_cash double_year present_value
## 1 1000 1 250 2 952.381
## 4 1500 1 375 2 1428.571
##
## $`2`
## cash_flow year quarter_cash double_year present_value
## 5 1100 2 275 4 997.7324
##
## $`3`
## cash_flow year quarter_cash double_year present_value
## 2 4000 3 1000 6 3455.35
##
## $`4`
## cash_flow year quarter_cash double_year present_value
## 3 550 4 137.5 8 452.4864
## 6 750 4 187.5 8 617.0269
##
## $`5`
## cash_flow year quarter_cash double_year present_value
## 7 6000 5 1500 10 4701.157
# Print the cash_flow column of B in split_cash
split_cash$B$cash_flow
## NULL
# Set the cash_flow column of company A in split_cash to 0
split_cash$A$cash_flow <- 0
# Use the grouping to unsplit split_cash
cash_no_A <- unsplit(split_cash, grouping)
# Print cash_no_A
cash_no_A
## cash_flow year quarter_cash double_year present_value
## 1 1000 1 250.0 2 952.3810
## 2 4000 3 1000.0 6 3455.3504
## 3 550 4 137.5 8 452.4864
## 4 1500 1 375.0 2 1428.5714
## 5 1100 2 275.0 4 997.7324
## 6 750 4 187.5 8 617.0269
## 7 6000 5 1500.0 10 4701.1570
You can use the attributes() function to return a list of attributes about the object you pass in. To access a specific attribute, you can use the attr() function.
# my_matrix and my_factor
my_matrix <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
rownames(my_matrix) <- c("Row1", "Row2")
colnames(my_matrix) <- c("Col1", "Col2", "Col3")
my_factor <- factor(c("A", "A", "B"), ordered = T, levels = c("A", "B"))
# attributes of my_matrix
attributes (my_matrix)
## $dim
## [1] 2 3
##
## $dimnames
## $dimnames[[1]]
## [1] "Row1" "Row2"
##
## $dimnames[[2]]
## [1] "Col1" "Col2" "Col3"
# Just the dim attribute of my_matrix
attr (my_matrix, which = "dim")
## [1] 2 3
# attributes of my_factor
attributes (my_factor)
## $levels
## [1] "A" "B"
##
## $class
## [1] "ordered" "factor"
The first week’s quiz is a very brief and simple just to get your feet wet. Complete the tasks below and include them at the end of your RMarkdown file of the first week. And publish it in RPubs.com and email me the link for grading.
A vector is a collection of data that is all of the same type. Vectors contain only one row or one column. Matrices are also collections of data that is all the same type. The difference is that matrices have both rows and columns.
Matrices are like tables of data that contain only one data type. Data frames are also tables of data but data frames can contain different types of data.
My_portfolio is a vector of stocks that I own
stocks <- c("amazon", "apple", "starbucks")
stocks
## [1] "amazon" "apple" "starbucks"
Stock_portfolio is a matrix created from stocks and their symbols.
stocks <- c("amazon", "apple", "starbucks")
symbols <- c("AMZN", "AAPL", "SBUX")
stock_portfolio <-cbind (stocks, symbols)
stock_portfolio
## stocks symbols
## [1,] "amazon" "AMZN"
## [2,] "apple" "AAPL"
## [3,] "starbucks" "SBUX"
Stock_values is a data frame made by combining stocks, symbols and their current prices.
stocks <- c("amazon", "apple", "starbucks")
symbols <- c("AMZN", "AAPL", "SBUX")
prices <- c( 978, 149, 62)
stock_values <- data.frame (stocks, symbols, prices)
stock_values
## stocks symbols prices
## 1 amazon AMZN 978
## 2 apple AAPL 149
## 3 starbucks SBUX 62
Stock_list is a list of my stocks, their symbols and their prices
stocks <- c("amazon", "apple", "starbucks")
symbols <- c("AMZN", "AAPL", "SBUX")
prices <- c( 978, 149, 62)
stock_list <- list (stocks, symbols, prices)