In this first lab, you will analyze the monthly stock returns of Starbucks (ticker: SBUX).
Let us get started by downloading the monthly return data from the following URL:
“http://iDataScienceR.com/files/sbuxPrices.csv” and by using the read.csv() function.
(Type ?read.table in the R console to consult the help file for this function)
In the read.csv() function, you should indicate that the data in the CSV file has a header (header argument) and that strings should not be interpreted as factors (stringsAsFactors argument).
# Assign the URL to the CSV file
dataurl <- "http://iDataScienceR.com/files/sbuxPrices.csv"
# Load the data frame using read.csv
sbux <- read.csv(dataurl, header = T, stringsAsFactors = F)
# "sbuxdf"" should be a data frame object.
# Data frames are rectangular data objects that typically contain observations in
# rows and variables in columns
Before we analyze the loaded return data, it is a good idea to have (at least) a quick look at the data.
R has a number of functions that help you do that:
The str() function compactly displays the structure of an R object.
It is arguably one of the most useful R functions.
The head() and tail() functions shows you the first and the last part of an R object, respectively.
The class() function shows you the class of an R object.
So let us use them:
str(sbux)
## 'data.frame': 181 obs. of 2 variables:
## $ Date : chr "3/31/1993" "4/1/1993" "5/3/1993" "6/1/1993" ...
## $ Adj.Close: num 1.13 1.15 1.43 1.46 1.41 1.44 1.63 1.59 1.32 1.32 ...
head(sbux)
## Date Adj.Close
## 1 3/31/1993 1.13
## 2 4/1/1993 1.15
## 3 5/3/1993 1.43
## 4 6/1/1993 1.46
## 5 7/1/1993 1.41
## 6 8/2/1993 1.44
tail(sbux)
## Date Adj.Close
## 176 10/1/2007 25.37
## 177 11/1/2007 22.24
## 178 12/3/2007 19.46
## 179 1/2/2008 17.98
## 180 2/1/2008 17.10
## 181 3/3/2008 16.64
class(sbux$Date)
## [1] "character"
What is the class of the Date column of the sbux_df data frame?
You can use square brackets to extract data from the “sbux” data frame like this sbux[rows, columns].
To specify which rows or columns to extract, you have several options:
sbux[1:5, "Adj.Close"]
## [1] 1.13 1.15 1.43 1.46 1.41
sbux[1:5, 2]
## [1] 1.13 1.15 1.43 1.46 1.41
sbux$Adj.Close[1:5]
## [1] 1.13 1.15 1.43 1.46 1.41
All of the above expressions will extract the first five closing prices.
If you do not wish to provide anything for the rows (or columns) then all rows (or columns) will be selected:
e.g. sbux[, “Adj.Close”]
(Home Work: Check this yourself by typing the different options in the R console)
Note that in the above operations, the dimension information was lost.
To preserve the dimension information, add the drop = FALSE argument:
closing_prices <- sbux[, "Adj.Close", drop = F]
head(closing_prices)
## Adj.Close
## 1 1.13
## 2 1.15
## 3 1.43
## 4 1.46
## 5 1.41
## 6 1.44
It will often be useful to select stock data between certain dates.
Advanced users are advised to look at the xts package.
However, base R also provides sufficient functionality to do this.
The which() function returns the indices for which a condition is TRUE.
For example:
which(sbux$Date == “3/1/1994”) returns the position of the date 3/1/1994, which indicates in this case the row number in the sbux data frame.
# Find indices associated with the dates 3/1/1994 and 3/1/1995
index1 <- which(sbux$Date == "3/1/1994")
index2 <- which(sbux$Date == "3/1/1995")
# Extract prices between 3/1/1994 and 3/1/1995
some_prices <- sbux[index1:index2, "Adj.Close"]
# The sbux_df data frame is already loaded in your work space
# Create a new data frame that contains the price data with the dates as the row names
sbux_prices <- sbux[, "Adj.Close", drop = FALSE]
rownames(sbux_prices) <- sbux$Date
head(sbux_prices)
## Adj.Close
## 3/31/1993 1.13
## 4/1/1993 1.15
## 5/3/1993 1.43
## 6/1/1993 1.46
## 7/1/1993 1.41
## 8/2/1993 1.44
# With Dates as rownames, you can subset directly on the dates.
# Find indices associated with the dates 3/1/1994 and 3/1/1995.
price1 <- sbux_prices["3/1/1994", ];
price2 <- sbux_prices["3/1/1995", ];
Next, the Starbucks closing prices are plotted as a function of time.
This plot was generated with plot(sbux$Adj.Close), the basic plotting function.
However, we should be able to generate a nicer plot, right?
For one thing, a line plot makes much more sense for price time series data.
# Now add all relevant arguments to the plot function below to get a nicer plot
plot(sbux$Adj.Close, type="l", col="blue", lwd=2,
ylab="Adjusted close", main="Monthly closing price of STARBUCKS")
That is a much nicer plot indeed! You can further improve the plot by adding a legend.
Let’s add a legend:
plot(sbux$Adj.Close, type = "l", col = "blue", lwd = 2,
ylab = "Adjusted close", main = "Monthly closing price of STARBUCKS")
legend(x = 'topleft', legend = 'SBUX', lty = 1, lwd = 2, col = 'blue')
If you denote by Pt the stock price at the end of month “t”, the simple return is given by:
Rt = [ Pt - Pt-1 ]/ Pt-1, the percentage price difference.
Your task in this exercise is to compute the simple returns for every time point “n”.
The fact that R is vectorized, makes that relatively easy.
In case you would like to calculate the price difference over time, you can use:
sbux_prices[2:n, 1] - sbux_prices[1:(n-1), 1]
Think about why this indeed calculates the price difference for all time periods.
The first vector contains all prices, except the price on the first day.
The second vector contains all prices except the price on the last day.
Given the fact that R takes the element-wise difference of these vectors, you get Pt - Pt-1 for every “t”.
sbux_prices <- sbux[, "Adj.Close", drop = FALSE]
# Denote n the number of time periods:
n <- nrow(sbux_prices)
sbux_ret <- ((sbux_prices[2:n, 1] - sbux_prices[1:(n-1), 1])/sbux_prices[1:(n-1), 1])
# Notice that sbux_ret is not a data frame object
class(sbux_ret)
## [1] "numeric"
Notice that sbux_ret is a vector.
Remember that you can add drop = FALSE, if you want a data frame as output.
The vector sbux_ret now contains the simple returns of Starbucks.
It would be convenient to have the dates as names for the elements of that vector.
Remember that the trading dates were in the first column of the sbux data frame.
To set the names of a vector, you can use names(vector) <- some_names.
Remember that we are dealing with closing prices.
The first return in sbux is thus realized on the second day, or sbux_prices[2, 1].
sbux_prices <- sbux[, "Adj.Close", drop = FALSE]
# Denote n the number of time periods:
n <- nrow(sbux_prices)
sbux_ret <- ((sbux_prices[2:n, 1] - sbux_prices[1:(n - 1), 1])/sbux_prices[1:(n - 1), 1])
# Notice that sbux_ret is not a data frame object
class(sbux_ret)
## [1] "numeric"
# Now add dates as names to the vector and print the first elements of
# sbux_ret to the console to check
names(sbux_ret) <- sbux[2:n, 1]
head(sbux_ret)
## 4/1/1993 5/3/1993 6/1/1993 7/1/1993 8/2/1993 9/1/1993
## 0.01769912 0.24347826 0.02097902 -0.03424658 0.02127660 0.13194444
Notice how R now nicely prints the dates above each return.
Much more convenient, isn't it?
As you might remember from class, the relation between single-period and multi-period returns is multiplicative for single returns.
That is not very convenient.
The yearly return is for example the geometric average of the monthly returns.
Therefore, in practice you will often use continuously compounded returns.
These returns have an additive relationship between single and multi-period returns and are defined as:
rt=ln(1+Rt),
with Rt the simple return and rt the continuously compounded return at moment t.
Continuously compounded returns can be computed easily in R by realizing that rt=ln(PtPt−1)
ln(PtPt−1)=ln(Pt)−ln(Pt−1).
In R, the log price can be easily computed through log(price).
Notice how the log() function in R actually computes the natural logarithm.
# Denote n the number of time periods:
n <- nrow(sbux_prices)
sbux_ret <- ((sbux_prices[2:n, 1] - sbux_prices[1:(n-1), 1])/sbux_prices[1:(n-1), 1])
# Compute continuously compounded 1-month returns
sbux_ccret <- log(sbux_prices[2:n, 1]) - log(sbux_prices[1:(n-1), 1])
# Assign names to the continuously compounded 1-month returns
names(sbux_ccret) <- sbux[2:n, 1]
# Show sbux_ccret
head(sbux_ccret)
## 4/1/1993 5/3/1993 6/1/1993 7/1/1993 8/2/1993 9/1/1993
## 0.01754431 0.21791250 0.02076199 -0.03484673 0.02105341 0.12393690
You would like to compare the simple and the continuously compounded returns.
In the next exercise, you will do that by generating two graphs.
In this exercise, you will just have a quick look at the data.
It would be nice to have the simple and continuously compounded return next to each other in a matrix, with n rows and two columns.
You can use the cbind() function to paste the two vectors that contain both types of returns next to each other in a matrix.
# Denote n the number of time periods:
n <- nrow(sbux_prices)
sbux_ret <- ((sbux_prices[2:n, 1] - sbux_prices[1:(n - 1), 1])/sbux_prices[1:(n - 1), 1])
# Compute continuously compounded 1-month returns
sbux_ccret <- log(sbux_prices[2:n, 1]) - log(sbux_prices[1:(n - 1),1])
names(sbux_ccret) <- sbux[2:n, 1]
head(sbux_ccret)
## 4/1/1993 5/3/1993 6/1/1993 7/1/1993 8/2/1993 9/1/1993
## 0.01754431 0.21791250 0.02076199 -0.03484673 0.02105341 0.12393690
# Compare the simple and cc returns
head(cbind(sbux_ret, sbux_ccret))
## sbux_ret sbux_ccret
## 4/1/1993 0.01769912 0.01754431
## 5/3/1993 0.24347826 0.21791250
## 6/1/1993 0.02097902 0.02076199
## 7/1/1993 -0.03424658 -0.03484673
## 8/2/1993 0.02127660 0.02105341
## 9/1/1993 0.13194444 0.12393690
Notice that the continuously compounded returns are always somewhat smaller than the simple returns (why?).
Let us compare both types of returns now in a graphical analysis.
In this exercise, we will create a plot that contains both the simple and continuously compounded returns.
This makes it easy to compare both types of returns.
Have a look at the sample code below.
First of all, we have to plot the simple returns as a function of time.
The argument type = l specifies a line plot, col = blue specifies that the simple returns line is blue, lwd = 2 specifies the line thickness, ylab = “Return” specifies that “Return” is the label of the y-axis and main specifies the plot’s main title.
# Plot the returns on the same graph
plot(sbux_ret, type = "l", col = "blue", lwd = 2, ylab = "Return",
main = "Monthly Returns on SBUX")
# Add horizontal line at zero
abline(h = 0)
# Add a legend
legend(x = "bottomright", legend = c("Simple", "CC"), lty = 1,
lwd = 2, col = c("blue", "red"))
# Add the continuously compounded returns
lines(sbux_ccret, col = "red", lwd = 2)
Have a close look at the plot and notice that the continuously compounded returns are always slightly smaller than the simple returns.
In the next section we will calculate whether investing in Starbucks would have been a good idea.
Would it have been a good idea to invest in the SBUX stock over the period in our data set?
In case you invested $1 in SBUX on 3/31/1993 (the first day in sbux), how much would that dollar be worth on 3/3/2008 (the last day in sbux)?
What was the evolution of the value of that dollar over time?
R helps us by quickly coming up with an answer to these questions.
Remember that when you use simple returns, the total return over a period can be obtained by taking the cumulative product of the gross returns.
R has a handy cumprod() function that calculates that cumulative product.
# Compute gross returns
sbux_gret <- 1 + sbux_ret
# Compute future values
sbux_fv <- cumprod(sbux_gret)
# Plot the evolution of the $1 invested in SBUX as a function of time
plot(sbux_fv, type = "l", col = "blue", lwd = 2, ylab = "Dollars",
main = "FV of $1 invested in SBUX")
Your workspace contains the vector sbux with the adjusted closing price data for Starbucks stock over the period December 2004 through December 2005.
Type sbux in the R console to have a look at the data.
Use the data in sbux.
Hint
Remember that you can access the first element of the sbux vector with sbux[1].
The simple return is the difference between the first price and the second Starbucks price, divided by the first price.
Answer (sbux[2] - sbux[1]) / sbux[1]
Hint
Use the vector sbux with the adjusted closing price data for Starbucks stock over the period December 2004 through December 2005.
Do you still remember how you calculated the simple return in the previous exercise?
Well, the continuously compounded return is just the natural logarithm of the simple return plus one.
Assume that all twelve months have the same return as the simple monthly return between the end of December 2004 and the end of January 2005.
Hint
Use the vector sbux with the adjusted closing price data for Starbucks stock over the period December 2004 through December 2005.
In the first exercise you calculated the simple return between December 2004 and January 2005.
Have a look a the wikipedia article on compound interest and think about how that applies to this situation.
Use the data in sbux and compute the actual simple annual return between December 2004 and December 2005.
Use sbux[1] to extract the first price and sbux[length(sbux)] to extract the last price.
To get the simple annual return, calculate the price difference and divide by the initial price.
Answer:
-2.15%
Use the data sbux and compute the actual annual continuously compounded return between December 2004 and December 2005.
Use sbux with the adjusted closing price data for Starbucks stock over the period December 2004 through December 2005.
Hint
Do you still remember how you calculated the annual Starbucks return in the previous exercise?
Well, the continuously compounded annual return is just the natural logarithm of that return plus one.