1 General directions for this Workshop

You will work in RStudio. Create an R Notebook document to write whatever is asked in this workshop.

You have to solve CHALLENGE exercises.

It is STRONGLY RECOMMENDED that you write your OWN NOTES as if this were your notebook. Your own workshop/notebook will be very helpful for your further study.

Keep saving your .Rmd file, and ONLY SUBMIT the .html version of your .Rmd file.

2 Data management

2.1 Data collection

We will download price data using a flexible list of tickers from Excel.

Download the InputW6.xlsx file from Canvas / Modules / Week 7. Save this file in the directory where you save this .Rmd file.

You have to install the package IntroCompFinR from a specific web repository. Go to the console and type:

install.packages(“IntroCompFinR”, repos = “http://R-Forge.R-project.org”)

Now load the required libraries for this workshop:

library(quantmod)
library(dplyr)
library(PerformanceAnalytics)
library(IntroCompFinR)
library(readxl)

Save your .Rmd file in the in the same directory where you saved the InputsW6.xlsx file.

We read a list of tickers from an Excel file. The Excel file has one Sheet with a list of tickers along with 4 different portfolios with their respective weights. You can download the Excel file “InputsW6.xlsx” from the course site. Save this file in your working directory.

input.df<-read_excel("InputW6.xlsx",sheet = "ticklist")
tickers.list<-input.df$ticker

Now we have the input of the program in the data frame input.df. The first and only one column is the list of tickers.

We can get monthly data from Yahoo from 2016 to Sep 21, 2021:

getSymbols(tickers.list,
           from = "2016-01-01", to = "2021-09-21",
           periodicity = "monthly",
           src = "yahoo")
## [1] "AMZN" "JPM"  "WMT"  "TSLA"

Since the getSymbols function creates an xts - zoo object for each asset, it is convenient to merge those objects into one single object. We will call this object prices.zoo using the merge function.

However, since we are reading the tickers from an Excel file, the name of the objects are the ticker names, which are stored in tickers.list. Although we know this, we cannot get access to these xts-xoo objects directly. Let’s explore why this is the case. If try to run the following line:

# prices.zoo <- merge(tickers.list)

This will not work since the merge function expect a list of objects separated by commas as parameters, and we are sending the string vector contained in tickers.list. We can also try:

# prices.zoo <- merge(tickers.list[1], tickers.list[2], tickers.list[3], tickers.list[4])

This will not work either since merge expect a list of objects, and we are sending as parameters the tickers as words or strings, not as object names.

Then, we have to find a way to create a list of xts-zoo objects and named them as the ticker names stored in tickers.list. We can use the function get to create these objects from the string names:

p1<-get(tickers.list[1])
p2<-get(tickers.list[2])
p3<-get(tickers.list[3])
p4<-get(tickers.list[4])

prices.zoo <- merge(p1, p2, p3, p4)

Now p1, p2, p3, and p4 will be xts-zoo objects, and the prices.zoo will have all these objects merged. Note that the merge fuction works for xts and zoo objects only. We can now remove the objects we created:

rm(p1, p2, p3, p4) 

We solved the problem in the case of exactly 4 tickers. However, what about solving this problem for any number of tickers? In this case, we can think in storing a list of objects. The function lapply applies a function to a list. This list can be any type of variable. Then, we can create a list of objects xts-zoo objects using the function get and the tickers.list as parameters of the lapply function:

objList <- lapply(tickers.list, get)
class(objList)
## [1] "list"

Now the list objList is a list of xts-zoo objects. Check how the lapply function applied the get function to the string vector tickers.list. Then the get function will be applied to each element of the tickers.list vector and the result will be assigned to the objList. objList will be a list of xts-zoo objects that contains the data frames with the price data for each stock.

We do not have to specify how many objects we send to this list. Then, if we specified 10 tickers in the Excel file, then this list will be a list of 10 xts-zoo objects, which will have the price information of 10 tickers.

To access each member of an object in R, we have to use double squared bracket and the element number. For example, if I want to display the first observations of the first xls-zoo object in the list, we can do the following:

head(objList[[1]])
##            AMZN.Open AMZN.High AMZN.Low AMZN.Close AMZN.Volume AMZN.Adjusted
## 2016-01-01    656.29    657.72   547.18     587.00   130200900        587.00
## 2016-02-01    578.15    581.80   474.00     552.52   124144800        552.52
## 2016-03-01    556.29    603.24   538.58     593.64    94009500        593.64
## 2016-04-01    590.49    669.98   585.25     659.59    78464200        659.59
## 2016-05-01    663.92    724.23   656.00     722.79    90614500        722.79
## 2016-06-01    720.90    731.50   682.12     715.62    74540900        715.62

Now we have to do a merge of the objects contained in this list. Since we know that we have only 4 tickers, then we can do the following:

prices.zoo <- merge(objList[[1]], objList[[2]], objList[[3]], objList[[4]])

This will work ok, but what if I want to generalize the code for any number of tickers specified in the Excel file? One effective way to do this is using the function do.call. This function allows you to call any R function, but instead of writing out the arguments one by one, I can use a list to hold the arguments of the function:

prices.zoo <- do.call(merge, objList)

We generalized our code to ANY number of tickers! The do.call function is similar to the lapply function, but the do.call function runs a function and send a list as parameters of the function.

Then the prices.zoo object will be a xts-zoo objects with many columns for all prices for all tickers.

I select only adjusted prices and drop NA values:

prices.df <- as.data.frame(na.omit(Ad(prices.zoo)))

We used the function na.omit(). This function will drop any observation that has at least one missing value for any of the stock prices. In this case, if we have a Mexican stock and US stocks, then the holidays of any country will be dropped for all stocks, including the observations of the Mexican stocks.

2.2 Return calculation

Now we calculate the monthly compounded returns. We do this as the difference in the log prices for asset i.

\[r_{i_t}=\Delta\left( log \left( {price_{i_t}} \right) \right)\]

returns.df <- diff(log(as.zoo(prices.df)))
## 
## NOTE: quantmod::as.zoo.data.frame() is deprecated
##   Use as.zoo(x, order.by = as.Date(rownames(x))) instead.
##   This note is printed once. To see it for every call, set
##   options(quantmod.deprecate.as.zoo.data.frame = TRUE)

2.3 Q Descriptive statistics and visualization

We take a look to the return data.

head(returns.df)
##            AMZN.Adjusted JPM.Adjusted  WMT.Adjusted TSLA.Adjusted
## 2016-02-01   -0.06053515  -0.04859589 -0.0003013582   0.003810721
## 2016-03-01    0.07178343   0.05056476  0.0318947236   0.179948013
## 2016-04-01    0.10534537   0.06504493 -0.0165614733   0.046721800
## 2016-05-01    0.09150022   0.03960460  0.0568258690  -0.075597957
## 2016-06-01   -0.00996940  -0.04914260  0.0384519563  -0.050296449
## 2016-07-01    0.05860212   0.02902441 -0.0006851006   0.100785316
tail(returns.df)
##            AMZN.Adjusted JPM.Adjusted WMT.Adjusted TSLA.Adjusted
## 2021-04-01    0.11392015   0.01032580  0.033761898    0.06029256
## 2021-05-01   -0.07307644   0.07148312  0.015038903   -0.12637239
## 2021-06-01    0.06518361  -0.05442591 -0.003216999    0.08354800
## 2021-07-01   -0.03326961  -0.02447084  0.010790936    0.01097385
## 2021-08-01    0.04213390   0.05817519  0.038194974    0.06822418
## 2021-09-01   -0.05500340   0.02311315 -0.057010583    0.05263259

To avoid scientific notations for outputs, I define the following:

options(scipen=100)
options(digits=2)

I display descriptive statistics of returns using the table.Stats function from the PerformanceAnalytics package:

returns.zoo<-as.zoo(returns.df)
table.Stats(returns.df) 
##                 AMZN.Adjusted JPM.Adjusted WMT.Adjusted TSLA.Adjusted
## Observations          68.0000      68.0000      68.0000       68.0000
## NAs                    0.0000       0.0000       0.0000        0.0000
## Minimum               -0.2259      -0.2544      -0.1692       -0.2539
## Quartile 1            -0.0240      -0.0250      -0.0171       -0.0560
## Median                 0.0268       0.0248       0.0134        0.0397
## Arithmetic Mean        0.0253       0.0172       0.0128        0.0443
## Geometric Mean         0.0225       0.0148       0.0116        0.0325
## Quartile 3             0.0741       0.0588       0.0445        0.1182
## Maximum                0.2382       0.1935       0.1110        0.5547
## SE Mean                0.0093       0.0083       0.0060        0.0195
## LCL Mean (0.95)        0.0068       0.0006       0.0009        0.0053
## UCL Mean (0.95)        0.0438       0.0339       0.0248        0.0832
## Variance               0.0059       0.0047       0.0024        0.0259
## Stdev                  0.0765       0.0688       0.0493        0.1610
## Skewness              -0.0656      -0.7124      -0.5615        0.7223
## Kurtosis               1.1806       2.7823       1.4022        0.8105

We can do a Boxplot of the returns to better appreciate the median, the quartiles Q1 and Q3, volatility and extreme values of these returns:

chart.Boxplot(returns.zoo)

Box plots help us to understand risk of financial instruments.

Learn what is a box plot and provide your own INTERPRETATION of this plot (use your own words); respond in CAPITAL LETTERS.

3 Simulating portfolios of 4 assets

Using these 4 assets you have to simulate 1000 portfolios with random weights for each asset, but the sum of the 4 random weights must be equal to 1.

Design an algorithm to create a vector of 4 rows and 1,000 columns with random weights, where the sum of each column is equal to 1. We can use the function runif which generates a random number following the uniform probability distribution. Here is an example of this algorithm (remember that this is an example, but there are many different ways to write this algoritm)

# I create an empty vector 
W<-c()
# I do a loop from 1 to 3 to create 1,000 columns of 4 random weights following
#   the uniform probability distribution:
for(i in 1:4)  {
  W<-rbind(W,runif(1000))
}
# The problem I have now is that some of the 1,000 portfolios
#  might end up having a sum of weights higher than one. # I can do a simple "trick" by 
#   dividing each of the 4 weights by the sum of these 4
#   weights. And I can do this # for all 1000 portfolios: 

# I first create a vector with the sum of weights for all portfolios:
sumw <- colSums(W)
# I do another loop to divide each weight by the sum of weights: 
for(i in 1:4)  {
  W[i,]<-W[i,]/sumw
  # In each iteration I divide one raw of W2 by the vector sumw, 
  #  which is the sum of the weights of all 1000 portfolios
}

# I check that the sum of weights is 1 (I do this only for 10 portfolios)
colSums(W[,1:10])
##  [1] 1 1 1 1 1 1 1 1 1 1
# All sums are equal to 1, as expected

# Then each column of this matrix represents a random portfolio without allowing for short sales. 

Then each of the 1,000 columns of W represent one random portfolio of 4 assets. I can calculate the expected return and expected risk of each of these portfolios according to Markowitz portfolio theory.

# I create a matrix for the historical returns of the assets:
ret.mat <- as.matrix(returns.df)
# I generate the expected returns of the assets as their geometric means:
exp.rets<-exp(colMeans(ret.mat)) - 1
# I generate the variance-covariance matrix:
COV<-var(ret.mat)
# I estimate the expected returns of all 1,000 portfolios using matrix algebra:
ERP<-t(W)%*%exp.rets
# I estimate the expected variance and risk of the portfolios using matrix algebra:
VARP<-t(W)%*%COV%*%W
# I get the diagonal of this matrix and take square root to get the expected risk
#   of the 1,000 portfolios:
RISKP<-sqrt(diag(VARP))
# I remove the VARP object since it has 1,000 times 1,000 cels! 1 million cells!
rm(VARP)

Now I can plot all the expected risk vs return of the 1,000 portfolios:

plot(RISKP,ERP[,1], main="Random portfolios of 4 assets",xlab="Volatility",ylab="Exp Return")

4 Q CHALLENGE 1 - Calculate the Portfolio of Minimum Variance

Write code to identify which of the 1,000 portfolios is the portfolio with the lowest expected variance. You have to show the weights of each asset and its corresponding expected risk and expected variance. Hint: you can use sub-setting of data frames.

5 The Global minimum variance portfolio

Although we can get an idea of the global minimum variance portfolio with the simulation, it is a good idea to find a way to estimate the exact GMV portfolio. We can calculate the global minimum variance portfolio using the gmvportws function which is part of the IntroCompFinR library. This can be done very easy:

gmvportws = globalMin.portfolio(exp.rets, COV) 

Remember that the calculation of the Global Minimum Variance Portfolio is an optimization problem stated as the portfolio with the minimum risk of all possible combinations subject to the sum of return weights of the portfolio is equal to 1.

According to Portfolio Theory, the expected risk of a portfolio \(P\) is calculated as the squared root of the Portfolio Variance.

\[SD(P) = \sqrt{(Var(P))}\]

The portfolio variance can be calculated with the following matrix multiplication:

\[Var(P) = W' * \sum_{cov} * W\] where:

\(W\) is the weight vector of the stocks, and \(W'\) is the tranposed matrix of weights. Then, the optimization problem to calculate the Global Minimum Variance Portfolio is defined as:

Minimize:

\[W' * \sum_{cov} * W\]

subject to:

\(1'*W = 1\)

\(1'\) is the transpose vector (horizontal vector of 1’s) that is multiplied times the vertical vector \(W\). The result of this matrix multiplication is the sum of all weights. The function globalMin.portfolio gets the variance-covariance matrix \(COVMAT\), represented in the code as covmat, and the vector of expected returns to calculate the solution of this optimization problem.

You can check the solution of this problem in the note Portfolio Theory - Part 3 or any Finance book that covers Portfolio theory.

The globalMin.portfolio function returns a portfolio class object, which has different attributes:

# Print the whole object
gmvportws
## Call:
## globalMin.portfolio(er = exp.rets, cov.mat = COV)
## 
## Portfolio expected return:     0.016 
## Portfolio standard deviation:  0.042 
## Portfolio weights:
## AMZN.Adjusted  JPM.Adjusted  WMT.Adjusted TSLA.Adjusted 
##        0.1314        0.2740        0.6005       -0.0059
# Check the attributes
attributes(gmvportws)
## $names
## [1] "call"    "er"      "sd"      "weights"
## 
## $class
## [1] "portfolio"
# To see the weights of the portfolio
gmvportws$weights
## AMZN.Adjusted  JPM.Adjusted  WMT.Adjusted TSLA.Adjusted 
##        0.1314        0.2740        0.6005       -0.0059
# To see the class of the object
class(gmvportws)
## [1] "portfolio"

We can run the same function, but now without allowing for short sales setting the parameter shorts = FALSE.

gmvportwos <- globalMin.portfolio(exp.rets, COV, shorts = FALSE)

# Print the object
gmvportwos
## Call:
## globalMin.portfolio(er = exp.rets, cov.mat = COV, shorts = FALSE)
## 
## Portfolio expected return:     0.016 
## Portfolio standard deviation:  0.042 
## Portfolio weights:
## AMZN.Adjusted  JPM.Adjusted  WMT.Adjusted TSLA.Adjusted 
##          0.13          0.27          0.60          0.00

6 The Efficient Frontier

The efficient frontier calculates the portfolio of minimum variance for different levels of expected returns. It uses a similar optimization solution than in the case of the GMV Portfolio, but it adds the restriction that the expected return must be a specific return. We use the efficient.frontier function:

ef <- efficient.frontier(exp.rets, COV, nport = 100, 
                         alpha.min = -0.5, 
                         alpha.max = 1.5, shorts = TRUE)

The efficient.frontier function returns a Markowitz class object, which has different attributes:

attributes(ef)
## $names
## [1] "call"    "er"      "sd"      "weights"
## 
## $class
## [1] "Markowitz"

Finally, we create a visualization of the efficient frontier and include the assets of the portfolio in the plot.

plot(ef, plot.assets = TRUE, col = "blue", pch = 16)

This frontier is not actually the “efficient” frontier since there are some portfolios with the same level of risk with lower expected return. We can generate only portfolios that are actually efficient by changing the alpha.min and alpha.max parameters.

The efficient.frontier function calculates the frontier using 2 portfolios: the GMV portfolio and the tangent/optimal portfolio. These 2 portfolios are efficient. Using these 2 portfolios the function calculates intermediate efficient portfolios by considering each portfolio as a financial instrument and generating linear combinations of these 2 portfolios with weights alpha for one portfolio and (1-alpha) for the second portfolio.

When we specify alpha.min and alpha.max, the function will do a loop to generate several linear combinations of efficient portfolios to generate portfolios between the GMV and the optimal portfolio. When alpha=1 it is assigning 100% to the GMV portfolio and 0% to the tangent portfolio; when alpha=-0.5 it shorts GMV in 50% and assign 150% to the tangent portfolio.

ef <- efficient.frontier(exp.rets, COV, nport = 100, 
                         alpha.min = -0.5, 
                         alpha.max = 1, shorts = TRUE)

plot(ef, plot.assets = TRUE, col = "blue", pch = 16)

With your own words, INTERPRET this frontier. Write your interpretation in CAPITAL LETTERS.

7 The Tangent (optimal) portfolio

Now that you are more familiar with R objects and Portfolio Theory, you have to look at the following code, run it, and write as comments what each line of code does:

r.free = 0.0
tan.port <- tangency.portfolio(exp.rets, COV, r.free)
tanportweights<-getPortfolio(er=exp.rets,cov.mat=COV,weights=tan.port$weights)
plot(tanportweights, col="blue")

gmin.port<-globalMin.portfolio(exp.rets, COV)
efrontier <- efficient.frontier(exp.rets, COV, alpha.min=-0.5,
                                alpha.max=1.5, nport=30)
attributes(efrontier)
## $names
## [1] "call"    "er"      "sd"      "weights"
## 
## $class
## [1] "Markowitz"
plot(efrontier)

plot(efrontier, plot.assets=TRUE, col="blue", pch=16)
points(gmin.port$sd, gmin.port$er, col="green", pch=16, cex=2)
points(tan.port$sd, tan.port$er, col="red", pch=16, cex=2)
text(gmin.port$sd, gmin.port$er, labels="GLOBAL MIN", pos=2)
text(tan.port$sd, tan.port$er, labels="TANGENCY", pos=2)
sr.tan = (tan.port$er - r.free)/tan.port$sd
abline(a=r.free, b=sr.tan, col="green", lwd=2)

Make sure you explain what each function does and which are the parameters. For example, the plot function receives the object efrontier of the class Markowitz. A Markowitz object contains…

8 Q CHALLENGE 2. Frontiers of 2-asset portfolios with different correlations

You have to simulate 5 frontiers of 2-asset portfolios where each frontier you assume a hypothetical correlation between the 2 assets, starting from -1 up to +1 jumping by 0.5. Do the following:

  1. Download monthly price data for WMT and AMZN from Yahoo from Jan 2016 to date. Calculate cc returns

  2. Calculate the expected returns and store them in a vector

  3. Calculate the Variance-Covariance matrix of these 2 stock returns

  4. Create a vector wa for simulating different weights for WMT. Create this vector with sequential values from 0 to 1. Create the vector wb equal to the complement of wa (1-wa) that represent the % invested in AMZN.

  5. Then, with these weight vectors of 11 elements you have to estimate the expected return, variance and expected standard deviation of each of these 11 portfolios.

  6. Do a scatter plot for the portfolio frontier, so that you can see the relationship between risk and return of these 11 portfolios.

  7. You have to simulate 5 portfolio frontiers using the weight vectors with positive weights. Each of these portfolio frontiers will have a hypothetical correlation between WMT and AMZN. Create the first portfolio frontier with a hypothetical correlation of -1, then the 2nd portfolio frontier with a correlation of -0.5, then the 3rd with a correlation of 0, then the 4rd frontier with a correlation of 0.5, and finally, the 5th frontier with a correlation of 1.

Graph the 5 frontiers in one plot. What do you observe? EXPLAIN with your own words.

  1. Encapsulate the code you wrote in a function that receives the two tickers, initial date and end date, and do the previous steps finishing in the simulation of 5 portfolio frontiers with different levels of correlation.