1 Introduction

This is a possible solution to Workshop 7. Remember that it can be many different good solutions for this workshop. I will illustrate how I solved it. The best way to learn is compare your attempt for W7 with this solution, and identify what you missed and how you can improve your code. Improve your code not only by copying or replacing with my code, but try to modify it according to your own logic.

Remember that the only way to learn programming is PRACTICE, PRACTICE AND PRACTICE… AND LEARN FROM YOUR MISTAKES …

2 Challenge 1 - Writing a function for the Market Model

You have to write a function that estimates the market model for a stock, but now it has to return:

  • beta coefficients (beta0 and beta1),
  • standard errors of beta coefficients,
  • p-values of beta coefficients, and
  • the # of valid (non-missing) observations used to run the regression.

Here are the specifications of the function:

  1. Your function has to receive 2 parameters: 1) the continuously compounded (cc) return of a stock, and 2) the cc returns of the market.

  2. Using the lm function, run a simple regression model according to the market model. Remember that the dependent variable must the stock return, and the independent variable must be the market returns.

  3. Your function must return a vector with 7 values. The structure of this vector is illustrated in the following example:

b0 se(b0) p-value(b0) b1 se(b1) p-value(b1) # Non-missing
0.01 0.02 0.30 1.3 0.10 0.0001 35

The non-missing observations can be calculated using the attriute df.residual of the regression model. You just have to add 2 to this value since the degrees of freedom of the regression residual is N-2, where N is the number of non-missing observations.

Test your function with at least 2 stocks (select any stocks you want), and use the correct market index. Display the vectors and INTERPRET the result of the beta coefficients of only ONE stock.

2.1 Solution to Challenge 1

I define the function to run a market regression model. I first identify the inputs, outputs and the process needed to get the output:

The inputs of the function: 1) a vector/column of stock cc returns, 2) a vector/column of market ret

The output of the function: 1) a vector with the following results: + beta0 + standard deviation of beta0 + p-value of beta0 + beta1 + standard deviation of beta1 + p-value of beta1 + number of non-missing observations

Processes/tasks of the function

  1. Calculate Betas:

    • Run the regression model with lm function
    • Get/identify where to find the betas, std errors and p-values
    • Get the valid non-missing observations
  2. Create/fill-out a vector with the beta results and the # of non-missing observations

  3. Return this vector

Here is the function:

# I first clean my environment:
rm(list=ls())

mktmodel <-function(stockret,mktret) {
# I run the market regression model with the lm function:
  model<-lm(stockret ~ mktret)
# I save the summary of the model:  
  sm<-summary(model)
# The summary of the regression model contains the coefficients matrix where 
#   the beta coefficients and their standard errors, pvalues and tvalues are stored.

# For beta0 and beta1 I need to extract: the beta coefficient, standard error, and pvalue
  
# The information of beta0 is located in the first row of the coefficients matrix, 
#    and beta1 information is located in the 2nd row
  
# The first column of the matrix has the actual beta coefficients
# The 2nd column has the standard error of the coefficients
# The 3rd column has the t-values
# The 4rd column has the p-values

# I can extract any value of a matrix using the [#row,#column] operator in the matrix   
  
# I can extract each value and then assemble all in a vector:
  
beta0 = sm$coefficients[1,1]
sebeta0 = sm$coefficients[1,2]
pvaluebeta0 = sm$coefficients[1,4]
beta1 = sm$coefficients[2,1]
sebeta1 = sm$coefficients[2,2]
pvaluebeta1 = sm$coefficients[2,4]
    
# I construct the vector with the beta coefficients, standard errors and corresponding 
#    p-values:  
result.vector<- c(beta0,sebeta0,pvaluebeta0,beta1,sebeta1,pvaluebeta1)

# Another faster way to do the same is the following: 
# result.vector<- sm$coefficients[1,c(1,2,4)],sm$coefficients[2,c(1,2,4)])

# Finally I add the number of observation as the last element of the vector:  
 result.vector<-c(result.vector,model$df.residual+2)
# Why I added 2 to the degrees of freedom?
# The degrees of freedom of the residuals ia actually the valid # of observations minus
 #  the # of beta coefficients (in this case, 2) 
 
# I give names to the columns of the vector:  
  names(result.vector)<-c("b0","seb0","pb0","b1","seb1","pb1","N")
# I return the vector as result of the function  
  return(result.vector)
}

To avoid scientific notations for outputs, I define the following:

options(scipen=100)
options(digits=2)

Now I can test my function with any stock:

library(quantmod)
# I pick Apple and the S&P500:
getSymbols(c("AAPL","^GSPC"),
           from = "2018-01-01", to = "2023-03-31",
           periodicity = "monthly",
           src = "yahoo")
## [1] "AAPL"  "^GSPC"
#I merge only adjusted prices into one dataset:
adjprices<-merge(Ad(AAPL),Ad(GSPC))
# I calculate returns of the adjprices:
returns<-diff(log(as.zoo(adjprices)))
# I display the first 5 rows of the return dataset:
head(returns)
##            AAPL.Adjusted GSPC.Adjusted
## 2018-02-01        0.0619       -0.0397
## 2018-03-01       -0.0557       -0.0273
## 2018-04-01       -0.0151        0.0027
## 2018-05-01        0.1229        0.0214
## 2018-06-01       -0.0056        0.0048
## 2018-07-01        0.0276        0.0354
# Now I test the function:
modelo1<- mktmodel(returns$AAPL.Adjusted,returns$GSPC.Adjusted)
# Now modelo1 will be the vector with the results: 
modelo1
##              b0            seb0             pb0              b1            seb1 
##  0.015374996919  0.008078114088  0.061803068510  1.251948132474  0.150758267109 
##             pb1               N 
##  0.000000000015 62.000000000000

3 Challenge 2 - Running market models with a loop using your function

Using the Excel file InputW7.xlsx posted in the course site, you have to bring mothly stock prices, and a series for the market index. Bring data from Jan 2018 to Dec 2021. In one Sheet you can find the stock tickers, and in the other Sheet you find the market ticker.

Write the corresponding code to estimate the market models of all stocks. The specifications of the program are the following:

  1. Write a loop to run the a market regression model for each stock.

  2. You must use the function you defined in the previous part to run the market regression models.

  3. Make sure your program runs with any number of tickers specified in the Excel file. I recommend you to start testing your program with few tickers (2 or 3), so create another excel file with only 2 or 3 stocks.

  4. Your program must create a Matrix with the results of the models for all stocks. The structure of this matrix is the following:

MATRIX RESULTS:

STOCK b0 se(b0) p-value(b0) b1 se(b1) p-value(b1) # Non-missing
ALFA 0.01 0.02 0.30 1.3 0.10 0.0001 35
….

For the stock names, use the rownames of the matrix.

Then, each row will be the regression results for each stock, and each column will have specific values of the betas, std errors, p-values, and the # of non-missing values used in the regression. Note that it is possible that some stocks might not have prices/returns for some periods.

3.1 Solution to Challenge II

I load the readxl packages:

library(readxl)

You have to place the Excel file in the same folder where you saved your .Rmd.

I read the stock tickers and the market ticker from the Excel file:

tickers.df<-read_excel("InputW7.xlsx",sheet = "tickers")
mkticker.df<-read_excel("InputW7.xlsx",sheet = "mkt")

I put the market ticker and the list of tickers in one vector:

tickers.list<-c(mkticker.df$ticker,tickers.df$ticker)
class(tickers.list)
## [1] "character"
tickers.list
##  [1] "^MXX"         "AC.MX"        "ACCELSAB.MX"  "ACTINVRB.MX"  "AEROMEX.MX"  
##  [6] "ALFAA.MX"     "ALPEKA.MX"    "ALSEA.MX"     "AMXL.MX"      "ARA.MX"      
## [11] "QQQQ.MX"      "ASURB.MX"     "AUTLANB.MX"   "AXTELCPO.MX"  "CMOCTEZ.MX"  
## [16] "CREAL.MX"     "CULTIBAB.MX"  "EDOARDOB.MX"  "ELEKTRA.MX"   "FEMSAUBD.MX" 
## [21] "VVV.MX"       "FIBRAPL14.MX" "FINDEP.MX"    "FFF.MX"       "FRAGUAB.MX"  
## [26] "FUNO11.MX"    "GAPB.MX"      "GCARSOA1.MX"  "GCC.MX"       "HOMEX.MX"    
## [31] "HOTEL.MX"     "ICHB.MX"      "LALAB.MX"     "LAMOSA.MX"    "MINSAB.MX"   
## [36] "MONEXB.MX"    "OMAB.MX"      "PASAB.MX"     "PINFRA.MX"    "POCHTECB.MX" 
## [41] "POSADASA.MX"  "VITROA.MX"    "VOLARA.MX"    "WALMEX.MX"

Now I bring from the web the price data of the tickers contained in the tickers.list vector. I will get the price data of each ticker using a loop since it might be possible that some tickers does not exist in Yahoo or there is no data for the dates specified.

for (t in tickers.list) {
  try(getSymbols(t, 
             from = "2018-01-01", to = "2023-03-31",
             periodicity = "monthly",
             src = "yahoo") )
  
}
## Error in getSymbols.yahoo(Symbols = "QQQQ.MX", env = <environment>, verbose = FALSE,  : 
##   Unable to import "QQQQ.MX".
## QQQQ.MX download failed after two attempts. Error message:
## HTTP error 404.
## Error in getSymbols.yahoo(Symbols = "VVV.MX", env = <environment>, verbose = FALSE,  : 
##   Unable to import "VVV.MX".
## VVV.MX download failed after two attempts. Error message:
## HTTP error 404.
## Error in getSymbols.yahoo(Symbols = "FFF.MX", env = <environment>, verbose = FALSE,  : 
##   Unable to import "FFF.MX".
## FFF.MX download failed after two attempts. Error message:
## HTTP error 404.
## Error in getSymbols.yahoo(Symbols = "LALAB.MX", env = <environment>, verbose = FALSE,  : 
##   Unable to import "LALAB.MX".
## LALAB.MX download failed after two attempts. Error message:
## HTTP error 404.

The try() function is used to avoid the program to stop in case there is an error due to lack of information in Yahoo Finance. Whatever function is run inside the try function is run in a “safe” mode, so that in case of an error, your program will not stop running.

Market tickers in Yahoo starts with the character ^, but when bringing the data with getSymbols, the character ^ is deleted to name the corresponding object. Then, I create a market ticker string without the ^character:

nchar(mkticker.df$ticker)
## [1] 4

I copy the mkticker from character 2 to the last character given by the function nchar, which gives the number of characters contained in a string

mktick<-substr(mkticker.df$ticker,2,nchar(mkticker.df$ticker))
mktick
## [1] "MXX"

I assign the market ticker to a vector that I will be filling out later

t.list<-c(mktick)

Now I check which tickers were brought from Yahoo with a loop of the existing objects. The ls() function brings the R objects that exist in the Globa Environment. In this list of objects I have to identify those objects that have a name equal to any of the tickers specified in the tickers.list vector. Then, I can try the following loop to construct a vector of tickers that could be brought from Yahoo Finance:

for(t in ls()) {
  if (t %in% tickers.list){
     t.list  <- c(t.list , t)
  }
}
t.list
##  [1] "MXX"          "AC.MX"        "ACCELSAB.MX"  "ACTINVRB.MX"  "AEROMEX.MX"  
##  [6] "ALFAA.MX"     "ALPEKA.MX"    "ALSEA.MX"     "AMXL.MX"      "ARA.MX"      
## [11] "ASURB.MX"     "AUTLANB.MX"   "AXTELCPO.MX"  "CMOCTEZ.MX"   "CREAL.MX"    
## [16] "CULTIBAB.MX"  "EDOARDOB.MX"  "ELEKTRA.MX"   "FEMSAUBD.MX"  "FIBRAPL14.MX"
## [21] "FINDEP.MX"    "FRAGUAB.MX"   "FUNO11.MX"    "GAPB.MX"      "GCARSOA1.MX" 
## [26] "GCC.MX"       "HOMEX.MX"     "HOTEL.MX"     "ICHB.MX"      "LAMOSA.MX"   
## [31] "MINSAB.MX"    "MONEXB.MX"    "OMAB.MX"      "PASAB.MX"     "PINFRA.MX"   
## [36] "POCHTECB.MX"  "POSADASA.MX"  "VITROA.MX"    "VOLARA.MX"    "WALMEX.MX"

I reassign t.list to ticker.list

tickers.list<-t.list

Now I create one object for each ticker using the vector of tickers tickers.list. I use the lapply function to apply the get function to each element of the tickers.list

objList <- lapply(tickers.list, get)
class(objList)
## [1] "list"

This will be a List object. Each element will be an xts-zoo object for each ticker.

Now I merge all objects using the function call. I do this to generalize my program for any number of tickers:

prices.zoo <- do.call(merge, objList)

The do.call function is similar to the lapply function. For the do.call function, the first argument is the function, and the second argument is the list of objects.

Now prices.zoo will be an xts-zoo object with all the prices for all tickers

I can drop the objects from my environment using the same do.call function: Before dropping the objects, I first create a list from the vector, since the function do.call needs a list object:

tickerlist=as.list(tickers.list)
do.call(rm,tickerlist)

I calculate cc returns for all adjusted price columns:

returns.df <-as.data.frame(diff(log(Ad(prices.zoo))))

Check that I used the function Ad() which selects only the columns for adjusted prices. This is a function from the quantmod package. At the end I apply the as.data.frame to end up with a data frame instead of a zoo object.

I rename the columns according to the tickerlist that contains the market ticker and all to stock tickers. I did this since the column names had the word ajusted, which make a big name for each ticker.

colnames(returns.df)<-tickerlist

Now I am ready to start running the market regression models for all tickers.

Now I will program an algorithm to automatically run all regression models according to the tickers read from the beginning of this program.

I define a structure where I will be saving the results of the regression models. I can define a matrix with the following rows and columns:

MATRIX RESULTS:

STOCK b0 se(b0) p-value(b0) b1 se(b1) p-value(b1) # Non-missing
ALFA 0.01 0.02 0.30 1.3 0.10 0.0001 35
….

Then, each row will be the regression results for each stock, and each column will have specific values of the betas, std errors, p-values, and the # of non-missing values used in the regression. I store the # of valid observations since some stocks might not have prices/returns for several months.

I can construct this matrix using the function mktmodel I defined in the previous part. The function returns a vector with this structure, so it is easy to construct a matrix using a loop and this function.

I will use a ‘for’ to do the loop ( I can also use ‘while’ or ‘repeat’ command)

# I assign an object to a null list. I will use this object to construct the matrix with
#   the set of all vectors of results:
matrix.results<-c()
for(i in 2:ncol(returns.df)) {
  # I am looping over i, from the value 2 to the last # of column of rets.mat 
  # I start in 2 since column 1 contains the market return. From column 2 I have 
  #  the stock returns until the last column. Then, if I have 50 stocks, then 
  #  the rets.mat matrix will have 51 columns since column 1 contains the mkt returns

  # I call my function that estimates the market regression model for the stock i:
  m <- mktmodel(returns.df[,i], returns.df[,1])
  # the column 1 of the matrix rets.mat contains the market returns, which is the
  #   independent variable of the regression model
  
  # Now I save the vector m into the matrix.results. I can use rbind to be 
  # appending the vectors in the matrix results:
  matrix.results<-rbind(matrix.results,m)
}

I display the first rows of the matrix results.m that contains ALL coefficients, standard errors, and p-values of ALL regression market models for ALL stocks. I also change the column names according to the regression parameters I need:

colnames(matrix.results)<-c("b0","seb0","pb0","b1","seb1","pb1","N")
head(matrix.results)
##        b0   seb0   pb0    b1  seb1        pb1  N
## m  0.0054 0.0054 0.320  0.58 0.097 0.00000013 62
## m  0.0124 0.0055 0.027  0.13 0.100 0.18573411 62
## m -0.0057    NaN   NaN -0.28   NaN        NaN  2
## m  0.0257 0.0916 0.780  2.15 1.666 0.20194033 62
## m -0.0128 0.0136 0.348  1.39 0.247 0.00000050 62
## m -0.0055 0.0137 0.693  0.92 0.250 0.00052677 62

I change the row names of the matrix according to the stock tickers. The first column has the market returns.

dim(matrix.results)
## [1] 39  7
rownames(matrix.results)<-tickers.list[2:length(tickers.list)]

I check the first 5 stocks of the matrix:

head(matrix.results)
##                  b0   seb0   pb0    b1  seb1        pb1  N
## AC.MX        0.0054 0.0054 0.320  0.58 0.097 0.00000013 62
## ACCELSAB.MX  0.0124 0.0055 0.027  0.13 0.100 0.18573411 62
## ACTINVRB.MX -0.0057    NaN   NaN -0.28   NaN        NaN  2
## AEROMEX.MX   0.0257 0.0916 0.780  2.15 1.666 0.20194033 62
## ALFAA.MX    -0.0128 0.0136 0.348  1.39 0.247 0.00000050 62
## ALPEKA.MX   -0.0055 0.0137 0.693  0.92 0.250 0.00052677 62

I convert the matrix into a data frame since it is easier to do subsetting and manipulation with a dataframe vs a matrix

results.df<-as.data.frame(matrix.results)
head(results.df)
##                  b0   seb0   pb0    b1  seb1        pb1  N
## AC.MX        0.0054 0.0054 0.320  0.58 0.097 0.00000013 62
## ACCELSAB.MX  0.0124 0.0055 0.027  0.13 0.100 0.18573411 62
## ACTINVRB.MX -0.0057    NaN   NaN -0.28   NaN        NaN  2
## AEROMEX.MX   0.0257 0.0916 0.780  2.15 1.666 0.20194033 62
## ALFAA.MX    -0.0128 0.0136 0.348  1.39 0.247 0.00000050 62
## ALPEKA.MX   -0.0055 0.0137 0.693  0.92 0.250 0.00052677 62

4 Challenge 3 (optional)

  1. Which stocks are significantly offering returns over the market?

  2. Which stocks are significantly less risky than the market?

Responses:

4.1 (a) Selecting stocks that offer returns over the market

I need to select those stocks with beta0 that is significantly bigger than zero.

I can calculate the maximum value of the beta1 95% confidence interval.

I sort the matrix according to beta0 and select those that have positive and significant beta0:

# I create a copy of the matrix result as data frame. I do this since it is easier
#    to sort data frames compared to matrices
sortedbybeta0 = as.data.frame(matrix.results)
# I sort the matrix according to the beta0m which is located in column 1: 
sortedbybeta0<-sortedbybeta0[order(sortedbybeta0$b0,decreasing=TRUE),]
# I add a new column for the minimum beta0 of the 95% confidence intervale: 
sortedbybeta0$minb0<- sortedbybeta0$b0 - 2*sortedbybeta0$seb0
# I select those rows with minb0>0, which are those stocks with significantly returns
#   over the market:
selectedbybeta0<-sortedbybeta0[sortedbybeta0$minb0>0,]
selectedbybeta0
##                 b0   seb0   pb0   b1 seb1     pb1  N  minb0
## FIBRAPL14.MX 0.015 0.0065 0.029 0.46 0.12 0.00022 62 0.0016
## ACCELSAB.MX  0.012 0.0055 0.027 0.13 0.10 0.18573 62 0.0014
## NA              NA     NA    NA   NA   NA      NA NA     NA

4.2 (b) Which stocks are significantly less risky than the market?

I need to select the stocks that have beta1 that is SIGNIFICANTLY less than 1.

I need to identify those stocks that are significantly less risky than the market. Then, I need to focus on the beta1 coefficients, and see which stocks have a 95% C.I. where the maximum of this interval is less than one.

I sort the matrix according to the maximum of the 95%C.I. of beta1, and select those with values less than one.

# I first convert the matrix.results into a data frame:
sortedbymaxb1<-as.data.frame(matrix.results)
# I add the maximum value of the beta1 95%C.I.:
sortedbymaxb1$maxb1<-sortedbymaxb1$b1 + 2* sortedbymaxb1$seb1
# I sort the data frame by the maximum value of the beta1 95%C.I.:
sortedbymaxb1<-sortedbymaxb1[order(sortedbymaxb1$maxb1,decreasing=FALSE),]
# I select those stocks that are significantly less risky than the market:
selectedbymaxb1<-sortedbymaxb1[sortedbymaxb1$maxb1<1,]
if (nrow(selectedbymaxb1)==0) {
  print("There is no stock that is significantly less risky than the market")
} else {
  print("The stocks that are significantly less risky than the market are:")
  selectedbymaxb1
}
## [1] "The stocks that are significantly less risky than the market are:"
##                    b0   seb0   pb0     b1  seb1        pb1  N maxb1
## EDOARDOB.MX   0.00000 0.0000   NaN  0.000 0.000        NaN 62  0.00
## POCHTECB.MX   0.00305 0.0067 0.649 -0.118 0.121 0.33489133 62  0.12
## MONEXB.MX     0.00771 0.0094 0.416 -0.194 0.171 0.25986732 62  0.15
## CMOCTEZ.MX    0.00148 0.0051 0.771  0.115 0.092 0.21878434 62  0.30
## CULTIBAB.MX  -0.00286 0.0137 0.835 -0.175 0.248 0.48264873 62  0.32
## ACCELSAB.MX   0.01244 0.0055 0.027  0.134 0.100 0.18573411 62  0.33
## PASAB.MX     -0.01137 0.0060 0.063  0.117 0.109 0.28768387 62  0.34
## FRAGUAB.MX    0.01214 0.0073 0.104  0.123 0.134 0.36041081 62  0.39
## ICHB.MX       0.01655 0.0112 0.147  0.042 0.205 0.83755968 62  0.45
## FINDEP.MX     0.00456 0.0131 0.729  0.078 0.238 0.74426886 62  0.55
## ELEKTRA.MX    0.00692 0.0097 0.477  0.255 0.176 0.15281539 62  0.61
## HOTEL.MX     -0.01660 0.0094 0.083  0.312 0.172 0.07405111 62  0.65
## FIBRAPL14.MX  0.01458 0.0065 0.029  0.465 0.118 0.00022414 62  0.70
## LAMOSA.MX     0.01695 0.0105 0.112  0.341 0.191 0.07952535 62  0.72
## AUTLANB.MX   -0.00018 0.0085 0.983  0.460 0.155 0.00430646 62  0.77
## AC.MX         0.00538 0.0054 0.320  0.582 0.097 0.00000013 62  0.78
## POSADASA.MX  -0.00702 0.0080 0.384  0.496 0.146 0.00117397 62  0.79
## WALMEX.MX     0.00774 0.0057 0.178  0.603 0.103 0.00000022 62  0.81
## NA                 NA     NA    NA     NA    NA         NA NA    NA