1 General directions for this Workshop

You will work in RStudio. Create an R Notebook document to write whatever is asked in this workshop.

You have to solve CHALLENGE exercises.

It is STRONGLY RECOMMENDED that you write your OWN NOTES as if this were your notebook. Your own workshop/notebook will be very helpful for your further study.

Keep saving your .Rmd file, and ONLY SUBMIT the .html version of your .Rmd file.

2 Review Solution of Workshop 6

Go to Canvas, download the Workshpo 6 Solution and compare with yours. Carefully READ the solution, take your notes and learn from your mistakes. Be ready to report the difference between your solution and my solution.

3 Data management and data validation

We need to read a list of tickers from an Excel file, and the ticker for the market that is also located in the same Excel file. Our program must read any list of tickers, get the corresponding stock price data from Yahoo, calculate returns and then run all market regression models of these stocks.

I load the packages needed:

library(readxl)
library(quantmod)

Download the InputW7.xlsx file from the course site and place it in the same folder where you saved this .Rmd file.

We read the stock tickers and the market ticker from the Excel file:

tickers.df<-read_excel("InputW7.xlsx",sheet = "tickers")
mkticker.df<-read_excel("InputW7.xlsx",sheet = "mkt")

We put the market ticker and the list of tickers in one vector:

tickers.list<-c(mkticker.df$ticker,tickers.df$ticker)
class(tickers.list)
## [1] "character"
tickers.list
##  [1] "^MXX"         "AC.MX"        "ACCELSAB.MX"  "ACTINVRB.MX"  "AEROMEX.MX"  
##  [6] "ALFAA.MX"     "ALPEKA.MX"    "ALSEA.MX"     "AMXL.MX"      "ARA.MX"      
## [11] "QQQQ.MX"      "ASURB.MX"     "AUTLANB.MX"   "AXTELCPO.MX"  "CMOCTEZ.MX"  
## [16] "CREAL.MX"     "CULTIBAB.MX"  "EDOARDOB.MX"  "ELEKTRA.MX"   "FEMSAUBD.MX" 
## [21] "VVV.MX"       "FIBRAPL14.MX" "FINDEP.MX"    "FFF.MX"       "FRAGUAB.MX"  
## [26] "FUNO11.MX"    "GAPB.MX"      "GCARSOA1.MX"  "GCC.MX"       "HOMEX.MX"    
## [31] "HOTEL.MX"     "ICHB.MX"      "LALAB.MX"     "LAMOSA.MX"    "MINSAB.MX"   
## [36] "MONEXB.MX"    "OMAB.MX"      "PASAB.MX"     "PINFRA.MX"    "POCHTECB.MX" 
## [41] "POSADASA.MX"  "VITROA.MX"    "VOLARA.MX"    "WALMEX.MX"

Now bring the price data (from 2018,from Yahoo) of the tickers contained in the tickers.list vector. We will get the price data of each ticker using a loop since it might be possible that some tickers does not exist in Yahoo or there is no data for the dates specified.

for (t in tickers.list) {
  try(getSymbols(t, 
             from = "2018-01-01",
             periodicity = "monthly",
             src = "yahoo") )
  
}
## Error in getSymbols.yahoo(Symbols = "QQQQ.MX", env = <environment>, verbose = FALSE,  : 
##   Unable to import "QQQQ.MX".
## el argumento "conn" está ausente, sin valor por omisión
## Error in getSymbols.yahoo(Symbols = "VVV.MX", env = <environment>, verbose = FALSE,  : 
##   Unable to import "VVV.MX".
## el argumento "conn" está ausente, sin valor por omisión
## Error in getSymbols.yahoo(Symbols = "FFF.MX", env = <environment>, verbose = FALSE,  : 
##   Unable to import "FFF.MX".
## el argumento "conn" está ausente, sin valor por omisión

The try() function is used to avoid the program to stop in case there is an error due to lack of information in Yahoo Finance. Whatever function is run inside the try function is run in a “safe” mode, so that in case of an error, your program will NOT STOP running.

Market tickers in Yahoo starts with the character ^, but when bringing the data with getSymbols, the character ^ is deleted to name the corresponding object. Then, I create a market ticker string without the ^character:

mkticker.df$ticker
## [1] "^MXX"
nchar(mkticker.df$ticker)
## [1] 4

I copy the mkticker from character 2 to the last character given by the function nchar, which gives the number of characters contained in a string

mktick<-substr(mkticker.df$ticker,2,nchar(mkticker.df$ticker))
mktick
## [1] "MXX"

I assign the market ticker to a vector that I will be filling out later

t.list<-c(mktick)

Now I check which tickers were brought from Yahoo with a loop of the existing objects. The ls() function brings the R objects that exist in the Globa Environment. In this list of objects I have to identify those objects that have a name equal to any of the tickers specified in the tickers.list vector. Then, I can try the following loop to construct a vector with only tickers that could be downloaded from Yahoo Finance:

for(t in ls()) {
  if (t %in% tickers.list){
     t.list  <- c(t.list , t)
  }
}
t.list
##  [1] "MXX"          "AC.MX"        "ACCELSAB.MX"  "ACTINVRB.MX"  "AEROMEX.MX"  
##  [6] "ALFAA.MX"     "ALPEKA.MX"    "ALSEA.MX"     "AMXL.MX"      "ARA.MX"      
## [11] "ASURB.MX"     "AUTLANB.MX"   "AXTELCPO.MX"  "CMOCTEZ.MX"   "CREAL.MX"    
## [16] "CULTIBAB.MX"  "EDOARDOB.MX"  "ELEKTRA.MX"   "FEMSAUBD.MX"  "FIBRAPL14.MX"
## [21] "FINDEP.MX"    "FRAGUAB.MX"   "FUNO11.MX"    "GAPB.MX"      "GCARSOA1.MX" 
## [26] "GCC.MX"       "HOMEX.MX"     "HOTEL.MX"     "ICHB.MX"      "LALAB.MX"    
## [31] "LAMOSA.MX"    "MINSAB.MX"    "MONEXB.MX"    "OMAB.MX"      "PASAB.MX"    
## [36] "PINFRA.MX"    "POCHTECB.MX"  "POSADASA.MX"  "VITROA.MX"    "VOLARA.MX"   
## [41] "WALMEX.MX"

The first ticker is the market ticker. I reassign t.list to ticker.list

tickers.list<-t.list

Now I create one object for each ticker using the vector of tickers tickers.list. I use the lapply function to apply the get function to each element of the tickers.list

objList <- lapply(tickers.list, get)
class(objList)
## [1] "list"

This will be a List object. Each element will be an xts-zoo object for each ticker.

Now I merge all objects using the function call. I do this to generalize my program for any number of tickers:

prices.zoo <- do.call(merge, objList)

The do.call function is similar to the lapply function. For the do.call function, the first argument is the function, and the second argument is the list of objects.

Now prices.zoo will be an xts-zoo object with all the prices for all tickers

I can drop the objects from my environment using the same do.call function: Before dropping the objects, I first create a list from the vector, since the function do.call needs a list object:

tickerlist=as.list(tickers.list)
do.call(rm,tickerlist)

I calculate cc returns for all adjusted price columns:

returns.df <-as.data.frame(diff(log(Ad(prices.zoo))))

Check that I used the function Ad() which selects only the columns for adjusted prices. This is a function from the quantmod package. At the end I apply the as.data.frame to end up with a data frame instead of a zoo object.

I rename the columns according to the tickerlist that contains the market ticker and all to stock tickers. I did this since the column names had the word ajusted, which make a big name for each ticker.

# I create a list of tickers without .MX:
tickers.list<-sub(".MX","",tickerlist,fixed=TRUE)
tickers.list
##  [1] "MXX"       "AC"        "ACCELSAB"  "ACTINVRB"  "AEROMEX"   "ALFAA"    
##  [7] "ALPEKA"    "ALSEA"     "AMXL"      "ARA"       "ASURB"     "AUTLANB"  
## [13] "AXTELCPO"  "CMOCTEZ"   "CREAL"     "CULTIBAB"  "EDOARDOB"  "ELEKTRA"  
## [19] "FEMSAUBD"  "FIBRAPL14" "FINDEP"    "FRAGUAB"   "FUNO11"    "GAPB"     
## [25] "GCARSOA1"  "GCC"       "HOMEX"     "HOTEL"     "ICHB"      "LALAB"    
## [31] "LAMOSA"    "MINSAB"    "MONEXB"    "OMAB"      "PASAB"     "PINFRA"   
## [37] "POCHTECB"  "POSADASA"  "VITROA"    "VOLARA"    "WALMEX"
# The sub function identifies a substring and replace it with other substring
# Now I assign the column names to returns.df:
colnames(returns.df)<-tickers.list
# I can see the first rows of the first 3 tickers
head(returns.df[,1:3])
##                    MXX          AC     ACCELSAB
## 2018-01-01          NA          NA           NA
## 2018-02-01 -0.06168296 -0.03166973 0.0000000000
## 2018-03-01 -0.02807023 -0.03999766 0.0038722217
## 2018-04-01  0.04728310  0.02784821 0.0009657171
## 2018-05-01 -0.07949962 -0.07700438 0.0038535693
## 2018-06-01  0.06502423  0.03890219 0.0377403280

Now I am ready to start running the market regression models for all tickers.

4 The Market Regression Model

The market model can be estimated by running a pulled time-series regression model using the market returns as the independent variable, and the individual asset returns as the dependent variable:

\[R_{i(t)}=\hat{\alpha}+\hat{\beta}R_{M(t)}+\varepsilon_{t}\]

I am assuming you know what a regression model is. If you do not remember, I strongly recommend you to read my note Basics of Linear Regression Models that is posted in the course site/materials.

According to the efficient market hypothesis, the expected value of the constant alpha (or beta0) in this regression model should be zero. Why? because if we find an individual asset that has a significant positive alpha coefficient, then this individual asset is expected to over perform the market. According to the efficient market hypothesis, the most efficient or more optimal portfolio in a financial market is the market portfolio. If there is an individual asset with a positive and significant alpha, it is expected that sooner or later (sooner more than later), its alpha will tend to zero. However, in reality, the efficient market hypothesis does not always hold.

Let’s write a function to estimate the market model for one stock and see whether you can find a stock or a portfolio with a significant alpha.

Now I will explain you how to run a regression model in R.

4.1 Running a regression model

I will explain how to run a regression model using specific tickers, and later you have to write a function to run a the market regression model.

We have monthly returns of stocks and the market in a data frame. Then we can estimate a market regression model.

It is recommended that you read the Note Basics of Regression Models that is posted in the course site to do a review about regression.

In a regression model we try to explain a dependent variable with one or more independent variables. In the market model, we have the market returns as the independent or explanatory variable, and the stock return as the dependent variable. In R, you have to use the function lm, linear model, to estimate a regression model. We can run a market regression model for Alfa as follows:

alfamod <- lm(ALFAA ~ MXX, data=returns.df)

As you see, we indicate first the dependent variable ALFAA, the Alfa returns, and then the independent variable MXX, the market returns. We separate the dependent variables from the independent variables with the ~ character. In this case, we have the variables in the data frame returns.df. We will get the object alfamod, which is an lm object:

class(alfamod)
## [1] "lm"
attributes(alfamod)
## $names
##  [1] "coefficients"  "residuals"     "effects"       "rank"         
##  [5] "fitted.values" "assign"        "qr"            "df.residual"  
##  [9] "na.action"     "xlevels"       "call"          "terms"        
## [13] "model"        
## 
## $class
## [1] "lm"
summary(alfamod)
## 
## Call:
## lm(formula = ALFAA ~ MXX, data = returns.df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.37110 -0.04982 -0.00296  0.06040  0.35749 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.01183    0.01589  -0.745     0.46    
## MXX          1.63257    0.31167   5.238  3.4e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1134 on 49 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.359,  Adjusted R-squared:  0.3459 
## F-statistic: 27.44 on 1 and 49 DF,  p-value: 3.402e-06

We can save the regression summary in another variable and display the regression coefficients:

s <-summary(alfamod)
class(s)
## [1] "summary.lm"
s$coefficients
##                Estimate Std. Error    t value     Pr(>|t|)
## (Intercept) -0.01182964 0.01588871 -0.7445312 0.4601111500
## MXX          1.63256990 0.31166865  5.2381589 0.0000034016

Now s is a summary regression object that has different attributes. One attribute is the matrix of coefficients that has not only coefficients beta zero and beta1 but also it has standard errors (standard deviations of betas), t-values and p-values. We can display only the beta coefficients as follows:

s$coefficients[,1]
## (Intercept)         MXX 
## -0.01182964  1.63256990

In this case, we are displaying all rows of column 2 of the matrix. If we want to display the standard errors of the betas, we do the same but replace 2 instead of 1 since standard errors are in column 2 of the coefficient matrix (do this in your code).

I can also use the function coefficients to directly estimate the beta coefficients in a vector:

betasalfa <- coefficients(lm(ALFAA ~ MXX, data=returns.df))

Now I have the vector betasalfa with beta0 and beta1.

4.2 Estimating the Market regression model for more than one stock

In case we have many stocks that we want to calculate their market model, how can we run several regression models using a loop?

We first transform the data frame of returns into a matrix using the function as.matrix. We save the result in a matrix class object called rets.mat.

rets.mat <- as.matrix(returns.df)
class(rets.mat)
## [1] "matrix" "array"
#head(rets.mat)
ncol(rets.mat)
## [1] 41

Now how can I run one regression model for each stock? I can program a loop like the following:

results.m = matrix(nrow=ncol(rets.mat)-1, ncol=2)
# I do a loop from 2 (column 2) to the last column of rets.mat:
for(i in 2:ncol(rets.mat)) {
  results.m[i-1,] = coefficients(lm(rets.mat[,i] ~ rets.mat[,1]))
}
results.m
##                [,1]        [,2]
##  [1,]  0.0014660607  0.60510650
##  [2,]  0.0134491011  0.23222576
##  [3,] -0.0006891994  0.35967785
##  [4,]  0.0422959248  3.01773634
##  [5,] -0.0118296391  1.63256990
##  [6,] -0.0023540184  1.24635279
##  [7,] -0.0094179601  2.42866707
##  [8,]  0.0043199289  0.89002931
##  [9,] -0.0101553162  0.73591058
## [10,]  0.0018439813  1.42496949
## [11,] -0.0007374100  0.63238749
## [12,] -0.0115051389  0.51112498
## [13,]  0.0022544041  0.17167559
## [14,] -0.0577933522  0.38492180
## [15,] -0.0102846610  0.07534960
## [16,]  0.0000000000  0.00000000
## [17,]  0.0124517084  0.30859177
## [18,] -0.0024164004  0.87909511
## [19,]  0.0127423520  0.53182733
## [20,]  0.0054081621  0.10804375
## [21,]  0.0063560471  0.27448500
## [22,] -0.0012802330  1.57257344
## [23,]  0.0100875074  1.70861178
## [24,] -0.0005150236  1.18338505
## [25,]  0.0043900341  0.97012421
## [26,] -0.0515237294  0.28558389
## [27,] -0.0202730592  0.60713259
## [28,]  0.0183167564  0.24867533
## [29,] -0.0104377374  1.05046418
## [30,]  0.0213882849  0.40393896
## [31,]  0.0832745367  1.15633658
## [32,] -0.0013350823  0.02178048
## [33,]  0.0110702466  1.65967036
## [34,] -0.0135872772  0.09995274
## [35,] -0.0032922417  0.87868039
## [36,]  0.0002283817 -0.04124777
## [37,] -0.0092384513  0.68837301
## [38,] -0.0205243390  1.04913607
## [39,]  0.0124858386  2.13712111
## [40,]  0.0105056323  0.49050861

I first define an empty matrix results.m with 2 columns and 2 rows since ncol(rets.mat) will be 3. I will fill out the content of this matrix in the loop.
In this case the function coefficients is extracting only the beta coefficients of the regression lm(rets.mat[,i] ~ rets.mat[,1])). As you see, in this case the loop executes 2 iterations since the number of columns of the matrix rets.mat is only 3 columns. I start with column 2 since column one has the market returns, and I have stock returns in column 2 and 3. If I expand the list of tickers that I get from getSymbols this code will still run since the loop will stop until the number of columns of the matrix rets.mat.

For each interation I am running the market regression for the corresponding column i. Let’s analyze the code lm(rets.mat[,i] ~ rets.mat[,1])). In this case, rets.mat[,i] means that I am considering as the dependent variable all returns of the column i of the matrix. In the case of rets.mat[,1] I am indicating that the independent variable will be the column 1 of the rets.mat matrix since the market return is located in column 1.
The result will be saved in results.m.

5 CHALLENGE 1 - Writing a function for the Market Model

In the previous workshop we learned how to run a market regression model for one or more stocks. We also wrote a function to run a market regression model the returns the beta coefficients, and another function that returns the standard errors.

Now you have to write a function that estimates the market model for a stock, but now it has to return:

  • beta coefficients (beta0 and beta1),
  • standard errors of beta coefficients,
  • p-values of beta coefficients, and
  • the # of valid (non-missing) observations used to run the regression.

Here are the specifications of the function:

  1. Your function has to receive 2 parameters: 1) the continuously compounded (cc) return of a stock, and 2) the cc returns of the market.

  2. Using the lm function, run a simple regression model according to the market model. Remember that the dependent variable must the stock return, and the independent variable must be the market returns.

  3. Your function must return a vector with 7 values. The structure of this vector is illustrated in the following example:

b0 se(b0) p-value(b0) b1 se(b1) p-value(b1) # Non-missing
0.01 0.02 0.30 1.3 0.10 0.0001 35

The non-missing observations can be calculated using the attriute df.residual of the regression model. You just have to add 2 to this value since the degrees of freedom of the regression residual is N-2, where N is the number of non-missing observations.

Test your function with at least 2 stocks (select any stocks you want), and use the correct market index. Display the vectors and INTERPRET the result of the beta coefficients of only ONE stock.

6 CHALLENGE 2 - Running market models with a loop using your function

Using the Excel file InputW7.xlsx posted in the course site, you have to bring mothly stock prices, and a series for the market index. Bring data from Jan 2018 to Dec 2021. In one Sheet you can find the stock tickers, and in the other Sheet you find the market ticker.

Write the corresponding code to estimate the market models of all stocks. The specifications of the program are the following:

  1. Write a loop to run the a market regression model for each stock.

  2. You must use the function you defined in the previous part to run the market regression models.

  3. Make sure your program runs with any number of tickers specified in the Excel file. I recommend you to start testing your program with few tickers (2 or 3), so create another excel file with only 2 or 3 stocks.

  4. Your program must create a Matrix with the results of the models for all stocks. The structure of this matrix is the following:

MATRIX RESULTS:

STOCK b0 se(b0) p-value(b0) b1 se(b1) p-value(b1) # Non-missing
ALFA 0.01 0.02 0.30 1.3 0.10 0.0001 35
….

For the stock names, use the rownames of the matrix.

Then, each row will be the regression results for each stock, and each column will have specific values of the betas, std errors, p-values, and the # of non-missing values used in the regression. Note that it is possible that some stocks might not have prices/returns for some periods.

7 CHALLENGE 3 - Selection of stocks based on market regressions

(OPTIONAL)

  1. Once you have the final matrix with all the regression results, Write a code to select and display which of the stocks are SIGNIFICANTLY offering returns over the market AND that the number of non-missing observations is equal or greater than 30. As comments, explain what is the condition needed to find statistical evidence to say that a stock is offering returns over the market.

  2. Write a code to select and display the stock that are SIGNIFICANTLY less risky than the market according to the market model regressions. As comment, explain what is the condition to say that the stock is statistically less risky than the market.

8 Reading

Re-read the Note “Basics of Regression Models” posted in the course site, and also do your own research about the CAPM model.

9 Datacamp Activity (OPTIONAL)

If you want to refresh your knowledge about loops, functions or the apply function, you can check Datacamp.com. Go to datacamp.com site. The following chapters can help you with this:

Course : Intermediate R for Finance, chapters :

  1. Loops
  2. Functions
  3. Apply