1 General directions for this Workshop

You will work in RStudio. Create an R Notebook document (File -> New File -> R Notebook), where you have to write whatever is asked in this workshop.

You have to replicate all the steps explained in this workshop, and ALSO you have to do whatever is asked. Any QUESTION or any STEP you need to do will be written in CAPITAL LETTERS. For ANY QUESTION, you have to RESPOND IN CAPITAL LETTERS right after the question.

It is STRONGLY RECOMMENDED that you write your OWN NOTES as if this were your notebook. Your own workshop/notebook will be very helpful for your further study.

You have to keep saving your .Rmd file, and ONLY SUBMIT the .html version of your .Rmd file. Open R studio and open a new R-script. Write your Name and the Workshop name as comments at the top. Respond questions in CAPITAL LETTERS in your R-Script. Follow directions.

2 The Linear regression model

The simple linear regression model is used to understand the linear relationship between two variables assuming that one variable, the independent variable (IV), can be used as a predictor of the other variable, the dependent variable (DV). In this part we illustrate a simple regression model with the Market Model.

The Market Model states that the expected return of a stock is given by its alpha coefficient (b0) plus its market beta coefficient (b1) multiplied times the market return. In mathematical terms:

\[ E[R_i] = α + β(R_M) \]

We can express the same equation using BO as alpha, and B1 as market beta:

\[ E[R_i] = β_0 + β_1(R_M) \]

We can estimate the alpha and market beta coefficient by running a simple linear regression model specifying that the market return is the independent variable and the stock return is the dependent variable. It is strongly recommended to use continuously compounded returns instead of simple returns to estimate the market regression model. The market regression model can be expressed as:

\[ r_{(i,t)} = b_0 + b_1*r_{(M,t)} + ε_t \]

Where:

\(ε_t\) is the error at time t. Thanks to the Central Limit Theorem, this error behaves like a Normal distributed random variable ∼ N(0, \(σ_ε\)); the error term \(ε_t\) is expected to have mean=0 and a specific standard deviation \(σ_ε\) (also called volatility).

\(r_{(i,t)}\) is the return of the stock i at time t.

\(r_{(M,t)}\) is the market return at time t

\(b_0\) and \(b_1\) are called regression coefficients

Now it’s time to use real data to better understand this model. Download monthly prices for Alfa (ALFAA.MX) and the Mexican market index IPCyC (^MXX) from Yahoo from January 2016 to Jan 2021.

3 Running a market regression model with real data

3.1 Data collection

We first load the quantmod package and download monthly price data for Alfa and the Mexican market index. We also merge both datasets into one:

# load package quantmod
library(quantmod)

# Download the data
getSymbols(c("ALFAA.MX", "^MXX"), from="2016-01-01", to= "2021-01-31", periodicity="monthly", src="yahoo")
## [1] "ALFAA.MX" "^MXX"
#Merge both xts-zoo objects into one dataset, but selecting only adjusted prices:

adjprices<-Ad(merge(ALFAA.MX,MXX))

3.2 Return calculation

We calculate continuously returns for both, Alfa and the IPCyC:

returns <- diff(log(adjprices)) 
#I dropped the na's:
returns <- na.omit(returns)

#I renamed the columns:
colnames(returns) <- c("ALFAA", "MXX")

3.3 Q Visualize the relationship

Do a scatter plot putting the IPCyC returns as the independent variable (X) and the stock return as the dependent variable (Y). We also add a line that better represents the relationship between the stock returns and the market returns.Type:

plot.default(x=returns$MXX,y=returns$ALFAA)
abline(lm(returns$ALFAA ~ returns$MXX),col='blue')

# As you see, I indicated that the Market returns goes in the X axis and 
#   Alfa returns in the Y axis. 
# In the market model, the independent variable is the market returns, while
#   the dependent variable is the stock return

Sometimes graphs can be deceiving. In this case, the range of X axis and Y axis are different, so it is better to do a graph where we can make both X and Y ranges with equal distance. We also add a line that better represents the relationship between the stock returns and the market returns. Type:

plot.default(x=returns$MXX,y=returns$ALFAA, xlim=c(-0.30,0.30) )
abline(lm(returns$ALFAA ~ returns$MXX),col='blue')

WHAT DOES THE PLOT TELL YOU? BRIEFLY EXPLAIN

Using the lm() function, run a simple regression model to see how the monthly returns of the stock are related with the market return. The first parameter of the function is the DEPENDENT VARIABLE (in this case, the stock return), and the second parameter must be the INDEPENDENT VARIABLE, also named the EXPLANATORY VARIABLE (in this case, the market return).

What you will get is called The Market Regression Model. You are trying to examine how the market returns can explain stock returns from Jan 2015 to Aug 2020.

Assign your market model to an object named “reg”.

reg <- lm(ALFAA ~ MXX, data=returns)
# Or by calling the return objects by itself:
reg <- lm(returns$ALFAA ~ returns$MXX)
summary(reg)
## 
## Call:
## lm(formula = returns$ALFAA ~ returns$MXX)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.31927 -0.04455 -0.01824  0.03004  0.34653 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.01541    0.01235  -1.248    0.217    
## returns$MXX  1.90225    0.27196   6.995 2.99e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.09568 on 58 degrees of freedom
## Multiple R-squared:  0.4576, Adjusted R-squared:  0.4482 
## F-statistic: 48.92 on 1 and 58 DF,  p-value: 2.993e-09

4 Q Respond the following questions:

  1. What are the standard errors of the beta coefficients? (b0 and b1) What are they for?

  2. What is the total sum of squares (SST) ? (provide the result, and explain the formula)

  3. What is the sum of squared errors (SSE) ? (provide the result, and explain the formula)

  4. What is the sum of squared regression differences (SSR) ? (provide the result and explain the formula)

  5. What is the coefficient of determination of the regression (the R-squared)? (provide the result and explain the formula)

  6. Interpret the results of the beta coefficients (b0 and b1) and their corresponding p-values with your own words.

  7. What are the t-values of the beta coefficients (b0 and b1) ?

  8. Interpret the 95% confidence interval of the beta coefficients (b0 and b1)

5 Quiz

Go to Canvas and do QUIZ 1. You will be able to try this quiz up to 2 times.

The grade of this Workshop will be the following:

If you submit an ORIGINAL and COMPLETE R file with your OWN RESPONSES, then the grade of your Workshop will be grade of your Quiz (the maximum of your attempts).

If you DO NOT submit file OR you submit a very, very similar version of another student’s Workshop, then your Workshop grade will be HALF of what you get in the Quiz.