import yfinance as yf
import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt
Workshop 3, Econometric Models
1 Q Estimating the CAPM model for a stock
2 The CAPM model
The Capital Asset Pricing Model states that the expected return of a stock is given by the risk-free rate plus its beta coefficient multiplied by the market premium return. In mathematical terms:
E[R_i] = R_f + β_1(R_M − R_f )
We can express the same equation as:
(E[R_i] − R_f ) = β_1(R_M − R_f )
Then, we are saying that the expected value of the premium return of a stock is equal to the premium market return multiplied by its market beta coefficient. You can estimate the beta coefficient of the CAPM using a regression model and using continuously compounded returns instead of simple returns. However, you must include the intercept b0 in the regression equation:
(r_i − r_f ) = β_0 + β_1(r_M − r_f ) + ε
Where ε ∼ N(0, σ_ε); the error is a random shock with an expected mean=0 and a specific standard deviation or volatility. This error represents the result of all factors that influence stock returns, and cannot be explained by the model (by the market).
In the market model, the dependent variable was the stock return and the independent variable was the market return. Unlike the market model, here the dependent variable is the difference between the stock return minus the risk-free rate (the stock premium return), and the independent variable is the premium return, which is equal to the market return minus the risk-free rate. Let’s run this model in with a couple of stocks.
3 Data collection
We load the libraries to collect, process and visualize stock data from Yahoo Finance:
3.1 Download stock data
We download monthly stock data for Apple, Tesla and the S&P500 from Dec 2019 to July 31, 2024 from Yahoo Finance using the yfinance function and obtain continuously compounded returns for each:
= yf.download('^GSPC AAPL TSLA', start='2019-12-01', end='2023-12-31', interval='1mo') data
[ 0%% ]
[**********************67%%****** ] 2 of 3 completed
[*********************100%%**********************] 3 of 3 completed
# Get adjusted close prices
= data['Adj Close']
adjprices # Calculate continuously compounded returns for the 3 prices:
= np.log(adjprices) - np.log(adjprices.shift(1))
returns # Drop the first month since it has NaN values:
= returns.dropna() returns
I have monthly returns from Jan 2020:
returns
AAPL TSLA ^GSPC
Date
2020-01-01 0.052602 0.441578 -0.001629
2020-02-01 -0.124201 0.026424 -0.087860
2020-03-01 -0.069944 -0.242781 -0.133668
2020-04-01 0.144424 0.400210 0.119421
2020-05-01 0.078964 0.065730 0.044287
2020-06-01 0.140190 0.257109 0.018221
2020-07-01 0.152834 0.281421 0.053637
2020-08-01 0.194234 0.554719 0.067719
2020-09-01 -0.106370 -0.149762 -0.040018
2020-10-01 -0.061888 -0.100372 -0.028056
2020-11-01 0.089481 0.380309 0.102146
2020-12-01 0.110196 0.217731 0.036449
2021-01-01 -0.005517 0.117344 -0.011199
2021-02-01 -0.084562 -0.161038 0.025757
2021-03-01 0.008806 -0.011270 0.041563
2021-04-01 0.073453 0.060293 0.051097
2021-05-01 -0.053514 -0.126372 0.005471
2021-06-01 0.096197 0.083548 0.021971
2021-07-01 0.062958 0.010974 0.022493
2021-08-01 0.040115 0.068224 0.028578
2021-09-01 -0.068965 0.052633 -0.048738
2021-10-01 0.057001 0.362230 0.066858
2021-11-01 0.098461 0.027238 -0.008369
2021-12-01 0.073062 -0.079968 0.042689
2022-01-01 -0.015837 -0.120597 -0.054018
2022-02-01 -0.056856 -0.073397 -0.031863
2022-03-01 0.057156 0.213504 0.035148
2022-04-01 -0.102178 -0.213125 -0.092068
2022-05-01 -0.057505 -0.138340 0.000053
2022-06-01 -0.083469 -0.118657 -0.087652
2022-07-01 0.172804 0.280480 0.087201
2022-08-01 -0.033093 -0.075250 -0.043367
2022-09-01 -0.127556 -0.038314 -0.098049
2022-10-01 0.103956 -0.153347 0.076835
2022-11-01 -0.035243 -0.155866 0.052358
2022-12-01 -0.128762 -0.457813 -0.060782
2023-01-01 0.104829 0.340916 0.059921
2023-02-01 0.021393 0.171905 -0.026459
2023-03-01 0.113647 0.008471 0.034451
2023-04-01 0.028575 -0.233184 0.014536
2023-05-01 0.043647 0.216022 0.002479
2023-06-01 0.091525 0.249689 0.062719
2023-07-01 0.012704 0.021392 0.030664
2023-08-01 -0.044658 -0.035588 -0.017875
2023-09-01 -0.091510 -0.030929 -0.049946
2023-10-01 -0.002573 -0.219832 -0.022225
2023-11-01 0.106443 0.178464 0.085424
2023-12-01 0.014808 0.034390 0.043279
3.2 Download risk-free data from the FED
We download the risk-free monthly rate for the US (3-month treasury bills), which is the TB3MS ticker. We do this with the pandas_datareader library:
import pandas_datareader.data as pdr
import datetime
# I define start as the month Jan 2020
= datetime.datetime(2020,1,1)
start # I define the end month as July 2024
= datetime.datetime(2023,12,31)
end = pdr.DataReader('TB3MS','fred',start,end) Tbills
We see the content of Tbills:
Tbills
TB3MS
DATE
2020-01-01 1.52
2020-02-01 1.52
2020-03-01 0.29
2020-04-01 0.14
2020-05-01 0.13
2020-06-01 0.16
2020-07-01 0.13
2020-08-01 0.10
2020-09-01 0.11
2020-10-01 0.10
2020-11-01 0.09
2020-12-01 0.09
2021-01-01 0.08
2021-02-01 0.04
2021-03-01 0.03
2021-04-01 0.02
2021-05-01 0.02
2021-06-01 0.04
2021-07-01 0.05
2021-08-01 0.05
2021-09-01 0.04
2021-10-01 0.05
2021-11-01 0.05
2021-12-01 0.06
2022-01-01 0.15
2022-02-01 0.33
2022-03-01 0.44
2022-04-01 0.76
2022-05-01 0.98
2022-06-01 1.49
2022-07-01 2.23
2022-08-01 2.63
2022-09-01 3.13
2022-10-01 3.72
2022-11-01 4.15
2022-12-01 4.25
2023-01-01 4.54
2023-02-01 4.65
2023-03-01 4.69
2023-04-01 4.92
2023-05-01 5.14
2023-06-01 5.16
2023-07-01 5.25
2023-08-01 5.30
2023-09-01 5.32
2023-10-01 5.34
2023-11-01 5.27
2023-12-01 5.24
The TB3MS serie is given in percentage and in annual rate. I divide it by 100 and 12 to get a monthly simple rate since I am using monthly rates for the stocks:
= Tbills / 100 / 12 rfrate
Now I get the continuously compounded return from the simple return:
= np.log(1+rfrate) rfrate
I used the formula to get cc reteurns from simple returns, which is applying the natural log of the growth factor (1+rfrate)
4 Visualize the relationship
We do a scatter plot putting the S&P500 premium returns as the independent variable (X) and Tesla premium return as the dependent variable (Y). We also add a line that better represents the relationship between the stock returns and the market returns:
import seaborn as sb
plt.clf()= returns['GSPC_Premr']
x = returns['TSLA_Premr']
y # I plot the (x,y) values along with the regression line that fits the data:
=x,y=y)
sb.regplot(x'Market Premium returns')
plt.xlabel('TSLA Premium returns')
plt.ylabel( plt.show()
Sometimes graphs can be deceiving. In this case, the range of X axis and Y axis are different, so it is better to do a graph where we can make both X and Y ranges with equal distance. We also add a line that better represents the relationship between the stock returns and the market returns. Type:
plt.clf()
=x,y=y)
sb.regplot(x# I adjust the scale of the X axis so that the magnitude of each unit of X is equal to that of the Y axis
-1,1,0.2)) plt.xticks(np.arange(
([<matplotlib.axis.XTick object at 0x000001F77B62AFE0>, <matplotlib.axis.XTick object at 0x000001F77B629E70>, <matplotlib.axis.XTick object at 0x000001F77B62BC10>, <matplotlib.axis.XTick object at 0x000001F77B6A6FE0>, <matplotlib.axis.XTick object at 0x000001F77B6A74F0>, <matplotlib.axis.XTick object at 0x000001F77B6A7C40>, <matplotlib.axis.XTick object at 0x000001F77EBA43D0>, <matplotlib.axis.XTick object at 0x000001F77B6A7D00>, <matplotlib.axis.XTick object at 0x000001F77B6A6D70>, <matplotlib.axis.XTick object at 0x000001F77B6A6290>], [Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, ''), Text(0, 0, '')])
# I label the axis:
'Market Premium returns')
plt.xlabel(
'TSLA Premium returns')
plt.ylabel( plt.show()
QUESTION: WHAT DOES THE PLOT TELL YOU? BRIEFLY EXPLAIN
5 Q Estimating the CAPM model for a stock
Use the premium returns to run the CAPM regression model for each stock.
We run the CAPM for TESLA:
import statsmodels.formula.api as smf
# I estimate the OLS regression model:
= smf.ols('TSLA_Premr ~ GSPC_Premr',data=returns).fit()
mkmodel # I display the summary of the regression:
print(mkmodel.summary())
OLS Regression Results
==============================================================================
Dep. Variable: TSLA_Premr R-squared: 0.353
Model: OLS Adj. R-squared: 0.339
Method: Least Squares F-statistic: 25.14
Date: mar., 27 ago. 2024 Prob (F-statistic): 8.39e-06
Time: 06:55:31 Log-Likelihood: 17.511
No. Observations: 48 AIC: -31.02
Df Residuals: 46 BIC: -27.28
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
Intercept 0.0295 0.025 1.185 0.242 -0.021 0.080
GSPC_Premr 2.2058 0.440 5.014 0.000 1.320 3.091
==============================================================================
Omnibus: 0.149 Durbin-Watson: 1.461
Prob(Omnibus): 0.928 Jarque-Bera (JB): 0.002
Skew: 0.006 Prob(JB): 0.999
Kurtosis: 2.974 Cond. No. 17.8
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
I used a simpler Python code to run this regression (I used the ols instead of the OLS function). With this, I do not have to add a vector of 1’s to the X variable.
The beta0 coefficient of the model is 0.0295, while beta1 is 2.2058.
The 95% confidence interval for beta0 goes from -0.0207 to 0.0797, while the 95% confidence interval for beta1 goes from 1.3203 to 3.0913.
This 95% confidence intervals for beta0 and beta1 can be roughly estimated if we subtract and add about 2 times the standard error of the beta coefficient from the beta. Why? Because thanks to the Central Limit Theorem, the beta coefficients will behave similar to a normal distributed variables since they can be expressed as a linear combination of random variables.
6 CHALLENGE 1
Respond the following questions regarding Tesla CAPM model:
(a) INTERPRET THE RESULTS OF THE COEFFICIENTS (b0 and b1), THEIR STANDARD ERRORS, P-VALUES AND 95% CONFIDENCE INTERVALS.
(b) ACCORDING TO THE EFFICIENT MARKET HYPOTHESIS, WHAT IS THE EXPECTED VALUE OF b0 in the CAPM REGRESSION MODEL?
(c) ACCORDING TO YOUR RESULTS, IS TESLA SIGNIFICANTLY RISKIER THAN THE MARKET ? WHAT IS THE t-test YOU NEED TO DO TO RESPOND THIS QUESTION? Do the test and provide your interpretation. (Hint: Here you have to change the null hypothesis for b1: H0: b1=1; Ha=b1<>1)
7 CHALLENGE 2
Follow the same procedure to get Apple’s CAPM and respond the following questions:
(a) INTERPRET THE RESULTS OF THE COEFFICIENTS (b0 and b1), THEIR STANDARD ERRORS, P-VALUES AND 95% CONFIDENCE INTERVALS.
(b) ACCORDING TO THE EFFICIENT MARKET HYPOTHESIS, WHAT IS THE EXPECTED VALUE OF b0 in the CAPM REGRESSION MODEL?
(c) ACCORDING TO YOUR RESULTS, IS TESLA SIGNIFICANTLY RISKIER THAN THE MARKET ? WHAT IS THE t-test YOU NEED TO DO TO RESPOND THIS QUESTION? Do the test and provide your interpretation. (Hint: Here you have to change the null hypothesis for b1: H0: b1=1; Ha=b1<>1)
WHAT IS THE EFFICIENT MARKET HYPOTHESIS? BRIEFLY DESCRIBE WHAT THIS HYPOTHESIS SAYS.
YOU HAVE TO DO YOUR OWN RESEARCH
8 READING
Read carefully: Basics of Linear Regression Models.
9 Quiz 3 and W3 submission
Go to Canvas and respond Quiz 3 about Linear Regression. You will be able to try this quiz up to 3 times. Questions in this Quiz are related to concepts of the readings related to this Workshop. The grade of this Workshop will be the following:
Complete (100%): If you submit an ORIGINAL and COMPLETE HTML file with all the activities, with your notes, and with your OWN RESPONSES to questions
Incomplete (75%): If you submit an ORIGINAL HTML file with ALL the activities but you did NOT RESPOND to the questions and/or you did not do all activities and respond to some of the questions.
Very Incomplete (10%-70%): If you complete from 10% to 75% of the workshop or you completed more but parts of your work is a copy-paste from other workshops.
Not submitted (0%)