Abstract
In this workshop we learn how to do predictions with multiple regression models in the context of Finance. Also, we learn how to run panel-data multiple regression changing the timing of the dependent variable, so we can examine how financial ratios can be related with future stock returns one quarter or one year later.You will work in RStudio. Create an R Notebook document to write whatever is asked in this workshop.
At the beginning of the R Notebook write Workshop 8 - Financial Econometrics I and your name (as we did in previous workshop).
You have to replicate all the steps explained in this workshop, and ALSO you have to do whatever is asked. Any QUESTION or any STEP you need to do will be written in CAPITAL LETTERS. For ANY QUESTION, you have to RESPOND IN CAPITAL LETTERS right after the question.
It is STRONGLY RECOMMENDED that you write your OWN NOTES as if this were your notebook. Your own workshop/notebook will be very helpful for your further study.
Keep saving your .Rmd file, and ONLY SUBMIT the .html version of your .Rmd file.
We will use the dataset http://www.apradie.com/datos/datamx2020q4.xlsx, which has quarterly financial data for all Mexican firms from 2000 to 2020. You have to download the market data and merge it with this dataset. You can check the solution of W6 to help you with this part. You have to:
# Load the package readxl
library(readxl)
# Download the excel file from a web site:
download.file("http://www.apradie.com/datos/datamx2020q4.xlsx", "dataw7.xlsx", mode="wb")
<- read_excel("dataw7.xlsx")
dataset
# Download market data from Yahoo Finance:
library(quantmod)
getSymbols("^MXX", from="2000-01-01", to="2019-12-31", periodicity="monthly", src="yahoo")
## [1] "^MXX"
I first have to collapse the market data from monthly to quarterly data since the panel dataset has quarterly data. Once the market data is in quarters, I calculate returns and merge it with the panel dataset that has financial variables of firms:
# Collapse monthly data to quarterly data using the last value of each quarter
<- to.quarterly(MXX, indexAt = 'startof')
QMXX # The to.quarterly function collapse the dataset from monthly to quarterly
# taking the values of the last month of each quarter.
# in the option indexAt I indicate that I want to keep the first month of the
# quarter as index. I do this since the panel data uses the first month for
# each quarter
# Select the Adjusted price column
<- Ad(QMXX)
QMXX # Change column name
colnames(QMXX) <- c("IPC")
# I calculate market quarterly continuously compounded returns:
$IPCrets <- diff(log(QMXX))
QMXX
# Create a data frame object from QMXX
<- data.frame(quarter=index(QMXX), coredata(QMXX))
QMXX.df
# Convert the quarter column to a Date type of column:
$quarter <- as.Date(dataset$quarter)
dataset# I merge the market dataset with the panel dataset:
<- merge(dataset, QMXX.df, by="quarter")
dataset # This is a many-to-1 type of data merging since the market return is merged N
# times according to the N firms in the panel dataset
# Convert the data set from a data frame to a panel data frame using the
# pdata.frame function from the plm library:
library(plm)
<- pdata.frame(dataset, index = c("firmcode", "quarter")) paneldata
Install the statar package before working in this section. This package has a better winsorize function. You only specify the minimum and maximum percentile you want to use for the winsorization:
# Keep only active firms
<- paneldata[paneldata$status == "active", ]
paneldata
# Calculate the book value of equity, that is total assetsw minus total liabilities:
$bookvalue <- paneldata$totalassets - paneldata$totalliabilities
paneldata# Calculate the market value that is the # of shares times the original stock price:
$marketvalue<-paneldata$originalhistoricalstockprice*paneldata$sharesoutstanding
paneldata# Calculate book-to-market ratio:
$bmr <- paneldata$bookvalue / paneldata$marketvalue
paneldata
# Calculate the bmr Winsorized:
# We will use the winsorize function from the statar package.
# This function can work with panel data (the previous function from the robustHD
# package cannot)
library(statar)
# We only specify the 2 percentiles for the winsorization in both sides of the
# distribution:
$bmr_w <- winsorize(paneldata$bmr, probs = c(0.02,0.98)) paneldata
Now we will learn how to make predictions for multiple regression models using the predict.lm function. Do the following:
1. Learn about earnings per share (EPS). Do your research on the Internet or books. EXPLAIN with your own words what is earnings per share and how it can be estimated.
EARNINGS PER SHARE IS EQUAL TO EARNINGS DIVIDED BY THE # OF SHARES. THE MEASURE FOR EARNINGS IS NET INCOME. HOWEVER, SOME ANALYSTS ALSO USE OTHER OPERATIONAL MEASURES FOR EARNINGS SUCH AS EARNINGS BEFORE INTEREST AND TAXES (EBIT). IF WE WANT TO MEASURE OPERATIONAL EARNINGS AND CALCULATE IT FOR MANY FIRMS, IT IS RECOMMENDED TO USE EBIT AS A MEASURE OF EARNINGS.
THE FORMULA OF EPS USING NET INCOME AS A MEASURE OF EARNINGS IS:
\[ EPS_{t}=\frac{NETINCOME_{t}}{\#OFSHARES_{t}} \]
THE FORMULA FOR EPS USING EBIT AS A MEASURE OF EARNINGS IS:*
\[ EPS_{t}=\frac{EBIT_{t}}{\#OFSHARES_{t}} \] IN A HYPOTHETICAL SCENARIO, IF THE ALL EARNINGS OF A PERIOD t WERE PAYED TO THE INVESTORS, THEN EPS WILL BE HOW MUCH OF ALL EARNINGS OF THE PERIOD IS PAYED TO EACH SHARE OWN BY INVESTORS.
IF WE WANT TO CALCULATE EPS FOR MORE THAN ONE FIRM, IT IS STRONGLY RECOMMENDED TO CALCULATE EARNINGS PER SHARE DEFLATED BY STOCK PRICE TO MAKE THE MEASURE COMPARABLE WITH AMONG FIRMS.
THE PROBLEM IS THAT EPS IS NOT A STANDARDIZED MEASURE SINCE EACH FIRM HAS DIFFERENT LEVELS OF STOCK PRICES, AND THE STOCK PRICE DOES NOT TELL US ANYTHING ABOUT THE VALUE OF THE FIRM. WE NEED TO MULTIPLY STOCK PRICE TIMES THE # OF SHARES OUTSTANDING TO CALCULATE THE MARKET VALUE OF THE FIRM.
THEN, IF WE DIVIDE EPS BY THE STOCK PRICE, THEN WE GET THE AMOUNT OF EARNINGS THAT EACH $1.00 OF THE MARKET VALUE WOULD RECEIVE AS EARNINGS.
FOR EXAMPLE, IMAGINE THAT THERE ARE 2 FIRMS THAT COMPETE IN THE INFORMATION TECHNOLOGY INDUSTRY. BOTH ARE ABOUT THE SAME SIZE IN TERMS OF MARKET VALUE, BUT WITH DIFFERENT STOCK PRICES:
FIRM 1: STOCK PRICE = $10.00, # OF SHARES= 100 MILLION; MARKET VALUE = $1,000 MILLION
FIRM 2: STOCK PRICE = $100.00, # OF SHARES= 10 MILLION; MARKET VALUE = $1,000 MILLION
LET’S CALCULATE EPS FOR BOTH FIRMS USING EBIT FOR THE LAST PERIOD. IMAGINE THAT BOTH FIRMS HAD EXACTLY THE SAME EARNINGS, WITH EBIT = $100 MILLION:
FIRM 1: EPS = EBIT / # OF SHARES = $100 MILLION / $100 MILLION = $1.00 PER SHARE
FIRM 2: EPS = EBIT / # OF SHARES = $100 MILLION / 10 MILLION = $10.00 PER SHARE
FIRM 2 HAS AN EPS THAT IS 10 TIMES THE EPS OF FIRM 1! THIS COMPARISON WILL NOT BE CORRECT SINCE BOTH FIRMS HAD THE SAME EARNINGS AND BOTH HAVE THE SAME MARKET VALUE!
LET’S NOW CALCULATE EARNINGS PER SHARE DEFLATED BY PRICE FOR EACH FIRM:
FIRM 1: EPSP = EPS / STOCK PRICE = $1.00 /$10.00 = $0.10
FIRM 2: EPSP = EPS / STOCK PRICE = $10.00 / $100.00 = $0.10
NOW IT MAKES SENSE THAT EPSP IS THE SAME FOR BOTH FIRMS!
# Calculate eps
$eps <- paneldata$ebit / paneldata$sharesoutstanding paneldata
Generate a new variable called epsp earnings per share divided by stock price.
Winsorize this epsp variable and name this winsorized variable as epspw. Find the best way to do this winsorization (to the left and/or to the right)
#Calculate epsp
$epsp <- paneldata$eps / paneldata$originalhistoricalstockprice
paneldata# Winsorize epsp
$epsp_w <- winsorize(paneldata$epsp, probs = c(0.02,0.98)) paneldata
## 0.88 % observations replaced at the bottom
## 0.88 % observations replaced at the top
# Calculate cc returns for all returns
$stockreturn <- diff(log(paneldata$adjustedstockprice))
paneldata<- as.data.frame(paneldata[(paneldata$quarter=="2019-10-01"),])
lastq
# Construct the regression model:
# first write the dependent variable (y), then all explanatory variables
<- lm(stockreturn ~ epsp_w + bmr_w, data = lastq)
reg1 <- summary(reg1)
s_reg1 s_reg1
##
## Call:
## lm(formula = stockreturn ~ epsp_w + bmr_w, data = lastq)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.42027 -0.07916 -0.01485 0.07058 0.62443
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.05782 0.02709 2.134 0.0351 *
## epsp_w -0.01946 0.14011 -0.139 0.8898
## bmr_w -0.03106 0.01915 -1.622 0.1077
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1584 on 110 degrees of freedom
## (39 observations deleted due to missingness)
## Multiple R-squared: 0.02496, Adjusted R-squared: 0.007232
## F-statistic: 1.408 on 2 and 110 DF, p-value: 0.249
INTERPRETATION:
USING QUARTERLY DATA FROM 2000 TO 2019 FOR ALL MEXICAN PUBLIC FIRMS, AFTER CONSIDERING THE EFFECT OF BOOK-TO-MARKET RATIO (BMR) ON STOCK RETURNS, THE EFFECT OF EARNINGS PER SHARE DEFLATED BY PRICE (EPSP) IS NEGATIVE, BUT IT IS NOT SIGNIFICANT. IN OTHER WORDS, THERE IS NO STATISTICAL EVIDENCE TO SAY THAT THERE IS A RELATIONSHIP BETWEEN BMR AND STOCK RETURNS.
AFTER CONSIDERING THE EFFECT OF EPSP, THE EFFECT OF BMR IS NEGATIVE AND IT IS MARGINALLY SIGNIFICANT (P-VALUE=0.10). IN OTHER WORDS, THERE IS MARGINAL STATISTICAL EVIDENCE ABOUT THE NEGATIVE RELATIONSHIP BETWEEN BMR AND STOCK RETURN. FOR EACH CHANGE IN +1 UNIT IN BMR, THE EXPECTED CHANGE IN STOCK QUARTERLY RETURN IS ABOUT -3.1%.
IN THIS CASE, THE INTERPRETATION OF THE BETA0 COEFFICIENT IS THAT WHEN EPS AND BMR ARE EQUAL TO ZERO, THE EXPECTED STOCK RETURN IS ABOUT 5.7%, AND IT IS SIGNIFICANTLY GREATER THAN ZERO. IN THIS CASE, THIS INTERPRETATION DOES NOT MAKE MUCH SENSE SINCE BMR THE ONLY WAY THAT BMR=0 IS THAT THE BOOK-VALUE IS EQUAL TO ZERO (THIS CAN ONLY HAPPEN WHEN A FIRM IS GOING BANKRUPT). THEN, IN THE CASE OF MULTIPLE REGRESSION, WE CAN USE BETA0 FOR PREDICTION, AND IN MANY CASES, THERE IS NO NEED TO PROVIDE A MEANINFUL INTERPRETATION OF BETA0.
*THE VALUES OF THE REGRESSION COEFFICIENTS ARE STORED IN THE s_reg R OBJECT IN THE coefficient ATTRIBUTE. WE CAN ACCESS THESE COEFFICIENTS AS FOLLOWS:
# THE MATRIX WHERE THE BETA COEFFICIENTS AND THEIR STANDARD ERRORS, T-VALUES AND P-VALUES
# ARE STORED IN s_reg1$coefficients:
$coefficients s_reg1
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.05781582 0.02709340 2.1339446 0.03506885
## epsp_w -0.01945593 0.14011492 -0.1388569 0.88981710
## bmr_w -0.03105855 0.01914788 -1.6220356 0.10765855
# WE CAN ACCESS THE BETA COEFFICIENTS AS FOLLOWS.
# BETA1 COEFFICIENT IS LOCATED IN THE ROW #1 AND COLUMN #1 [1,1] OF THE MATRIX
# coefficients:
= s_reg1$coefficients[1,1]
BETA0
# BETA1 IS LOCATED IN THE ROW #2 AND COLUMN #1 [2,1] OF THE MATRIX coefficients:
= s_reg1$coefficients[2,1]
BETA1
# BETA2 IS LOCATED IN THE ROW #3 AND COLUMN #1 [3,1] OF THE MATRIX coefficients:
= s_reg1$coefficients[3,1] BETA2
NOW THAT WE HAVE THE VALUES OF THE BETA COEFFICIENTS, I CAN MANUALLY ESTIMATE THE EXPECTED VALUE OF STOCK RETURN WHEN EPSP_W = 0.05 AND BMR_W = 0.80:
EXPECTED STOCK RETURN = BETA0 + BETA1 EPSP_W + BETA2BMR_W
SUBSTITUTING EPSP_W=0.05 AND BMRW_W=0.80:
EXPECTED STOCK RETURN = BETA0 + BETA1 0.05 + BETA20.80
WE CAN DO THIS IN R AS FOLLOWS:
= BETA0 + BETA1*0.05 + BETA2 * 0.80
stockreturn_prediction #The prediction for stock return when EPSP_w=0.05 and BMRW=0.80 IS:
stockreturn_prediction
## [1] 0.03199618
I NEED TO CREATE A DATA FRAME WITH 2 VALUES FOR THE INDEPENDENT VARIABLES. IN CAN DO THIS AS FOLLOWS:
<- data.frame(epsp_w=c(0.05), bmr_w=c(0.8))
new_x # new_x will have to 0.05 for EPS_W and 0.80 for BMR_W:
new_x
## epsp_w bmr_w
## 1 0.05 0.8
NOW THAT I CAN RUN THE predict.lm FUNCTION WITH THE new_x DATA FRAME TO GET THE PREDICTION FOR STOCK RETURN:
= predict.lm(reg1, newdata = new_x)
stockreturn_prediction stockreturn_prediction
## 1
## 0.03199618
I GOT THE SAME PREDICTION THAN THE PREVIOUS MANUAL PREDICTION.
I CAN ALSO CALCULATE THE PREDICTION AND THE 95% CONFIDENCE INTERVAL OF THE PREDICTION AS FOLLOWS:
<- predict.lm(reg1, new_x, interval = "confidence")
pr_reg1 pr_reg1
## fit lwr upr
## 1 0.03199618 -0.002305707 0.06629807
IN THE pr_reg1 OBJECT I HAVE TO PREDICTION (THE FIT VALUE) AND ALSO THE LOWER (lwr) AND THE UPPER (upr) VALUES OF THE 95% OF THE STOCK RETURN PREDICTION
I CAN JOIN THE SPECIFIC VALUES OF THE INDEPENDENT VARIABLES WITH THE PREDICTION AND THE 95% C.I. OF THE PREDICTION:
# Join both objects in order to have a better perception
<- cbind(new_x, pr_reg1)
pr_reg1.df pr_reg1.df
## epsp_w bmr_w fit lwr upr
## 1 0.05 0.8 0.03199618 -0.002305707 0.06629807
# Construct the model using the whole paneldata object
<- lm(stockreturn ~ IPCrets + bmr_w, data = paneldata)
reg2 <- summary(reg2)
s_reg2 s_reg2
##
## Call:
## lm(formula = stockreturn ~ IPCrets + bmr_w, data = paneldata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.26380 -0.07630 -0.00448 0.07449 1.56551
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.032240 0.002972 10.85 <2e-16 ***
## IPCrets 0.830298 0.023419 35.45 <2e-16 ***
## bmr_w -0.032087 0.002315 -13.86 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1675 on 7096 degrees of freedom
## (5061 observations deleted due to missingness)
## Multiple R-squared: 0.1712, Adjusted R-squared: 0.171
## F-statistic: 733.1 on 2 and 7096 DF, p-value: < 2.2e-16
= mean(paneldata$IPCrets,na.rm=TRUE)
avg_market_return #The avg market return is:
avg_market_return
## [1] 0.02230853
<- data.frame(bmr_w=0.8, IPCrets = avg_market_return)
new_x2
# Now I do the same but including the 95% C.I. for the prediction:
<- predict.lm(reg2, new_x2, interval = "confidence")
pr_reg2 pr_reg2
## fit lwr upr
## 1 0.02509361 0.0211449 0.02904231
<- cbind(new_x2, pr_reg2)
pr_reg2.df pr_reg2.df
## bmr_w IPCrets fit lwr upr
## 1 0.8 0.02230853 0.02509361 0.0211449 0.02904231
Check the 95% CI of the predictions. Provide a brief INTERPRETATION of the output and the graph
I START CREATING A VECTOR WITH VALUES FOR THE DIFFERENT MARKET RETURNS:
= seq(from=-0.02, to=0.02, by=0.01)
IPCrets IPCrets
## [1] -0.02 -0.01 0.00 0.01 0.02
THE seq FUNCTION CREATES A SEQUENCE OF NUMBERS. IN THIS CASE, FROM -0.02 TO +0.02 JUMPING BY 0.01.
I CALCULATE THE MEDIAN OF BMR_W:
= median(paneldata$bmr_w,na.rm = TRUE)
median_bmrw median_bmrw
## [1] 0.6896754
I CALCULTED THE MEDIAN SINCE THE DISTRIBUTION OF BMRW IS NOT QUITE NORMAL. REMEMBER THAT THE BEST CENTRAL TENDENCY MEASURE OF A VARIABLE THAT DOES NOT FOLLOW A NORMAL DISTRIBUTION IS THE MEDIAN (THE 50 PERCENTILE)
NOW i CREATE A DATA FRAME WITH A COLUMN WITH THE DIFFERENT VALUES OF MARKET RETURN (FROM -2% TO +2%), AND WITH A CONSTANT VALUE FOR BMRW EQUAL TO ITS MEDIAN:
<- data.frame(IPCrets, bmr_w=median_bmrw)
new_x2b new_x2b
## IPCrets bmr_w
## 1 -0.02 0.6896754
## 2 -0.01 0.6896754
## 3 0.00 0.6896754
## 4 0.01 0.6896754
## 5 0.02 0.6896754
USING THIS DATA FRAME WITH VALUES FOR IPCrets AND BMRW, NOW I ESTIMATE THE STOCK RETURN PREDICTIONS AND THEIR CORRESPONDING 95% C.I. ACCORDING TO THE REGRESSION MODEL:
# I get the predictions according to the regression model stored in reg2:
<- predict.lm(reg2, new_x2b, interval = "confidence")
pr_reg2b # I dislpay the predictions for each combination of market return and bmrw:
pr_reg2b
## fit lwr upr
## 1 -0.006495126 -0.01091005 -0.002080198
## 2 0.001807850 -0.00244553 0.006061231
## 3 0.010110826 0.00597411 0.014247542
## 4 0.018413802 0.01434500 0.022482600
## 5 0.026716779 0.02266470 0.030768856
# I put togetyer the values for the independent variables and the predictions:
<- cbind(new_x2b, pr_reg2b)
pr_reg2b.df
# I change the name of the columns accordingly:
colnames(pr_reg2b.df) <- c("IPCrets", "bmr_w", "Stockreturn", "lwr", "upr")
pr_reg2b.df
## IPCrets bmr_w Stockreturn lwr upr
## 1 -0.02 0.6896754 -0.006495126 -0.01091005 -0.002080198
## 2 -0.01 0.6896754 0.001807850 -0.00244553 0.006061231
## 3 0.00 0.6896754 0.010110826 0.00597411 0.014247542
## 4 0.01 0.6896754 0.018413802 0.01434500 0.022482600
## 5 0.02 0.6896754 0.026716779 0.02266470 0.030768856
I FINALLY PLOT THE PREDICTIONS OF STOCK RETURNS AND THEIR 95% C.I. FOR THE DIFFERENT VALUES OF MARKET RETURN FROM -2% TO +2%:
library(ggplot2)
ggplot(pr_reg2b.df, aes(x = IPCrets, y=Stockreturn))+
geom_point(size = 2) + geom_line() +
geom_errorbar(aes(ymax = upr, ymin=lwr))
I CAN DO THE SAME USING THE prediction LIBRARY:
library(prediction)
= prediction(reg2, at = list(IPCrets=seq(from=-0.02, to=0.02, by=0.01), bmr_w=median_bmrw))
pr_reg3.df pr_reg3.df
## IPCrets bmr_w x
## -0.02 0.6897 -0.006495
## -0.01 0.6897 0.001808
## 0.00 0.6897 0.010111
## 0.01 0.6897 0.018414
## 0.02 0.6897 0.026717
I GOT THE SAME PREDICTION WITH THE prediction FUNCTION.
In the previous part we examined whether the market return and the BMR influence the stock return. We run the models using contemporary values of the variables. What does this mean? We examine whether the BMR and the market return of a period influence the stock return in the SAME period. In multiple regression models in R, we can also examine non-contemporaneous (lagged or future) effects of the independent variables. For example, we can examine whether the BMR and the market return influences future stock return (1 quarter or 4 quarters later). In Finance it is important to examine lag effects of independent variables on the dependent variable. An important reason is that financial statement releases of public firms usualy last for 1 or 2 months after the closing accounting period. For example, the financial statement of Q4 of 2020 can be release up to February or March 2021.
Let’s work with an example.
Which are the lagged variables we can use in R? We can use the lag function and the plm function. The plm stands for panel-data linear model.
lag(variable, #) refers to the LAG number # of the variable. Lag # 1 refers to the value of the variable 1 period ago.
Examples:
lag(bmrw,1) refers to the previous value of book-to-market ratio in the dataset. If the dataset has quarters, then it is the bmr value 1 quarter ago.
lag(bmr,4) refers to the previous value of book-to-market ratio ONE year ago if the dataset has quarterly data.
If you want to go forward (ahead) instead, you can still use the lag function, but you need to specifya negative #. For example:
lag(bmr,-4) refers to the future value of book-to-market ratio ONE year in the future if the dataset has quarterly data.
Using the same panel dataset we used in the previous part, do the following models:
We will use the plm function from the plm package.
# Load the package
library(plm)
# Use the lag() function with -1 indicating to go forward 1 period
# Using negative numbers is like going reverse
<- plm(lag(stockreturn, -1) ~ IPCrets + bmr_w, data=paneldata, model="pooling")
model1 <- summary(model1)
s_model1 s_model1
## Pooling Model
##
## Call:
## plm(formula = lag(stockreturn, -1) ~ IPCrets + bmr_w, data = paneldata,
## model = "pooling")
##
## Unbalanced Panel: n = 146, T = 1-78, N = 7039
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -1.3348414 -0.0774005 -0.0027878 0.0818477 1.4644899
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## (Intercept) 0.0017384 0.0032410 0.5364 0.5917
## IPCrets 0.2748376 0.0253327 10.8491 < 2.2e-16 ***
## bmr_w 0.0114935 0.0025282 4.5461 5.557e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 237.61
## Residual Sum of Squares: 233.09
## R-Squared: 0.019058
## Adj. R-Squared: 0.018779
## F-statistic: 68.3469 on 2 and 7036 DF, p-value: < 2.22e-16
This model can also be constructed as:
<-plm(stockreturn ~ lag(IPCrets) + lag(bmr_w), data = paneldata, model="pooling")
model1asummary(model1a)
## Pooling Model
##
## Call:
## plm(formula = stockreturn ~ lag(IPCrets) + lag(bmr_w), data = paneldata,
## model = "pooling")
##
## Unbalanced Panel: n = 146, T = 1-78, N = 7039
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -1.3348414 -0.0774005 -0.0027878 0.0818477 1.4644899
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## (Intercept) 0.0017384 0.0032410 0.5364 0.5917
## lag(IPCrets) 0.2748376 0.0253327 10.8491 < 2.2e-16 ***
## lag(bmr_w) 0.0114935 0.0025282 4.5461 5.557e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 237.61
## Residual Sum of Squares: 233.09
## R-Squared: 0.019058
## Adj. R-Squared: 0.018779
## F-statistic: 68.3469 on 2 and 7036 DF, p-value: < 2.22e-16
INTERPRETATION:
CONSIDERING QUARTERLY DATA OF MEXICAN FIRMS FROM 2000 TO 2019, THE EFFECT OF THE MARKET RETURN ON FUTURE STOCK RETURN 1 QUARTER LATER IS POSITIVE AND SIGNIFICANT, AFTER CONSIDERING THE EFFECFT OF BOOK-TO-MARKET RATIO. FOR EACH +1% MOVEMENT IN THE MARKET RETURN IN ONE QUARTER, IT IS EXPECTED THAT THE MOVEMENT OF THE STOCK RETURN ONE QUARTER LATER WILL BE AROUND 0.27%.
AFTER CONSIDERING THE EFFECT OF THE MARKET RETURN ON FUTURE STOCK RETURN ONE QUARTER LATER, THE EFFECT OF BOOK-TO-MARKET RETURN ON FUTURE STOCK RETURN IS POSITIVE AND SIGNIFICANT. FOR EACH MOVEMENT IN +1 UNIT OF BMRW IN A QUARTER, IT IS EXPECTED THAT STOCK RETURN WILL MOVE ONE QUARTER LATER IN ABOUT 1.14%.
<- plm(lag(stockreturn, -4) ~ bmr_w + epsp_w, data = paneldata, model="pooling")
model2 <- summary(model2)
s_model2 s_model2
## Pooling Model
##
## Call:
## plm(formula = lag(stockreturn, -4) ~ bmr_w + epsp_w, data = paneldata,
## model = "pooling")
##
## Unbalanced Panel: n = 126, T = 1-76, N = 4756
##
## Residuals:
## Min. 1st Qu. Median 3rd Qu. Max.
## -1.3128451 -0.0740470 -0.0028452 0.0789605 1.4889600
##
## Coefficients:
## Estimate Std. Error t-value Pr(>|t|)
## (Intercept) 0.0019973 0.0041118 0.4857 0.62717
## bmr_w 0.0089931 0.0035967 2.5004 0.01244 *
## epsp_w -0.0066547 0.0343372 -0.1938 0.84634
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Total Sum of Squares: 157.48
## Residual Sum of Squares: 157.26
## R-Squared: 0.0013965
## Adj. R-Squared: 0.00097633
## F-statistic: 3.3235 on 2 and 4753 DF, p-value: 0.03611
INTERPRETATION:
CONSIDERING THE EFFECT OF BMRW ON FUTURE STOCK RETURN ONE YEAR LATER, THE EFFECT OF EPSPW ON FUTURE STOCK RETURN ONE YEAR LATER IS NEGATIVE, BUT IT IS NOT SIGNIFICANT. IN OTHER WORDS, IT SEEMS THAT EARNINGS PER SHARE OF THE CURREN QUARTER MIGHT NOT BE SIGNIFICANTLY RELATED WITH FUTURE STOCK RETURNS 1 YEAR LATER.
CONSIDERING THE EFFECT OF EPSPW ON FUTURE STOCK RETURN ONE YEAR LATER, THE EFFECT OF BMRW ON FUTURE STOCK RETURNS ONE YEAR LATER IS POSITIVE AND SIGNIFICANT. FOR EACH +1 MOVEMENT IN THE BMRW, IT IS EXPECTED THAT STOCK RETURN ONE YEAR LATER WILL MOVE IN ABOUT +0.89%. THIS SOUNDS WIERD SINCE IT IS SUPPOSED THAT HIGH VALUES OF BMRW IS NOT A GOOD SIGN OF MARKET PERFORMANCE OF A STOCK, BUT IT MIGHT BE THAT FIRMS THAT HAVE LOW VALUES OF BMRW IN ONE QUARTER, THEY IMPLEMENT ACTIONS SO THAT IN 1 YEAR, THEY SIGNIFICANTLY IMPROVE THEIR PERFORMANCE. THIS IS JUST A GUESS OF POSSIBLE EXPLANATIONS OF THIS SIGNIFICANT RELATIONSHIP.
library(prediction)
<- data.frame(bmr_w = seq(from=0.6, to=1.6, by=0.1), epsp_w=mean(paneldata$epsp_w, na.rm=TRUE))
newx_model2 #pr1_model2_1 <- predict.lm(model2, newx_model2, interval = "confidence")
<- prediction_summary(model=model2, at=newx_model2,level=0.95)
pr1_model2 colnames(pr1_model2) <- c("bmr_w","epsp_w", "Predicted_return")
<- s_model2$coefficients[1,2]^2
var_b0 <- s_model2$coefficients[2,2]^2
var_b1 <- s_model2$coefficients[3,2]^2
var_b2 <- cov(matrix(c(s_model2$coefficients[1,1], s_model2$coefficients[2,1],
cov_coeff $coefficients[3,1])))
s_model2
$SE <- sqrt(var_b0 + pr1_model2$bmr_w^2*var_b1 +
pr1_model2$epsp_w^2*var_b2 + 2*cov_coeff) pr1_model2
## Warning in var_b0 + pr1_model2$bmr_w^2 * var_b1 + pr1_model2$epsp_w^2 * : Recycling array of length 1 in vector-array arithmetic is deprecated.
## Use c() or as.vector() instead.
$lwr <- pr1_model2$Predicted_return - 2*pr1_model2$SE
pr1_model2$upr <- pr1_model2$Predicted_return + 2*pr1_model2$SE
pr1_model2
pr1_model2
## bmr_w epsp_w Predicted_return NA NA NA NA NA SE lwr upr
## 0.6 0.06349 0.006971 NA NA NA NA NA 0.01221 -0.01746 0.03140
## 0.7 0.06349 0.007870 NA NA NA NA NA 0.01228 -0.01670 0.03244
## 0.8 0.06349 0.008769 NA NA NA NA NA 0.01236 -0.01596 0.03349
## 0.9 0.06349 0.009669 NA NA NA NA NA 0.01245 -0.01523 0.03457
## 1.0 0.06349 0.010568 NA NA NA NA NA 0.01255 -0.01453 0.03567
## 1.1 0.06349 0.011467 NA NA NA NA NA 0.01266 -0.01385 0.03678
## 1.2 0.06349 0.012366 NA NA NA NA NA 0.01277 -0.01318 0.03791
## 1.3 0.06349 0.013266 NA NA NA NA NA 0.01290 -0.01253 0.03907
## 1.4 0.06349 0.014165 NA NA NA NA NA 0.01303 -0.01190 0.04023
## 1.5 0.06349 0.015064 NA NA NA NA NA 0.01318 -0.01129 0.04142
## 1.6 0.06349 0.015964 NA NA NA NA NA 0.01333 -0.01069 0.04262
ggplot(pr1_model2, aes(x = bmr_w, y=Predicted_return))+
geom_point(size = 2) + geom_line() +
geom_errorbar(aes(ymax = upr, ymin=lwr))