EXECUTIVE SUMMARY

This analysis uses Fama-French factor estimates on S&P 500 equities, and then explores the best model for volatility using such factors.

Due to high predictability on 12 month lagged returns on investments, we can thus use different models to predict volatility. Primarily testing Black’s theory that “negative returns are followed by larger increases in volatility than positive returns.”

This mainly suggests that negative returns (decrease in price) change a company’s debt/equity ratio, increasing their leverage. As seen in the final graphs the best models will follow ARIMA(1,1) then GARCH(1,1) or ARCH(12) that capture volatility in a time period of 12 months. Indicating that volatility in monthly returns can be predicted 12 months out.

This can be seen in the below graph where the blue line indicates realized variance in volatility in the stock market compared to red which is the model.

CONTENTS

Methodology

From Kenneth French’s web page monthly and daily returns to the investment (CMA) factor from 1963-07-01 to 2019-12-31 was selected as the independent variable. CMA is the last factor in the FF 5-factor model.

When examining any model that measures volatility the widely accepted conclusion is to choose a model that is most ‘parsimonious’ meaning a model that needs little inputs. By having the smallest amounts of inputs possible the model is better able predict more accurately and will contain less white noise.

For this analysis realized variance will be synonymous with realized volatility.

Models : For Measuring Monthly Volatility

ARIMA

The data follows an ARMA(1,1) for the return series. To determine what is the estimated monthly persistence of expected returns to CMA? What is the half-life of the expected return series in months?

Here the blue line represents realized variance = \(\sigma_t^2\) of ARCH(12) and the red line represents realized variance = \(\sigma_t^2\) of GARCH(1,1).

It appears as if the estimated variance processes are stationary since the unconditional mean and variance of the estimated variance processes is not time varying. This is also supported by the ADF test, which rejects the null hypothesis that these two processes are nonstationary.

## 
## Call:
## arima(x = dfm$CMA, order = c(1, 0, 1))
## 
## Coefficients:
##          ar1      ma1  intercept
##       0.4258  -0.3125     0.2737
## s.e.  0.2513   0.2639     0.0908
## 
## sigma^2 estimated as 3.899:  log likelihood = -1423.37,  aic = 2854.75

ARCH(12) and GARCH(1,1)

Estimating an ARCH(12) and a GARCH(1,1) process for the residuals from this ARMA(1,1).

The models do a good job of accounting for clustering of volatility because the absolute value of the Normalized Residuals \(\eta_t = \varepsilon_t/\sigma_t\) looks like to be white noise. As seen in the fine two autocorrelation graphs there is no autocorrelation in \(|\eta_t|\), further supporting the fact that the models do a good job of accounting for clustering of volatility.

Analysis the first two graphs we can see the Blue and Red lines move together indicating that when estimating volatility in returns on investment the model should follow a ARIMA(1,1) then a either a GARCH(1,1) or ARCH(12) for monthly returns. Indicating that volatility in monthly returns can be predicted 12 months out.

ARCH(12):

##             Estimate  Std. Error       t value     Pr(>|t|)
## mu      -0.006960401  0.06318379 -1.101612e-01 9.122815e-01
## omega    1.060035709  0.23981639  4.420197e+00 9.861091e-06
## alpha1   0.142833116  0.05039829  2.834087e+00 4.595686e-03
## alpha2   0.169528664  0.05521210  3.070498e+00 2.137019e-03
## alpha3   0.036143891  0.05062770  7.139154e-01 4.752795e-01
## alpha4   0.014270114  0.04466797  3.194709e-01 7.493695e-01
## alpha5   0.084051593  0.04785289  1.756458e+00 7.901026e-02
## alpha6   0.167005266  0.06086364  2.743925e+00 6.070938e-03
## alpha7   0.018541487  0.04609738  4.022244e-01 6.875189e-01
## alpha8   0.000000010  0.05734819  1.743734e-07 9.999999e-01
## alpha9   0.055421944  0.04606210  1.203201e+00 2.288987e-01
## alpha10  0.000000010  0.03805501  2.627775e-07 9.999998e-01
## alpha11  0.030656948  0.03724640  8.230848e-01 4.104598e-01
## alpha12  0.000000010  0.03808755  2.625530e-07 9.999998e-01

GARCH(1,1):

##            Estimate  Std. Error    t value     Pr(>|t|)
## mu     -0.006960401  0.06197135 -0.1123164 9.105725e-01
## omega   0.163941864  0.06377344  2.5706918 1.014956e-02
## alpha1  0.147764970  0.03016989  4.8977629 9.693392e-07
## beta1   0.811201475  0.03521324 23.0368341 0.000000e+00

## 
##  Augmented Dickey-Fuller Test
## 
## data:  fit.arch@sigma.t^2
## Dickey-Fuller = -4.4185, Lag order = 8, p-value = 0.01
## alternative hypothesis: stationary
## 
##  Augmented Dickey-Fuller Test
## 
## data:  fit.garch@sigma.t^2
## Dickey-Fuller = -3.8668, Lag order = 8, p-value = 0.01566
## alternative hypothesis: stationary

ARCH(12):

GARCH(1,1):

 

Models : Comparing Volatility in Model to Realized Variance

First order autocorrelation of \(RV_t\):

acf(newdfm$RV, plot = F)$acf[[2]]
## [1] 0.6910482

First order autocorrelation of \(\varepsilon_t^2\):

acf(res^2, plot = F)$acf[[2]]
## [1] 0.2937863

Correlation between \(RV_t\) and \(\varepsilon_t^2\):

cor(newdfm$RV, res^2)
## [1] 0.535606

Correlation between \(RV_t\) and \(\sigma_t^2\) from GARCH:

cor(newdfm$RV, fit.garch@sigma.t^2)
## [1] 0.6232504

Correlation between \(\varepsilon_t^2\) and \(\sigma_t^2\) from GARCH:

cor(res^2, fit.garch@sigma.t^2)
## [1] 0.4098962

ARMA(1,1) on \(RV_t\):

## 
## Call:
## arima(x = newdfm$RV, order = c(1, 0, 1))
## 
## Coefficients:
##          ar1      ma1  intercept
##       0.9220  -0.4976     2.7399
## s.e.  0.0181   0.0388     0.7367
## 
## sigma^2 estimated as 9.158:  log likelihood = -1713.29,  aic = 3434.59

Correlation between \(v_t\) and \(RV_t\):

## [1] 0.7409121

Here the blue line represents \(v_t\) and the red line represents \(\sigma_t^2\) of GARCH(1,1).