LAM Chun Ming (58555905)
please find the full version here https://rpubs.com/TCML/1254698
library('TSA')
library('fGarch')
library('parallel')
library('rugarch')
library('quantmod')
library('tseries')
library('forecast')
library('MSwM')
getSymbols("VIX", from = '2006-01-01', to = '2024-10-22')
Warning: VIX contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
[1] "VIX"
vix = as.numeric(VIX[,6])
vix = na.omit(vix)
dvix = diff(vix)
ARCH(p) model is given as follows \[ h_t=\alpha_0 + \sum_{i=1}^{p}\alpha_i\varepsilon_{t-i}^2 \\ \varepsilon_t =z_t \sqrt{h_t} , \space \space z_t \sim i.i.d(0,1) \] we can use the fact that we can always decompose \(\varepsilon_t^2 = h_t + v_t\) where \(v_t\) is some arbitrary error, and show that this is the same as AR(p) on the squared error (or the de-meaned process) \(\varepsilon_t^2\)
\[ \varepsilon_t^2 = \alpha_0 + \sum_{i=1}^p\alpha_i\varepsilon_{t-i}^2 +v_t \]
because, this the equivalent to the AR(p) process on the squared error term, we can run PACF on the squared error, as we would determined a AR process, following which we use an MLE method to fit the actual parameters
here we first find the de-meaned process by first fitting an ARIMA model
# uses EACF on the differenced VIX (to remove stationary)
eacf(dvix)
AR/MA
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 x o o o x o x o o o o o o o
1 x x o o x o o o o o o o o o
2 x o o o x o o o o o o o o o
3 x x o o o o x x o o o o o o
4 x x x x o o o o o o o o o o
5 x x o x x o o o o o o o o o
6 x x o x x x x o o o o o o o
7 x x o x x o x o o o o o o o
m_arima = arima(vix, order = c(3,1,2))
# checking model goodness of fit - cannot reject the Null Hypothesis at 5% level - i.e. there is no serial dependency in the lag
Box.test(m_arima$residuals,lag = 10)
Box-Pierce test
data: m_arima$residuals
X-squared = 13.222, df = 10, p-value = 0.2115
# taking residuals
res= m_arima$residuals
after fitting the ARIMA process, we fit an ARCH model, by considering it’s AR equivalence
pacf(res^2)
# PACF suggests AR of 3 <=> ARCH(3),
# fitting model
model_ARCH3 = garchFit(~garch(3,0), data = res , trace = F)
# here all values are significant
summary(model_ARCH3)
Title:
GARCH Modelling
Call:
garchFit(formula = ~garch(3, 0), data = res, trace = F)
Mean and Variance Equation:
data ~ garch(3, 0)
<environment: 0x000001bff65eb650>
[data = res]
Conditional Distribution:
norm
Coefficient(s):
mu omega alpha1 alpha2 alpha3
-2.1084e+01 9.2171e+04 6.9268e-02 1.5041e-01 2.4924e-01
Std. Errors:
based on Hessian
Error Analysis:
Estimate Std. Error t value Pr(>|t|)
mu -2.108e+01 1.178e+01 -1.790 0.073387 .
omega 9.217e+04 5.942e+03 15.512 < 2e-16 ***
alpha1 6.927e-02 3.124e-02 2.217 0.026603 *
alpha2 1.504e-01 4.881e-02 3.081 0.002060 **
alpha3 2.492e-01 6.832e-02 3.648 0.000264 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log Likelihood:
-5742.805 normalized: -7.297083
Description:
Fri Dec 6 16:27:06 2024 by user: Toby Lam
Standardised Residuals Tests:
Statistic p-Value
Jarque-Bera Test R Chi^2 1.775535e+05 0.0000000
Shapiro-Wilk Test R W 6.888432e-01 0.0000000
Ljung-Box Test R Q(10) 1.193094e+01 0.2897054
Ljung-Box Test R Q(15) 1.381473e+01 0.5396245
Ljung-Box Test R Q(20) 1.664868e+01 0.6756541
Ljung-Box Test R^2 Q(10) 7.056540e-01 0.9999660
Ljung-Box Test R^2 Q(15) 7.393540e-01 1.0000000
Ljung-Box Test R^2 Q(20) 1.106019e+00 1.0000000
LM Arch Test R TR^2 5.042072e-01 0.9999997
Information Criterion Statistics:
AIC BIC SIC HQIC
14.60687 14.63653 14.60679 14.61828
fitted at the
\[ ARCH: h_t = 91270 + 0.06927 \varepsilon_{t-1}^2 + 0.1504\varepsilon_{t-2}^2 + 0.2492\varepsilon_{t-3}^2\\ AR: \varepsilon_t^2 = 91270 + 0.06927 \varepsilon_{t-1}^2 + 0.1504\varepsilon_{t-2}^2 + 0.2492\varepsilon_{t-3}^2 + v_t \\ \]
\[ \varepsilon_t = z_t \sqrt{h_t} \ \ s.t.: z_t \sim i.i.d(0,1)\\ h_t = \alpha_0 + \sum_{i=1}^m \alpha_i\varepsilon_{t-i}^2 + \sum_{j=1}^s \beta_j h_{t-j} \]
again decomposing \(\varepsilon_t^2 = h_t + v_t\),
\[ \varepsilon_t^2 = \alpha_0 + \sum_{i=1}^m \alpha_i \varepsilon_{t-i}^2 + \sum_{j=1}^s \beta_j (\varepsilon_{t-j}^2-v_{t-j}) + v_t \]
by regrouping into \(\varepsilon_{t-j}^2 \ \& \ v_{t-j}\) , note that \(I\{\cdot\}\) is the indicator function
\[ \varepsilon_t^2 = \alpha_0 + \sum_{i=1}^m \tilde\alpha_i \varepsilon_{t-i}^2 + \sum_{j=1}^s \tilde\beta_j v_{t-j} + v_t\\ s.t. : \ \tilde\alpha_i = \alpha_iI\{i\leq m\} + \beta_i I\{i\leq s\}\\ \tilde \beta_j = -\beta_j \]
model = garchFit(~garch(1,1), data = res, trace = F)
summary(model)
Title:
GARCH Modelling
Call:
garchFit(formula = ~garch(1, 1), data = res, trace = F)
Mean and Variance Equation:
data ~ garch(1, 1)
<environment: 0x000001bff16e69c8>
[data = res]
Conditional Distribution:
norm
Coefficient(s):
mu omega alpha1 beta1
-1.3696e+01 1.2860e+04 6.8775e-02 8.2912e-01
Std. Errors:
based on Hessian
Error Analysis:
Estimate Std. Error t value Pr(>|t|)
mu -1.370e+01 1.131e+01 -1.211 0.226068
omega 1.286e+04 2.779e+03 4.628 3.69e-06 ***
alpha1 6.878e-02 1.817e-02 3.785 0.000153 ***
beta1 8.291e-01 3.446e-02 24.059 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log Likelihood:
-5707.395 normalized: -7.25209
Description:
Fri Dec 6 16:43:44 2024 by user: Toby Lam
Standardised Residuals Tests:
Statistic p-Value
Jarque-Bera Test R Chi^2 3.000255e+05 0.0000000
Shapiro-Wilk Test R W 6.927818e-01 0.0000000
Ljung-Box Test R Q(10) 1.143145e+01 0.3249062
Ljung-Box Test R Q(15) 1.325227e+01 0.5828200
Ljung-Box Test R Q(20) 1.599566e+01 0.7168933
Ljung-Box Test R^2 Q(10) 5.349208e-02 1.0000000
Ljung-Box Test R^2 Q(15) 8.769032e-02 1.0000000
Ljung-Box Test R^2 Q(20) 1.235536e-01 1.0000000
LM Arch Test R TR^2 7.811317e-02 1.0000000
Information Criterion Statistics:
AIC BIC SIC HQIC
14.51435 14.53807 14.51429 14.52347
fitted model
\[ h_t = 12860 + 0.06878 \varepsilon_{t-1}^2 + 0.8291 h_{t-1} \]
model = garchFit(~garch(1,1), data = res, cond.dist = 'std', trace = F)
summary(model)
Title:
GARCH Modelling
Call:
garchFit(formula = ~garch(1, 1), data = res, cond.dist = "std",
trace = F)
Mean and Variance Equation:
data ~ garch(1, 1)
<environment: 0x000001bfe7628170>
[data = res]
Conditional Distribution:
std
Coefficient(s):
mu omega alpha1 beta1 shape
-1.12649 28.30229 0.63266 0.72633 2.69061
Std. Errors:
based on Hessian
Error Analysis:
Estimate Std. Error t value Pr(>|t|)
mu -1.12649 2.41367 -0.467 0.64071
omega 28.30229 141.73943 0.200 0.84173
alpha1 0.63266 0.19732 3.206 0.00134 **
beta1 0.72633 0.02888 25.152 < 2e-16 ***
shape 2.69061 0.28504 9.439 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log Likelihood:
-5404.256 normalized: -6.866907
Description:
Fri Dec 6 16:45:13 2024 by user: Toby Lam
Standardised Residuals Tests:
Statistic p-Value
Jarque-Bera Test R Chi^2 1.976041e+07 0
Shapiro-Wilk Test R W 3.459989e-02 0
Ljung-Box Test R Q(10) 5.426508e-02 1
Ljung-Box Test R Q(15) 5.818304e-02 1
Ljung-Box Test R Q(20) 8.217111e-02 1
Ljung-Box Test R^2 Q(10) 1.324736e-02 1
Ljung-Box Test R^2 Q(15) 2.005019e-02 1
Ljung-Box Test R^2 Q(20) 2.694674e-02 1
LM Arch Test R TR^2 1.598610e-02 1
Information Criterion Statistics:
AIC BIC SIC HQIC
13.74652 13.77618 13.74644 13.75792
the fitted model \[ h_t = 28.30 + 0.6327\varepsilon_{t-1}^2 + 0.7263 h_{t-1} \]
following work done in part 1a)
# test for stationarity in vix data ; we cannot reject the Null Hypothesis data is non stationary
adf.test(vix)
Augmented Dickey-Fuller Test
data: vix
Dickey-Fuller = -1.8035, Lag order = 9, p-value = 0.6615
alternative hypothesis: stationary
# differencing to remove stationarity, test again; we can reject the Null Hypothesis - i.e. the data is stationary
adf.test(dvix)
Warning in adf.test(dvix) : p-value smaller than printed p-value
Augmented Dickey-Fuller Test
data: dvix
Dickey-Fuller = -8.9308, Lag order = 9, p-value = 0.01
alternative hypothesis: stationary
# use EACF to select for model - this produces candidate model ARMA(0,1) & ARMA(2,1)
eacf(dvix)
AR/MA
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 x o o o x o x o o o o o o o
1 x x o o x o o o o o o o o o
2 x o o o x o o o o o o o o o
3 x x o o o o x x o o o o o o
4 x x x x o o o o o o o o o o
5 x x o x x o o o o o o o o o
6 x x o x x x x o o o o o o o
7 x x o x x o x o o o o o o o
# try to fit for ARMA(0,1)
m1 = arima(vix, order = c(0,1,1))
# test for validity - can reject the null hypothesis at 5% - there is serial dependencies in the residual
Box.test(m1$residuals,lag = 10)
Box-Pierce test
data: m1$residuals
X-squared = 18.194, df = 10, p-value = 0.05178
# try second candidate
m2 = arima(vix, order = c(2,1,1))
# test for validity - can reject the null hypothesis at 5% - there is serial dependencies in the residual
Box.test(m2$residuals,lag = 10)
Box-Pierce test
data: m2$residuals
X-squared = 20.766, df = 10, p-value = 0.02279
# try third candidate
m3 = arima(vix, order = c(3,1,2))
# test for validity - cannot reject the null hypothesis at 5% - there is no serial dependencies in the residual
Box.test(m3$residuals,lag = 10)
Box-Pierce test
data: m3$residuals
X-squared = 13.222, df = 10, p-value = 0.2115
# accept the ARMA(3,2) on the differenced data
first fit the model using an MLE method
model = garchFit(~arma(3,2)+garch(1,1), data = dvix, trace = F)
summary(model)
Title:
GARCH Modelling
Call:
garchFit(formula = ~arma(3, 2) + garch(1, 1), data = dvix, trace = F)
Mean and Variance Equation:
data ~ arma(3, 2) + garch(1, 1)
<environment: 0x000001bff358d200>
[data = dvix]
Conditional Distribution:
norm
Coefficient(s):
mu ar1 ar2 ar3 ma1 ma2 omega alpha1 beta1
-4.6882e+00 9.6053e-01 -4.1495e-01 7.9452e-02 -1.0000e+00 3.4651e-01 1.5574e+04 9.4341e-02 7.8623e-01
Std. Errors:
based on Hessian
Error Analysis:
Estimate Std. Error t value Pr(>|t|)
mu -4.688e+00 4.010e+00 -1.169 0.2423
ar1 9.605e-01 2.276e-01 4.220 2.44e-05 ***
ar2 -4.149e-01 1.951e-01 -2.127 0.0334 *
ar3 7.945e-02 5.023e-02 1.582 0.1137
ma1 -1.000e+00 2.258e-01 -4.428 9.49e-06 ***
ma2 3.465e-01 1.849e-01 1.874 0.0610 .
omega 1.557e+04 2.951e+03 5.277 1.31e-07 ***
alpha1 9.434e-02 2.328e-02 4.053 5.05e-05 ***
beta1 7.862e-01 3.796e-02 20.714 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log Likelihood:
-5672.15 normalized: -7.216475
Description:
Fri Dec 6 16:45:58 2024 by user: Toby Lam
Standardised Residuals Tests:
Statistic p-Value
Jarque-Bera Test R Chi^2 3.863743e+05 0.0000000
Shapiro-Wilk Test R W 6.770924e-01 0.0000000
Ljung-Box Test R Q(10) 7.165589e+00 0.7097254
Ljung-Box Test R Q(15) 9.637014e+00 0.8419305
Ljung-Box Test R Q(20) 1.156437e+01 0.9302548
Ljung-Box Test R^2 Q(10) 7.277833e-02 1.0000000
Ljung-Box Test R^2 Q(15) 1.149614e-01 1.0000000
Ljung-Box Test R^2 Q(20) 2.076563e-01 1.0000000
LM Arch Test R TR^2 1.078846e-01 1.0000000
Information Criterion Statistics:
AIC BIC SIC HQIC
14.45585 14.50929 14.45559 14.47640
fitted model
\[ X_t = 0.9605 X_{t-1} -0.4149 X_{t-2} + 0.07945 X_{t-3} - \varepsilon_{t-1} + 0.3465 \varepsilon_{t-2}+\varepsilon_t\\ h_t = 15570 + 0.09434 \varepsilon_{t-1}^2 + 0.7862 h_{t-1} \]
here the p value is 0.1105, the sample statistic has a 11.05% probability to be observed given an assumed Null Hypothesis. therefore we cannot reject the Null Hypothesis at the 5% level. That is to say, given our sample, we do not see sufficient statistical evidence that the population daily log percentage return is different from 0.
getSymbols("WMT",from = "2018-01-02", to = "2024-11-01")
[1] "WMT"
# find the log percentage returns
wmt = WMT$WMT.Close
lret_wmt = log(wmt/lag(wmt))*100
# format data
lret_wmt = na.omit(lret_wmt)
lret_wmt = as.numeric(lret_wmt)
# find the mean
mean(lret_wmt)
[1] 0.053156
# conduct t test
t.test(lret_wmt)
One Sample t-test
data: lret_wmt
t = 1.5966, df = 1718, p-value = 0.1105
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-0.01214204 0.11845404
sample estimates:
mean of x
0.053156
Taking the Ljung Box test shows that, we can reject the Null Hypothesis for both the log returns and the log squared returns. that is, there is serial dependencies in both time series
Box.test(lret_wmt, lag = 10, type = "Ljung")
Box-Ljung test
data: lret_wmt
X-squared = 39.364, df = 10, p-value = 2.192e-05
Box.test(lret_wmt^2, lag = 10, type = "Ljung")
Box-Ljung test
data: lret_wmt^2
X-squared = 278.83, df = 10, p-value < 2.2e-16
the model seems adequate - the QQ plot shows reasonable good fit for most of the central quantile data points, Ljung Box test on the residuals also shows no serial dependencies in the residual
# use EACF
eacf(lret_wmt)
AR/MA
0 1 2 3 4 5 6 7 8 9 10 11 12 13
0 x o o x o o o x x o o o o o
1 o o o x o o o o x o o o o o
2 x o o x o o o o x o o o o o
3 x o x x o o o o x o o o o o
4 x x x x o o o o x o o o o o
5 x x o x x o o x x o o o o o
6 x x o x x x o o o o o o o o
7 x x x x x x x x o o o o o o
# candidate ARMA(1,0) - LB test sufficient: we can reject the null hypothesis at 5% - i.e. there is serial dependencies in the residual
m1 = arima(lret_wmt, c(1,0,0))
Box.test(m1$residuals, lag = 10 , type = "Ljung")
Box-Ljung test
data: m1$residuals
X-squared = 27.644, df = 10, p-value = 0.002058
# candidate ARMA(3,1) - LB test sufficient: we can reject the null hypothesis at 5% - i.e. there is serial dependencies in the residual
m2 = arima(lret_wmt, c(3,0,1))
Box.test(m2$residuals, lag = 10 , type = "Ljung")
Box-Ljung test
data: m2$residuals
X-squared = 24.656, df = 10, p-value = 0.006038
# candidate ARMA(5,2)
m3 = arima(lret_wmt, c(5,0,2))
Box.test(m3$residuals, lag = 10 , type = "Ljung")
Box-Ljung test
data: m3$residuals
X-squared = 13.308, df = 10, p-value = 0.2069
#fit ARMA - GARCH model
model_C1 = garchFit(~arma(5,2)+garch(1,1),lret_wmt,trace = F)
Warning in arima(.series$x, order = c(u, 0, v), include.mean = include.mean) :
possible convergence problem: optim gave code = 1
summary(model_C1)
Title:
GARCH Modelling
Call:
garchFit(formula = ~arma(5, 2) + garch(1, 1), data = lret_wmt,
trace = F)
Mean and Variance Equation:
data ~ arma(5, 2) + garch(1, 1)
<environment: 0x000001bff6582df0>
[data = lret_wmt]
Conditional Distribution:
norm
Coefficient(s):
mu ar1 ar2 ar3 ar4 ar5 ma1 ma2 omega alpha1 beta1
0.1290380 -0.3900016 -0.7518651 -0.0465791 -0.0684031 0.0069665 0.3222041 0.6914670 0.2000378 0.1174106 0.7797547
Std. Errors:
based on Hessian
Error Analysis:
Estimate Std. Error t value Pr(>|t|)
mu 0.129038 0.061355 2.103 0.0355 *
ar1 -0.390002 0.180382 -2.162 0.0306 *
ar2 -0.751865 0.181482 -4.143 3.43e-05 ***
ar3 -0.046579 0.039658 -1.175 0.2402
ar4 -0.068403 0.030796 -2.221 0.0263 *
ar5 0.006966 0.031635 0.220 0.8257
ma1 0.322204 0.177503 1.815 0.0695 .
ma2 0.691467 0.175721 3.935 8.32e-05 ***
omega 0.200038 0.043933 4.553 5.28e-06 ***
alpha1 0.117411 0.023050 5.094 3.51e-07 ***
beta1 0.779755 0.037184 20.970 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log Likelihood:
-2848.271 normalized: -1.656935
Description:
Fri Dec 6 16:49:17 2024 by user: Toby Lam
Standardised Residuals Tests:
Statistic p-Value
Jarque-Bera Test R Chi^2 1.801635e+04 0.0000000
Shapiro-Wilk Test R W 8.817975e-01 0.0000000
Ljung-Box Test R Q(10) 4.577784e+00 0.9175425
Ljung-Box Test R Q(15) 5.656693e+00 0.9849556
Ljung-Box Test R Q(20) 9.070822e+00 0.9820724
Ljung-Box Test R^2 Q(10) 1.597135e+00 0.9985996
Ljung-Box Test R^2 Q(15) 2.281165e+00 0.9999296
Ljung-Box Test R^2 Q(20) 2.745289e+00 0.9999981
LM Arch Test R TR^2 1.699656e+00 0.9997460
Information Criterion Statistics:
AIC BIC SIC HQIC
3.326667 3.361539 3.326586 3.339570
plot(model_C1,which = 13)
fitted model
\[ X_t = 0.1290 - 0.3900 X_{t-1} -0.7519 X_{t-2} -0.04658 X_{t-3} -0.06840 X_{t-4} +0.006966 X_{t-5} + 0.3222\varepsilon_{t-1} + 0.6915 \varepsilon_{t-2}\\ h_t = 0.2000 + 0.1174 \varepsilon_{t-1}^2 +0.7798 h_{t-1} \]
fitting the IGARCH(1,1) model to the time series \(z_t\)
zret = lret_wmt - mean(lret_wmt)
spec1=ugarchspec(variance.model=list(model="iGARCH",garchOrder=c(1,1)),mean.model=list(armaOrder=c(0,0)))
mm=ugarchfit(zret,spec=spec1)
mm
*---------------------------------*
* GARCH Model Fit *
*---------------------------------*
Conditional Variance Dynamics
-----------------------------------
GARCH Model : iGARCH(1,1)
Mean Model : ARFIMA(0,0,0)
Distribution : norm
Optimal Parameters
------------------------------------
Estimate Std. Error t value Pr(>|t|)
mu 0.000000 0.029637 0.0000 1.0e+00
omega 0.007722 0.001926 4.0087 6.1e-05
alpha1 0.044440 0.006565 6.7697 0.0e+00
beta1 0.955560 NA NA NA
Robust Standard Errors:
Estimate Std. Error t value Pr(>|t|)
mu 0.000000 0.040574 0.0000 1.000000
omega 0.007722 0.005223 1.4785 0.139270
alpha1 0.044440 0.025872 1.7177 0.085847
beta1 0.955560 NA NA NA
LogLikelihood : -2901.029
Information Criteria
------------------------------------
Akaike 3.3787
Bayes 3.3883
Shibata 3.3787
Hannan-Quinn 3.3823
Weighted Ljung-Box Test on Standardized Residuals
------------------------------------
statistic p-value
Lag[1] 2.537 0.1112
Lag[2*(p+q)+(p+q)-1][2] 2.643 0.1754
Lag[4*(p+q)+(p+q)-1][5] 3.876 0.2699
d.o.f=0
H0 : No serial correlation
Weighted Ljung-Box Test on Standardized Squared Residuals
------------------------------------
statistic p-value
Lag[1] 0.7211 0.3958
Lag[2*(p+q)+(p+q)-1][5] 0.8096 0.9011
Lag[4*(p+q)+(p+q)-1][9] 1.1217 0.9806
d.o.f=2
Weighted ARCH LM Tests
------------------------------------
Statistic Shape Scale P-Value
ARCH Lag[3] 0.003869 0.500 2.000 0.9504
ARCH Lag[5] 0.123037 1.440 1.667 0.9824
ARCH Lag[7] 0.314495 2.315 1.543 0.9920
Nyblom stability test
------------------------------------
Joint Statistic: 1.2822
Individual Statistics:
mu 0.03649
omega 0.26378
alpha1 0.18205
Asymptotic Critical Values (10% 5% 1%)
Joint Statistic: 0.846 1.01 1.35
Individual Statistic: 0.35 0.47 0.75
Sign Bias Test
------------------------------------
Adjusted Pearson Goodness-of-Fit Test:
------------------------------------
group statistic p-value(g-1)
1 20 133.5 3.218e-19
2 30 145.2 2.071e-17
3 40 154.1 1.303e-15
4 50 172.4 1.204e-15
Elapsed time : 0.0635159
the fitted model \[ h_t = 0.007722 + 0.04444 \varepsilon_t^2 + 0.9556 h_{t-1} \]
under the Ljung Box test, we cannot reject the Null Hypothesis at the 5% level - that is there is no serial correlation in the standardized residuals
zstd = as.numeric(sigma(mm))
sres_z = zret/zstd
Box.test(sres_z,lag = 10, type = "Ljung")
Box-Ljung test
data: sres_z
X-squared = 11.714, df = 10, p-value = 0.3046
getSymbols('^GSPC', from = '2018-01-02', to = '2024-11-01')
[1] "GSPC"
lret_wmt = log(WMT$WMT.Close/lag(WMT$WMT.Close))*100
lret_spx =log(GSPC$GSPC.Close/lag(GSPC$GSPC.Close))*100
lret_comb = merge(lret_spx,lret_wmt)
lret_comb = na.omit(lret_comb)
lret_wmt = as.numeric(lret_comb$WMT.Close)
lret_spx = as.numeric(lret_comb$GSPC.Close)
m_capm = lm(lret_wmt~lret_spx)
summary(m_capm)
Call:
lm(formula = lret_wmt ~ lret_spx)
Residuals:
Min 1Q Median 3Q Max
-13.0862 -0.5708 -0.0232 0.5513 8.5045
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03180 0.02983 1.066 0.287
lret_spx 0.48974 0.02376 20.613 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.236 on 1717 degrees of freedom
Multiple R-squared: 0.1984, Adjusted R-squared: 0.1979
F-statistic: 424.9 on 1 and 1717 DF, p-value: < 2.2e-16
fitted model \[ X_t = 0.03180 + 0.4897R_{m,t}+\varepsilon_t \]
we let the null hypothesis be that the sensitivity to market returns \(\beta\) (how much percentage point movement of the market returns affect percentage point movement of the stock return), is 0. whilst the alternative hypothesis is that it is not 0
as the p value for the \(\beta\) from the t-test is 2e-16. it means that we can reject the Null Hypothesis at the 5% level- that is to say there is statistical evidence, that the return of the stock is sensitive to the market returns
# create dummy variable c1
c1 = rep(0,length(lret_spx))
idx=c(1:length(lret_spx))[lret_spx>0]
c1[idx] = 1
# create dummy variable nsp
lret_nsp = c1*lret_spx
we let the null hypothesis be that the alternative abnormal return in a positive market regime \(\tilde\alpha\) is 0. whilst the alternative hypothesis for is that it is not 0
as the p value for the \(\tilde{\alpha}\) from the t-test is 0.0987. it means that we cannot reject the Null Hypothesis at the 5% level- that is to say there is not sufficient statistical evidence, that there exists an alternative abnormal return in stock returns in a positive market regime
m_tcapm1 = lm(lret_wmt~c1+lret_spx)
summary(m_tcapm1)
Call:
lm(formula = lret_wmt ~ c1 + lret_spx)
Residuals:
Min 1Q Median 3Q Max
-13.0930 -0.5732 -0.0111 0.5641 8.5389
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.10143 0.05163 1.964 0.0496 *
c1 -0.13105 0.07933 -1.652 0.0987 .
lret_spx 0.52390 0.03149 16.638 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.236 on 1716 degrees of freedom
Multiple R-squared: 0.1997, Adjusted R-squared: 0.1987
F-statistic: 214 on 2 and 1716 DF, p-value: < 2.2e-16
fitted model at 5% level of significance, \(I\{\cdot\}\) is the indicator variable
\[ X_t = 0.1014 -0.1311 I\{R_{m,t} > 0\} + 0.5239 R_{m,t} + \varepsilon_t \]
we let the null hypothesis be that the alternative - sensitivity to market returns in a positive market regime \(\tilde{\beta}\) (how much percentage point movement of the market returns affect percentage point movement of the stock return, given market return is positive), whilst the alternative hypothesis is that it is not 0
as the p value for the \(\tilde{\beta}\) from the t-test is 0.612. it means that we cannot reject the Null Hypothesis at the 5% level- that is to say there is not sufficient statistical evidence, that there exists an alternative sensitivity of stock returns to market returns during a positive market regime
m_tcapm1 = lm(lret_wmt~lret_nsp+lret_spx)
summary(m_tcapm1)
Call:
lm(formula = lret_wmt ~ lret_nsp + lret_spx)
Residuals:
Min 1Q Median 3Q Max
-13.1081 -0.5760 -0.0210 0.5527 8.5039
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.01844 0.03979 0.463 0.643
lret_nsp 0.03221 0.06351 0.507 0.612
lret_spx 0.47517 0.03728 12.747 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.236 on 1716 degrees of freedom
Multiple R-squared: 0.1985, Adjusted R-squared: 0.1976
F-statistic: 212.5 on 2 and 1716 DF, p-value: < 2.2e-16
fitted model \[ X_t = 0.4752 R_{m,t} + 0.03221 R_{m,t} I \{R_{m,t} > 0\}+ \varepsilon_t \]
we let the null hypothesis be that the alternative abnormal return in a positive market regime \(\tilde\alpha\) is 0. whilst the alternative hypothesis for is that it is not 0.
similarly we let the null hypothesis be that the alternative - sensitivity to market returns in a positive market regime \(\tilde{\beta}\) (how much percentage point movement of the market returns affect percentage point movement of the stock return, given market return is positive), whilst the alternative hypothesis is that it is not 0
as the p value for the \(\tilde{\alpha}\) from the t-test is 0.0906. it means that we cannot reject the Null Hypothesis at the 5% level- that is to say there is not sufficient statistical evidence, that there exists an alternative abnormal return in stock returns in a positive market regime
as the p value for the \(\tilde{\beta}\) from the t-test is 0.5289. it means that we cannot reject the Null Hypothesis at the 5% level- that is to say there is not sufficient statistical evidence, that there exists an alternative sensitivity of stock returns to market returns during a positive market regime
m_tcapm1 = lm(lret_wmt~c1+lret_nsp+lret_spx)
summary(m_tcapm1)
Call:
lm(formula = lret_wmt ~ c1 + lret_nsp + lret_spx)
Residuals:
Min 1Q Median 3Q Max
-13.1205 -0.5729 -0.0119 0.5641 8.5392
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.08676 0.05665 1.531 0.1259
c1 -0.13471 0.07955 -1.693 0.0906 .
lret_nsp 0.04009 0.06365 0.630 0.5289
lret_spx 0.50673 0.04166 12.164 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.236 on 1715 degrees of freedom
Multiple R-squared: 0.1998, Adjusted R-squared: 0.1984
F-statistic: 142.8 on 3 and 1715 DF, p-value: < 2.2e-16
fitted model \[ X_t = 0.08676 -0.1347 I\{R_{m,t}>0\} + 0.5067 R_{m,t} + 0.04009 R_{m,t} I\{R_{m,t}>0\}+ \varepsilon_t \]
the markov chain regime switching model (MCSM), supposes that there is a set of unobservable states in the system, in which the behaviors of the time series is different under. these states jump to other states in the markov chain, as describe by the markov transition probability
there is no need to assume or specific prior states like we have to in threshold models, the states can also be unobservant unlike threshold models
#Model with only intercept
mod<-lm(lret_wmt~lret_nsp + lret_spx)
#fit an constant model
msm_model = msmFit(mod, k=2, sw=c(T,T,T,T))
summary(msm_model)
Markov Switching Model
Call: msmFit(object = mod, k = 2, sw = c(T, T, T, T))
Coefficients:
Regime 1
---------
Estimate Std. Error t value Pr(>|t|)
(Intercept)(S) -0.0404 0.2919 -0.1384 0.889924
lret_nsp(S) 0.2564 0.2415 1.0617 0.288372
lret_spx(S) 0.4423 0.1523 2.9041 0.003683 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.381196
Multiple R-squared: 0.1869
Standardized Residuals:
Min Q1 Med Q3 Max
-13.431221722 -0.052082713 -0.002405205 0.043898711 8.411864474
Regime 2
---------
Estimate Std. Error t value Pr(>|t|)
(Intercept)(S) 0.0915 0.0334 2.7395 0.006153 **
lret_nsp(S) -0.1876 0.0688 -2.7267 0.006397 **
lret_spx(S) 0.5346 0.0412 12.9757 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8499172
Multiple R-squared: 0.2297
Standardized Residuals:
Min Q1 Med Q3 Max
-2.225361e+00 -5.124874e-01 -8.764533e-05 4.986231e-01 2.312729e+00
Transition probabilities:
Regime 1 Regime 2
Regime 1 0.6915957 0.02446478
Regime 2 0.3084043 0.97553522
fitted model at 5%
\[ S_1 : X_t = -0.0404 +0.4423 R_{m,t} +0.2564 R_{m,t} I\{R_{m,t}> 0\} + \varepsilon_{t,1}\\ S_2 : X_t = 0.0915 + 0.5346 R_{m,t} -0.1867 R_{m,t} I\{R_{m,t}> 0\} + \varepsilon_{t,2}\\ \]
\(\alpha_1 \ \& \ \alpha_2\) represent the abnormal returns in the different states, \(\beta_1 \ \& \ \beta_2\) represents stock return’s normal sensitivity to the market returns. If \(\gamma_i \neq 0\) then \(\beta_i\) can be interpreted as the sensitivity during a negative market regime. \(\sigma_1^2 \ \& \ \sigma_2^2\) represents the level of innovation or return shocks in the two different periods. Sometimes it is also possible to interpret the states for example in state 1 the constant term \(\mu\) is negative which suggests a market downturn state.
Yes, this is observed in state 2, where there is a different \(\beta\) for positive and negative market returns
the columns of the matrix indicate the current state, whilst the rows indicate the next state. hence position [1,1] indicate the probability of the to remain in the current state, given we are currently in state one.
similarly [1,2] is the probability to be in state 1 in the next time step given that we are currently in state 1. by definition column sum of the matrix = 1
tmat = msm_model@transMat
tmat
[,1] [,2]
[1,] 0.6915957 0.02446478
[2,] 0.3084043 0.97553522
Expected duration of state 1 is 40.88 periods (trading days), expected duration of state 2 is 3.242 periods (trading days)
duration = 1/(1-diag(tmat))
duration
[1] 3.242497 40.875078