In this problem we will analyze Microsoft stock price (GOOG/NASDAQ_MSFT). In this example I will work with the close prices of MSFT. First let’s import the data from Quandl.
library(Quandl)
library(tseries)
library(forecast)
library(urca)
## Warning: package 'urca' was built under R version 3.2.4
library(fGarch)
## Warning: package 'fGarch' was built under R version 3.2.4
## Warning: package 'timeSeries' was built under R version 3.2.4
## Warning: package 'fBasics' was built under R version 3.2.4
library(rugarch)
## Warning: package 'rugarch' was built under R version 3.2.4
library(stargazer)
ms=Quandl("GOOG/NASDAQ_MSFT", type="zoo")
msft=ms[,4]
str(msft)
## 'zoo' series from 1986-03-13 to 2016-03-22
## Data: num [1:7584] 0.1 0.1 0.1 0.1 0.1 0.1 0.09 0.09 0.09 0.09 ...
## Index: Date[1:7584], format: "1986-03-13" "1986-03-14" "1986-03-17" "1986-03-18" ...
head(msft)
## 1986-03-13 1986-03-14 1986-03-17 1986-03-18 1986-03-19 1986-03-20
## 0.1 0.1 0.1 0.1 0.1 0.1
tail(msft)
## 2016-03-15 2016-03-16 2016-03-17 2016-03-18 2016-03-21 2016-03-22
## 53.59 54.35 54.66 53.49 53.86 54.07
plot(msft, xlab="", ylab="", main="Microsoft Corporation (MSFT) stock prices")
Now let’s log transform it and take first difference:
lmsft = log(msft)
dlmsft = diff(lmsft)
plot(dlmsft, xlab="", ylab="", main="Difference of lof transformed MSFT")
Let’s see is series is stationary by ADF test:
adf.test(dlmsft)
## Warning in adf.test(dlmsft): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: dlmsft
## Dickey-Fuller = -19.671, Lag order = 19, p-value = 0.01
## alternative hypothesis: stationary
At 5% level we can reject the null hypothesis of non-stationarity. We can see that the data is stationary and now let’s plot the ACF and PACF of the model to estimate appropriate ARMA model:
par(mfrow=c(1,2))
acf(as.data.frame(dlmsft),type='correlation',lag=15, main="dCPI")
acf(as.data.frame(dlmsft),type='partial',lag=15, main="dCPI")
Since it is daily (business days) series data I included 15 lags. From ACF and PACF I assume that the model will be with
m1 = arima(coredata(dlmsft), order=c(4,0,5))
tsdiag(m1,gof.lag=15)
ARIMA(5,0,4) model seems to be appropriate. Now let’s also check the auto arima command:
auto.arima(coredata(dlmsft), max.p = 5, max.q = 5, max.P = 5, max.Q = 5)
## Warning in auto.arima(coredata(dlmsft), max.p = 5, max.q = 5, max.P = 5, :
## Unable to fit final model using maximum likelihood. AIC value approximated
## Series: coredata(dlmsft)
## ARIMA(5,1,4)
##
## Coefficients:
## ar1 ar2 ar3 ar4 ar5 ma1 ma2
## -1.6636 -1.6851 -0.9901 -0.1202 -0.0472 0.6383 -0.0365
## s.e. 0.0328 0.0413 0.0387 0.0225 0.0116 0.0316 0.0152
## ma3 ma4
## -0.7122 -0.8326
## s.e. 0.0236 0.0261
##
## sigma^2 estimated as 0.000539: log likelihood=17771.7
## AIC=-35516.04 AICc=-35516.01 BIC=-35446.71
According to the auto.atima ARIMA(5,1,4) is an appropriate model. Now let’s perfomr the Lyung Box test to ARIMA(5,0,4) model:
Box.test(m1$residuals, lag=15, type="Ljung")
##
## Box-Ljung test
##
## data: m1$residuals
## X-squared = 7.3751, df = 15, p-value = 0.9464
So according to ACF, PACF and following LB test we can say that ARIMA(5,0,4) model is appropriate and that residuals are not serially corelated.
acf(m1$residuals^2)
pacf(m1$residuals^2)
Based on PACF and ACF we can consider ARCH(8) model.
arch = garchFit(~ arma(5,4) + garch(8,0), data = dlmsft, trace = FALSE)
## Warning in arima(.series$x, order = c(u, 0, v), include.mean =
## include.mean): possible convergence problem: optim gave code = 1
arch
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~arma(5, 4) + garch(8, 0), data = dlmsft,
## trace = FALSE)
##
## Mean and Variance Equation:
## data ~ arma(5, 4) + garch(8, 0)
## <environment: 0x0b8f5860>
## [data = dlmsft]
##
## Conditional Distribution:
## norm
##
## Coefficient(s):
## mu ar1 ar2 ar3 ar4
## 0.00024590 0.44350186 0.64343584 0.54331937 -0.89230874
## ar5 ma1 ma2 ma3 ma4
## -0.00406812 -0.46558966 -0.66570730 -0.54197106 0.92100260
## omega alpha1 alpha2 alpha3 alpha4
## 0.00017557 0.16224530 0.12026554 0.06234907 0.11713034
## alpha5 alpha6 alpha7 alpha8
## 0.05990453 0.03171972 0.09657060 0.05183601
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu 2.459e-04 5.839e-05 4.211 2.54e-05 ***
## ar1 4.435e-01 3.021e-02 14.679 < 2e-16 ***
## ar2 6.434e-01 1.883e-02 34.179 < 2e-16 ***
## ar3 5.433e-01 1.801e-02 30.173 < 2e-16 ***
## ar4 -8.923e-01 3.276e-02 -27.235 < 2e-16 ***
## ar5 -4.068e-03 1.503e-02 -0.271 0.78668
## ma1 -4.656e-01 2.724e-02 -17.091 < 2e-16 ***
## ma2 -6.657e-01 1.229e-02 -54.182 < 2e-16 ***
## ma3 -5.420e-01 1.718e-02 -31.540 < 2e-16 ***
## ma4 9.210e-01 2.932e-02 31.409 < 2e-16 ***
## omega 1.756e-04 7.262e-06 24.177 < 2e-16 ***
## alpha1 1.622e-01 1.534e-02 10.579 < 2e-16 ***
## alpha2 1.203e-01 1.620e-02 7.425 1.13e-13 ***
## alpha3 6.235e-02 1.252e-02 4.979 6.40e-07 ***
## alpha4 1.171e-01 1.473e-02 7.954 1.78e-15 ***
## alpha5 5.990e-02 1.199e-02 4.994 5.90e-07 ***
## alpha6 3.172e-02 1.206e-02 2.629 0.00855 **
## alpha7 9.657e-02 1.470e-02 6.568 5.10e-11 ***
## alpha8 5.184e-02 1.175e-02 4.411 1.03e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 18651.65 normalized: 2.459666
##
## Description:
## Wed Mar 23 14:10:37 2016 by user: bahundja
So here is the summary for the ARCH(8) model discussed in part (b):
plot(arch@residuals, type="l", xlab="", ylab="", main="Residuals")
plot(arch@residuals/arch@sigma.t, type="l", xlab="", ylab="", main="Rtandardized Residuals")
plot((arch@residuals/arch@sigma.t)^2, type="l", xlab="", ylab="", main="Squared Standardized Residuals")
Now let’s plot the ACF and PACF’s and check adequacy. First, let’s start with Standardized Residuals:
par(mfrow=c(1,2))
acf(arch@residuals/arch@sigma.t,type="correlation", lag.max=15, xlab="",ylab="",main="ACF Standardized Residuals")
acf(arch@residuals/arch@sigma.t,type="partial", lag.max=15, xlab="",ylab="",main="PACF for Standardized Residuals")
And ACF/PACF of Squared Standardized Residuals will be:
par(mfrow=c(1,2))
acf((arch@residuals/arch@sigma.t)^2, type="correlation", lag.max=15, ylab="", main="ACF for Squared Standardized Residuals")
acf((arch@residuals/arch@sigma.t)^2, type="partial", lag.max=15, ylab="", main="PACF for Squared Standardized Residuals")
We can observe that residuals are not serially corelated.
Q10 <- arima(arch@residuals/arch@sigma.t, order=c(10,0,0))
tsdiag(Q10,gof.lag=15)
In this part we have to estimate a GARCH(m, s) model with Student-t innovations and perform model checking. Now let’s let’s start with GARCH(1,1)
garch = garchFit( ~ garch(1,1), data=dlmsft, trace=FALSE)
par(mfrow=c(1,2))
acf(garch@residuals/garch@sigma.t, type="correlation", lag.max=15, ylab="", main="ACF for standardized residuals")
acf(garch@residuals/garch@sigma.t, type="partial", lag.max=15, ylab="", main="PACF for standardized residuals")
par(mfrow=c(1,2))
acf((garch@residuals/garch@sigma.t)^2, type="correlation", lag.max=15, ylab="", main="ACF for squared standardized residuals")
acf((garch@residuals/garch@sigma.t)^2, type="partial", lag.max=15, ylab="", main="PACF for squared standardized residuals")