Q2

We retrieve the stock price data for Microsoft and transform the data time series into \(y_t=\Delta \text{log MSFT_t}\)

First we load the all of the packages that I need for this analysis.

library("Quandl")
library("tseries")
library("urca")
library("fGarch")
library("stargazer")
library("rugarch")

Then I run the log differences transformation of the MSFT stock price data. We only look at the closing prices. I then plot the data for initial checks.

msft<-Quandl("GOOG/NASDAQ_MSFT", type="zoo")
msft2<-msft$Close
logmsft<-log(msft2)
diffmsft<- diff(logmsft,1)
par(mfrow=c(1,2))
plot(logmsft,xlab="Time", ylab="Log of MSFT")
plot(diffmsft,xlab="Time", ylab="Log Differences of MSFT")

We can see that \(\Delta \text{log MSFT_t}\) on the right looks stationary and the raw data looks nonstationary. If anything the left chart looks like there is a structural trend break around 2000.

Estimate an ARMA Model

Now, I use the ACF, PACF, and Ljung-Box statistics to identify and estimate a model for the \(\Delta \text{log MSFT_t}\) series.

par(mfrow=c(1,2))
acf(coredata(diffmsft),type='correlation',na.action=na.pass,lag=24, ,main="ACF of diffMSFT")
acf(coredata(diffmsft),type='partial',na.action=na.pass,lag=24,main="PACF of diffMSFT")

Looking at both the ACF and PACF, it looks like an ARMA(4,4) model might be a good estimator. This seems to be an unusual configuration. But we will test ARMA(4,4) as well as other ones using Ljung-Box Statistics.

ARMA44 <- arima(coredata(diffmsft), order=c(4,0,4))
tsdiag(ARMA44,gof.lag=24)

Lets also test ARMA (4,5)

ARMA45 <- arima(coredata(diffmsft), order=c(4,0,5))
tsdiag(ARMA45,gof.lag=24)

And ARMA (3,2)

ARMA32 <- arima(coredata(diffmsft), order=c(3,0,2))
tsdiag(ARMA32,gof.lag=24)

If we do a BIC check on all three specifications.

BIC(ARMA44)
## [1] -35508.46
BIC(ARMA45)
## [1] -35510.42
BIC(ARMA32)
## [1] -35531.75

Comparing our three BIC values, they are all very close. However, ARMA(3,2) does have the lowest BIC. Also, that would be what we would specify by looking at the most significant signals on the ACF and PACF. So we should estimate our model using ARMA(3,2)

Identify and Estimate a ARCH Model with Normal Innovations

An arch model with squared residuals looks like this: \[ \sigma_t^2=w+\alpha_1 a_{t-1}^2\]

First we will get the squared residuals from our ARMA(3,2) model.

par(mfrow=c(1,2))
acf(ARMA32$residuals^2,type='correlation',na.action=na.pass,lag=12,main="ACF of Squared Residuals")
acf(ARMA32$residuals^2,type='partial',na.action=na.pass,lag=12,main="PACF of Squared Residuals")

Looking at the ACF and PACF, it looks like the specifications can be very high. Lets just look at the most significant ones and say it is ARMA(2,2). If we want to try to name them all, we can also look at ARMA(8,8)

Now we look at the Ljung-Box statistics and BIC.

sARMA22 <- arima(ARMA32$residuals^2, order=c(2,0,2))
tsdiag(sARMA22,gof.lag=24)

Lets also test ARMA (8,8)

sARMA88 <- arima(ARMA32$residuals^2, order=c(8,0,8))
tsdiag(sARMA22,gof.lag=24)

BIC(sARMA22)
## [1] -71689.05
BIC(sARMA88)
## [1] -71654.46

We can see that our ARMA(2,2) specifications for the squared residuals have a lower BIC. We will use the ARCH model specifications in the next section.

Identify and Estimate an ARCH Model

First we will test an ARCH(8,0) model and plot the residuals, conditional \(\sigma\) and squared residuals. We also use a ARCH(8,0) setting.

arch1<-garchFit(~arma(3,2) + garch(8,0), data=diffmsft, trace=FALSE)
## Warning in arima(.series$x, order = c(u, 0, v), include.mean =
## include.mean): possible convergence problem: optim gave code = 1
## Warning in sqrt(diag(fit$cvar)): NaNs produced
par(mfrow=c(2,2))
plot(arch1@residuals,type="l", xlab="",ylab="",main="Residuals")
plot(arch1@sigma.t,type="l", xlab="",ylab="",main="Conditional Standard Deviation")
plot(arch1@residuals/arch1@sigma.t, type="l", xlab="",ylab="",main="Standardized Residuals")
plot((arch1@residuals/arch1@sigma.t)^2, type="l", xlab="",ylab="",main="Squared Standardized Residuals")

Now we will plot the ACF and PCF charts.

check1 <- arch1@residuals/arch1@sigma.t
check2 <- (arch1@residuals/arch1@sigma.t)^2
par(mfcol=c(2,2), cex=0.6, mar=c(2,2,3,1))
acf(check1, type="correlation", lag.max=24, ylab="", main="ACF for standardized residuals")
acf(check1, type="partial", lag.max=24, ylab="", main="PACF for standardized residuals")
acf(check2, type="correlation", lag.max=24, ylab="", main="ACF for squared standardized residuals")
acf(check2, type="partial", lag.max=24, ylab="", main="PACF for squared standardized residuals")

This looks like our residuals have little autocorrelation. It seems to be that our model works and is within desirable specifications. However, we should look at the Q test.

This is the Q(10) statistic for the standardized residuals

scheck1 <- arima(check1, order=c(10,0,0))
tsdiag(scheck1,gof.lag=24)

This is the Q(20) statistic for the standardized residuals

scheck1 <- arima(check1, order=c(20,0,0))
tsdiag(scheck1,gof.lag=24)

This is the Q(10) statistic for the squared standardized residuals

scheck2 <- arima(check2, order=c(10,0,0))
tsdiag(scheck2,gof.lag=24)

This is the Q(20) statistic for the squared standardized residuals

scheck2 <- arima(check2, order=c(20,0,0))
tsdiag(scheck2,gof.lag=24)

Let’s also continue to check for adequacy. I continue to the QQ plot and the Jarque Bera test which is coming up.

par(mfrow=c(2,3), cex=0.6, mar=c(4,4,3,1))
plot(arch1, which=3)
plot(arch1, which=9)
plot(arch1, which=2)
plot(arch1, which=10)
plot(arch1, which=11)
plot(arch1, which=13)

By looking at our QQ plot, we can see that there are significant deviations at the tail end. This shows us that our data is not normally distributed and has fat tails. Lets look at the Jarque-Bera test.

skewness(diffmsft)
## [1] -0.4579968
## attr(,"method")
## [1] "moment"
kurtosis(diffmsft)
## [1] 14.45592
## attr(,"method")
## [1] "excess"
normalTest(diffmsft, method="jb")
## 
## Title:
##  Jarque - Bera Normalality Test
## 
## Test Results:
##   STATISTIC:
##     X-squared: 66325.5024
##   P VALUE:
##     Asymptotic p Value: < 2.2e-16 
## 
## Description:
##  Tue Mar 22 15:56:05 2016 by user: rakim

There is excess kurtosis of 14.14-3= 11.14 and we have a slight negative skew value. When we test for normality, we reject the null. So we verify our QQ plot results that our \(y_t\) is not normally distributed.

Estimate a GARCH(m,s) Model with Student-T Innovations

First lets specify some basic lower order GARCH(1,1) GARCH (2,1) GARCH (1,2) models for specification. We will do this with Student-t innovations.

garch22<-garchFit(~garch(2,2), data=coredata(diffmsft), trace=FALSE, cond.dist="sstd")
## Warning in sqrt(diag(fit$cvar)): NaNs produced
par(mfrow=c(2,2))
plot(garch22@residuals,type="l", xlab="",ylab="",main="Residuals")
plot(garch22@sigma.t,type="l", xlab="",ylab="",main="Conditional Standard Deviation")
plot(garch22@residuals/garch22@sigma.t, type="l", xlab="",ylab="",main="Standardized Residuals")
plot((garch22@residuals/garch22@sigma.t)^2, type="l", xlab="",ylab="",main="Squared Standardized Residuals")

After modeling a GARCH(2,2) model, we now perform model checking. Looking at the ACF PACF charts,

check1 <- garch22@residuals/garch22@sigma.t
check2 <- (garch22@residuals/garch22@sigma.t)^2
par(mfcol=c(2,2), cex=0.6, mar=c(2,2,3,1))
acf(check1, type="correlation", lag.max=24, ylab="", main="ACF for standardized residuals")
acf(check1, type="partial", lag.max=24, ylab="", main="PACF for standardized residuals")
acf(check2, type="correlation", lag.max=24, ylab="", main="ACF for squared standardized residuals")
acf(check2, type="partial", lag.max=24, ylab="", main="PACF for squared standardized residuals")

Our GARCH(2,2) model passes our ACF and PACF tests because they are well below the outer boundary limits represented by the blue dotted lines.

This is the Q(3) statistic for the standardized residuals

scheck2 <- arima(check1, order=c(3,0,0))
tsdiag(scheck2,gof.lag=24)

Estimate a TGARCH(m,s) Model

Now we estimate a TGARCH model. First we use the garchFit function.

tgarch<- garchFit(~garch(2,2), data=coredata(diffmsft), trace=FALSE, leverage=TRUE)
## Warning in sqrt(diag(fit$cvar)): NaNs produced
par(mfrow=c(2,2))
plot(tgarch@residuals,type="l", xlab="",ylab="",main="Residuals")
plot(tgarch@sigma.t,type="l", xlab="",ylab="",main="Conditional Standard Deviation")
plot(tgarch@residuals/tgarch@sigma.t, type="l", xlab="",ylab="",main="Standardized Residuals")
plot((tgarch@residuals/tgarch@sigma.t)^2, type="l", xlab="",ylab="",main="Squared Standardized Residuals")

After modeling a TGARCH(2,2) model, we now perform model checking. Looking at the ACF PACF charts,

check1 <- tgarch@residuals/tgarch@sigma.t
check2 <- (tgarch@residuals/tgarch@sigma.t)^2
par(mfcol=c(2,2), cex=0.6, mar=c(2,2,3,1))
acf(check1, type="correlation", lag.max=24, ylab="", main="ACF for standardized residuals")
acf(check1, type="partial", lag.max=24, ylab="", main="PACF for standardized residuals")
acf(check2, type="correlation", lag.max=24, ylab="", main="ACF for squared standardized residuals")
acf(check2, type="partial", lag.max=24, ylab="", main="PACF for squared standardized residuals")

Looking at the ACF PACF charts, our TGARCH(2,2) seems to work. We do another check using the Q(10) test.

scheck2 <- arima(check1, order=c(10,0,0))
tsdiag(scheck2,gof.lag=24)

The specifications for TGARCH(2,2) also pass our Q(10) test. Looking at our standardized residuals, the TGARCH model shows us that negative residuals are more volatile than positive residuals. So negative shocks are bigger than positive shocks. This is in line with the characteristics of actual financial time series data.

Summary Table

Using the STARGAZER package, I show all of the models in this problem side by side.

stargazer(ARMA32,arch1,garch22,tgarch, align=TRUE, type="html", column.sep.width="0pt", report="vc*",
single.row=TRUE, header=FALSE, model.numbers=FALSE, dep.var.caption="", dep.var.labels="Microsoft Daily Returns", column.labels=c("ARMA(3,2)", "ARCH(8)", "GARCH(2,2)", "TGARCH(2,2)"))
Microsoft Daily Returns
ARIMA GARCH GARCH
ARMA(3,2) ARCH(8) GARCH(2,2) TGARCH(2,2)
mu 0.0002 0.001*** 0.001***
ar1 0.062 0.979
ar2 0.144 -0.242
ar3 -0.019
ma1 -0.101 -1.000
ma2 -0.195 0.230
intercept 0.001***
omega 0.0002*** 0.00000*** 0.00001***
alpha1 0.161*** 0.077*** 0.074***
alpha2 0.121*** 0.000 0.000
alpha3 0.063*** 0.000
alpha4 0.112***
alpha5 0.055***
alpha6 0.028**
alpha7 0.096***
alpha8 0.053***
gamma1 0.109***
gamma2 0.041
beta1 0.684 0.642***
beta2 0.238 0.273***
skew 1.041***
shape 4.837***
Observations 7,582 7,582 7,582 7,582
Log Likelihood 17,797.140 -18,630.460 -19,263.610 -18,817.680
sigma2 0.001
Akaike Inf. Crit. -35,580.290 -4.911 -5.079 -4.962
Bayesian Inf. Crit. -4.898 -5.071 -4.954
Note: p<0.1; p<0.05; p<0.01

Restimate the Best Model Using the RUGARCH Package. Create and Plot Rolling 1 Day Ahead Forecasts

Let us pick the TGARCH(2,2) model as the best model from the ACF and PACF charts. Now we should create and plot 100 rolling 1 day forecasts.

tgarchspec<-ugarchspec(mean.model=list(armaOrder=c(0,0), archm=TRUE, archpow=2, include.mean=TRUE), variance.model=list(model="sGARCH", garchOrder=c(2,2)))
t1<- ugarchfit(data=coredata(diffmsft), spec=tgarchspec,out.sample=200)
forecast<- ugarchforecast(t1,n.roll=100)
plot(forecast,which="all")

This marks the end of Problem number 2