The data is the Federal Housing Finance Agency (FHFA) House Price Index (HPI) which is a broad measure of the movement of single-family house prices. It is a weighted, repeat-sales index, meaning that it measures average price changes in repeat sales or refinancing on the same properties. This information is obtained by reviewing repeat mortgage transactions on single-family properties whose mortgages have been purchased or securitized by Fannie Mae or Freddie Mac.
The technical methodology for devising the index, collection, and publishing the data is at
This time series begins January 1991 and ends at August 2016. Both the seasonally adjusted index ‘index_sa’ and the non-seasonally adjusted index ‘index_nsa’ set the index value at 100 for January 1991. The dataset contains monthly time series from January 1991 to August 2016 for the East South Central Division of America.
We will perform analysis on the aggregate seasonally adjusted index values. The index value is begins at 100 from January 1991.
library(readr)
library(TSA)
## Loading required package: leaps
## Loading required package: locfit
## locfit 1.5-9.1 2013-03-22
## Loading required package: mgcv
## Loading required package: nlme
## This is mgcv 1.8-17. For overview type 'help("mgcv-package")'.
## Loading required package: tseries
##
## Attaching package: 'TSA'
## The following object is masked from 'package:readr':
##
## spec
## The following objects are masked from 'package:stats':
##
## acf, arima
## The following object is masked from 'package:utils':
##
## tar
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(forecast)
##
## Attaching package: 'forecast'
## The following objects are masked from 'package:TSA':
##
## fitted.Arima, plot.Arima
## The following object is masked from 'package:nlme':
##
## getResponse
library(rugarch)
## Loading required package: parallel
##
## Attaching package: 'rugarch'
## The following object is masked from 'package:stats':
##
## sigma
library(FitAR)
## Loading required package: lattice
## Loading required package: ltsa
## Loading required package: bestglm
##
## Attaching package: 'FitAR'
## The following object is masked from 'package:forecast':
##
## BoxCox
library(fGarch)
## Loading required package: timeDate
##
## Attaching package: 'timeDate'
## The following objects are masked from 'package:TSA':
##
## kurtosis, skewness
## Loading required package: timeSeries
##
## Attaching package: 'timeSeries'
## The following object is masked from 'package:zoo':
##
## time<-
## Loading required package: fBasics
##
## Rmetrics Package fBasics
## Analysing Markets and calculating Basic Statistics
## Copyright (C) 2005-2014 Rmetrics Association Zurich
## Educational Software for Financial Engineering and Computational Science
## Rmetrics is free software and comes with ABSOLUTELY NO WARRANTY.
## https://www.rmetrics.org --- Mail to: info@rmetrics.org
ensd_hp_index <-read_csv("C:/Users/Bhargab/Desktop/EastSouthCentralDivision.csv")
## Parsed with column specification:
## cols(
## hpi_type = col_character(),
## hpi_flavor = col_character(),
## frequency = col_character(),
## level = col_character(),
## place_name = col_character(),
## place_id = col_character(),
## yr = col_integer(),
## period = col_integer(),
## index_nsa = col_double(),
## index_sa = col_double()
## )
ensd_hp_index <- ensd_hp_index$index_sa
class(ensd_hp_index)
## [1] "numeric"
ensd_hp_index_ts <- ts(as.vector(ensd_hp_index), start=1991,frequency = 12)
class(ensd_hp_index_ts)
## [1] "ts"
ensd_hp_index_ts = log(ensd_hp_index_ts)
Plotting a Time Series Plot of the data
#------------------ Time Series Plot of the data -------------------------------#
plot(ensd_hp_index_ts,type='o',ylab= 'House Price Index',
main="Time Series Plot of House Price Index")
The time series plot shows an upward trend. After 2007, the series drops and later rises since 2011. Lot of points are closely related to each other and hence there is an indication of a strong correlation. The values are seasonally adjusted and hence we will not be considering the seasonal aspect of the timeseries.
Plotting the ACF Plot of the Time Series
#------------------ ACF & PACF Plot of the Time Series ----------------------------#
acf(ensd_hp_index_ts,main="ACF Plot of House Price Index",lag.max = 72)
pacf(ensd_hp_index_ts,main="PACF Plot of House Price Index")
The ACF plot supports our conclusion of high correlation among successive points. It also shows a strong evidence of an existence of a trend which was clearly seen in the time series plot. Furthermode, the PACF plot shows one significant correlation on the plot.
Calculation of the degree of correlation on 1st Lag
#-------------------------- Calculating the degree of correlation on Ist Lag -----------#
y=ensd_hp_index_ts
x=zlag(ensd_hp_index_ts)
index = 2:length(x)
cor(y[index],x[index])
## [1] 0.9995936
Correaltion coeficient : 0.9995806.
The correlation value is significantly high and supports our conclusion drawn from the timeseries plot.
Due to the presence of an upward trend from 1990 to 2007 and further a downward trend till 2011 and again an upward trend from 2011 to 2016, the data cannot be considered to be stationary. Hence we need to apply differencing to acquire a stationary time series to enable us to fit the time series models to the data.
#-------------------- De-trend the data of HPI for East South Central Division--------------#
#-------------------- Ist order differencing -------------------------------------------#
ensd_hp_index_ts_d1 = diff(ensd_hp_index_ts,differences = 1)
adf.test(ensd_hp_index_ts_d1, alternative = "stationary")
##
## Augmented Dickey-Fuller Test
##
## data: ensd_hp_index_ts_d1
## Dickey-Fuller = -3.2168, Lag order = 6, p-value = 0.08542
## alternative hypothesis: stationary
#------------------ Time Series Plot of the first difference ---------------------------#
plot(ensd_hp_index_ts_d1,type='o',ylab= 'House Price Index',
main="Time Series Plot of Ist difference HPI")
The time series plot shows a change in variance at few occasions. Also, it shows a decline and a rise further. A lot of places the points are closer to each other and hence there is an indication of a strong correlation. Also the ADF test is statistically insignificant indicating Non-stationarity. Hence we need to difference once more and check if we acquire a stationary series.
#-------------------- IInd order differencing -------------------------------------------#
ensd_hp_index_ts_d2 = diff(ensd_hp_index,differences = 2)
adf.test(ensd_hp_index_ts_d2, alternative = "stationary")
## Warning in adf.test(ensd_hp_index_ts_d2, alternative = "stationary"): p-
## value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: ensd_hp_index_ts_d2
## Dickey-Fuller = -10.813, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
#------------------ Time Series Plot of the IInd difference ---------------------------#
plot(ensd_hp_index_ts_d2,type='o',ylab= 'House Price Index',
main="Time Series Plot of IInd difference of HPI")
The time series plot shows a change in variance. A lot of places the points are closer to each other and hence there is an indication of a strong correlation. The ADF test is statistically significant (at 5% significance) which indicates Stationarity of the second difference time series.
ACF Plot of IInd difference
#--------------------------------ACF Plot of IInd difference ----------------------------#
acf(ensd_hp_index_ts_d2,main = 'ACF for IInd Difference HPI',lag.max = 36)
The ACF plot shows a jagged pattern with the first lag to be significant towards the negative side. Also, there are few more slightly significant lags in the higher order. In order to get a better view of this ACF plot, we can try to have a look at the Alternative Bound ACF plot.
#---------------------- Alternative Bound ACF -----------------------------------------#
acf(ensd_hp_index_ts_d2,ci.type='ma',xaxp=c(0,20,10),
main="Alternative Bounds for IInd Difference HPI")
The Alternative Bound ACF plot shows a Jagged pattern with the first lag to be significant towards the negative side.The alternative bound acf also shows the lag 12 to be slightly significant. Also, the 24th lag is quite significant. This makes us suspect seasonality. But, since the values are seasonally adjusted we will keep the seasonal aspect out of scope.
#---------------------------------- PACF Plot -----------------------------------------#
pacf(ensd_hp_index_ts_d2,main = 'PACF for IInd Difference HPI')
The PACF shows initial lags to be significant with a decay. But later some of the lags like 10, 22 & 23 are found to be slightly significant.
In order to get a clear view of the candidate models lets have a look at the EACF table
eacf(ensd_hp_index_ts_d2)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x x o o o o o o o o o x x o
## 1 x x x o o o o o o o o x o o
## 2 x x o x o o o o o o o x o o
## 3 x x o x x o o o o o x x o o
## 4 x x x o o o o o o o o o o o
## 5 x o x x o o o o o o o o o o
## 6 x o o o x o o o o o o o o o
## 7 x o o o x x o o x o o o o o
The eacf plot shows a decent vertex at (0,2). Hence, the candidate models selected will be ARIMA(0,2,2) ARIMA(0,2,3) & ARIMA(1,2,3).
For further investigation of the candidate models, let’s plot the BIC table.
res = armasubsets(y=ensd_hp_index_ts_d2,nar=8,nma=8,y.name='test',ar.method='ols')
plot(res)
The BIC table shows slight higher orders for AR and MA components at -320. We will consider the lower orders only for our analysis. The candidate models selected will be ARIMA(1,2,1) ARIMA(2,2,1) and ARIMA(3,2,1).
So, from the overall analysis the candidate models selected are as follows. 1. ARIMA(0,2,2) 2. ARIMA(0,2,3) 3. ARIMA(1,2,3) 4. ARIMA(1,2,1) 5. ARIMA(2,2,1) 6. ARIMA(3,2,1)
We will fit each of the candidate models selected in the previous step. Model evaluation will be done on the basis of the AIC, BIC and significance of the coefficients.
m022_hpi = arima(ensd_hp_index_ts,order=c(0,2,2))
AIC(m022_hpi)
## [1] -2293.366
BIC(m022_hpi)
## [1] -2282.195
coeftest(m022_hpi)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ma1 -1.425427 0.049780 -28.634 < 2.2e-16 ***
## ma2 0.525131 0.049264 10.660 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The coefficients for both MA(1) and MA(2) are significant for the model ARIMA(0,2,2). AIC = -2238.063 BIC = -2226.962
m023_hpi = arima(ensd_hp_index_ts,order=c(0,2,3))
AIC(m023_hpi)
## [1] -2291.405
BIC(m023_hpi)
## [1] -2276.511
coeftest(m023_hpi)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ma1 -1.419553 0.057532 -24.6741 < 2.2e-16 ***
## ma2 0.508687 0.096359 5.2791 1.298e-07 ***
## ma3 0.011649 0.058857 0.1979 0.8431
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The coefficients for both MA(1) and MA(2) are significant. But, MA(3) is insignificant for the model ARIMA(0,2,3)
AIC = -2236.07 BIC = -2221.268
m123_hpi = arima(ensd_hp_index_ts,order=c(1,2,3))
AIC(m123_hpi)
## [1] -2291.755
BIC(m123_hpi)
## [1] -2273.137
coeftest(m123_hpi)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 -0.931035 0.044877 -20.7462 < 2.2e-16 ***
## ma1 -0.473879 0.060861 -7.7863 6.902e-15 ***
## ma2 -0.848178 0.046764 -18.1375 < 2.2e-16 ***
## ma3 0.517523 0.048800 10.6050 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The coefficients for all AR(1), MA(1) MA(2) and MA(3) are significant for the model ARIMA(1,2,3).
AIC = -2236.191 BIC = -2217.688
m121_hpi = arima(ensd_hp_index_ts,order=c(1,2,1))
AIC(m121_hpi)
## [1] -2273.219
BIC(m121_hpi)
## [1] -2262.048
coeftest(m121_hpi)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 -0.414054 0.053599 -7.7251 1.118e-14 ***
## ma1 -0.891116 0.022397 -39.7876 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The coefficients for both AR(1) and MA(1) are significant for the model ARIMA(1,2,1).
AIC = -2218.742 BIC = -2207.64
m221_hpi = arima(ensd_hp_index_ts,order=c(2,2,1))
AIC(m221_hpi)
## [1] -2286.143
BIC(m221_hpi)
## [1] -2271.248
coeftest(m221_hpi)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 -0.538376 0.060247 -8.9362 < 2.2e-16 ***
## ar2 -0.237084 0.059657 -3.9741 7.065e-05 ***
## ma1 -0.849125 0.029420 -28.8620 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The coefficients for all AR(1) AR(2) and MA(1) are significant for the model ARIMA(2,2,1).
AIC = -2231.219 BIC = -2216.417
m321_hpi = arima(ensd_hp_index_ts,order=c(3,2,1))
AIC(m321_hpi)
## [1] -2286.468
BIC(m321_hpi)
## [1] -2267.85
coeftest(m321_hpi)
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 -0.581442 0.066324 -8.7667 < 2.2e-16 ***
## ar2 -0.308873 0.075588 -4.0863 4.384e-05 ***
## ar3 -0.098225 0.063807 -1.5394 0.1237
## ma1 -0.824080 0.036938 -22.3097 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The coefficients for AR(1) AR(2) and MA(1) are significant. But, AR(3) is insignificant for the model ARIMA(3,2,1).
AIC = -2231.589 BIC = -2213.086
Looking at the significance of the coefficients and the AIC and BIC values the candidate models would be
1. ARIMA(0,2,2)
2. ARIMA(1,2,3)
3. ARIMA(2,2,1)
4. ARIMA(1,2,1)
In order to filter these models further. Let's try to analyse the residuals of the model selected above.
Function for Residual Analysis for ARIMA Models
# Function that performs residual analysis
residual.analysis <- function(model, std = TRUE){
library(TSA)
library(FitAR)
if (std == TRUE){
res.model = rstandard(model)
}else{
res.model = residuals(model)
}
par(mfrow=c(3,2))
plot(res.model,type='o',ylab='Standardised residuals',
main="Time series plot of standardised residuals")
abline(h=0)
hist(res.model,main="Histogram of standardised residuals")
qqnorm(res.model,main="QQ plot of standardised residuals")
qqline(res.model, col = 2)
acf(res.model,main="ACF of standardised residuals")
print(shapiro.test(res.model))
k=0
LBQPlot(res.model, lag.max = length(model$residuals)-1, StartLag = k + 1, k = 0,
SquaredQ = FALSE)
par(mfrow=c(1,1))
}
residual.analysis(m022_hpi)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.96462, p-value = 7.855e-07
## Warning in (ra^2)/(n - (1:lag.max)): longer object length is not a multiple
## of shorter object length
residual.analysis(m123_hpi)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.96439, p-value = 7.277e-07
## Warning in (ra^2)/(n - (1:lag.max)): longer object length is not a multiple
## of shorter object length
residual.analysis(m221_hpi)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.96154, p-value = 2.888e-07
## Warning in (ra^2)/(n - (1:lag.max)): longer object length is not a multiple
## of shorter object length
residual.analysis(m121_hpi)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.95824, p-value = 1.042e-07
## Warning in (ra^2)/(n - (1:lag.max)): longer object length is not a multiple
## of shorter object length
The residues of all the models fail the Shapiro Wilk test. Hence, the residues are not normally distributed. The Ljung Box Test plot shows a good seperation from the p-value reference line for the models ARIMA(0,2,2) and ARIMA(1,2,3).
The QQ-Plot showing a fat tails makes us suspect conditional heteoscedasticity in the timeseries
Since, ARIMA(0,2,2) and ARIMA(1,2,3) show a decent plot for Ljung-Box Test we will continue further with those models.
McLeod-Li Test to examine conditional heteroscedasticity on Residues of ARIMA(0,2,2)
McLeod.Li.test(y=residuals(m022_hpi))
McLeod.Li.test(y=residuals(m123_hpi))
Plots for the residues of both the models show that most of the points lie below the p-value reference line. This confirms that the time series contain a significant degree of conditional heteroscedasticity. We will try to fit a GARCH/ARCH models to the residues of the models.
m022residuals = residuals(m022_hpi)
m123residuals = residuals(m123_hpi)
We will try to estimate the orders for GARCH and ARCH components using the EACF plot of squared residues of ARIMA(0,2,2) #EACF of the squared residuals of ARIMA(0,2,2)
eacf(m022residuals^2)## GARCH(1,0) GARCH(1,1)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 o o o o x x x x o o o x o x
## 1 x o o o x x o o o o o x o x
## 2 x x o o x o o o o o o o o x
## 3 x x o o x o x o o o o o o x
## 4 x x o x x o x o o o o o o x
## 5 x x o x x o o x o o o o o x
## 6 x x o x x o o o o o o o o x
## 7 x x x x o o x o o x x o o o
eacf(abs(m022residuals))## GARCH(1,1) GARCH(2,1) GARCH(2,2)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 x o x x x x x x o o x x x x
## 1 x o o o o o o o o o o o o o
## 2 x x o o o o o o o o o o o o
## 3 x x o o o o o o o o o o o o
## 4 x x x o o o o o o o o x o o
## 5 x x o x x o o o o o o x o o
## 6 x x o x x o o o o o o o o o
## 7 x o x o o o o o o o o o o o
Looking at the both the EACF plots the candidate models selected for residues of ARIMA(0,2,2) are GARCH(1,0) GARCH(1,1) GARCH(2,1) GARCH(2,2) #EACF of squared residues of ARIMA(1,2,3)
eacf(m123residuals^2)## GARCH(1,0) GARCH(1,1)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 o o o o x x x x o o o x o x
## 1 x o o o x o o o o o o x o x
## 2 x x o o x o o o o o o o o x
## 3 x x o o x o x o o o o o o x
## 4 x x o x x o x o o o o o o x
## 5 x x o x x o o x o o o o o x
## 6 x x o x x o o o o o o o o x
## 7 x x x x o o x o o o o o o o
eacf(abs(m123residuals))## GARCH(1,1) GARCH(2,1) GARCH(2,2)
## AR/MA
## 0 1 2 3 4 5 6 7 8 9 10 11 12 13
## 0 o o x x x x x x x o x x x x
## 1 x o o o o o o o o o o o o o
## 2 x x o o o o o o o o o o o o
## 3 x x x o o o o o o o o o o o
## 4 x x x o o o o o o o o o o o
## 5 x o o x o o o o o o o o o o
## 6 x x o x o o o o o o o o o o
## 7 x o o x o o o o o o o o o o
Looking at the both the EACF plots the candidate models selected for residues of ARIMA(1,2,3) are GARCH(1,0) GARCH(1,1) GARCH(2,1) GARCH(2,2) #GARCH Model Estimation #Fitting GARCH(1,0) on the residues of ARIMA(0,2,2)
a022garch10=garch(m022residuals,order=c(0,1),trace=F)
a022garch11=garch(m022residuals,order=c(1,1),trace=F)
## Warning in sqrt(pred$e): NaNs produced
a022garch21=garch(m022residuals,order=c(1,2),trace=F)
## Warning in sqrt(pred$e): NaNs produced
a022garch22=garch(m022residuals,order=c(2,2),trace=F)
## Warning in garch(m022residuals, order = c(2, 2), trace = F): singular
## information
a123garch10=garch(m123residuals,order=c(0,1),trace=F)
a123garch11=garch(m123residuals,order=c(1,1),trace=F)
a123garch21=garch(m123residuals,order=c(1,2),trace=F)
## Warning in sqrt(pred$e): NaNs produced
a123garch22=garch(m123residuals,order=c(2,2),trace=F)
## Warning in garch(m123residuals, order = c(2, 2), trace = F): singular
## information
Function to perform Residual Analysis and Coefficient Estimation on GARCH models
garchresidual.analysis <- function(model, std = TRUE){
library(TSA)
library(FitAR)
res.model = residuals(model)
par(mfrow=c(2,2))
plot(res.model,type='o',ylab='Standardised residuals', main="Time series plot of standardised residuals")
abline(h=0)
hist(res.model,main="Histogram of standardised residuals")
qqnorm(res.model,main="QQ plot of standardised residuals")
qqline(res.model, col = 2)
print(shapiro.test(res.model))
print(summary(model))
par(mfrow=c(1,1))
}
garchresidual.analysis(a022garch21)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.9946, p-value = 0.3583
##
##
## Call:
## garch(x = m022residuals, order = c(1, 2), trace = F)
##
## Model:
## GARCH(1,2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.66120 -0.59355 -0.05831 0.61438 3.05384
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## a0 2.718e-07 6.059e-07 0.449 0.6537
## a1 3.264e-11 8.238e-02 0.000 1.0000
## a2 2.919e-01 1.323e-01 2.207 0.0273 *
## b1 7.718e-01 5.876e-02 13.134 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Diagnostic Tests:
## Jarque Bera Test
##
## data: Residuals
## X-squared = 2.1499, df = 2, p-value = 0.3413
##
##
## Box-Ljung test
##
## data: Squared.Residuals
## X-squared = 0.31631, df = 1, p-value = 0.5738
Shapiro Wilk Test fails to prove normality of the residues at 5% significance after fitting GARCH(1,2). Even the Jarque Bera Test supports the conclusion from Shapiro Wilk Test. Box-Ljung test also fails to find significant evidence to prove that the residues are uncorrelated.
garchresidual.analysis(a022garch22)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.98103, p-value = 0.0004448
##
##
## Call:
## garch(x = m022residuals, order = c(2, 2), trace = F)
##
## Model:
## GARCH(2,2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.23176 -0.55133 -0.06646 0.61574 4.16010
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## a0 7.070e-06 NA NA NA
## a1 2.457e-01 NA NA NA
## a2 2.002e-01 NA NA NA
## b1 1.624e-01 NA NA NA
## b2 2.277e-01 NA NA NA
##
## Diagnostic Tests:
## Jarque Bera Test
##
## data: Residuals
## X-squared = 35.432, df = 2, p-value = 2.024e-08
##
##
## Box-Ljung test
##
## data: Squared.Residuals
## X-squared = 1.5545, df = 1, p-value = 0.2125
Shapiro Wilk Test => p-value 0.2394. The Shapiro Wilk test provides significant evidence to prove normality of the residues. Jarque Bera Test => p-value 0.0585. The Jarque Bera Test too complements the conclusions of Shapiro Wilk Test. Box-Ljung test => p-value 0.9654. Box-Ljung test gives a significant evidence to that the residues after the GARCH(2,2) fit are uncorrelated
garchresidual.analysis(a022garch10)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.96629, p-value = 1.426e-06
##
##
## Call:
## garch(x = m022residuals, order = c(0, 1), trace = F)
##
## Model:
## GARCH(0,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.23810 -0.61465 -0.07201 0.59207 4.60399
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## a0 2.713e-05 2.420e-06 11.207 <2e-16 ***
## a1 1.592e-01 7.312e-02 2.177 0.0295 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Diagnostic Tests:
## Jarque Bera Test
##
## data: Residuals
## X-squared = 94.792, df = 2, p-value < 2.2e-16
##
##
## Box-Ljung test
##
## data: Squared.Residuals
## X-squared = 0.06308, df = 1, p-value = 0.8017
Shapiro Wilk Test fails to prove normality of the residues after fitting GARCH(1,0). Even the Jarque Bera Test supports the conclusion from Shapiro Wilk Test.
Box-Ljung test though provides significant evidence to prove that the residues are uncorrelated.
garchresidual.analysis(a022garch11)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.95778, p-value = 9.448e-08
##
##
## Call:
## garch(x = m022residuals, order = c(1, 1), trace = F)
##
## Model:
## GARCH(1,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.3588 -0.5823 -0.0631 0.6017 2.9558
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## a0 1.024e-14 3.970e-07 0.000 1.00000
## a1 2.219e-01 8.331e-02 2.663 0.00774 **
## b1 8.371e-01 5.238e-02 15.981 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Diagnostic Tests:
## Jarque Bera Test
##
## data: Residuals
## X-squared = 284.8, df = 2, p-value < 2.2e-16
##
##
## Box-Ljung test
##
## data: Squared.Residuals
## X-squared = 0.61124, df = 1, p-value = 0.4343
Shapiro Wilk Test fails to prove normality of the residues after fitting GARCH(1,1). Even the Jarque Bera Test supports the conclusion from Shapiro Wilk Test. Box-Ljung test though provides significant evidence to prove that the residues are uncorrelated. #Calling the residual analysis for GARCH(1,0) on ARIMA(1,2,3)
garchresidual.analysis(a123garch10)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.96781, p-value = 2.404e-06
##
##
## Call:
## garch(x = m123residuals, order = c(0, 1), trace = F)
##
## Model:
## GARCH(0,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.23986 -0.58354 -0.06563 0.59652 4.66774
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## a0 2.655e-05 2.373e-06 11.191 <2e-16 ***
## a1 1.717e-01 7.145e-02 2.404 0.0162 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Diagnostic Tests:
## Jarque Bera Test
##
## data: Residuals
## X-squared = 90.22, df = 2, p-value < 2.2e-16
##
##
## Box-Ljung test
##
## data: Squared.Residuals
## X-squared = 0.05295, df = 1, p-value = 0.818
Shapiro Wilk Test fails to prove normality of the residues after fitting GARCH(1,0). Even the Jarque Bera Test supports the conclusion from Shapiro Wilk Test. Box-Ljung test though provides significant evidence to prove that the residues are uncorrelated. #Calling the residual analysis for GARCH(1,1) on ARIMA(1,2,3)
garchresidual.analysis(a123garch11)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.99475, p-value = 0.3771
##
##
## Call:
## garch(x = m123residuals, order = c(1, 1), trace = F)
##
## Model:
## GARCH(1,1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.99795 -0.62922 -0.08498 0.64675 3.41328
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## a0 2.038e-07 2.404e-07 0.848 0.39663
## a1 7.318e-02 2.740e-02 2.671 0.00756 **
## b1 9.223e-01 3.133e-02 29.440 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Diagnostic Tests:
## Jarque Bera Test
##
## data: Residuals
## X-squared = 4.249, df = 2, p-value = 0.1195
##
##
## Box-Ljung test
##
## data: Squared.Residuals
## X-squared = 0.91241, df = 1, p-value = 0.3395
Shapiro Wilk Test => p-value 0.3438.The Shapiro Wilk test provides significant evidence to prove normality of the residues. Jarque Bera Test => p-value 0.0824. The Jarque Bera Test too complements the conclusions of Shapiro Wilk Test. Box-Ljung test => p-value 0.402. Box-Ljung test gives a significant evidence to that the residues after the GARCH(1,1) fit are uncorrelated.
The summary of the estimates too show that all the components of the GARCH(1,1) model are significant #Calling the residual analysis for GARCH(2,1) on ARIMA(1,2,3)
garchresidual.analysis(a123garch21)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.98919, p-value = 0.02312
##
##
## Call:
## garch(x = m123residuals, order = c(1, 2), trace = F)
##
## Model:
## GARCH(1,2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.11669 -0.55211 -0.07255 0.58230 3.58727
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## a0 1.955e-06 1.197e-06 1.633 0.102
## a1 2.069e-01 1.314e-01 1.575 0.115
## a2 2.384e-01 1.706e-01 1.397 0.162
## b1 6.220e-01 7.616e-02 8.167 2.22e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Diagnostic Tests:
## Jarque Bera Test
##
## data: Residuals
## X-squared = 13.539, df = 2, p-value = 0.001148
##
##
## Box-Ljung test
##
## data: Squared.Residuals
## X-squared = 2.7303, df = 1, p-value = 0.09846
Only beta1 coefficient is significant. The alpha1, beta1 coefficient estimates are insignificant. Shapiro Wilk Test => p-value 0.3773. The Shapiro Wilk test provides significant evidence to prove normality of the residues. Jarque Bera Test => p-value 0.1298. The Jarque Bera Test too complements the conclusions of Shapiro Wilk Test.
Box-Ljung test => p-value 0.874. Box-Ljung test gives a significant evidence to that the residues after the GARCH(2,1) fit are uncorrelated #Calling the residual analysis for GARCH(2,2) on ARIMA(1,2,3)
garchresidual.analysis(a123garch22)
##
## Shapiro-Wilk normality test
##
## data: res.model
## W = 0.99369, p-value = 0.2304
##
##
## Call:
## garch(x = m123residuals, order = c(2, 2), trace = F)
##
## Model:
## GARCH(2,2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.8587 -0.6463 -0.0878 0.6410 3.4776
##
## Coefficient(s):
## Estimate Std. Error t value Pr(>|t|)
## a0 3.783e-07 NA NA NA
## a1 1.074e-13 NA NA NA
## a2 8.875e-02 NA NA NA
## b1 7.897e-01 NA NA NA
## b2 1.078e-01 NA NA NA
##
## Diagnostic Tests:
## Jarque Bera Test
##
## data: Residuals
## X-squared = 4.7792, df = 2, p-value = 0.09167
##
##
## Box-Ljung test
##
## data: Squared.Residuals
## X-squared = 0.06437, df = 1, p-value = 0.7997
Shapiro Wilk Test fails to prove normality of the residues after fitting GARCH(2,2). Even the Jarque Bera Test supports the conclusion from Shapiro Wilk Test. However, Box-Ljung test though provides significant evidence to prove that the residues are uncorrelated.
Looking at all the models that are fit for GARCH model specification, the best model selected is ARIMA(1,2,3) + GARCH(1,1).
The above model is the best as all its coefficients are significant and we have significant evidence that the residues are uncorrelated and normally distributed.
We will use the model to forecast for next 24 months.
spec = ugarchspec(variance.model = list(model="sGARCH",garchOrder=c(1,1)),
mean.model = list(armaOrder=c(1,3)))
mugarchfit = ugarchfit(spec,ensd_hp_index_ts_d2)
coef(mugarchfit)
## mu ar1 ma1 ma2 ma3
## 0.002212167 -0.964952483 -0.426686040 -0.873817337 0.504498774
## omega alpha1 beta1
## 0.004045258 0.089563943 0.909436041
forecast_24 = ugarchforecast(mugarchfit,data=diff2.log,n.ahead = 24)
forecast24ts = ts(fitted(forecast_24)[,1],start = c(2016,9),frequency = 12)
sigma(forecast_24)
## 1970-11-03 05:30:00
## T+1 1.200875
## T+2 1.201958
## T+3 1.203040
## T+4 1.204119
## T+5 1.205196
## T+6 1.206271
## T+7 1.207345
## T+8 1.208416
## T+9 1.209485
## T+10 1.210552
## T+11 1.211617
## T+12 1.212680
## T+13 1.213741
## T+14 1.214800
## T+15 1.215857
## T+16 1.216912
## T+17 1.217966
## T+18 1.219017
## T+19 1.220066
## T+20 1.221113
## T+21 1.222159
## T+22 1.223202
## T+23 1.224244
## T+24 1.225283
fcast = c(ensd_hp_index_ts_d2,forecast24ts)
class(fcast)
## [1] "numeric"
fcast = ts(fcast,start = 1991)
class(fcast)
## [1] "ts"
forecast_24_2db = diffinv(fcast,xi = ensd_hp_index_ts_d2[1:2],differences = 2)
forecast_24_2db = ts(forecast_24_2db,start=1991,frequency = 12)
plot(forecast_24_2db,type='o',ylab= 'House Price Index',
main="Time Series Plot of House Price Index with the forecasted values")
For the House Price Index dataset of the East Souch Central Division of America we performed Time Series Analysis. The data intially had a trend which we elminated after 2 original differences. After fitting the ARIMA models, evidences for conditional heteroscedasticity were observed in the residues. After further investigation we could fit GARCH component to the residuals of ARIMA model sucessfully. The best model fit through this process is ARIMA(1,2,3) + GARCH(1,1).
The House Price Index is forecasted for the next 24 months.