Financial Econometrics - HOMEWORK 7
Group Members
- Pham Thi Truc Na - MAMAIU20098
- Nguyen Ngoc Kim Chi - MAMAIU20034
- Nguyen Ngoc Khanh Minh - MAMAIU20026
- Le Ngoc Yen - MAMAIU20059
Libraries
## Loading required package: carData
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Attaching package: 'aTSA'
## The following object is masked from 'package:graphics':
##
## identify
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
##
## Attaching package: 'forecast'
## The following object is masked from 'package:aTSA':
##
## forecast
Constructing ARMA Models
To build an ARMA model for the house price changes. There are three stages involved: identification, estimation and diagnostic checking. The first stage is check stationary by using Dickey-Fuller test
#Import data
library(readxl)
UKHP <- read_excel("E:/FE/UKHP.xls", col_types = c("date", "numeric"))
View(UKHP)
#Create variable:
names(UKHP)[2]= "hp"
UKHP$dhp = c(NA, 100*diff(UKHP$hp)/UKHP$hp[1:nrow(UKHP)-1])
#Constructing ARMA Models
UKHP = UKHP [-1,] ##drop observations (NA) unless necessary
##1. Check Stationary
library(tseries)##
## Attaching package: 'tseries'
## The following objects are masked from 'package:aTSA':
##
## adf.test, kpss.test, pp.test
## Warning in adf.test(UKHP$dhp): p-value smaller than printed p-value
##
## Augmented Dickey-Fuller Test
##
## data: UKHP$dhp
## Dickey-Fuller = -5.1732, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
The small p−value returned by the Dickey-Fuller test suggests that our data is stationary.
Estimating Autocorrelation Coefficients
The second stage is carried out by looking at the autocorrelation and partial autocorrelation coefficients to identify any structure in the data.
-> The ACF dies away rather slowly, while only the first two PACF values seem strongly significant.
Using Information Criteria to Decide on Model Orders
Using the criterion based on the estimated standard errors, the model with the lowest value of AIC and SBIC should be chosen.
##
## Attaching package: 'psych'
## The following object is masked from 'package:car':
##
## logit
##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.822373 0.059625 13.7923 < 2.2e-16 ***
## ma1 -0.541725 0.087676 -6.1787 6.462e-10 ***
## intercept 0.428586 0.141410 3.0308 0.002439 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## [1] 933.4199
## [1] 948.5675
aic_table = array (NA , c (6,6,2) )
for ( ar in 0:5) {
for ( ma in 0:5) {
arma = arima ( UKHP$dhp , order = c ( ar ,0, ma ) )
aic_table [ ar +1, ma +1,1] = AIC ( arma )
aic_table [ ar +1, ma +1,2] = AIC ( arma , k = log ( nrow ( UKHP ) ) )
}
}## Warning in arima(UKHP$dhp, order = c(ar, 0, ma)): possible convergence problem:
## optim gave code = 1
## Warning in arima(UKHP$dhp, order = c(ar, 0, ma)): possible convergence problem:
## optim gave code = 1
## Warning in arima(UKHP$dhp, order = c(ar, 0, ma)): possible convergence problem:
## optim gave code = 1
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1001.2637 977.4020 935.5083 931.4545 930.2949 929.9505
## [2,] 958.7914 933.4199 925.6487 923.7888 924.4999 929.3072
## [3,] 922.4600 924.4080 926.2427 926.4088 926.1038 925.6015
## [4,] 924.4016 926.4344 928.1834 925.9574 925.4296 918.6900
## [5,] 926.2610 928.2587 914.1016 918.3987 927.4242 918.0929
## [6,] 928.2454 927.6420 923.3094 918.1430 920.1053 927.6499
## [1] 17
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1008.8375 988.7627 950.6558 950.3889 953.0163 956.4588
## [2,] 970.1521 948.5675 944.5832 946.5102 951.0082 959.6024
## [3,] 937.6076 943.3425 948.9641 952.9171 956.3990 959.6836
## [4,] 943.3361 949.1558 954.6917 956.2526 959.5117 956.5590
## [5,] 948.9824 954.7670 944.3968 952.4808 965.2931 959.7488
## [6,] 954.7536 957.9371 957.3914 956.0119 961.7612 973.0927
## [1] 3
We see that the AIC has a value of 933.4199 and the BIC a value of 948.5675.
The table created by the code above is presented below with the AIC values in the first table and the SBIC values in the second table.
Row and column indicates start with 1, the respectuve AR and MA order represented is to be reduced by 1.
Therefore, in this case, the criteria choose different models: AIC selects an ARMA(4,2), while SBIC selects the smaller ARMA(2,0) model.
Forecasting Using ARMA model
Compare ARMA(2,0) and ARMA(4,2) which one is the best fit model
- Let us first estimate the ARMA(0,2) model for the time period
##
## Ljung-Box test
##
## data: Residuals from ARIMA(2,0,0) with non-zero mean
## Q* = 5.1323, df = 8, p-value = 0.7433
##
## Model df: 2. Total lags used: 10
### Forecasting
ar2 = arima ( UKHP$dhp [ UKHP$Month <="2015-12-01"] , order = c (2,0,0) )
dynamic_fc = predict(ar2,n.ahead = 27)
coeftest(ar2)##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 0.235302 0.054366 4.3281 1.504e-05 ***
## ar2 0.340531 0.054500 6.2482 4.151e-10 ***
## intercept 0.441567 0.137422 3.2132 0.001313 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
- Let us second estimate the ARMA(4,2) model for the time period
##
## Ljung-Box test
##
## data: Residuals from ARIMA(4,0,2) with non-zero mean
## Q* = 14.429, df = 4, p-value = 0.006044
##
## Model df: 6. Total lags used: 10
### Forecasting
ar42 = arima ( UKHP$dhp [ UKHP$Month <="2015-12-01"] , order = c (4,0,2) )
dynamic_fc = predict(ar42,n.ahead = 27)
coeftest(ar42)##
## z test of coefficients:
##
## Estimate Std. Error z value Pr(>|z|)
## ar1 1.2147060 0.0537964 22.5797 < 2.2e-16 ***
## ar2 -0.8316660 0.0877442 -9.4783 < 2.2e-16 ***
## ar3 -0.1658027 0.0878208 -1.8880 0.059030 .
## ar4 0.3850525 0.0538454 7.1511 8.61e-13 ***
## ma1 -0.9886525 0.0098747 -100.1199 < 2.2e-16 ***
## ma2 0.9987283 0.0187421 53.2879 < 2.2e-16 ***
## intercept 0.4431573 0.1428588 3.1021 0.001922 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Calculate error MAE, MAPE, MSE
## ME RMSE MAE MPE MAPE MASE
## Training set 0.00105131 1.013463 0.7716421 -20.5119 317.6721 0.7936192
## ACF1
## Training set -0.0004932139
## ME RMSE MAE MPE MAPE MASE
## Training set -0.0008804006 0.9790295 0.7440369 -190.0471 490.0513 0.7652277
## ACF1
## Training set -0.02676992
The best ARIMA model suggested by R is ARMA(2,0). Since
The residuals seem to follow a white noise process (\(p_{value}>0.05\))
All parameters are significant, i.e. the model fits well to the data.