Electricity.RData examined in HW1 Exercise 2 (d)-(f). Fit the model using first order differenced log transformed series.a: By visually checking, decide what SARIMA models seem appropriate, i.e., specify p, d, q and P, D, Q in SARIMA model, ARIMA(p, d, q) ∗ ARIMA(P,D, Q)s. Choose the most appropriate two models and explain your answer
load("Electricity.RData")
# first order differenced log transformed series
diff.log.electricity <- diff(log(electricity), lag=1)
acf(diff.log.electricity, lag=60, main = "acf")
Comment: When looking at the ACF plot, we can see that there are plenty repetitive peaks and trough throughout the time series data. They can be detected through very large ACF at s, 2s, 3s, etc., but they do not tail off to zero. This implies that this data is non-stationary seasonal series, and hence, we need to apply further seasonal difference.
sdiff.diff.log.electricity <- diff(diff.log.electricity, lag=12)
par(mfrow=c(1,2))
acf(sdiff.diff.log.electricity, lag=60)
pacf(sdiff.diff.log.electricity, lag=60)
Comment: After further differencing diff.log.electricity, we see that the ACF lags tail-off to 0 rather quick. This means that the data is finally a stationary seasonal series. The PACF also tails off to 0, with repetitive peaks below/above the blue dashed lines. Hence, we will try ARIMA(4,1,3) for both the non-seasonal part, and ARIMA(0,1,1) for the non-seasonal part. Another model to try would be ARIMA(1,1,1) for both non-seasonal and seasonal part.
b: Fit the two SARIMA models which you chose in part (a) using sarima function in R and interpret the result.
sarima(electricity, p=4, d=1, q=3, P=0, D=1, Q=1, S=12)
## initial value 9.219916
## iter 2 value 9.006146
## iter 3 value 8.959792
## iter 4 value 8.903809
## iter 5 value 8.878712
## iter 6 value 8.874496
## iter 7 value 8.868160
## iter 8 value 8.867186
## iter 9 value 8.866672
## iter 10 value 8.866515
## iter 11 value 8.866413
## iter 12 value 8.866387
## iter 13 value 8.866360
## iter 14 value 8.866321
## iter 15 value 8.866269
## iter 16 value 8.866080
## iter 17 value 8.864435
## iter 18 value 8.863320
## iter 19 value 8.862893
## iter 20 value 8.862720
## iter 21 value 8.862314
## iter 22 value 8.861819
## iter 23 value 8.861033
## iter 24 value 8.859630
## iter 25 value 8.857495
## iter 26 value 8.856444
## iter 27 value 8.855050
## iter 28 value 8.852176
## iter 29 value 8.851106
## iter 30 value 8.850572
## iter 31 value 8.850514
## iter 32 value 8.850444
## iter 33 value 8.850334
## iter 34 value 8.850231
## iter 35 value 8.850204
## iter 36 value 8.850179
## iter 37 value 8.850167
## iter 38 value 8.850149
## iter 39 value 8.850137
## iter 40 value 8.850124
## iter 41 value 8.850104
## iter 42 value 8.850099
## iter 43 value 8.850098
## iter 44 value 8.850097
## iter 44 value 8.850097
## iter 44 value 8.850097
## final value 8.850097
## converged
## initial value 8.849589
## iter 2 value 8.849580
## iter 3 value 8.849514
## iter 4 value 8.849467
## iter 5 value 8.849450
## iter 6 value 8.849432
## iter 7 value 8.849428
## iter 8 value 8.849425
## iter 9 value 8.849424
## iter 10 value 8.849424
## iter 11 value 8.849423
## iter 12 value 8.849420
## iter 13 value 8.849415
## iter 14 value 8.849410
## iter 15 value 8.849407
## iter 16 value 8.849407
## iter 16 value 8.849407
## iter 16 value 8.849407
## final value 8.849407
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), include.mean = !no.constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ar2 ar3 ar4 ma1 ma2 ma3 sma1
## 0.0278 -0.7623 0.4217 -0.1139 -0.4654 0.5371 -0.7767 -0.7047
## s.e. 0.1175 0.0954 0.0752 0.0814 0.1072 0.0612 0.1293 0.0339
##
## sigma^2 estimated as 47334220: log likelihood = -3932.78, aic = 7883.55
##
## $degrees_of_freedom
## [1] 375
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.0278 0.1175 0.2366 0.8131
## ar2 -0.7623 0.0954 -7.9932 0.0000
## ar3 0.4217 0.0752 5.6043 0.0000
## ar4 -0.1139 0.0814 -1.3991 0.1626
## ma1 -0.4654 0.1072 -4.3430 0.0000
## ma2 0.5371 0.0612 8.7704 0.0000
## ma3 -0.7767 0.1293 -6.0046 0.0000
## sma1 -0.7047 0.0339 -20.7951 0.0000
##
## $AIC
## [1] 20.00902
##
## $AICc
## [1] 20.00997
##
## $BIC
## [1] 20.0992
sarima(electricity, p=1, d=1, q=1, P=1, D=1, Q=1, S=12)
## initial value 9.227746
## iter 2 value 9.004840
## iter 3 value 8.963150
## iter 4 value 8.935167
## iter 5 value 8.924893
## iter 6 value 8.906515
## iter 7 value 8.903222
## iter 8 value 8.883118
## iter 9 value 8.875670
## iter 10 value 8.875570
## iter 11 value 8.875512
## iter 12 value 8.875500
## iter 12 value 8.875500
## iter 12 value 8.875500
## final value 8.875500
## converged
## initial value 8.871965
## iter 2 value 8.871869
## iter 3 value 8.871862
## iter 4 value 8.871790
## iter 5 value 8.871780
## iter 6 value 8.871779
## iter 7 value 8.871779
## iter 7 value 8.871779
## iter 7 value 8.871779
## final value 8.871779
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), include.mean = !no.constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ma1 sar1 sma1
## 0.4204 -0.8864 -0.0447 -0.6962
## s.e. 0.0638 0.0334 0.0658 0.0422
##
## sigma^2 estimated as 49511784: log likelihood = -3941.34, aic = 7892.69
##
## $degrees_of_freedom
## [1] 379
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.4204 0.0638 6.5947 0.0000
## ma1 -0.8864 0.0334 -26.5052 0.0000
## sar1 -0.0447 0.0658 -0.6791 0.4975
## sma1 -0.6962 0.0422 -16.4995 0.0000
##
## $AIC
## [1] 20.03221
##
## $AICc
## [1] 20.03247
##
## $BIC
## [1] 20.08231
c: Check model diagnostics of two candidate models and choose the final one based on diagnostics plots and AIC/ BIC comparison
Comment:
ARIMA (4,1,3) * ARIMA(0,1,1): We can see that most of the p-values for Ljung-Box statistic are fairly high as they’re plotted right above the cutoff point (blue dashed line), although there are some that are located just right on the line. This means that we cannot reject the null hypothesis that specifies our model does not show lack of fit (model is fine), since it’s higher than the cutoff line (0.05)
ARIMA(1,1,1) * ARIMA(1,1,1): The other model, produced randomly, produced a set of p-values lower than the cutoff line of 0.05, though one (lag 9) can be seen plotted right above the blue dashed line. This means that our null hypothesis is rejected, and we can conclude that our model shows lack of fit.
d: Make a prediction for next 4 values based on your final model. Make sure that you predict the values in original scale, not transformed values.
sarima.for(electricity, n.ahead=4, p=4, d=1, q=3, P=0, D=1, Q=1, S=12)
## $pred
## Jan Feb Mar Apr
## 2006 357481.2 316015.9 325577.4 304470.9
##
## $se
## Jan Feb Mar Apr
## 2006 6879.987 7893.337 8203.880 8454.532
gtemp in astsa package analyzed in HW1 Exercise 4. Among 130 time series observations, we will use last 10 series (121-130) for the model validation and rest of them (1-120) for the model train. In other words, we aim to predict last 10 series based on fitted model using first 120 observations.data(gtemp)
train.gtemp = ts(gtemp[1:120]) # train set: for the model selection and fit
test.gtemp = ts(gtemp[121:130]) # test set: only for the evaluation (for calculating prediction error)
a: By visually checking first 120 time series plot, decide which ESM model seems the most appropriate. For example, answer should be like “Additive error + no trend + additive seasonality”. Explain your choice.
plot.ts(train.gtemp)
Comment: Looking at the plot, this dataset appears to have an increasing trend, although there seems to be no sign of seasonality. This is because we don’t see any repetition of data at a certain period of the time interval. Hence, the proper model for this dataset would be additive error + additive trend + no seasonality, or AAN
b: Fit the model for the first 120 data which you chose in part (a). Use ets function with the specification of model input with three letters. Report autoplot() and interpret how your chosen model decompose the data.
train.gtemp.ets = ets(train.gtemp)
summary(train.gtemp.ets)
## ETS(A,N,N)
##
## Call:
## ets(y = train.gtemp)
##
## Smoothing parameters:
## alpha = 0.3775
##
## Initial states:
## l = -0.2644
##
## sigma: 0.0999
##
## AIC AICc BIC
## 25.59878 25.80568 33.96126
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 0.01414113 0.09905458 0.0810756 NaN Inf 0.9067666 0.1060173
autoplot(train.gtemp.ets)
Comment: When looking at the summary, we can see that the alpha and beta values are very small. This means that the slope and trend components change very little over time. This can be backed up by the plot generated as there’s not a lot of range for both the level and slope.
c: Make a prediction for next 10 values based on your final model. Calculate the prediction error using test set. As we know the true Y from the test set, it is the mean of squared error.
train.gtemp.ets <- ets(train.gtemp, model="AAN")
pred.ets.gtemp <- forecast(train.gtemp.ets, h=10)
plot(pred.ets.gtemp)
mean((pred.ets.gtemp$mean - as.vector(test.gtemp))^2)
## [1] 0.01915633
pred.ets.gtemp
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 121 0.3740666 0.2462632 0.5018700 0.1786081 0.5695250
## 122 0.3791551 0.2467692 0.5115410 0.1766883 0.5816219
## 123 0.3842436 0.2474254 0.5210618 0.1749982 0.5934891
## 124 0.3893321 0.2482176 0.5304467 0.1735160 0.6051482
## 125 0.3944206 0.2491337 0.5397075 0.1722235 0.6166178
## 126 0.3995092 0.2501635 0.5488548 0.1711046 0.6279137
## 127 0.4045977 0.2512977 0.5578976 0.1701456 0.6390497
## 128 0.4096862 0.2525286 0.5668437 0.1693345 0.6500379
## 129 0.4147747 0.2538493 0.5757001 0.1686605 0.6608889
## 130 0.4198632 0.2552534 0.5844730 0.1681142 0.6716122
e: Compare prediction performance between ESM and ARIMA
electricity data analyzed in Exercise 1. Among 396 time series observations, we will use last 26 series (371-396) for the model validation and rest of them (1-370) for the model train. In other words, we aim to predict last 26 series based on fitted model using first 370 observations.train.electricity = ts(electricity[1:370], freq=12) # train set: for the model selection and fit
test.electricity = ts(electricity[371:396], freq=12) # test set: only for the evaluation (for calculating prediction error)
a: By visually checking first 370 time series plot, decide which ESM model seems the most appropriate. Explain your choice.
plot(train.electricity)
Comment: Looking at the plot, this dataset appears to have an increasing trend, with a multiplicative seasonality. This is because we can see that the peaks and troughs of the data tend increase (vary in frequency and amplitude over time) Hence, the proper model for this dataset would be multiplicative error + additive trend + multiplicative seasonality, or MAM
b: Fit the model for the first 370 data which you chose in part (a). Use ets function with the specification of model input with three letters. Report autoplot() and interpret how your chosen model decompose the data.
electricity.ets <- ets(train.electricity, model = "MAM")
summary(electricity.ets)
## ETS(M,Ad,M)
##
## Call:
## ets(y = train.electricity, model = "MAM")
##
## Smoothing parameters:
## alpha = 0.6427
## beta = 5e-04
## gamma = 0.0067
## phi = 0.9786
##
## Initial states:
## l = 153445.5704
## b = 661.7965
## s = 1.0085 0.9265 0.9389 0.9852 1.1337 1.1436
## 1.0381 0.9614 0.9009 0.9693 0.935 1.0588
##
## sigma: 0.0276
##
## AIC AICc BIC
## 8678.796 8680.745 8749.239
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE ACF1
## Training set 624.7038 6691.059 4901.356 0.1873471 2.054026 0.5697099 0.07684258
autoplot(electricity.ets)
Comment: The small values of β and γ mean that the slope and seasonal components change very little over time. This can be backed up by the small range for season and slope. The seasonal part is very clear as it is indicated by the regular pattern by the season plot. The trend slope seems to have a decreasing trend, although it only ranges from 200 to 600 - short period. The level, however, shows a large scale ranging from 150000 to 300000.
c: Make a prediction for next 26 values based on your final model. Calculate the prediction error using validation set.
train.electricity.ets <- ets(train.electricity, model="MAM")
pred.ets.electricity <- forecast(train.electricity.ets, h=26)
plot(pred.ets.electricity)
mean((pred.ets.electricity$mean - as.vector(test.electricity))^2)
## [1] 142228849
pred.ets.electricity
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## Nov 31 303293.6 292555.1 314032.0 286870.5 319716.6
## Dec 31 330488.9 316574.8 344403.1 309209.0 351768.8
## Jan 32 346744.6 330142.6 363346.6 321354.0 372135.1
## Feb 32 306258.0 290014.0 322501.9 281414.9 331101.0
## Mar 32 317535.3 299197.8 335872.7 289490.6 345579.9
## Apr 32 295170.2 276837.4 313503.0 267132.6 323207.8
## May 32 314895.6 294052.9 335738.3 283019.4 346771.8
## Jun 32 340250.6 316420.0 364081.1 303804.9 376696.2
## Jul 32 374844.9 347221.8 402468.0 332599.0 417090.8
## Aug 32 372008.6 343297.5 400719.6 328098.8 415918.4
## Sep 32 323057.0 297044.8 349069.3 283274.7 362839.4
## Oct 32 307804.4 282031.4 333577.3 268388.1 347220.7
## Nov 32 303462.6 277093.8 329831.4 263135.0 343790.2
## Dec 32 330669.2 300946.4 360392.0 285212.1 376126.3
## Jan 33 346929.7 314738.7 379120.6 297697.9 396161.4
## Feb 33 306418.0 277122.8 335713.1 261614.8 351221.1
## Mar 33 317697.6 286453.4 348941.8 269913.7 365481.5
## Apr 33 295317.9 265486.2 325149.6 249694.3 340941.5
## May 33 315049.8 282404.4 347695.3 265122.9 364976.7
## Jun 33 340413.7 304274.0 376553.3 285142.8 395684.5
## Jul 33 375020.7 334274.1 415767.3 312704.2 437337.3
## Aug 33 372179.4 330835.2 413523.6 308948.8 435409.9
## Sep 33 323202.2 286527.6 359876.8 267113.2 379291.2
## Oct 33 307939.7 272276.5 343603.0 253397.5 362482.0
## Nov 33 303593.2 267721.9 339464.5 248732.8 358453.6
## Dec 33 330808.5 290975.8 370641.2 269889.7 391727.3
d: Now, you fit SARIMA for the first 370 data using the chosen model in Exercise 1. Predict the next 26 values based on this SARIMA model. Just borrow your answer from Exercise 1, fit the model, and calculate the prediction error.
# ARIMA(5,1,0) * ARIMA(0,1,1)12
sarima(train.electricity, p=4, d=1, q=3, P=0, D=1, Q=1, S=12)
## initial value 9.193982
## iter 2 value 8.985677
## iter 3 value 8.971106
## iter 4 value 8.902322
## iter 5 value 8.879036
## iter 6 value 8.864474
## iter 7 value 8.858320
## iter 8 value 8.853862
## iter 9 value 8.853425
## iter 10 value 8.848685
## iter 11 value 8.847490
## iter 12 value 8.846848
## iter 13 value 8.846755
## iter 14 value 8.846753
## iter 15 value 8.846752
## iter 16 value 8.846745
## iter 17 value 8.846730
## iter 18 value 8.846657
## iter 19 value 8.846636
## iter 20 value 8.846610
## iter 21 value 8.846597
## iter 22 value 8.846580
## iter 23 value 8.846570
## iter 24 value 8.846562
## iter 25 value 8.846509
## iter 26 value 8.846293
## iter 27 value 8.846085
## iter 28 value 8.845994
## iter 29 value 8.845906
## iter 30 value 8.845795
## iter 31 value 8.845632
## iter 32 value 8.845155
## iter 33 value 8.844371
## iter 34 value 8.843856
## iter 35 value 8.843532
## iter 36 value 8.843497
## iter 37 value 8.843333
## iter 38 value 8.843048
## iter 39 value 8.842535
## iter 40 value 8.839372
## iter 41 value 8.838742
## iter 42 value 8.836111
## iter 43 value 8.834603
## iter 44 value 8.833865
## iter 45 value 8.833446
## iter 46 value 8.832701
## iter 47 value 8.832591
## iter 48 value 8.832507
## iter 49 value 8.832445
## iter 50 value 8.832425
## iter 51 value 8.832419
## iter 52 value 8.832413
## iter 53 value 8.832387
## iter 54 value 8.832366
## iter 55 value 8.832357
## iter 56 value 8.832356
## iter 56 value 8.832356
## iter 56 value 8.832356
## final value 8.832356
## converged
## initial value 8.830843
## iter 2 value 8.830777
## iter 3 value 8.830762
## iter 4 value 8.830699
## iter 5 value 8.830687
## iter 6 value 8.830656
## iter 7 value 8.830638
## iter 8 value 8.830623
## iter 9 value 8.830595
## iter 10 value 8.830574
## iter 11 value 8.830552
## iter 12 value 8.830545
## iter 13 value 8.830543
## iter 14 value 8.830539
## iter 15 value 8.830532
## iter 16 value 8.830526
## iter 17 value 8.830524
## iter 17 value 8.830524
## iter 17 value 8.830524
## final value 8.830524
## converged
## $fit
##
## Call:
## stats::arima(x = xdata, order = c(p, d, q), seasonal = list(order = c(P, D,
## Q), period = S), include.mean = !no.constant, transform.pars = trans, fixed = fixed,
## optim.control = list(trace = trc, REPORT = 1, reltol = tol))
##
## Coefficients:
## ar1 ar2 ar3 ar4 ma1 ma2 ma3 sma1
## 0.1452 -0.8058 0.3933 -0.1263 -0.5854 0.6606 -0.7558 -0.6922
## s.e. 0.0987 0.1240 0.0741 0.0628 0.0898 0.1570 0.0914 0.0363
##
## sigma^2 estimated as 45540557: log likelihood = -3659.06, aic = 7336.12
##
## $degrees_of_freedom
## [1] 349
##
## $ttable
## Estimate SE t.value p.value
## ar1 0.1452 0.0987 1.4705 0.1423
## ar2 -0.8058 0.1240 -6.5000 0.0000
## ar3 0.3933 0.0741 5.3059 0.0000
## ar4 -0.1263 0.0628 -2.0117 0.0450
## ma1 -0.5854 0.0898 -6.5205 0.0000
## ma2 0.6606 0.1570 4.2083 0.0000
## ma3 -0.7558 0.0914 -8.2715 0.0000
## sma1 -0.6922 0.0363 -19.0840 0.0000
##
## $AIC
## [1] 19.9351
##
## $AICc
## [1] 19.93619
##
## $BIC
## [1] 20.02993
# prediction
sarima.electricity.pred <- sarima.for(train.electricity, n.ahead=26, p=4, d=1, q=3, P=0, D=1, Q=1, S=12)
names(sarima.electricity.pred)
## [1] "pred" "se"
mean((sarima.electricity.pred$pred - as.vector(test.electricity))^2)
## [1] 85495410
e: Compare prediction performance between ESM and ARIMA.
Comment: We can see that both prediction performance between ESM and ARIMA aren’t that different from each other. When looking at both plots, they produce basically the same forecast, although the prediction error for ARIMA is less than the prediction error rate for our ESM model.
usdhkd.RData contains daily USD/HKD (US dollar to Hong Kong dollar) exchange rate from January 1, 2005 to March 7, 2006load("usdhkd.RData")
a: Generate the time series plot on exchange rate. Generate ACF and PACF from both raw data and squared data, respectively. Make comments based on your observation.
plot.ts(usdhkd)
#ACF/PACF
par(mfrow=c(1,2))
acf(usdhkd)
acf(usdhkd^2)
#ACF/PACF^2
par(mfrow=c(1,2))
pacf(usdhkd)
pacf(usdhkd^2)
Comment: For the ACF for the original time series, we can see that only one lag crosses the blue dashed lines, making them insignificant. However, when we squared the data, we see a few of these lags cross the blue dashed lines, making them significant. The same goes to out PACF plots, only this time, the original time series data have more lags that cross the blue dashed lines than the squared time series data.
b: Fit total four mdoels – ARCH(1), ARCH(2), GARCH(1,1) and GARCH(2,2). Then choose the best one based on diagnostics test on residuals and AIC criteria. There is no one correct model and choose the one that you think the optimal
# ARCH(1)
usdhkd.garch1 <- garchFit(~garch(1,0), data=usdhkd, trace=F)
summary(usdhkd.garch1)
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~garch(1, 0), data = usdhkd, trace = F)
##
## Mean and Variance Equation:
## data ~ garch(1, 0)
## <environment: 0x7fec26e93f88>
## [data = usdhkd]
##
## Conditional Distribution:
## norm
##
## Coefficient(s):
## mu omega alpha1
## -0.00117994 0.00045482 0.49888267
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu -1.180e-03 1.053e-03 -1.120 0.262606
## omega 4.548e-04 4.296e-05 10.588 < 2e-16 ***
## alpha1 4.989e-01 1.336e-01 3.733 0.000189 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 977.8896 normalized: 2.268885
##
## Description:
## Mon May 10 21:12:04 2021 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 4337.001 0
## Shapiro-Wilk Test R W 0.8287354 0
## Ljung-Box Test R Q(10) 20.34018 0.02619446
## Ljung-Box Test R Q(15) 26.32464 0.03474572
## Ljung-Box Test R Q(20) 30.61091 0.06054118
## Ljung-Box Test R^2 Q(10) 9.38141 0.4963288
## Ljung-Box Test R^2 Q(15) 10.35731 0.7966824
## Ljung-Box Test R^2 Q(20) 11.16525 0.9418239
## LM Arch Test R TR^2 9.46025 0.663197
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -4.523850 -4.495547 -4.523946 -4.512675
# ARCH(2)
usdhkd.garch2 <- garchFit(~garch(2,0), data=usdhkd, trace=F)
summary(usdhkd.garch2)
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~garch(2, 0), data = usdhkd, trace = F)
##
## Mean and Variance Equation:
## data ~ garch(2, 0)
## <environment: 0x7fec29e640e0>
## [data = usdhkd]
##
## Conditional Distribution:
## norm
##
## Coefficient(s):
## mu omega alpha1 alpha2
## -0.00046702 0.00026598 0.54327969 0.48701087
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu -4.670e-04 8.699e-04 -0.537 0.591
## omega 2.660e-04 3.473e-05 7.658 1.89e-14 ***
## alpha1 5.433e-01 1.323e-01 4.108 3.99e-05 ***
## alpha2 4.870e-01 1.108e-01 4.395 1.11e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 997.0851 normalized: 2.313422
##
## Description:
## Mon May 10 21:12:05 2021 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 1450.006 0
## Shapiro-Wilk Test R W 0.8744017 0
## Ljung-Box Test R Q(10) 18.6254 0.04528692
## Ljung-Box Test R Q(15) 26.25545 0.03542182
## Ljung-Box Test R Q(20) 28.93381 0.08907092
## Ljung-Box Test R^2 Q(10) 8.642008 0.5663814
## Ljung-Box Test R^2 Q(15) 11.90956 0.6858611
## Ljung-Box Test R^2 Q(20) 16.36083 0.6940018
## LM Arch Test R TR^2 9.24412 0.68195
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -4.608283 -4.570547 -4.608454 -4.593384
# GARCH(1,1)
usdhkd.garch11 = garchFit(~garch(1,1), data=usdhkd, trace=F)
summary(usdhkd.garch11)
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~garch(1, 1), data = usdhkd, trace = F)
##
## Mean and Variance Equation:
## data ~ garch(1, 1)
## <environment: 0x7fec289a3ed0>
## [data = usdhkd]
##
## Conditional Distribution:
## norm
##
## Coefficient(s):
## mu omega alpha1 beta1
## -1.6576e-04 1.5468e-05 2.4065e-01 8.0449e-01
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu -1.658e-04 8.902e-04 -0.186 0.8523
## omega 1.547e-05 1.904e-05 0.812 0.4167
## alpha1 2.406e-01 1.362e-01 1.767 0.0772 .
## beta1 8.045e-01 1.052e-01 7.648 2.04e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 999.3168 normalized: 2.3186
##
## Description:
## Mon May 10 21:12:05 2021 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 2690.144 0
## Shapiro-Wilk Test R W 0.858535 0
## Ljung-Box Test R Q(10) 20.78275 0.02266047
## Ljung-Box Test R Q(15) 24.94071 0.05074532
## Ljung-Box Test R Q(20) 26.92712 0.1373269
## Ljung-Box Test R^2 Q(10) 3.768855 0.9571724
## Ljung-Box Test R^2 Q(15) 6.011632 0.9795517
## Ljung-Box Test R^2 Q(20) 8.392027 0.9889398
## LM Arch Test R TR^2 4.156909 0.9804356
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -4.618639 -4.580903 -4.618810 -4.603740
# GARCH (2,2)
usdhkd.garch22 = garchFit(~garch(2,2), data=usdhkd, trace=F)
## Warning in sqrt(diag(fit$cvar)): NaNs produced
summary(usdhkd.garch22)
##
## Title:
## GARCH Modelling
##
## Call:
## garchFit(formula = ~garch(2, 2), data = usdhkd, trace = F)
##
## Mean and Variance Equation:
## data ~ garch(2, 2)
## <environment: 0x7fec2b3659a8>
## [data = usdhkd]
##
## Conditional Distribution:
## norm
##
## Coefficient(s):
## mu omega alpha1 alpha2 beta1 beta2
## -1.0068e-04 1.3247e-05 2.4447e-01 1.0000e-08 5.5623e-01 2.4114e-01
##
## Std. Errors:
## based on Hessian
##
## Error Analysis:
## Estimate Std. Error t value Pr(>|t|)
## mu -1.007e-04 8.768e-04 -0.115 0.90858
## omega 1.325e-05 9.320e-06 1.421 0.15519
## alpha1 2.445e-01 8.728e-02 2.801 0.00509 **
## alpha2 1.000e-08 NA NA NA
## beta1 5.562e-01 NA NA NA
## beta2 2.411e-01 NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log Likelihood:
## 1000.581 normalized: 2.321534
##
## Description:
## Mon May 10 21:12:05 2021 by user:
##
##
## Standardised Residuals Tests:
## Statistic p-Value
## Jarque-Bera Test R Chi^2 2936.158 0
## Shapiro-Wilk Test R W 0.859941 0
## Ljung-Box Test R Q(10) 21.39029 0.01853056
## Ljung-Box Test R Q(15) 25.39566 0.04487562
## Ljung-Box Test R Q(20) 27.34351 0.1258733
## Ljung-Box Test R^2 Q(10) 3.013569 0.9811029
## Ljung-Box Test R^2 Q(15) 5.023137 0.9919287
## Ljung-Box Test R^2 Q(20) 7.216473 0.9959124
## LM Arch Test R TR^2 3.459938 0.9913337
##
## Information Criterion Statistics:
## AIC BIC SIC HQIC
## -4.615226 -4.558622 -4.615607 -4.592877
Comment: Although all four models have residuals that are not normally distributed (p-value < 0.05), I believe that GARCH(1,1) would be the best model out of all four because it has only one insignificant p-value for its residuals, while the others have more than one. Moreover, it has the lowest AIC and BIC values out of all models.
c: Make a prediction for next 5 volatility based on your final model
predict(usdhkd.garch11, n.ahead=5, plot=TRUE, conf=.9)
## meanForecast meanError standardDeviation lowerInterval upperInterval
## 1 -0.0001657554 0.01384428 0.01384428 -0.02293757 0.02260606
## 2 -0.0001657554 0.01468953 0.01468953 -0.02432788 0.02399637
## 3 -0.0001657554 0.01552382 0.01552382 -0.02570016 0.02536865
## 4 -0.0001657554 0.01635032 0.01635032 -0.02705965 0.02672813
## 5 -0.0001657554 0.01717167 0.01717167 -0.02841064 0.02807913
Comment: The prediction error is 1.91%, which is acceptable.
d: Now, you fit ARIMA for the first 120 data using the chosen model in Homework1 Exercise 4. Predict the next 10 values based on this ARIMA model. Fit the model, and calculate the prediction error.