8.11 Exercises

8.1 Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers

a. Explain the differences among these figures. Do they all indicate that the data are white noise?

Note: Conditions for being white noise (at least 1 condition should be met)

one or more large spikes are not outside these bounds (CI - 95%)
substantially less than 5% of spikes are outside these bounds

The X1 might not be white noise (lags are autocorrelated) given that large spike at 12 seems to go over the 95% CI. However, this is border line issue so we may have to rely on Box-LJung test to figure out whether p-value is less than 0.05 or not.

The x2 also might not be white noise given that relatively large spikes at lag 2 and lag 6 go over the CI but again, Box-LJung test is required to confirm this.

The x3 looks to be white noise since all lags are within CI and no one or more large spikes are outside CIs. However, we can see that spike at large 20 is on the borderline so we are not 100% sure. Box-LJung test is required once again.

However, it is 100% certain that all of these data do not have more than 5% of spikes going over the CIs which means the 2nd condition for being white noise is satisfied; but 1st condition which is about one or more large spike going over the CIs might not be identified visually so Box-LJung test is required to test validity of white noise to make sure if 1st condition is satisfied.

b. Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

The critical values at different distances are from the mean of zero since we expect each autocorrelation to be close to zero for white noise series. If lag deviates from mean of zero and go over the CIs, it might indicate the data is autocorrelated (not white noise).

The autocorrelations are different in each figure since the size of the data for each figure is different. (in fact, the formula for CI is +-2/sqrt(N) given N = sample size) The bigger the N, sample size, the narrower the bounds. It looks like the fluctuation of spikes for each lag tends to become smaller as the size of the data increases.

8.2. A classic example of a non-stationary series is the daily closing IBM stock price series (data set `ibmclose`). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

ggAcf shows that 2 conditions for being white noise are not satisfied. ggPacf shows that 1st condition is not satisfied, which is enough to say that data is not white noise (not stationary).

autoplot(ibmclose) # + autolayer(ibmclose)

ggAcf(ibmclose)

ggPacf(ibmclose)

Box.test(ibmclose, type = c("Ljung-Box"))

## 
##  Box-Ljung test
## 
## data:  ibmclose
## X-squared = 367.11, df = 1, p-value < 2.2e-16

8.6. Use R to simulate and plot some data from simple ARIMA models.

a. Use the following R code to generate data from an AR(1) model with o1 = 0.6 and var = 1. The process starts with y1 = 0.

Data sets are generated from following codes.

y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]

y

## Time Series:
## Start = 1 
## End = 100 
## Frequency = 1 
##   [1]  0.000000000  0.818123693  1.750733181  1.935355795  0.778304985
##   [6]  0.187865212  0.695718462  1.491346666  1.070980186  0.551181763
##  [11]  1.208260372  0.955434253  0.545495272  0.670635464  0.298774950
##  [16]  0.650644785 -0.213995766 -1.383048208 -0.636606908  0.965388454
##  [21]  1.248386521  2.622791257  1.109780687  1.010431007  1.197863621
##  [26]  0.892816207  1.688625561 -0.407721928 -2.231313452 -0.783508432
##  [31] -1.094433287 -1.082789709 -0.151499491 -1.022531250 -0.817622554
##  [36] -0.686311827 -0.001718324 -0.091683729 -1.511514193 -1.123975817
##  [41] -1.104596824  1.505670479  1.790316999 -1.116293798 -1.057907490
##  [46] -1.177534024 -2.986856518 -0.738838020  0.063507723 -0.534658415
##  [51] -0.879884098 -0.526992087 -0.288159465 -0.415629111  0.308349963
##  [56]  0.488099034 -1.381228842  0.752162099  0.948320015  1.911475750
##  [61]  1.596326986  0.974267160  0.903681052 -0.267089739  0.083034640
##  [66] -0.631678318  1.408528651  0.670087945  0.170090229  0.891703469
##  [71]  0.068699791  0.296097411 -1.030075412  0.842359841  0.372255176
##  [76]  0.115854416 -0.741962074 -1.217073061 -1.407538114 -0.676621702
##  [81] -1.316654921 -0.355456752  1.167305332 -1.129610403 -1.143347940
##  [86] -0.034693579 -0.341072559 -0.380792818 -0.274650591  1.105549231
##  [91]  0.724730742 -0.691651596 -0.442948976 -0.821664212  0.340567110
##  [96]  0.511240722 -2.047177517 -0.650056179 -0.626510300 -1.465252847

b. Produce a time plot for the series. How does the plot change as you change o1?

Given that these conditions always hold for AR(1) model: 1. When o1 = 0, yt is white noise 2. 01 = 1 and c = 0, yt is random walk 3. 01 = 1 and c != 0, yt is random walk with drift 4. when 01 < 0, yt tends to oscillate around the mean

Normally, we restrict AR(1) to stationary data: 1. For AR(1): -1 < o1 < 1

As you can see, as parameter approches to 0, yt tends to revolves around 0; yt is white noise. As parameter approaches to 1, yt tends become random walk.

Mathematically, this is making sense. Intuitively, when o1 = 0, the equation becomes yt = e indicating yt is indeed white noise (error term is the only variable to predict yt). It is apparent from ggAcf(y2) and autoplot(y2) that the series has no particular seasonality or trend and all of spikes are within CIs.

When o1 = 1, the equation becomes yt = yt-1 + e, hence, given that y’t = yt - yt-1, it is true that y’t = yt - yt-1 = e indicating that the first order differencing (the change between consecutive observations) is equal to error term, given that the differenced series is white noise (e). It is apparent from ggAcf(y3) and autoplot(y3) that the series has long periods of apparent trends up or down as well as sudden and unpredictable changes in direction.

# 0.6
autoplot(y)

# lower - 0.5
y2 <- ts(numeric(100))
e2 <- rnorm(100)
for(i in 2:100)
  y2[i] <- 0.5*y2[i-1] + e2[i]

autoplot(y2)

# 0.3
for(i in 2:100)
  y2[i] <- 0.3*y2[i-1] + e2[i]

autoplot(y2)

# 0
for(i in 2:100)
  y2[i] <- 0*y2[i-1] + e2[i]

# param = 0 graphs
autoplot(y2)

ggAcf(y2)

# higher - 0.7
y3 <- ts(numeric(100))
e3 <- rnorm(100)
for(i in 2:100)
  y3[i] <- 0.7*y3[i-1] + e3[i]

autoplot(y3)

# higher - 0.8
for(i in 2:100)
  y3[i] <- 0.8*y3[i-1] + e3[i]

autoplot(y3)

# higher - 1
for(i in 2:100)
  y3[i] <- 1*y3[i-1] + e3[i]

# param = 1 graphs
autoplot(y3)

ggAcf(y3)

# c. Write your own code to generate data from an MA(1) model with theta_1 = 0.6 and var = 1.

The plot below shows the timeseries graph.

#ma.sim <- arima.sim(model=list(ma=c(0.6)),n=100)

## list description for AR(1) model with small coef
AR.pos <- list(order=c(1,0,0), ar=0.6, sd=1)
ma.sim <- arima.sim(n=100, model=AR.pos)
autoplot(ma.sim)

# d. Produce a time plot for the series. How does the plot change as you change theta_1?

As theta_1 increases, the series becoming more and more non-stationary. I can see cleary trends in 0.95 compared to 0.05.

## set up plot region
par(mfrow=c(3,3))
## loop over orders of q
for(q in c(0.05, 0.1,0.3,0.7,0.8, 0.95)) {
  AR.pos <- list(order=c(1,0,0), ar=q, sd=1)
  ma.sim <- arima.sim(n=100, model=AR.pos)
  
  plot.ts(ma.sim, ylab=paste("parameter(",q,")",sep=""))
  acf(ma.sim)
  pacf(ma.sim)
}

# e. Generate data from an ARMA(1,1) model with para1 = 0.6, theta_1 = 0.6, var = 1

The random data is generated.

arma.pos <- list(order=c(1,0,1), ar=0.6, ma=0.6, sd=1)
arma.sim <- arima.sim(n=100, model=arma.pos)
Arima(arma.sim)

## Series: arma.sim 
## ARIMA(0,0,0) with non-zero mean 
## 
## Coefficients:
##        mean
##       0.446
## s.e.  0.157
## 
## sigma^2 estimated as 2.489:  log likelihood=-186.99
## AIC=377.99   AICc=378.11   BIC=383.2

Box.test(arma.sim, type = c("Ljung-Box"))

## 
##  Box-Ljung test
## 
## data:  arma.sim
## X-squared = 49.59, df = 1, p-value = 1.895e-12

f. Generate data from an AR(2) model with para1 = - 0.8, para2 = 0.3 and var = 1. (Note that these parameters will give a non-stationary series.)

The data is generated.

#ar.pos <- list(order=c(2,0,0), ar=c(-0.8, 0.3),  sd=1)
#ar.sim <- arima.sim(n=100, model=ar.pos)
#Arima(ar.sim )

y <- ts(numeric(100))
e <- rnorm(100)

for(i in 3:100)
  y[i] <- (-0.8*y[i-1] + 0.3*y[i-2] + e[i])

head(y)

## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
## [1]  0.0000000  0.0000000 -0.0883549 -0.2750319  0.9716661 -1.2132951

g. Graph the latter two series and compare them.

For ARMA(1,1), data keeps revolving around the mean till the end of the period (stationary) where as AR(2) shows that data tends to fluctuate with strong cyclic pattern near the end. (non-stationary)

autoplot(arma.sim)

autoplot(y)

# 8.8. Consider austa, the total international visitors to Australia (in millions) for the period 1980-2015. # a. Use auto.arima() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.

ARIMa(0,1,1) with drift was selected.

Ljung-Box test suggests we fail to reject null hypothesis of residuals being white noise, (the residuals have constant variance). The residuals also look normally distributed.

Thus, we can conclude that our model can forecast reliably.

The forecast plot shows that there will be upward trend in the next 10 periods.

model <- auto.arima(austa)
checkresiduals(model)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 5, p-value = 0.8067
## 
## Model df: 2.   Total lags used: 7

pred <- forecast(model, h = 10)
autoplot(pred)

summary(pred)

## 
## Forecast method: ARIMA(0,1,1) with drift
## 
## Model Information:
## Series: austa 
## ARIMA(0,1,1) with drift 
## 
## Coefficients:
##          ma1   drift
##       0.3006  0.1735
## s.e.  0.1647  0.0390
## 
## sigma^2 estimated as 0.03376:  log likelihood=10.62
## AIC=-15.24   AICc=-14.46   BIC=-10.57
## 
## Error measures:
##                        ME      RMSE       MAE       MPE     MAPE      MASE
## Training set 0.0008313383 0.1759116 0.1520309 -1.069983 5.513269 0.7461559
##                      ACF1
## Training set -0.000571993
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95     Hi 95
## 2016       7.108647 6.873183 7.344111 6.748536  7.468758
## 2017       7.282129 6.895831 7.668428 6.691337  7.872922
## 2018       7.455612 6.962652 7.948571 6.701695  8.209529
## 2019       7.629094 7.048756 8.209432 6.741543  8.516644
## 2020       7.802576 7.146393 8.458758 6.799031  8.806120
## 2021       7.976058 7.251932 8.700184 6.868603  9.083513
## 2022       8.149540 7.363321 8.935760 6.947121  9.351960
## 2023       8.323023 7.479266 9.166779 7.032609  9.613436
## 2024       8.496505 7.598893 9.394117 7.123725  9.869284
## 2025       8.669987 7.721572 9.618402 7.219512 10.120462

b. Plot forecasts from an ARIMA(0,1,1) model with no drift and compare these to part a. Remove the MA term and plot again.

Ljung-Box test for both models suggests we fail to reject null hypothesis of residuals being white noise, (the residuals have constant variance). The residuals also look normally distributed.

Thus, we can conclude that our models can forecast reliably.

The forecast plot in b. shows that there will be a flat trend in the next 10 periods, which is different from a. where there was increasing trend.

model2 <- Arima(austa, order = c(0,1,1), include.drift = FALSE)
checkresiduals(model2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1)
## Q* = 3.8403, df = 6, p-value = 0.6983
## 
## Model df: 1.   Total lags used: 7

pred2 <- forecast(model2, h = 10)
autoplot(pred2)

summary(pred2)

## 
## Forecast method: ARIMA(0,1,1)
## 
## Model Information:
## Series: austa 
## ARIMA(0,1,1) 
## 
## Coefficients:
##          ma1
##       0.4936
## s.e.  0.1265
## 
## sigma^2 estimated as 0.04833:  log likelihood=3.73
## AIC=-3.45   AICc=-3.08   BIC=-0.34
## 
## Error measures:
##                     ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.1150344 0.2136411 0.1714887 3.817629 5.649995 0.8416532
##                    ACF1
## Training set -0.1892256
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2016       7.015837 6.734107 7.297567 6.584969 7.446705
## 2017       7.015837 6.509440 7.522234 6.241369 7.790305
## 2018       7.015837 6.357426 7.674248 6.008885 8.022789
## 2019       7.015837 6.234446 7.797228 5.820803 8.210871
## 2020       7.015837 6.128347 7.903327 5.658539 8.373135
## 2021       7.015837 6.033643 7.998031 5.513701 8.517973
## 2022       7.015837 5.947300 8.084374 5.381651 8.650023
## 2023       7.015837 5.867430 8.164244 5.259501 8.772173
## 2024       7.015837 5.792765 8.238909 5.145310 8.886364
## 2025       7.015837 5.722403 8.309271 5.037701 8.993973

model3 <- Arima(austa, order = c(0,1,0), include.drift = FALSE)
checkresiduals(model3)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,0)
## Q* = 5.1592, df = 7, p-value = 0.6405
## 
## Model df: 0.   Total lags used: 7

pred3 <- forecast(model3, h = 10)
autoplot(pred3)

summary(pred3)

## 
## Forecast method: ARIMA(0,1,0)
## 
## Model Information:
## Series: austa 
## ARIMA(0,1,0) 
## 
## sigma^2 estimated as 0.06423:  log likelihood=-1.62
## AIC=5.24   AICc=5.36   BIC=6.8
## 
## Error measures:
##                     ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.1674969 0.2498996 0.1981155 5.471861 6.479997 0.9723354
##                   ACF1
## Training set 0.2583422
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2016       6.858953 6.534151 7.183755 6.362211 7.355695
## 2017       6.858953 6.399614 7.318292 6.156454 7.561452
## 2018       6.858953 6.296379 7.421527 5.998571 7.719335
## 2019       6.858953 6.209349 7.508557 5.865469 7.852437
## 2020       6.858953 6.132673 7.585233 5.748204 7.969702
## 2021       6.858953 6.063354 7.654552 5.642189 8.075717
## 2022       6.858953 5.999607 7.718299 5.544697 8.173209
## 2023       6.858953 5.940274 7.777632 5.453955 8.263951
## 2024       6.858953 5.884547 7.833359 5.368727 8.349179
## 2025       6.858953 5.831839 7.886067 5.288117 8.429789

c. Plot forecasts from an ARIMA(2,1,3) model with drift. Remove the constant and see what happens.

Given that d > 0 already, constant is omitted from the beginning but I applied it anyway. Method needs to be changed to Maximum-likelihood or CSS. Otherwise, Arima() gives you an error message.

model4 <- Arima(austa, order = c(2,1,3), include.constant = FALSE, method = 'CSS')
checkresiduals(model4)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,1,3)
## Q* = 3.634, df = 3, p-value = 0.3038
## 
## Model df: 5.   Total lags used: 8

pred4 <- forecast(model4, h = 10)

## Warning in predict.Arima(object, n.ahead = h): MA part of model is not
## invertible

autoplot(pred4)

summary(pred4)

## 
## Forecast method: ARIMA(2,1,3)
## 
## Model Information:
## Series: austa 
## ARIMA(2,1,3) 
## 
## Coefficients:
##          ar1     ar2      ma1      ma2      ma3
##       0.5138  0.7818  -0.1307  -0.9960  -0.4001
## s.e.  0.3718  0.5367   0.4625   0.5151   0.3345
## 
## sigma^2 estimated as 0.03799:  part log likelihood=9.24
## 
## Error measures:
##                      ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.04981184 0.1779344 0.1390249 2.155024 4.803313 0.6823234
##                    ACF1
## Training set -0.1202229
## 
## Forecasts:
##      Point Forecast     Lo 80     Hi 80     Lo 95     Hi 95
## 2016       7.416168  7.111898  7.720437  6.950828  7.881508
## 2017       8.036829  7.415544  8.658113  7.086655  8.987002
## 2018       8.778174  7.819677  9.736671  7.312279 10.244069
## 2019       9.644357  8.308738 10.979975  7.601705 11.687009
## 2020      10.669041  8.886710 12.451371  7.943201 13.394880
## 2021      11.872771  9.565477 14.180064  8.344070 15.401471
## 2022      13.292422 10.361632 16.223213  8.810165 17.774679
## 2023      14.963006 11.294913 18.631099  9.353142 20.572870
## 2024      16.931343 12.389743 21.472942  9.985566 23.877119
## 2025      19.248862 13.674560 24.823164 10.723703 27.774021

model4 <- Arima(austa, order = c(2,1,3), include.constant = FALSE, method = 'ML')
checkresiduals(model4)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,1,3)
## Q* = 3.6694, df = 3, p-value = 0.2994
## 
## Model df: 5.   Total lags used: 8

pred4 <- forecast(model4, h = 10)
autoplot(pred4)

summary(pred4)

## 
## Forecast method: ARIMA(2,1,3)
## 
## Model Information:
## Series: austa 
## ARIMA(2,1,3) 
## 
## Coefficients:
##          ar1     ar2     ma1      ma2      ma3
##       0.0004  0.9996  0.4633  -0.9893  -0.4625
## s.e.  0.0031  0.0031  0.2283   0.0329   0.2261
## 
## sigma^2 estimated as 0.03568:  log likelihood=9.24
## AIC=-6.48   AICc=-3.48   BIC=2.85
## 
## Error measures:
##                      ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.03972287 0.1724412 0.1402048 1.685829 4.592996 0.6881143
##                     ACF1
## Training set -0.07513543
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95     Hi 95
## 2016       7.178948 6.932158 7.425739 6.801515  7.556382
## 2017       7.316096 6.877422 7.754770 6.645201  7.986991
## 2018       7.510224 6.933117 8.087331 6.627615  8.392833
## 2019       7.647387 6.956180 8.338593 6.590278  8.704496
## 2020       7.841488 7.046627 8.636348 6.625853  9.057122
## 2021       7.978666 7.090007 8.867325 6.619579  9.337753
## 2022       8.172740 7.194381 9.151099 6.676469  9.669011
## 2023       8.309934 7.247714 9.372154 6.685409  9.934459
## 2024       8.503981 7.359766 9.648197 6.754055 10.253908
## 2025       8.641190 7.419069 9.863312 6.772117 10.510264

d. Plot forecasts from an ARIMA(0,0,1) model with a constant. Remove the MA term and plot again.

Note that both of them have residuals that does not resemble white noise series. Residuals do not have normally distributed shape. Both constant variance and nomarlity assumptions are not met and hence the models are not reliable in prediction. Note that in 0,0,0, point forecast starts in flat trend where as 0,0,1 becomes flat trend at h > 1.

model5 <- Arima(austa, order = c(0,0,1), include.constant = TRUE)
checkresiduals(model5)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,1) with non-zero mean
## Q* = 116.97, df = 5, p-value < 2.2e-16
## 
## Model df: 2.   Total lags used: 7

pred5 <- forecast(model5, h = 10)
autoplot(pred5)

summary(pred5)

## 
## Forecast method: ARIMA(0,0,1) with non-zero mean
## 
## Model Information:
## Series: austa 
## ARIMA(0,0,1) with non-zero mean 
## 
## Coefficients:
##          ma1    mean
##       1.0000  3.5515
## s.e.  0.0907  0.3090
## 
## sigma^2 estimated as 0.9347:  log likelihood=-50.64
## AIC=107.28   AICc=108.03   BIC=112.04
## 
## Error measures:
##                      ME      RMSE       MAE       MPE     MAPE     MASE
## Training set 0.01997167 0.9395462 0.8063899 -26.87733 42.56419 3.957698
##                   ACF1
## Training set 0.8109528
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2016       4.997220 3.741604 6.252836 3.076921 6.917519
## 2017       3.551491 1.799302 5.303680 0.871749 6.231232
## 2018       3.551491 1.799302 5.303680 0.871749 6.231232
## 2019       3.551491 1.799302 5.303680 0.871749 6.231232
## 2020       3.551491 1.799302 5.303680 0.871749 6.231232
## 2021       3.551491 1.799302 5.303680 0.871749 6.231232
## 2022       3.551491 1.799302 5.303680 0.871749 6.231232
## 2023       3.551491 1.799302 5.303680 0.871749 6.231232
## 2024       3.551491 1.799302 5.303680 0.871749 6.231232
## 2025       3.551491 1.799302 5.303680 0.871749 6.231232

model6 <- Arima(austa, order = c(0,0,0), include.constant = TRUE)
checkresiduals(model6)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,0) with non-zero mean
## Q* = 137.9, df = 6, p-value < 2.2e-16
## 
## Model df: 1.   Total lags used: 7

pred6 <- forecast(model6, h = 10)
autoplot(pred6)

summary(pred6)

## 
## Forecast method: ARIMA(0,0,0) with non-zero mean
## 
## Model Information:
## Series: austa 
## ARIMA(0,0,0) with non-zero mean 
## 
## Coefficients:
##         mean
##       3.5414
## s.e.  0.2972
## 
## sigma^2 estimated as 3.27:  log likelihood=-71.9
## AIC=147.8   AICc=148.17   BIC=150.97
## 
## Error measures:
##                       ME     RMSE      MAE       MPE     MAPE     MASE
## Training set 5.00254e-13 1.783072 1.580774 -52.92811 82.83469 7.758317
##                   ACF1
## Training set 0.9099807
## 
## Forecasts:
##      Point Forecast    Lo 80    Hi 80        Lo 95    Hi 95
## 2016       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2017       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2018       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2019       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2020       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2021       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2022       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2023       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2024       3.541428 1.223915 5.858941 -0.002902594 7.085758
## 2025       3.541428 1.223915 5.858941 -0.002902594 7.085758

e. Plot forecasts from an ARIMA(0,2,1) model with no constant.

Ljung-Box test confirms residuals are white noise and therefore, model is a reliable predictor. Note that for this model has an increasing trend in forecast.

model7 <- Arima(austa, order = c(0,2,1), include.constant = FALSE)
checkresiduals(model7)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,2,1)
## Q* = 3.8054, df = 6, p-value = 0.703
## 
## Model df: 1.   Total lags used: 7

pred7 <- forecast(model7, h = 10)
autoplot(pred7)

summary(pred7)

## 
## Forecast method: ARIMA(0,2,1)
## 
## Model Information:
## Series: austa 
## ARIMA(0,2,1) 
## 
## Coefficients:
##           ma1
##       -0.7263
## s.e.   0.2277
## 
## sigma^2 estimated as 0.03969:  log likelihood=6.74
## AIC=-9.48   AICc=-9.09   BIC=-6.43
## 
## Error measures:
##                      ME      RMSE       MAE      MPE     MAPE      MASE
## Training set 0.02907781 0.1907537 0.1567877 1.216128 4.984081 0.7695018
##                   ACF1
## Training set 0.1284994
## 
## Forecasts:
##      Point Forecast    Lo 80     Hi 80    Lo 95     Hi 95
## 2016       7.179291 6.923960  7.434622 6.788796  7.569786
## 2017       7.499628 7.086148  7.913108 6.867265  8.131992
## 2018       7.819966 7.248051  8.391881 6.945297  8.694634
## 2019       8.140304 7.403196  8.877411 7.012995  9.267612
## 2020       8.460641 7.549896  9.371387 7.067776  9.853506
## 2021       8.780979 7.687705  9.874253 7.108960 10.452997
## 2022       9.101316 7.816610 10.386023 7.136527 11.066106
## 2023       9.421654 7.936765 10.906543 7.150712 11.692596
## 2024       9.741992 8.048388 11.435595 7.151848 12.332135
## 2025      10.062329 8.151718 11.972940 7.140302 12.984357

hw#6.data624

Sang Yoon (Andy) Hwang

October 7th, 2019

8.11 Exercises

8.1 Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers

a. Explain the differences among these figures. Do they all indicate that the data are white noise?

b. Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

8.2. A classic example of a non-stationary series is the daily closing IBM stock price series (data set `ibmclose`). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

8.6. Use R to simulate and plot some data from simple ARIMA models.

a. Use the following R code to generate data from an AR(1) model with o1 = 0.6 and var = 1. The process starts with y1 = 0.

b. Produce a time plot for the series. How does the plot change as you change o1?

f. Generate data from an AR(2) model with para1 = - 0.8, para2 = 0.3 and var = 1. (Note that these parameters will give a non-stationary series.)

g. Graph the latter two series and compare them.

b. Plot forecasts from an ARIMA(0,1,1) model with no drift and compare these to part a. Remove the MA term and plot again.

c. Plot forecasts from an ARIMA(2,1,3) model with drift. Remove the constant and see what happens.

d. Plot forecasts from an ARIMA(0,0,1) model with a constant. Remove the MA term and plot again.

e. Plot forecasts from an ARIMA(0,2,1) model with no constant.

hw#6.data624

Sang Yoon (Andy) Hwang

October 7th, 2019

8.11 Exercises

8.1 Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers

a. Explain the differences among these figures. Do they all indicate that the data are white noise?

b. Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?

8.2. A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

8.6. Use R to simulate and plot some data from simple ARIMA models.

a. Use the following R code to generate data from an AR(1) model with o1 = 0.6 and var = 1. The process starts with y1 = 0.

b. Produce a time plot for the series. How does the plot change as you change o1?

f. Generate data from an AR(2) model with para1 = - 0.8, para2 = 0.3 and var = 1. (Note that these parameters will give a non-stationary series.)

g. Graph the latter two series and compare them.

b. Plot forecasts from an ARIMA(0,1,1) model with no drift and compare these to part a. Remove the MA term and plot again.

c. Plot forecasts from an ARIMA(2,1,3) model with drift. Remove the constant and see what happens.

d. Plot forecasts from an ARIMA(0,0,1) model with a constant. Remove the MA term and plot again.

e. Plot forecasts from an ARIMA(0,2,1) model with no constant.

8.2. A classic example of a non-stationary series is the daily closing IBM stock price series (data set `ibmclose`). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.