chapter7and8

chapter 7 1.

library(fpp2)

## Loading required package: ggplot2

## Loading required package: forecast

## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo

## Loading required package: fma

## Loading required package: expsmooth

summary(ses(pigs,h=4))

## 
## Forecast method: Simple exponential smoothing
## 
## Model Information:
## Simple exponential smoothing 
## 
## Call:
##  ses(y = pigs, h = 4) 
## 
##   Smoothing parameters:
##     alpha = 0.2971 
## 
##   Initial states:
##     l = 77260.0561 
## 
##   sigma:  10308.58
## 
##      AIC     AICc      BIC 
## 4462.955 4463.086 4472.665 
## 
## Error measures:
##                    ME    RMSE      MAE       MPE     MAPE      MASE       ACF1
## Training set 385.8721 10253.6 7961.383 -0.922652 9.274016 0.7966249 0.01282239
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Sep 1995       98816.41 85605.43 112027.4 78611.97 119020.8
## Oct 1995       98816.41 85034.52 112598.3 77738.83 119894.0
## Nov 1995       98816.41 84486.34 113146.5 76900.46 120732.4
## Dec 1995       98816.41 83958.37 113674.4 76092.99 121539.8

stan<-sd((ses(pigs, h=4))$residuals)
ses(pigs,h=4)$mean[1]-1.96*stan

## [1] 78679.97

ses(pigs,h=4)$mean[1]+1.96*stan

## [1] 118952.8

fts <- function(y, h) {forecast(ets(y), h = h)}
summary(fts(qcement,h=4))

## 
## Forecast method: ETS(M,A,M)
## 
## Model Information:
## ETS(M,A,M) 
## 
## Call:
##  ets(y = y) 
## 
##   Smoothing parameters:
##     alpha = 0.7505 
##     beta  = 0.003 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 0.5043 
##     b = 0.0081 
##     s = 1.0297 1.0488 1.0164 0.9052
## 
##   sigma:  0.0473
## 
##       AIC      AICc       BIC 
##  6.783846  7.591021 37.843192 
## 
## Error measures:
##                        ME       RMSE        MAE         MPE     MAPE      MASE
## Training set 0.0006237224 0.07829922 0.05603263 -0.08782872 3.606505 0.5485176
##                     ACF1
## Training set -0.03632985
## 
## Forecasts:
##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2014 Q2       2.521670 2.368825 2.674515 2.287913 2.755427
## 2014 Q3       2.610968 2.412978 2.808958 2.308169 2.913767
## 2014 Q4       2.571906 2.344302 2.799510 2.223816 2.919996
## 2015 Q1       2.268522 2.042601 2.494443 1.923006 2.614039

summary(snaive(qcement, h=4))

## 
## Forecast method: Seasonal naive method
## 
## Model Information:
## Call: snaive(y = qcement, h = 4) 
## 
## Residual sd: 0.1297 
## 
## Error measures:
##                      ME      RMSE       MAE      MPE     MAPE MASE     ACF1
## Training set 0.03419651 0.1338564 0.1021528 2.338509 6.768085    1 0.613902
## 
## Forecasts:
##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2014 Q2          2.528 2.356456 2.699544 2.265646 2.790354
## 2014 Q3          2.637 2.465456 2.808544 2.374646 2.899354
## 2014 Q4          2.565 2.393456 2.736544 2.302646 2.827354
## 2015 Q1          2.229 2.057456 2.400544 1.966646 2.491354

Observations are not identical but they are very close

ses function

ses_manual <- function(y, alpha, l0){
  y_hat <- l0
  for(index in 1:length(y)){
   y_hat <- alpha*y[index] + (1 - alpha)*y_hat 
  }
  cat("Forecast result by ses_optim function: ",
      as.character(y_hat),
      sep = "\n")
}

code for ses wouldn’t

I did not get the same values, although I am not usre if I did it correctly
come back to this.

str(books)

##  Time-Series [1:30, 1:2] from 1 to 30: 199 172 111 209 161 119 195 195 131 183 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:2] "Paperback" "Hardcover"

summary(books)

##    Paperback       Hardcover    
##  Min.   :111.0   Min.   :128.0  
##  1st Qu.:167.2   1st Qu.:170.5  
##  Median :189.0   Median :200.5  
##  Mean   :186.4   Mean   :198.8  
##  3rd Qu.:207.2   3rd Qu.:222.0  
##  Max.   :247.0   Max.   :283.0

head(books)

## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
##   Paperback Hardcover
## 1       199       139
## 2       172       128
## 3       111       172
## 4       209       139
## 5       161       191
## 6       119       168

autoplot(books) + xlab("day") + ylab("count")

Based on the first lool at the plot there looks to be an upward trend for both sets in the month. As its only one month of data its hard to see any seasonality.

books_weeklyts <- ts(books, frequency=7)
print("Decomposition of Paperback data")

## [1] "Decomposition of Paperback data"

plot(decompose(books_weeklyts[,1]))

print("Decomposition of Hardcover data")

## [1] "Decomposition of Hardcover data"

plot(decompose(books_weeklyts[,2]))

There is an upward trend in book, there also looks like there could be weekly seasonality in both.

autoplot(ses(books[,1],h =4 ))

autoplot(ses(books[,2],h=4))

library(caret)

## Loading required package: lattice

library(Metrics)

## 
## Attaching package: 'Metrics'

## The following objects are masked from 'package:caret':
## 
##     precision, recall

## The following object is masked from 'package:forecast':
## 
##     accuracy

rmse(books[,1],ses(books[,1])$fitted)

## [1] 33.63769

rmse(books[,2],ses(books[,2])$fitted)

## [1] 31.93101

holt.pback <- holt(books[, 1], h = 4)
holt.hcover <- holt(books[, 2], h = 4)
summary(holt.pback)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = books[, 1], h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 170.699 
##     b = 1.2621 
## 
##   sigma:  33.4464
## 
##      AIC     AICc      BIC 
## 318.3396 320.8396 325.3456 
## 
## Error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -3.717178 31.13692 26.18083 -5.508526 15.58354 0.6602122
##                    ACF1
## Training set -0.1750792
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       209.4668 166.6035 252.3301 143.9130 275.0205
## 32       210.7177 167.8544 253.5811 145.1640 276.2715
## 33       211.9687 169.1054 254.8320 146.4149 277.5225
## 34       213.2197 170.3564 256.0830 147.6659 278.7735

summary(holt.hcover)

## 
## Forecast method: Holt's method
## 
## Model Information:
## Holt's method 
## 
## Call:
##  holt(y = books[, 2], h = 4) 
## 
##   Smoothing parameters:
##     alpha = 1e-04 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 147.7935 
##     b = 3.303 
## 
##   sigma:  29.2106
## 
##      AIC     AICc      BIC 
## 310.2148 312.7148 317.2208 
## 
## Error measures:
##                      ME     RMSE      MAE       MPE    MAPE      MASE
## Training set -0.1357882 27.19358 23.15557 -2.114792 12.1626 0.6908555
##                     ACF1
## Training set -0.03245186
## 
## Forecasts:
##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 31       250.1739 212.7390 287.6087 192.9222 307.4256
## 32       253.4765 216.0416 290.9113 196.2248 310.7282
## 33       256.7791 219.3442 294.2140 199.5274 314.0308
## 34       260.0817 222.6468 297.5166 202.8300 317.3334

The RMSE for Holt’s method is 31.13692 for Paperback and 27.19358 for Hardcover. These are a decent bit smaller than those of the SES where we had 33.63769 and 31.93101 respectively for ses. Therefore Holt seems a better fit.

pback <- ses(books[, 1])
hcover <- ses(books[, 2])

autoplot(pback) +
  autolayer(fitted(pback), series = "SES Paperback") +
  xlab("Days") + ylab("Sales") + ggtitle("SES Paperback")

autoplot(holt.pback) +
  autolayer(fitted(holt.pback), series = "Holt Paperback") +
  xlab("Days") + ylab("Sales") + ggtitle("Holt's Method Paperback")

autoplot(hcover) +
  autolayer(fitted(hcover), series = "SES Hardcover") +
  xlab("Days") + ylab("Sales") + ggtitle("Hardcover Prediction")

autoplot(holt.hcover) +
  autolayer(fitted(holt.hcover), series = "Holt Hardcover") +
  xlab("Days") + ylab("Sales") + ggtitle("Holt's Method Hardcover")

I believe the Holt Method does a much better job capturing the upward trend of the two time series.

hpci.low <- holt.pback$mean[1] - 1.96 * 31.13692
hpci.high <- holt.pback$mean[1] + 1.96 * 31.13692

hhci.low <- holt.hcover$mean[1] - 1.96 * 27.19358
hhci.high <- holt.hcover$mean[1] + 1.96 * 27.19358

spci.low <- pback$mean[1] - 1.96 * 33.63769
spci.high <- pback$mean[1] + 1.96 * 33.63769

shci.low <- pback$mean[1] - 1.96 * 31.93101
shci.high <- pback$mean[1] + 1.96 * 31.93101

hpci <- c(hpci.low, hpci.high)
hhci <- c(hhci.low, hhci.high)
spci <- c(spci.low, spci.high)
shci <- c(shci.low, shci.high)

con.int <- data.frame(hpci, spci, hhci, shci)
con.int

##       hpci     spci     hhci     shci
## 1 148.4384 141.1798 196.8745 144.5249
## 2 270.4951 273.0395 303.4733 269.6944

The Holt CI’s are smaller than the SES CI’s but not by as much as I would have expected looking at the plots earlier. I guess the CI’s can be smaller for holt as the slant is included where SES needs to open more to include the upward trend.

def <- holt(eggs, h = 100)
autoplot(eggs) +
  autolayer(def, series = "Default Holt") +
  ggtitle("Default Holt") + xlab("Years") + ylab("Prices")

def.damp <- holt(eggs, h = 100, damped = TRUE)
autoplot(eggs) +
  autolayer(def.damp, series = "Damped Holt") +
  ggtitle("Default Damped Holt") + xlab("Years") + ylab("Prices")

rmse(eggs,def.damp$fitted)

## [1] 26.54019

rmse(eggs,def$fitted)

## [1] 26.58219

The Rmse for the damped holt only slightly outperforms the default holt. However the damped method does stop the predictions going negative so I trust this forecast far more.

def.lam <- holt(eggs, lambda = "auto", h = 100)
autoplot(eggs) +
  autolayer(def.lam, series = "BoxCox Holt") +
  ggtitle("Box Cox Transform Holt") + xlab("Years") + ylab(" Prices")

damp.lam <- holt(eggs, lambda = "auto", damped = TRUE, h = 100)
autoplot(eggs) +
  autolayer(damp.lam, series = "BoxCox Holt Damped") +
  ggtitle("Box Cox Transform Holt Damped") + xlab("Years") + ylab("Prices")

rmse(eggs,damp.lam$fitted)

## [1] 26.53321

rmse(eggs,def.lam$fitted)

## [1] 26.39376

The transformed default holt just ever so slightly outperforms the the damped holt model. But the transformed holt damped has the best RMSE. There isn’t a radical improvement, but there is improvement.

library(readxl)
retaildata <- read_excel("Downloads/retail.xlsx",skip = 1)
myts <- ts(retaildata[, "A3349413L"], frequency = 12, start = c(1982, 4))

Multiplicative decompositions are common with economic time series and given this is retail data it is about money. Percentage change is better for forecasting trend.

fitholt <- hw(myts, seasonal = "multiplicative")
fitdamped <- hw(myts, damped = TRUE, seasonal = "multiplicative")

autoplot(myts) +
  autolayer(fitholt, series = "HW Multiplicative", PI = FALSE) +
  autolayer(fitdamped, series = "HW Multiplicative Damped", PI = FALSE) +
  xlab("Year") + ylab("Money") + guides(colour=guide_legend(title="Forecast"))

Forecasts are incredibly similar.

rmse(retaildata$A3349413L,fitholt$fittedvalues)

## [1] NaN

rmse(retaildata$A3349413L,fitdamped$fittedvalues)

## [1] NaN

The RMSE’s are very close. Negiliable difference.

checkresiduals(fitholt)

## 
##  Ljung-Box test
## 
## data:  Residuals from Holt-Winters' multiplicative method
## Q* = 38.47, df = 8, p-value = 6.162e-06
## 
## Model df: 16.   Total lags used: 24

checkresiduals(fitdamped)

## 
##  Ljung-Box test
## 
## data:  Residuals from Damped Holt-Winters' multiplicative method
## Q* = 39.296, df = 7, p-value = 1.716e-06
## 
## Model df: 17.   Total lags used: 24

Definiitely not white noise. a large number of spikes outside the bounds and also there is a clear pattern in both ACF graphs.

autoplot(ukcars) + xlab("Quarter") + ylab("Passenger Car Production (thousands)") + ggtitle("Quarterly UK Passenger Car Production (1st QT 1977 - 1st QT 2005)")

No noticeable upward or downtrend. Potentially an upward trend. There seems to be cyclic periods of production. Also looks like there could be seasonal effects.

p1 <- ggsubseriesplot(ukcars) + xlab("Quarter") + ylab("Passenger Car Production (thousands)")

p2 <- ggAcf(ukcars)

plot(p1)

plot(p2)

Strong seasonality of lower production in the third quarter.

fit_stl <- stl(ukcars, s.window = 13, robust = TRUE)
autoplot(fit_stl) +  xlab("Quarter") + ylab("Passenger Car Production (thousands)") + ggtitle("STL Decomposition of UK Passenger Car Production")

ssadj <- seasadj(fit_stl)

fit.stlf <- stlf(ukcars, s.window = 13, robust = TRUE, etsmodel="AAN", damped=TRUE)

fcast.fit.stlf <- forecast(fit.stlf, h = 8)

autoplot(fcast.fit.stlf) +  xlab("Quarter") + ylab("Passenger Car Production (thousands)") + ggtitle("AAN Forecast for UK Passenger Car Production")

fit.hlm <- stlf(ukcars, s.window = 13, robust = TRUE, etsmodel="AAN", damped=FALSE)

fcast.hlm <- forecast(fit.hlm, h = 8)

autoplot(fcast.hlm) +  xlab("Quarter") + ylab("Passenger Car Production (thousands)") + ggtitle("Holt's Linear Method for UK Passenger Car Production")

decomp.fit <- stl(ukcars, s.window = "periodic", robust = TRUE)
ukcars.sadjst <- seasadj(decomp.fit)
ukcars.stl.fit <- stlf(ukcars.sadjst, etsmodel="AAN", damped=TRUE)
ukcars.stl.fcst <- forecast(ukcars.stl.fit, h=2); ukcars.stl.fcst

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       393.1714 362.6002 423.7426 346.4167 439.9261
## 2005 Q3       409.4102 372.0259 446.7945 352.2358 466.5846

summary(ukcars.stl.fcst)

## 
## Forecast method: STL +  ETS(A,Ad,N)
## 
## Model Information:
## ETS(A,Ad,N) 
## 
## Call:
##  ets(y = na.interp(x), model = etsmodel, damped = TRUE, allow.multiplicative.trend = allow.multiplicative.trend) 
## 
##   Smoothing parameters:
##     alpha = 0.7037 
##     beta  = 1e-04 
##     phi   = 0.9243 
## 
##   Initial states:
##     l = 341.6239 
##     b = -5.0799 
## 
##   sigma:  23.8549
## 
##      AIC     AICc      BIC 
## 1257.950 1258.743 1274.314 
## 
## Error measures:
##                    ME     RMSE      MAE        MPE     MAPE     MASE       ACF1
## Training set 1.551267 23.32113 18.48987 0.07585736 5.955086 0.602576 0.02262668
## 
## Forecasts:
##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       393.1714 362.6002 423.7426 346.4167 439.9261
## 2005 Q3       409.4102 372.0259 446.7945 352.2358 466.5846

ukcars.hw.fit <- hw(ukcars.sadjst, damped=FALSE)
ukcars.hw.fcst <- forecast(ukcars.hw.fit, h=2); ukcars.hw.fcst

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       406.2469 372.2952 440.1986 354.3222 458.1715
## 2005 Q3       406.4128 366.5619 446.2637 345.4662 467.3595

summary(ukcars.hw.fcst)

## 
## Forecast method: Holt-Winters' additive method
## 
## Model Information:
## Holt-Winters' additive method 
## 
## Call:
##  hw(y = ukcars.sadjst, damped = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.6145 
##     beta  = 1e-04 
##     gamma = 3e-04 
## 
##   Initial states:
##     l = 341.3029 
##     b = -0.478 
##     s = -4.9584 1.2504 0.636 3.072
## 
##   sigma:  26.4927
## 
##      AIC     AICc      BIC 
## 1284.470 1286.217 1309.016 
## 
## Error measures:
##                    ME     RMSE      MAE         MPE     MAPE      MASE
## Training set 1.706018 25.53765 20.42416 0.007349269 6.556794 0.6656134
##                    ACF1
## Training set 0.01923023
## 
## Forecasts:
##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005 Q2       406.2469 372.2952 440.1986 354.3222 458.1715
## 2005 Q3       406.4128 366.5619 446.2637 345.4662 467.3595

ukcars.ets.fit <- ets(ukcars)
summary(ukcars.ets.fit)

## ETS(A,N,A) 
## 
## Call:
##  ets(y = ukcars) 
## 
##   Smoothing parameters:
##     alpha = 0.6199 
##     gamma = 1e-04 
## 
##   Initial states:
##     l = 314.2568 
##     s = -1.7579 -44.9601 21.1956 25.5223
## 
##   sigma:  25.9302
## 
##      AIC     AICc      BIC 
## 1277.752 1278.819 1296.844 
## 
## Training set error measures:
##                    ME     RMSE      MAE        MPE     MAPE      MASE
## Training set 1.313884 25.23244 20.17907 -0.1570979 6.629003 0.6576259
##                    ACF1
## Training set 0.02573334

plot(ukcars.stl.fcst)

plot(ukcars.hw.fcst)

plot(ukcars.stl.fcst)

Both look reasonable, but the ETS model slightly beats it in terms of RMSE.

checkresiduals(ukcars.stl.fcst)

## Warning in checkresiduals(ukcars.stl.fcst): The fitted degrees of freedom is
## based on the model used for the seasonally adjusted data.

## 
##  Ljung-Box test
## 
## data:  Residuals from STL +  ETS(A,Ad,N)
## Q* = 24.138, df = 3, p-value = 2.337e-05
## 
## Model df: 5.   Total lags used: 8

Residuals look ok, a bit of autocorrelation issues. But not pattern in residuals. Rssiduals might be skewed left in histogram.

str(visitors)

##  Time-Series [1:240] from 1985 to 2005: 75.7 75.4 83.1 82.9 77.3 ...

autoplot(visitors)+ylab("Short-term Oversea Visitors")+xlab("Year")

There is a postive trend with seasonal trend and respike on the downslope of the trend.

visit.test<- window(visitors, start = 2004)
visit.fcast <- hw(visit.test,seasonal = "multiplicative", damped = FALSE, h=12)

## Warning in ets(x, "MAM", alpha = alpha, beta = beta, gamma = gamma, phi = phi, :
## Seasonal component could not be estimated

visit.fcast1 <- hw(visitors, seasonal = "multiplicative", damped = FALSE, h=12)
plot(visit.fcast1)

Multiplicative seasonailyt is necessary because the seasonal trend is increasing the distance between the high and low points with the season.

v.ets <- ets(visit.test)
v.ets_fcast <-forecast(v.ets,h=24)
plot(v.ets_fcast)

checkresiduals(v.ets)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,N)
## Q* = 1.3521, df = 3, p-value = 0.7168
## 
## Model df: 2.   Total lags used: 5

v.box<-ets(visit.test, model = "ZZZ", lambda = "auto")
v.box_fcast <- forecast(v.box)
plot(v.box_fcast)

checkresiduals(v.box)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,N)
## Q* = 1.3525, df = 3, p-value = 0.7167
## 
## Model df: 2.   Total lags used: 5

v.snaive<-snaive(visit.test, h=24)
v.snaive_fcast<- forecast(v.snaive, h=24)
plot(v.snaive_fcast)

checkresiduals(v.snaive)

## 
##  Ljung-Box test
## 
## data:  Residuals from Seasonal naive method
## Q* = 1.4962, df = 3, p-value = 0.6832
## 
## Model df: 0.   Total lags used: 3

v.stl <- stl(visitors, s.window = "periodic")
v.stl_fcast <- forecast(v.stl,h=24)
plot(v.stl_fcast)

summary(v.stl_fcast)

## 
## Forecast method: STL +  ETS(A,A,N)
## 
## Model Information:
## ETS(A,A,N) 
## 
## Call:
##  ets(y = na.interp(x), model = etsmodel, allow.multiplicative.trend = allow.multiplicative.trend) 
## 
##   Smoothing parameters:
##     alpha = 0.4891 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 116.9363 
##     b = 1.4201 
## 
##   sigma:  20.3762
## 
##      AIC     AICc      BIC 
## 2768.215 2768.472 2785.619 
## 
## Error measures:
##                       ME     RMSE      MAE        MPE     MAPE      MASE
## Training set -0.00602107 20.20565 15.48178 -0.0439509 6.231597 0.5717274
##                    ACF1
## Training set 0.05185533
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## May 2005       401.2570 375.1438 427.3701 361.3204 441.1935
## Jun 2005       416.4592 387.3886 445.5298 371.9996 460.9189
## Jul 2005       471.8865 440.1316 503.6413 423.3216 520.4514
## Aug 2005       443.2371 409.0069 477.4674 390.8865 495.5878
## Sep 2005       435.2928 398.7535 471.8320 379.4108 491.1747
## Oct 2005       472.7590 434.0474 511.4706 413.5547 531.9633
## Nov 2005       490.5152 449.7460 531.2844 428.1641 552.8663
## Dec 2005       560.3010 517.5724 603.0296 494.9533 625.6488
## Jan 2006       464.0268 419.4240 508.6296 395.8128 532.2408
## Feb 2006       496.6757 450.2737 543.0777 425.7099 567.6414
## Mar 2006       488.9345 440.7998 537.0692 415.3188 562.5502
## Apr 2006       454.2768 404.4689 504.0847 378.1022 530.4514
## May 2006       418.2968 366.8695 469.7241 339.6456 496.9481
## Jun 2006       433.4991 380.5012 486.4969 352.4459 514.5523
## Jul 2006       488.9263 434.4025 543.4502 405.5394 572.3133
## Aug 2006       460.2770 404.2682 516.2858 374.6189 545.9351
## Sep 2006       452.3326 394.8766 509.7887 364.4612 540.2040
## Oct 2006       489.7989 430.9306 548.6671 399.7677 579.8301
## Nov 2006       507.5551 447.3072 567.8030 415.4139 599.6963
## Dec 2006       577.3409 515.7436 638.9381 483.1360 671.5457
## Jan 2007       481.0667 418.1485 543.9848 384.8416 577.2917
## Feb 2007       513.7155 449.5031 577.9280 415.5111 611.9200
## Mar 2007       505.9744 440.4927 571.4560 405.8288 606.1199
## Apr 2007       471.3167 404.5894 538.0439 369.2662 573.3672

summary(v.snaive_fcast)

## 
## Forecast method: Seasonal naive method
## 
## Model Information:
## Call: snaive(y = visit.test, h = 24) 
## 
## Residual sd: 27.039 
## 
## Error measures:
##                  ME     RMSE    MAE      MPE     MAPE MASE       ACF1
## Training set 42.025 48.10855 42.025 8.609435 8.609435    1 -0.4227005
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## May 2005          341.3 279.6464 402.9536 247.0090 435.5910
## Jun 2005          367.3 305.6464 428.9536 273.0090 461.5910
## Jul 2005          472.0 410.3464 533.6536 377.7090 566.2910
## Aug 2005          405.8 344.1464 467.4536 311.5090 500.0910
## Sep 2005          395.6 333.9464 457.2536 301.3090 489.8910
## Oct 2005          449.9 388.2464 511.5536 355.6090 544.1910
## Nov 2005          479.9 418.2464 541.5536 385.6090 574.1910
## Dec 2005          593.1 531.4464 654.7536 498.8090 687.3910
## Jan 2006          462.4 400.7464 524.0536 368.1090 556.6910
## Feb 2006          501.6 439.9464 563.2536 407.3090 595.8910
## Mar 2006          504.7 443.0464 566.3536 410.4090 598.9910
## Apr 2006          409.5 347.8464 471.1536 315.2090 503.7910
## May 2006          341.3 254.1087 428.4913 207.9524 474.6476
## Jun 2006          367.3 280.1087 454.4913 233.9524 500.6476
## Jul 2006          472.0 384.8087 559.1913 338.6524 605.3476
## Aug 2006          405.8 318.6087 492.9913 272.4524 539.1476
## Sep 2006          395.6 308.4087 482.7913 262.2524 528.9476
## Oct 2006          449.9 362.7087 537.0913 316.5524 583.2476
## Nov 2006          479.9 392.7087 567.0913 346.5524 613.2476
## Dec 2006          593.1 505.9087 680.2913 459.7524 726.4476
## Jan 2007          462.4 375.2087 549.5913 329.0524 595.7476
## Feb 2007          501.6 414.4087 588.7913 368.2524 634.9476
## Mar 2007          504.7 417.5087 591.8913 371.3524 638.0476
## Apr 2007          409.5 322.3087 496.6913 276.1524 542.8476

summary(v.box_fcast)

## 
## Forecast method: ETS(A,N,N)
## 
## Model Information:
## ETS(A,N,N) 
## 
## Call:
##  ets(y = visit.test, model = "ZZZ", lambda = "auto") 
## 
##   Box-Cox transformation: lambda= 1 
## 
##   Smoothing parameters:
##     alpha = 0.3487 
## 
##   Initial states:
##     l = 422.2688 
## 
##   sigma:  62.3167
## 
##      AIC     AICc      BIC 
## 180.4562 182.4562 182.7740 
## 
## Error measures:
##                   ME     RMSE      MAE       MPE     MAPE     MASE       ACF1
## Training set 7.50738 58.29191 43.28511 0.3536385 9.599327 1.029985 0.07718639
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## May 2005       465.1511 385.2891 545.0131 343.0127 587.2895
## Jun 2005       465.1511 380.5737 549.7286 335.8011 594.5011
## Jul 2005       465.1511 376.1076 554.1946 328.9708 601.3314
## Aug 2005       465.1511 371.8551 558.4471 322.4672 607.8350
## Sep 2005       465.1511 367.7882 562.5141 316.2473 614.0549
## Oct 2005       465.1511 363.8844 566.4178 310.2771 620.0252
## Nov 2005       465.1511 360.1257 570.1765 304.5286 625.7737
## Dec 2005       465.1511 356.4969 573.8053 298.9788 631.3234
## Jan 2006       465.1511 352.9854 577.3168 293.6085 636.6937
## Feb 2006       465.1511 349.5806 580.7216 288.4013 641.9009
## Mar 2006       465.1511 346.2733 584.0290 283.3431 646.9591
## Apr 2006       465.1511 343.0555 587.2467 278.4220 651.8803
## May 2006       465.1511 339.9204 590.3819 273.6272 656.6750
## Jun 2006       465.1511 336.8618 593.4404 268.9496 661.3526
## Jul 2006       465.1511 333.8745 596.4277 264.3809 665.9213
## Aug 2006       465.1511 330.9537 599.3485 259.9139 670.3883
## Sep 2006       465.1511 328.0951 602.2071 255.5421 674.7601
## Oct 2006       465.1511 325.2950 605.0072 251.2596 679.0426
## Nov 2006       465.1511 322.5498 607.7524 247.0612 683.2410
## Dec 2006       465.1511 319.8564 610.4458 242.9421 687.3601
## Jan 2007       465.1511 317.2121 613.0901 238.8980 691.4042
## Feb 2007       465.1511 314.6143 615.6879 234.9249 695.3773
## Mar 2007       465.1511 312.0605 618.2417 231.0192 699.2830
## Apr 2007       465.1511 309.5486 620.7536 227.1776 703.1246

summary(v.ets_fcast)

## 
## Forecast method: ETS(A,N,N)
## 
## Model Information:
## ETS(A,N,N) 
## 
## Call:
##  ets(y = visit.test) 
## 
##   Smoothing parameters:
##     alpha = 0.3488 
## 
##   Initial states:
##     l = 423.299 
## 
##   sigma:  62.3167
## 
##      AIC     AICc      BIC 
## 180.4562 182.4562 182.7740 
## 
## Error measures:
##                    ME     RMSE      MAE       MPE     MAPE     MASE       ACF1
## Training set 7.497218 58.29191 43.28807 0.3514231 9.600129 1.030055 0.07710875
## 
## Forecasts:
##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## May 2005       465.1445 385.2825 545.0065 343.0061 587.2829
## Jun 2005       465.1445 380.5627 549.7263 335.7878 594.5012
## Jul 2005       465.1445 376.0927 554.1963 328.9515 601.3375
## Aug 2005       465.1445 371.8366 558.4524 322.4424 607.8466
## Sep 2005       465.1445 367.7664 562.5226 316.2175 614.0715
## Oct 2005       465.1445 363.8596 566.4294 310.2426 620.0464
## Nov 2005       465.1445 360.0980 570.1910 304.4897 625.7993
## Dec 2005       465.1445 356.4665 573.8225 298.9358 631.3532
## Jan 2006       465.1445 352.9525 577.3365 293.5616 636.7274
## Feb 2006       465.1445 349.5452 580.7438 288.3507 641.9383
## Mar 2006       465.1445 346.2356 584.0534 283.2890 647.0000
## Apr 2006       465.1445 343.0156 587.2734 278.3645 651.9245
## May 2006       465.1445 339.8784 590.4106 273.5665 656.7225
## Jun 2006       465.1445 336.8178 593.4712 268.8857 661.4033
## Jul 2006       465.1445 333.8285 596.4605 264.3141 665.9749
## Aug 2006       465.1445 330.9058 599.3832 259.8442 670.4448
## Sep 2006       465.1445 328.0454 602.2436 255.4696 674.8195
## Oct 2006       465.1445 325.2435 605.0455 251.1844 679.1047
## Nov 2006       465.1445 322.4966 607.7924 246.9833 683.3057
## Dec 2006       465.1445 319.8016 610.4875 242.8616 687.4274
## Jan 2007       465.1445 317.1556 613.1334 238.8150 691.4740
## Feb 2007       465.1445 314.5562 615.7328 234.8395 695.4495
## Mar 2007       465.1445 312.0008 618.2882 230.9315 699.3576
## Apr 2007       465.1445 309.4874 620.8016 227.0875 703.2015

The STL forecast gave a considerable better RMSE That was around 20 where the seasonal was 48 and the others were 58.

When using a CV instead of test and training set. Seasonal Naive was a better fit, but STL still performed better than the other 2.

aus.test <- window(ausbeer,start = 2008)

aus.ets <-ets(aus.test)

aus.snaive <- snaive(aus.test)

aus.stl <- stlf(aus.test)

accurcay funvtion wouldn’t knit

STL gives the best forecast results for the data set.

ets(bicoal)

## ETS(M,N,N) 
## 
## Call:
##  ets(y = bicoal) 
## 
##   Smoothing parameters:
##     alpha = 0.8205 
## 
##   Initial states:
##     l = 542.665 
## 
##   sigma:  0.1262
## 
##      AIC     AICc      BIC 
## 595.2499 595.7832 600.9253

ets(chicken)

## ETS(M,N,N) 
## 
## Call:
##  ets(y = chicken) 
## 
##   Smoothing parameters:
##     alpha = 0.98 
## 
##   Initial states:
##     l = 159.8322 
## 
##   sigma:  0.1691
## 
##      AIC     AICc      BIC 
## 635.2382 635.6018 641.9836

ets(eggs)

## ETS(M,N,N) 
## 
## Call:
##  ets(y = eggs) 
## 
##   Smoothing parameters:
##     alpha = 0.8198 
## 
##   Initial states:
##     l = 278.8889 
## 
##   sigma:  0.1355
## 
##      AIC     AICc      BIC 
## 1043.286 1043.553 1050.916

ets(dole)

## ETS(M,Ad,M) 
## 
## Call:
##  ets(y = dole) 
## 
##   Smoothing parameters:
##     alpha = 0.697 
##     beta  = 0.124 
##     gamma = 0.303 
##     phi   = 0.902 
## 
##   Initial states:
##     l = 2708.6621 
##     b = 836.017 
##     s = 1.0404 0.8893 0.9103 1.0301 1.0576 1.0584
##            0.9801 0.9632 1.021 0.9838 1.0145 1.0514
## 
##   sigma:  0.0935
## 
##      AIC     AICc      BIC 
## 10602.67 10604.30 10676.19

ets(usdeaths)

## ETS(A,N,A) 
## 
## Call:
##  ets(y = usdeaths) 
## 
##   Smoothing parameters:
##     alpha = 0.5972 
##     gamma = 0.0019 
## 
##   Initial states:
##     l = 9195.6403 
##     s = -62.6129 -270.0351 263.3823 -89.4907 1005.529 1662.647
##            795.2585 333.326 -551.161 -737.5102 -1552.872 -796.4611
## 
##   sigma:  294.4663
## 
##      AIC     AICc      BIC 
## 1141.016 1149.587 1175.166

ets(lynx)

## ETS(M,N,N) 
## 
## Call:
##  ets(y = lynx) 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
## 
##   Initial states:
##     l = 2372.8047 
## 
##   sigma:  0.9594
## 
##      AIC     AICc      BIC 
## 2058.138 2058.356 2066.346

summary(ets(bicoal))

## ETS(M,N,N) 
## 
## Call:
##  ets(y = bicoal) 
## 
##   Smoothing parameters:
##     alpha = 0.8205 
## 
##   Initial states:
##     l = 542.665 
## 
##   sigma:  0.1262
## 
##      AIC     AICc      BIC 
## 595.2499 595.7832 600.9253 
## 
## Training set error measures:
##                      ME  RMSE      MAE        MPE    MAPE      MASE       ACF1
## Training set 0.07050762 59.53 46.46042 -0.9663802 10.1218 0.9982543 0.03586774

summary(ets(chicken))

## ETS(M,N,N) 
## 
## Call:
##  ets(y = chicken) 
## 
##   Smoothing parameters:
##     alpha = 0.98 
## 
##   Initial states:
##     l = 159.8322 
## 
##   sigma:  0.1691
## 
##      AIC     AICc      BIC 
## 635.2382 635.6018 641.9836 
## 
## Training set error measures:
##                     ME     RMSE     MAE       MPE     MAPE      MASE       ACF1
## Training set -2.116415 13.38863 9.53233 -4.930494 13.11671 0.9922021 0.02763414

summary(ets(eggs))

## ETS(M,N,N) 
## 
## Call:
##  ets(y = eggs) 
## 
##   Smoothing parameters:
##     alpha = 0.8198 
## 
##   Initial states:
##     l = 278.8889 
## 
##   sigma:  0.1355
## 
##      AIC     AICc      BIC 
## 1043.286 1043.553 1050.916 
## 
## Training set error measures:
##                     ME     RMSE      MAE       MPE     MAPE      MASE
## Training set -2.800547 26.58125 19.24163 -2.956833 10.02174 0.9491609
##                    ACF1
## Training set 0.02759551

summary(ets(dole))

## ETS(M,Ad,M) 
## 
## Call:
##  ets(y = dole) 
## 
##   Smoothing parameters:
##     alpha = 0.697 
##     beta  = 0.124 
##     gamma = 0.303 
##     phi   = 0.902 
## 
##   Initial states:
##     l = 2708.6621 
##     b = 836.017 
##     s = 1.0404 0.8893 0.9103 1.0301 1.0576 1.0584
##            0.9801 0.9632 1.021 0.9838 1.0145 1.0514
## 
##   sigma:  0.0935
## 
##      AIC     AICc      BIC 
## 10602.67 10604.30 10676.19 
## 
## Training set error measures:
##                    ME     RMSE     MAE       MPE     MAPE      MASE      ACF1
## Training set 569.2395 16559.97 10000.7 0.5960691 5.910033 0.2424801 0.5556989

summary(ets(usdeaths))

## ETS(A,N,A) 
## 
## Call:
##  ets(y = usdeaths) 
## 
##   Smoothing parameters:
##     alpha = 0.5972 
##     gamma = 0.0019 
## 
##   Initial states:
##     l = 9195.6403 
##     s = -62.6129 -270.0351 263.3823 -89.4907 1005.529 1662.647
##            795.2585 333.326 -551.161 -737.5102 -1552.872 -796.4611
## 
##   sigma:  294.4663
## 
##      AIC     AICc      BIC 
## 1141.016 1149.587 1175.166 
## 
## Training set error measures:
##                     ME     RMSE      MAE         MPE     MAPE      MASE
## Training set -1.262752 264.2916 205.3357 -0.06186187 2.360075 0.4699475
##                      ACF1
## Training set -0.008409405

summary(ets(lynx))

## ETS(M,N,N) 
## 
## Call:
##  ets(y = lynx) 
## 
##   Smoothing parameters:
##     alpha = 0.9999 
## 
##   Initial states:
##     l = 2372.8047 
## 
##   sigma:  0.9594
## 
##      AIC     AICc      BIC 
## 2058.138 2058.356 2066.346 
## 
## Training set error measures:
##                    ME     RMSE      MAE       MPE     MAPE     MASE      ACF1
## Training set 8.975647 1198.452 842.0649 -52.12968 101.3686 1.013488 0.3677583

US deaths and Lynx have particurly low RMSE’s

plot(forecast(ets(lynx)))

plot(forecast(ets(dole)))

Dole doesn’t show any seasonality and on average the trend goes up but then there seems to be drops in the data without any significant pattern. A random walk.

checkresiduals(ets(dole))

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(M,Ad,M)
## Q* = 277.99, df = 7, p-value < 2.2e-16
## 
## Model df: 17.   Total lags used: 24

Q* value is high so not white noise either with the residuals.

ets(visitors,model ="MAM")

## ETS(M,A,M) 
## 
## Call:
##  ets(y = visitors, model = "MAM") 
## 
##   Smoothing parameters:
##     alpha = 0.6146 
##     beta  = 2e-04 
##     gamma = 0.192 
## 
##   Initial states:
##     l = 92.9631 
##     b = 2.2221 
##     s = 0.9378 1.0666 1.0669 0.9625 1.3768 1.113
##            1.0012 0.8219 0.9317 1.0046 0.8755 0.8413
## 
##   sigma:  0.0536
## 
##      AIC     AICc      BIC 
## 2603.654 2606.411 2662.825

HoltWinters(visitors,seasonal ="multiplicative")

## Holt-Winters exponential smoothing with trend and multiplicative seasonal component.
## 
## Call:
## HoltWinters(x = visitors, seasonal = "multiplicative")
## 
## Smoothing parameters:
##  alpha: 0.5390725
##  beta : 0
##  gamma: 0.6288391
## 
## Coefficients:
##            [,1]
## a   481.6302879
## b     2.3322698
## s1    0.7541358
## s2    0.8326265
## s3    1.0546914
## s4    0.9066325
## s5    0.8977256
## s6    0.9993470
## s7    1.0364483
## s8    1.2568825
## s9    0.9408039
## s10   1.0217838
## s11   0.9992030
## s12   0.8635915

I did not get the same model with the visitors data.

Chapter 8 1. The figures show the correlation between the lags of each point. As they are within the acceptable boundaries this indicates they are not correlated with time.

the critical value region is dependent on sample size. As the sample size increases the critical value decreases.

autoplot(ibmclose) + ylab("Close Price")+xlab("Day")

plot(acf(ibmclose))

plot(pacf(ibmclose))

There is a trend element in both plots. The ACF plots portray autocorrelations with the time series.

The PACF plots show a strong outlier and groups of negative and positive lags together.

plot(BoxCox(usnetelec,lambda = 0), main = "US Net Electricity Generation")

plot(BoxCox(usgdp,lambda = 1/2),main = "US GDP" )

plot(BoxCox(mcopper,lambda = 1/3), main = "Monthly Copper Prices")

plot(BoxCox(enplanements,lambda = 0), main = "Monthly US Domestic Enplanements")

plot(BoxCox(visitors,lambda = 1/3),main = "Monthly Australian Overseas Visitors")

acf(enplanements)

pacf(enplanements)

acf(enplanements - lag(enplanements,1))

pacf(enplanements - lag(enplanements,1))

There is a clear sign of seasonality. This is shown in the acf plot. After taking lag1 of the series of backshift by 1 period, most of it has been covered by differencing of order 1.

retail ts wouldn’t knit

y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]

plot(y)

It decreases by ϕ1, data condenses.

y2 <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y2[i] <- 0.1*y2[i-1] + e[i]

plot(y2)

input1 <- 0.6

yO <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  yO[i] <- input1*yO[i-1] + e[i]

plot(yO)

input1 <- 0.6
input2 <- 0.1
input3 <- 1.5

## Test 1
y1 <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y1[i] <- 0.6*y1[i-1] + e[i]

plot(y1)

y2 <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y2[i] <- 0.6*y2[i-1] + e[i]

plot(y2)

y3 <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y3[i] <- 1.5*y3[i-1] + e[i]

plot(y3)

Plot become exponential like. Not sure if I did something wrong.

y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  y[i] <- 0.6*y[i-1] + e[i]

arima(y)

## 
## Call:
## arima(x = y)
## 
## Coefficients:
##       intercept
##         -0.1883
## s.e.     0.1258
## 
## sigma^2 estimated as 1.582:  log likelihood = -164.83,  aic = 333.67

yAr <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  yAr[i] <- 0.6*yAr[i-1] + 0.6*e[i-1] + e[i]

plot(yAr)

yAr2 <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
  yAr2[i] <- -0.8*yAr2[i-1] + 0.3*e[i-1] + e[i]

plot(yAr2)

plot(yAr)

The second Arima model has a lot more volatility.

autoplot(wmurders) + ylab("Women Murder Rate")+ xlab("Years")

wmurders1 <- ets(wmurders, model="ZZZ", damped=NULL, alpha=NULL, beta=NULL,
    gamma=NULL, phi=NULL, lambda=NULL, biasadj=FALSE,
    additive.only=FALSE, restrict=TRUE,
    allow.multiplicative.trend=FALSE)
wmurders1

## ETS(M,N,N) 
## 
## Call:
##  ets(y = wmurders, model = "ZZZ", damped = NULL, alpha = NULL,  
## 
##  Call:
##      beta = NULL, gamma = NULL, phi = NULL, additive.only = FALSE,  
## 
##  Call:
##      lambda = NULL, biasadj = FALSE, restrict = TRUE, allow.multiplicative.trend = FALSE) 
## 
##   Smoothing parameters:
##     alpha = 0.9599 
## 
##   Initial states:
##     l = 2.4177 
## 
##   sigma:  0.0617
## 
##      AIC     AICc      BIC 
## 49.36113 49.83172 55.38313

wmurders2 <- wmurders %>%
  Arima(order=c(0,1,1), seasonal=c(0,1,1)) %>%
  residuals() %>% ggtsdisplay()

wmurders3 <- arima(wmurders, order=c(3,1,1))
wmurders3

## 
## Call:
## arima(x = wmurders, order = c(3, 1, 1))
## 
## Coefficients:
##          ar1     ar2      ar3      ma1
##       0.0683  0.3011  -0.0608  -0.1179
## s.e.  1.7595  0.1635   0.5328   1.7569
## 
## sigma^2 estimated as 0.04104:  log likelihood = 9.5,  aic = -8.99

auto.arima(wmurders)

## Series: wmurders 
## ARIMA(1,2,1) 
## 
## Coefficients:
##           ar1      ma1
##       -0.2434  -0.8261
## s.e.   0.1553   0.1143
## 
## sigma^2 estimated as 0.04632:  log likelihood=6.44
## AIC=-6.88   AICc=-6.39   BIC=-0.97

No, the constant may cause a drift.

wmAuto <- auto.arima(wmurders)
wmAuto

## Series: wmurders 
## ARIMA(1,2,1) 
## 
## Coefficients:
##           ar1      ma1
##       -0.2434  -0.8261
## s.e.   0.1553   0.1143
## 
## sigma^2 estimated as 0.04632:  log likelihood=6.44
## AIC=-6.88   AICc=-6.39   BIC=-0.97

acf(wmurders)

pacf(wmurders)

Pacf wouldn’t knit

wmurders.ts <- ts(wmurders)
forecast(wmurders.ts)

##    Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 56       2.592549 2.387657 2.797442 2.279194 2.905905
## 57       2.592549 2.308273 2.876826 2.157786 3.027313
## 58       2.592549 2.246455 2.938644 2.063243 3.121855
## 59       2.592549 2.193944 2.991155 1.982935 3.202164
## 60       2.592549 2.147433 3.037665 1.911803 3.273296
## 61       2.592549 2.105202 3.079897 1.847215 3.337883
## 62       2.592549 2.066218 3.118881 1.787594 3.397505
## 63       2.592549 2.029805 3.155294 1.731906 3.453193
## 64       2.592549 1.995493 3.189606 1.679431 3.505668
## 65       2.592549 1.962938 3.222161 1.629642 3.555457

checkresiduals(wmurders.ts)

## Warning in modeldf.default(object): Could not find appropriate degrees of
## freedom for this model.

forecast(wmurders,h=3)

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005       2.592549 2.387657 2.797442 2.279194 2.905905
## 2006       2.592549 2.308273 2.876826 2.157786 3.027313
## 2007       2.592549 2.246455 2.938644 2.063243 3.121855

plot(forecast(wmurders, h=3))

auto.arima(wmurders)

## Series: wmurders 
## ARIMA(1,2,1) 
## 
## Coefficients:
##           ar1      ma1
##       -0.2434  -0.8261
## s.e.   0.1553   0.1143
## 
## sigma^2 estimated as 0.04632:  log likelihood=6.44
## AIC=-6.88   AICc=-6.39   BIC=-0.97

autoplot(forecast(auto.arima(wmurders),h=3))

The arima model shows a downward trend, while the first model doesn’t. Also slightly different ranges.

aus <- auto.arima(austa)
aus

## Series: austa 
## ARIMA(0,1,1) with drift 
## 
## Coefficients:
##          ma1   drift
##       0.3006  0.1735
## s.e.  0.1647  0.0390
## 
## sigma^2 estimated as 0.03376:  log likelihood=10.62
## AIC=-15.24   AICc=-14.46   BIC=-10.57

checkresiduals(aus)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 5, p-value = 0.8067
## 
## Model df: 2.   Total lags used: 7

plot(forecast(aus, h=10))

aus2 <- forecast(arima(austa, order = c(0,1,1)),h=10)
plot(aus2)

aus3 <- forecast(arima(austa, order = c(1,2,3)),h=10)
plot(aus3)

aus3 <- forecast(arima(austa))
plot(aus3)

Graphs forecast looks very inaccurate when constant removed.

aus4 <- forecast(arima(austa, order=c(0,0,1)),h=10)
plot(aus4)

aus4 <- forecast(arima(austa))
plot(aus4)

Forecast looks unreliable when constant removed

aus5 <- forecast(arima(austa, order=c(0,2,1)),h=10)
plot(aus5)

plot(BoxCox(usgdp,lambda = 0), main = "US GDP")

usgdp1 <- BoxCox(usgdp,lambda = 0)
auto.arima(usgdp1)

## Series: usgdp1 
## ARIMA(2,1,2) with drift 
## 
## Coefficients:
##          ar1      ar2      ma1     ma2   drift
##       1.3594  -0.7732  -1.1002  0.6102  0.0084
## s.e.  0.1489   0.1774   0.2197  0.2285  0.0007
## 
## sigma^2 estimated as 8.503e-05:  log likelihood=773.76
## AIC=-1535.52   AICc=-1535.15   BIC=-1514.73

arima(usgdp1,order = c(0,0,1))

## 
## Call:
## arima(x = usgdp1, order = c(0, 0, 1))
## 
## Coefficients:
##          ma1  intercept
##       1.0000     8.3967
## s.e.  0.0186     0.0373
## 
## sigma^2 estimated as 0.08289:  log likelihood = -43.93,  aic = 93.87

arima(usgdp1,order = c(1,1,1))

## 
## Call:
## arima(x = usgdp1, order = c(1, 1, 1))
## 
## Coefficients:
##          ar1      ma1
##       0.8512  -0.4140
## s.e.  0.0579   0.1117
## 
## sigma^2 estimated as 9.845e-05:  log likelihood = 753.49,  aic = -1500.97

usgdp.best <- auto.arima(usgdp1)
checkresiduals(usgdp.best)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,1,2) with drift
## Q* = 6.3724, df = 3, p-value = 0.09483
## 
## Model df: 5.   Total lags used: 8

forecast(usgdp.best,h=25)

##         Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2006 Q2       9.350972 9.339154 9.362789 9.332898 9.369045
## 2006 Q3       9.360099 9.341097 9.379101 9.331038 9.389161
## 2006 Q4       9.368788 9.343212 9.394364 9.329673 9.407904
## 2007 Q1       9.377005 9.345848 9.408162 9.329355 9.424655
## 2007 Q2       9.384918 9.349435 9.420401 9.330651 9.439185
## 2007 Q3       9.392785 9.354123 9.431446 9.333656 9.451913
## 2007 Q4       9.400821 9.359781 9.441862 9.338055 9.463587
## 2008 Q1       9.409127 9.366114 9.452139 9.343345 9.474909
## 2008 Q2       9.417665 9.372764 9.462565 9.348995 9.486334
## 2008 Q3       9.426311 9.379400 9.473223 9.354567 9.498056
## 2008 Q4       9.434927 9.385815 9.484039 9.359816 9.510037
## 2009 Q1       9.443414 9.391979 9.494849 9.364751 9.522077
## 2009 Q2       9.451753 9.398015 9.505491 9.369568 9.533938
## 2009 Q3       9.459988 9.404096 9.515881 9.374508 9.545469
## 2009 Q4       9.468198 9.410357 9.526039 9.379738 9.556658
## 2010 Q1       9.476452 9.416851 9.536054 9.385300 9.567605
## 2010 Q2       9.484788 9.423549 9.546027 9.391131 9.578445
## 2010 Q3       9.493199 9.430373 9.556025 9.397115 9.589283
## 2010 Q4       9.501650 9.437234 9.566066 9.403134 9.600166
## 2011 Q1       9.510097 9.444065 9.576129 9.409110 9.611084
## 2011 Q2       9.518507 9.450845 9.586169 9.415027 9.621987
## 2011 Q3       9.526871 9.457594 9.596148 9.420921 9.632821
## 2011 Q4       9.535200 9.464354 9.606047 9.426850 9.643551
## 2012 Q1       9.543518 9.471165 9.615871 9.432863 9.654173
## 2012 Q2       9.551847 9.478049 9.625644 9.438983 9.664710

autoplot(forecast(usgdp.best,h=25))

ets(usgdp1)

## ETS(A,A,N) 
## 
## Call:
##  ets(y = usgdp1) 
## 
##   Smoothing parameters:
##     alpha = 0.9959 
##     beta  = 1e-04 
## 
##   Initial states:
##     l = 7.3524 
##     b = 0.0084 
## 
##   sigma:  0.0099
## 
##       AIC      AICc       BIC 
## -883.5955 -883.3357 -866.2552

usgdp.ets <- forecast(ets(usgdp1),h=25)
autoplot(usgdp.ets)

Results very similar between the two forecasts

autoplot(austourists)

Sharp seasonal data with an upward trend

acf(austourists)

Definitely autocorrelation, lots of values lie outside critical bounds.

pacf(austourists)

Similar result to the ACF, so no white noise

plot(diff(austourists, lag=4, differences=1), xlab="Year", ylab="Total Night Visitors", main="International Tourists to Australia", col="blue")

Perhaps a slight upward trend but there is inconsistent seasonality.

auto.arima(austourists)

## Series: austourists 
## ARIMA(1,0,0)(1,1,0)[4] with drift 
## 
## Coefficients:
##          ar1     sar1   drift
##       0.4705  -0.5305  0.5489
## s.e.  0.1154   0.1122  0.0864
## 
## sigma^2 estimated as 5.15:  log likelihood=-142.48
## AIC=292.97   AICc=293.65   BIC=301.6

No the models are not the same, The auto arima has a smaller AIC

I cant get the backshift operation to work.

mel1 <- ma(usmelec, order=12)
plot(mel1, col="red", main="Electricity Monthly Total Net Generation", xlab="Year", ylab="Electricity Generation (billions kWh")

Data doesn’t seem like it needs a log transformation.

melLog <- BoxCox(usmelec, lambda=0)

plot(BoxCox(mel1, lambda=0), main="Electricity Monthly Total Net Generation", xlab="Year", ylab="Electricity Generation (billions kWh")

There is possibly some very small seasonal aspect in the series. Therefore not stationary.

tsdisplay(diff(usmelec, 1))

tsdisplay(diff(melLog, 1))

auto.arima(usmelec)

## Series: usmelec 
## ARIMA(1,0,2)(0,1,1)[12] with drift 
## 
## Coefficients:
##          ar1      ma1      ma2     sma1   drift
##       0.9717  -0.4374  -0.2774  -0.7061  0.3834
## s.e.  0.0163   0.0483   0.0493   0.0310  0.0868
## 
## sigma^2 estimated as 57.67:  log likelihood=-1635.13
## AIC=3282.26   AICc=3282.44   BIC=3307.22

arima(usmelec, order=c(1,1,1))

## 
## Call:
## arima(x = usmelec, order = c(1, 1, 1))
## 
## Coefficients:
##           ar1     ma1
##       -0.1626  0.3242
## s.e.   0.2454  0.2344
## 
## sigma^2 estimated as 555.7:  log likelihood = -2220.85,  aic = 4447.69

Auto selection Arima performs far better in terms of AIC.

autoMel <- auto.arima(usmelec)
checkresiduals(autoMel)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,2)(0,1,1)[12] with drift
## Q* = 42.725, df = 19, p-value = 0.001413
## 
## Model df: 5.   Total lags used: 24

tsdisplay(autoMel$residuals)

melARIMAauto <- arima(usmelec)
autoMelF <- forecast(melARIMAauto)
autoMelF

##          Point Forecast    Lo 80   Hi 80   Lo 95    Hi 95
## Jul 2013       259.5576 171.3822 347.733 124.705 394.4102
## Aug 2013       259.5576 171.3822 347.733 124.705 394.4102
## Sep 2013       259.5576 171.3822 347.733 124.705 394.4102
## Oct 2013       259.5576 171.3822 347.733 124.705 394.4102
## Nov 2013       259.5576 171.3822 347.733 124.705 394.4102
## Dec 2013       259.5576 171.3822 347.733 124.705 394.4102
## Jan 2014       259.5576 171.3822 347.733 124.705 394.4102
## Feb 2014       259.5576 171.3822 347.733 124.705 394.4102
## Mar 2014       259.5576 171.3822 347.733 124.705 394.4102
## Apr 2014       259.5576 171.3822 347.733 124.705 394.4102
## May 2014       259.5576 171.3822 347.733 124.705 394.4102
## Jun 2014       259.5576 171.3822 347.733 124.705 394.4102
## Jul 2014       259.5576 171.3822 347.733 124.705 394.4102
## Aug 2014       259.5576 171.3822 347.733 124.705 394.4102
## Sep 2014       259.5576 171.3822 347.733 124.705 394.4102
## Oct 2014       259.5576 171.3822 347.733 124.705 394.4102
## Nov 2014       259.5576 171.3822 347.733 124.705 394.4102
## Dec 2014       259.5576 171.3822 347.733 124.705 394.4102
## Jan 2015       259.5576 171.3822 347.733 124.705 394.4102
## Feb 2015       259.5576 171.3822 347.733 124.705 394.4102
## Mar 2015       259.5576 171.3822 347.733 124.705 394.4102
## Apr 2015       259.5576 171.3822 347.733 124.705 394.4102
## May 2015       259.5576 171.3822 347.733 124.705 394.4102
## Jun 2015       259.5576 171.3822 347.733 124.705 394.4102

I dont think the autoselection gives a stationary model.

melA <- arima(usmelec, order=c(1,1,1))

tsdisplay(melA$residuals)

autoplot(mcopper, col="dodgerblue") + ylab("Monthly Copper Prices") + xlab("Year")

mc <-BoxCox(mcopper, lambda=0) 
autoplot(mc, col="dodgerblue") + ylab("Monthly Copper Prices") + xlab("Year")

auto.arima(mc)

## Series: mc 
## ARIMA(0,1,1) 
## 
## Coefficients:
##          ma1
##       0.3756
## s.e.  0.0385
## 
## sigma^2 estimated as 0.003634:  log likelihood=782.84
## AIC=-1561.69   AICc=-1561.66   BIC=-1553.02

arima(mc,order = c(1,1,1))

## 
## Call:
## arima(x = mc, order = c(1, 1, 1))
## 
## Coefficients:
##           ar1     ma1
##       -0.0083  0.3825
## s.e.   0.1007  0.0912
## 
## sigma^2 estimated as 0.003628:  log likelihood = 782.85,  aic = -1559.69

arima(mc,order = c(0,0,1))

## 
## Call:
## arima(x = mc, order = c(0, 0, 1))
## 
## Coefficients:
##          ma1  intercept
##       0.9451     6.7305
## s.e.  0.0091     0.0267
## 
## sigma^2 estimated as 0.1062:  log likelihood = -169.03,  aic = 344.06

Based on the AIC the auto selection was the best model.

But model 2 was very close.

mc1 <- arima(mc)
mc.fcast <- forecast(mc)
mc.fcast

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 2007       8.133809 8.052148 8.215470 8.008919 8.258699
## Feb 2007       8.126612 8.004955 8.248269 7.940553 8.312670
## Mar 2007       8.120296 7.964292 8.276299 7.881709 8.358883
## Apr 2007       8.114754 7.927141 8.302367 7.827825 8.401683
## May 2007       8.109891 7.892420 8.327362 7.777298 8.442485
## Jun 2007       8.105624 7.859604 8.351644 7.729369 8.481879
## Jul 2007       8.101880 7.828388 8.375371 7.683611 8.520149
## Aug 2007       8.098594 7.798571 8.398618 7.639748 8.557441
## Sep 2007       8.095711 7.770005 8.421418 7.597586 8.593837
## Oct 2007       8.093182 7.742577 8.443787 7.556978 8.629386
## Nov 2007       8.090962 7.716193 8.465731 7.517802 8.664122
## Dec 2007       8.089014 7.690772 8.487256 7.479956 8.698072
## Jan 2008       8.087305 7.666247 8.508363 7.443352 8.731258
## Feb 2008       8.085805 7.642553 8.529057 7.407909 8.763701
## Mar 2008       8.084489 7.619635 8.549343 7.373556 8.795422
## Apr 2008       8.083334 7.597443 8.569226 7.340227 8.826441
## May 2008       8.082321 7.575929 8.588713 7.307861 8.856781
## Jun 2008       8.081432 7.555051 8.607813 7.276402 8.886462
## Jul 2008       8.080652 7.534770 8.626534 7.245797 8.915506
## Aug 2008       8.079967 7.515049 8.644885 7.216000 8.943935
## Sep 2008       8.079366 7.495855 8.662877 7.186963 8.971770
## Oct 2008       8.078839 7.477158 8.680521 7.158647 8.999032
## Nov 2008       8.078377 7.458928 8.697826 7.131011 9.025743
## Dec 2008       8.077971 7.441138 8.714803 7.104020 9.051922

autoplot(mc.fcast)+ ylab("Monthly Copper Prices") + xlab("Year")

mc2<- ets(mc)
mc.fcast1<- forecast(mc2)
mc.fcast1

##          Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## Jan 2007       8.133809 8.052148 8.215470 8.008919 8.258699
## Feb 2007       8.126612 8.004955 8.248269 7.940553 8.312670
## Mar 2007       8.120296 7.964292 8.276299 7.881709 8.358883
## Apr 2007       8.114754 7.927141 8.302367 7.827825 8.401683
## May 2007       8.109891 7.892420 8.327362 7.777298 8.442485
## Jun 2007       8.105624 7.859604 8.351644 7.729369 8.481879
## Jul 2007       8.101880 7.828388 8.375371 7.683611 8.520149
## Aug 2007       8.098594 7.798571 8.398618 7.639748 8.557441
## Sep 2007       8.095711 7.770005 8.421418 7.597586 8.593837
## Oct 2007       8.093182 7.742577 8.443787 7.556978 8.629386
## Nov 2007       8.090962 7.716193 8.465731 7.517802 8.664122
## Dec 2007       8.089014 7.690772 8.487256 7.479956 8.698072
## Jan 2008       8.087305 7.666247 8.508363 7.443352 8.731258
## Feb 2008       8.085805 7.642553 8.529057 7.407909 8.763701
## Mar 2008       8.084489 7.619635 8.549343 7.373556 8.795422
## Apr 2008       8.083334 7.597443 8.569226 7.340227 8.826441
## May 2008       8.082321 7.575929 8.588713 7.307861 8.856781
## Jun 2008       8.081432 7.555051 8.607813 7.276402 8.886462
## Jul 2008       8.080652 7.534770 8.626534 7.245797 8.915506
## Aug 2008       8.079967 7.515049 8.644885 7.216000 8.943935
## Sep 2008       8.079366 7.495855 8.662877 7.186963 8.971770
## Oct 2008       8.078839 7.477158 8.680521 7.158647 8.999032
## Nov 2008       8.078377 7.458928 8.697826 7.131011 9.025743
## Dec 2008       8.077971 7.441138 8.714803 7.104020 9.051922

autoplot(mc.fcast1)+ ylab("Monthly Copper Prices") + xlab("Year")

str(hsales)

##  Time-Series [1:275] from 1973 to 1996: 55 60 68 63 65 61 54 52 46 42 ...

autoplot(hsales)+ ylab("Monthly One-Family Home Sales") + xlab("Year")

hsales.bc <- BoxCox(hsales, lambda=0) ## Log Transform
autoplot(hsales.bc) + ylab("Monthly One-Family Home Sales") + xlab("Year")

home <- BoxCox(hsales,lambda = 3)
autoplot(home) + ylab("Monthly One-Family Home Sales") + xlab("Year")

home2 <-BoxCox(hsales, lambda=.5) 
autoplot(home2) + ylab("Monthly One-Family Home Sales") + xlab("Year")

home3 <- BoxCox(hsales, lambda=-2) 
autoplot(home3) + ylab("Monthly One-Family Home Sales") + xlab("Year")

None of the models were stationary transformations

auto.arima(hsales)

## Series: hsales 
## ARIMA(1,0,0)(1,1,0)[12] with drift 
## 
## Coefficients:
##          ar1     sar1    drift
##       0.8867  -0.4320  -0.0228
## s.e.  0.0294   0.0569   0.1642
## 
## sigma^2 estimated as 27.92:  log likelihood=-811.38
## AIC=1630.76   AICc=1630.92   BIC=1645.05

arima(hsales, order=c(0,0,1))

## 
## Call:
## arima(x = hsales, order = c(0, 0, 1))
## 
## Coefficients:
##          ma1  intercept
##       0.7465    52.2712
## s.e.  0.0309     0.8445
## 
## sigma^2 estimated as 64.5:  log likelihood = -963.54,  aic = 1933.08

arima(hsales, order=c(1,1,1))

## 
## Call:
## arima(x = hsales, order = c(1, 1, 1))
## 
## Coefficients:
##          ar1      ma1
##       0.1713  -0.0142
## s.e.  0.1945   0.1902
## 
## sigma^2 estimated as 40.03:  log likelihood = -894.29,  aic = 1794.58

Autoselection model is the best by AIC.

hsales.est <- auto.arima(hsales)
checkresiduals(hsales.est)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,0)(1,1,0)[12] with drift
## Q* = 39.66, df = 21, p-value = 0.008177
## 
## Model df: 3.   Total lags used: 24

acf(hsales)

pacf(hsales)

A lot of the value are out of the CR.

hsales.stlf <- stlf(hsales, method="arima")
hsales.stlf_fcast <- forecast(hsales.stlf,h=24)
plot(hsales.stlf_fcast)

The forecast now seems to be more focused but perhaps overfitted.

autoplot(sheep) + ylab("Sheep Population") + xlab("Year")

auto.arima(sheep)

## Series: sheep 
## ARIMA(3,1,0) 
## 
## Coefficients:
##          ar1      ar2      ar3
##       0.4210  -0.2018  -0.3044
## s.e.  0.1193   0.1363   0.1243
## 
## sigma^2 estimated as 4991:  log likelihood=-407.56
## AIC=823.12   AICc=823.71   BIC=832.22

Arima 3,1,0 was best

sheep1 <- auto.arima(sheep)
forecast(arima(sheep))

##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 1940       1856.671 1573.106 2140.237 1422.995 2290.347
## 1941       1856.671 1573.106 2140.237 1422.995 2290.347
## 1942       1856.671 1573.106 2140.237 1422.995 2290.347
## 1943       1856.671 1573.106 2140.237 1422.995 2290.347
## 1944       1856.671 1573.106 2140.237 1422.995 2290.347
## 1945       1856.671 1573.106 2140.237 1422.995 2290.347
## 1946       1856.671 1573.106 2140.237 1422.995 2290.347
## 1947       1856.671 1573.106 2140.237 1422.995 2290.347
## 1948       1856.671 1573.106 2140.237 1422.995 2290.347
## 1949       1856.671 1573.106 2140.237 1422.995 2290.347

pacf(sheep)

There are seasonal trends therefore model appropriate

Couldn’t download Quandl

chapter7and8

Mark Whitehouse

7/27/2020