Loading Libraries

Do the exercises 8.1, 8.2, 8.3, 8.5., 8.6, 8.7 in Hyndman. Please submit both the Rpubs link as well as your .rmd file.

Additional Reference: https://www.linkedin.com/pulse/reading-acf-pacf-plots-missing-manual-cheatsheet-saqib-ali/

#Libraries
#install.packages(forecast)
library(forecast)
## Warning: package 'forecast' was built under R version 3.4.4
## Warning in as.POSIXlt.POSIXct(Sys.time()): unknown timezone 'zone/tz/2020a.
## 1.0/zoneinfo/America/New_York'
library(fpp2)
## ── Attaching packages ───────────────────────────────────────── fpp2 2.4 ──
## ✓ ggplot2   3.3.2     ✓ expsmooth 2.3  
## ✓ fma       2.4
## 
library(ggplot2)

8.1a

Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.

Explain the differences among these figures. Do they all indicate that the data are white noise? Left: ACF for a white noise series of 36 numbers. Middle: ACF for a white noise series of 360 numbers. Right: ACF for a white noise series of 1,000 numbers. Figure 8.31: Left: ACF for a white noise series of 36 numbers. Middle: ACF for a white noise series of 360 numbers. Right: ACF for a white noise series of 1,000 numbers.

Ans:

The ACFs plots illustrates the level of randomness for each data sample. With a small sample as in the 36 random variables, it’s still unclear whether this dataset produces random erros or normaized residuals. As we move to more random numbers the ACF Plot shows less sensitivity to the the variability of the data and hence shows a more stabalized series and henced a fully normalized residual plot which shows up an an ACF plot closed to zero.

##

8.1b

Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise? ## Ans: The ACF plots illustrates how nornmalized the random variables are. The the increase in random numbers in the dataset the more normalized the error producing a close to zero ACF plot.

##

8.2

A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.

Ans:

Non-Stationary series will show trends and seasonality which we can clearly see from the ACF and PACF plots autocorelation and trend effects from the gradual decrease in the ACF as oppose to a sudden drop in the ACF within the confidence interval if the time series was stationary.

autoplot(ibmclose) + ggtitle('IBM Close Series')

ggAcf(ibmclose)

ggPacf(ibmclose)

8.3a

For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.

usnetelec

autoplot(usnetelec) + ggtitle('USNETELEC Time Series')

lambda <- BoxCox.lambda(usnetelec)
cat('Lambda: ',lambda,'\n')
## Lambda:  0.5167714
autoplot(BoxCox(usnetelec,lambda)) + ggtitle('USNETELEC with Box-Cox Transformation Lambda =', lambda)

## ACF With Box Cox Transformation
ggAcf((BoxCox(usnetelec,lambda))) + ggtitle('USNETELEC ACF with Box-Cox Transformation Lambda =', lambda)

## Differencing Order
cat('Differencing Order:',ndiffs(BoxCox(usnetelec, lambda)))
## Differencing Order: 2
## Time series with Box-Cox Tranfomration + Differencing.
autoplot(diff(BoxCox(usnetelec,lambda))) + ggtitle('Stationary USNETELEC  with Box-Cox Transformation + Differencing')

##ACF with BOX COX + Differencing
ggAcf(diff(BoxCox(usnetelec,lambda)))  + ggtitle('USNETELEC ACF with Box-Cox Transformation + Differencing')

8.3b

usgdp

autoplot(usgdp) + ggtitle('usgdp Time Series')

lambda <- BoxCox.lambda(usgdp)
cat('Lambda: ',lambda,'\n')
## Lambda:  0.366352
autoplot(BoxCox(usgdp,lambda)) + ggtitle('usgdp with Box-Cox Transformation Lambda =', lambda)

## ACF With Box Cox Transformation
ggAcf((BoxCox(usgdp,lambda))) + ggtitle('usgdp ACF with Box-Cox Transformation Lambda =', lambda)

## Differencing Order
cat('Differencing Order:',ndiffs(BoxCox(usgdp, lambda)))
## Differencing Order: 1
## Time series with Box-Cox Tranfomration + Differencing.
autoplot(diff(BoxCox(usgdp,lambda))) + ggtitle('Stationary usgdp  with Box-Cox Transformation + Differencing')

##ACF with BOX COX + Differencing
ggAcf(diff(BoxCox(usgdp,lambda)))  + ggtitle('usgdp ACF with Box-Cox Transformation + Differencing')

8.3c

mcopper

autoplot(mcopper) + ggtitle('mcopper Time Series')

lambda <- BoxCox.lambda(mcopper)
cat('Lambda: ',lambda,'\n')
## Lambda:  0.1919047
autoplot(BoxCox(mcopper,lambda)) + ggtitle('mcopper with Box-Cox Transformation Lambda =', lambda)

## ACF With Box Cox Transformation
ggAcf((BoxCox(mcopper,lambda))) + ggtitle('mcopper ACF with Box-Cox Transformation Lambda =', lambda)

## Differencing Order
cat('Differencing Order:',ndiffs(BoxCox(mcopper, lambda)))
## Differencing Order: 1
## Time series with Box-Cox Tranfomration + Differencing.
autoplot(diff(BoxCox(mcopper,lambda))) + ggtitle('Stationary mcopper  with Box-Cox Transformation + Differencing')

##ACF with BOX COX + Differencing
ggAcf(diff(BoxCox(mcopper,lambda)))  + ggtitle('mcopper ACF with Box-Cox Transformation + Differencing')

8.3d

enplanements

autoplot(enplanements) + ggtitle('enplanements Time Series')

lambda <- BoxCox.lambda(enplanements)
cat('Lambda: ',lambda,'\n')
## Lambda:  -0.2269461
autoplot(BoxCox(enplanements,lambda)) + ggtitle('enplanements with Box-Cox Transformation Lambda =', lambda)

## ACF With Box Cox Transformation
ggAcf((BoxCox(enplanements,lambda))) + ggtitle('enplanements ACF with Box-Cox Transformation Lambda =', lambda)

## Differencing Order
cat('Differencing Order:',ndiffs(BoxCox(enplanements, lambda)))
## Differencing Order: 1
## Time series with Box-Cox Tranfomration + Differencing.
autoplot(diff(BoxCox(enplanements,lambda))) + ggtitle('Stationary enplanements  with Box-Cox Transformation + Differencing')

##ACF with BOX COX + Differencing
ggAcf(diff(BoxCox(enplanements,lambda)))  + ggtitle('enplanements ACF with Box-Cox Transformation + Differencing')

8.3d

visitors

autoplot(visitors) + ggtitle('visitors Time Series')

lambda <- BoxCox.lambda(visitors)
cat('Lambda: ',lambda,'\n')
## Lambda:  0.2775249
autoplot(BoxCox(visitors,lambda)) + ggtitle('visitors with Box-Cox Transformation Lambda =', lambda)

## ACF With Box Cox Transformation
ggAcf((BoxCox(visitors,lambda))) + ggtitle('visitors ACF with Box-Cox Transformation Lambda =', lambda)

## Differencing Order
cat('Differencing Order:',ndiffs(BoxCox(visitors, lambda)))
## Differencing Order: 1
## Time series with Box-Cox Tranfomration + Differencing.
autoplot(diff(BoxCox(visitors,lambda))) + ggtitle('Stationary visitors  with Box-Cox Transformation + Differencing')

##ACF with BOX COX + Differencing
ggAcf(diff(BoxCox(visitors,lambda)))  + ggtitle('visitors ACF with Box-Cox Transformation + Differencing')

8.5

For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data

retaildata <- readxl::read_excel("retail.xlsx", skip=1)
myts_retail <- ts(retaildata[,"A3349873A"],frequency=12, start=c(1982,4))



autoplot(myts_retail) + ggtitle('myts_retail Time Series')

lambda <- BoxCox.lambda(myts_retail)
cat('Lambda: ',lambda,'\n')
## Lambda:  0.1276369
autoplot(BoxCox(myts_retail,lambda)) + ggtitle('myts_retail with Box-Cox Transformation Lambda =', lambda)

## ACF With Box Cox Transformation
ggAcf((BoxCox(myts_retail,lambda))) + ggtitle('myts_retail ACF with Box-Cox Transformation Lambda =', lambda)

## Differencing Order
cat('Differencing Order:',ndiffs(BoxCox(myts_retail, lambda)))
## Differencing Order: 1
## Time series with Box-Cox Tranfomration + Differencing.
autoplot(diff(BoxCox(myts_retail,lambda))) + ggtitle('Stationary myts_retail  with Box-Cox Transformation + Differencing')

##ACF with BOX COX + Differencing
ggAcf(diff(BoxCox(myts_retail,lambda)))  + ggtitle('myts_retail ACF with Box-Cox Transformation + Differencing')

8.6a

Use R to simulate and plot some data from simple ARIMA models.

Use the following R code to generate data from an AR(1) model with ϕ1 = 0.6 and σ2 = 1. The process starts with y1 = v0.

y <- ts(numeric(100)) e <- rnorm(100) for(i in 2:100) y[i] <- 0.6*y[i-1] + e[i]

y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- 0.6*y[i-1] + e[i]
head(y)
## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
## [1]  0.0000000  0.3501332  0.4034377 -0.9546627 -1.2348876  1.2125804

8.6b

Produce a time plot for the series. How does the plot change as you change ϕ1?

#autoplot(y) + ggtitle('AR(1) model with  ϕ1 = 0.6 and σ2 = 1')

AR1 <- function(phi){
  y <- ts(numeric(100))
  e <- rnorm(100)
  for(i in 2:100)
    y[i] <- phi*y[i-1] + e[i]
  return(y)
}
autoplot(AR1(0.6)) + autolayer(AR1(0.9)) + autolayer(AR1(0.3))

8.6c

Write your own code to generate data from an MA(1) model with θ1 = 0.6 and σ2 = 1.

  y <- ts(numeric(100))
  e <- rnorm(100)
  for(i in 2:100)
    y[i] <- 0.6*e[i-1] + e[i]
head(y)
## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
## [1]  0.00000000 -1.00435240 -0.05666559  2.38971581  2.16117888 -0.29459048

8.6D

Produce a time plot for the series. How does the plot change as you change θ1?

MA1 <- function(theta){
  y <- ts(numeric(100))
  e <- rnorm(100)
  for(i in 2:100)
    y[i] <- theta*e[i-1] + e[i]
  return(y)
}
autoplot(MA1(0.6)) + autolayer(MA1(0.9)) + autolayer(MA1(0.3))

## 8.6e

Generate data from an ARMA(1,1) model with ϕ1=0.6, θ1=0.6 and σ2=1.

ARMA1_1 <- function(phi, theta){
  y <- ts(numeric(100))
  e <- rnorm(100)
  for(i in 2:100)
    y[i] <- phi*y[i-1] + theta*e[i-1] + e[i]
  return(y)
}
head(y)
## Time Series:
## Start = 1 
## End = 6 
## Frequency = 1 
## [1]  0.00000000 -1.00435240 -0.05666559  2.38971581  2.16117888 -0.29459048
autoplot(ARMA1_1(0.6, 0.6))

8.6f

Generate data from an AR(2) model with ϕ1 = −0.8 , ϕ2= 0.3 and σ2 = 1. (Note that these parameters will give a non-stationary series.)

AR2 <- function(phi1, phi2){
  y <- ts(numeric(100))
  e <- rnorm(100)
  for(i in 3:100)
    y[i] <- phi1*y[i-1] + phi2*y[i-2] + e[i]
  return(y)
}
autoplot(AR2(-0.8,0.3))

8.6g

Graph the latter two series and compare them.

autoplot(ARMA1_1(0.6, 0.6)) + autolayer(AR2(-0.8,0.3))

8.7

Consider wmurders, the number of women murdered each year (per 100,000 standard population) in the United States.

ggtsdisplay(wmurders)

8.7a

By studying appropriate graphs of the series in R, find an appropriate ARIMA(p,d,q) model for these data.

ANS

There appears to be no dift (see below ACF) from differencing, hence d should be 1. Note there are spikes in lag 2 in both ACF and PACF plots indicated that either an AR(2) or and MA(2). My guest would be either ARIMA(0,1,2) or ARIMA(2,1,0)

# First we try to make this series stationary
ggtsdisplay(diff(wmurders))

8.7b

Should you include a constant in the model? Explain. Since there is no drift, no constant is needed.

## Since there is no drift, no constant is needed.

8.7c

Write this model in terms of the backshift operator.

8.7d

Fit the model using R and examine the residuals. Is the model satisfactory?

ARIMA_0_1_2 <- Arima(wmurders, order=c(0,1,2))
checkresiduals(ARIMA_0_1_2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,1,2)
## Q* = 9.7748, df = 8, p-value = 0.2812
## 
## Model df: 2.   Total lags used: 10
ARIMA_2_1_0 <- Arima(wmurders, order=c(2,1,0))
checkresiduals(ARIMA_2_1_0)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(2,1,0)
## Q* = 9.9569, df = 8, p-value = 0.2681
## 
## Model df: 2.   Total lags used: 10

8.7e

Forecast three times ahead. Check your forecasts by hand to make sure that you know how they have been calculated.

Forcast_ARIMA_0_1_2 <- forecast(ARIMA_0_1_2, h=3)
Forcast_ARIMA_0_1_2
##      Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
## 2005       2.458450 2.195194 2.721707 2.055834 2.861066
## 2006       2.477101 2.116875 2.837327 1.926183 3.028018
## 2007       2.477101 1.979272 2.974929 1.715738 3.238464

8.7f

Create a plot of the series with forecasts and prediction intervals for the next three periods shown.

autoplot(Forcast_ARIMA_0_1_2)

8.7

Does auto.arima() give the same model you have chosen? If not, which model do you think is better?

autofit_arima <- auto.arima(wmurders, d=1, stepwise = FALSE)
autoplot(forecast(autofit_arima, h=3) )