Data Pre-Processing and Exponential Smoothing
Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.
Explain the differences among these figures. DO they all indicate that the data are white noise?
Why are the critical values at different distances from the mean of zero? Why are the autocorrelations different in each figure when they each refer to white noise?
A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose ). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.
ibmclose %>%
ggtsdisplay()For a non-stationary plot, we’re looking for a time series plot to be both random and vary within a set range. In the plots above, we’re seeing clear autocorrelation across >25 lagging periods and a significant correlation to the t-1 value.
Use R to simulate and plot some data from simple ARIMA models.
Use the following R code to generate data from an AR(1) model with \(\phi_1\) = 0.6 and \(\sigma^2\) = 1. The process starts with \(y_1\) = 0.
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- 0.6 * y[i-1] + e[i]Produce a time plot for the series. How does the plot change as you change \(\phi_1\)?
# function to plot AR(1) model given phi
plot_ar <- function(phi) {
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- phi * y[i-1] + e[i]
autoplot(y) +
ggtitle(paste0('AR(1), p = ', phi))
}
# phi = -2
plot_ar(-2)# phi = -1
plot_ar(-1)# phi = -0.5
plot_ar(-0.5)# phi = 0
plot_ar(0)# phi = 0.5
plot_ar(0.5)# phi = 1
plot_ar(1)# phi = 2
plot_ar(2)Write your own code to generate data from an MA(1) model with \(\theta_1\) = 0.6 and \(\sigma^2\) = 1.
plot_ma <- function(theta) {
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- e[i] + theta * e[i-1]
autoplot(y) +
ggtitle(paste0('MA(1), theta = ', theta))
}Produce a time plot for the series. How does the plot change as you change \(\theta_1\)?
plot_ma(0.6)# theta = -2
plot_ma(-2)# theta = -1
plot_ma(-1)# theta = -0.5
plot_ma(-0.5)# theta = 0
plot_ma(0)# theta = 0.5
plot_ma(0.5)# theta = 1
plot_ma(1)# theta = 2
plot_ma(2)Generate data from an ARMA(1,1) model with \(\phi_1\) = 0.6, \(\theta_1\) = 0.6 and \(\sigma^2\) = 1.
plot_arma <- function(phi, theta) {
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- phi * y[i-1] + e[i] + theta * e[i-1]
autoplot(y) +
ggtitle(paste0('ARMA(1,1), p = ', phi, ' theta = ', theta))
}
plot_arma(0.6,0.6)Generate data from an AR(2) model with \(\phi_1\) = -0.8, \(\phi_2\) = 0.3 and \(\sigma^2\) = 1. (Note that these parameters will give a non-stationary series.)
plot_ar2 <- function(p, p2) {
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 3:100)
y[i] <- p * y[i-1] + e[i] + p2 * y[i-2]
autoplot(y) +
ggtitle(paste0('AR(2), p = ', p, ' phi_2 = ', p2))
}
plot_ar2(-0.8,0.3)Graph the latter two series and compare them.
Consider austa , the total international visitors to Austrailia (in millions) for the period 1980-2015.
autoplot(austa)Use auto.arima() to find an appropriate ARIMA model. What model was selected. Check that the residuals look like white noise. Plot forecasts for the next 10 periods.
(fit <- auto.arima(austa))## Series: austa
## ARIMA(0,1,1) with drift
##
## Coefficients:
## ma1 drift
## 0.3006 0.1735
## s.e. 0.1647 0.0390
##
## sigma^2 estimated as 0.03376: log likelihood=10.62
## AIC=-15.24 AICc=-14.46 BIC=-10.57
checkresiduals(fit)##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,1,1) with drift
## Q* = 2.297, df = 5, p-value = 0.8067
##
## Model df: 2. Total lags used: 7
fit %>%
forecast(h=10) %>%
autoplot()Plot forecasts from an ARIMA(0,1,1) model with no drift and compare these to part a. Remove the MA term and plot again.
Arima(austa, order = c(0,1,1), include.drift = F) %>%
forecast(h=10) %>%
autoplot()Arima(austa, order = c(0,1,1), include.drift = F, include.mean = F) %>%
forecast(h=10) %>%
autoplot()Plot forecasts from an ARIMA(2,1,3) model with drift. Remove the constant and see what happens.
Arima(austa, order = c(2,1,3), include.drift = T) %>%
forecast(h=10) %>%
autoplot()# Arima(austa, order = c(2,1,3), include.drift = T, include.constant = F) %>%
# forecast(h=10) %>%
# autoplot()Plot forecasts from an ARIMA(0,0,1) model with a constant. Remove the MA term and plot again.
Arima(austa, order = c(0,0,1), include.constant = T) %>%
forecast(h=10) %>%
autoplot()Arima(austa, order = c(0,0,1), include.constant = T, include.mean = F) %>%
forecast(h=10) %>%
autoplot()Plot forecasts from an ARIMA(0,2,1) model with no constant.
Arima(austa, order = c(0,2,1), include.constant = T) %>%
forecast(h=10) %>%
autoplot()