Figure 8.31 shows the ACFs for 36 random numbers, 360 random numbers and 1,000 random numbers.
Left: ACF for a white noise series of 36 numbers. Middle: ACF for a white noise series of 360 numbers. Right: ACF for a white noise series of 1,000 numbers.
It seems that the larger or longer the time series the closer the critical values for autocorrelation get to zero since the range of values that show no significant difference from 0 (between the blue lines) gets successively narrower as the number of data points increases in each plot.
All of the plots look like white noise since almost none of the plots extend beyond the critical values and there is no discernible pattern in the plots.
From Hyndman:
For a white noise series, we expect 95% of the spikes in the ACF to lie within \(\pm 2/\sqrt{T}\) where \(T\) is the length of the time series.1
Since the critical values are dependent on \(T\) the longer the time series, the smaller the absolute value of the critical values will be.
A classic example of a non-stationary series is the daily closing IBM stock price series (data set ibmclose). Use R to plot the daily closing prices for IBM stock and the ACF and PACF. Explain how each plot shows that the series is non-stationary and should be differenced.
A downward trend is clearly evident in the time plot which would rule out stationarity.
The ACF plot shows that all lags are well outside the range between the critical values that would indicate that the series is not white noise and so not stationary.
Because the first lag of a PACF plot is the same as the first lag in the ACF plot, the PACF plot also shows a significant spike at lag 1 that indicates that the series is not stationary.
For the following series, find an appropriate Box-Cox transformation and order of differencing in order to obtain stationary data.
usnetelecusgdpmcopperenplanementsvisitorsIn homework 2 we examined box-cox transformations for the first four of these series and discovered that "The enplanements series is the only one of the four that shows a clear seasonality that increases with the increase in the level of the series, so it is the only one of the four series for which a Box-Cox transformation is warranted and useful.2 So we will only use a box-cox transformation on the enplanements time series and possibly the visitors time series if it seems warranted after examining plots to look for increasing seasonality.
usnetelecDescription: Annual US net electricity generation (billion kwh) for 1949-2003
## [1] 1
We still see a lot of variation after differencing only once and there is one negative spike that extends below the critical value at lag 14 so let’s see if a second difference is better.
Not surprisingly, since the ndiffs function suggested only first-order differencing, that made things even worse, so let’s see if a log transformation or box-cox transformation helps in this case even though a box-cox transformation did not seem warranted.
## [1] 2
# after log transformation and Differencing twice
usnetelec %>% log() %>% diff() %>% diff() %>% ggtsdisplay()The ndiffs function suggested second-order differencing after a log transformation, but that made things worse again, so let’s try the Box-Cox transformation…
## [1] 2
# after Box-Cox transformation and Differencing once
usnetelec %>% BoxCox(lambda = "auto") %>% diff() %>% ggtsdisplay()Interesting! Even though a Box-Cox transformation did not seem warranted in this case based on plots, after trial and error the best solution for making the usnetelec time series stationary seems to be a Box-Cox transformation with first-order differencing even though the ’ndiffs` function suggested second-order differencing with Box-Cox! However it’s only marginally better than simple first-order differencing, so for simplicity’s sake we may choose to go with the original first-order differencing only instead.
usgdpDescription: Quarterly US GDP. 1947:1 - 2006.1.
## [1] 2
After my experience with the usnetelec series, I decided to try log transformation and Box-Cox too just in case, but second-order differencing alone resulted in the best outcome.
mcopperDescription: Monthly copper prices. Copper, grade A, electrolytic wire bars/cathodes,LME,cash (pounds/ton)
Source: UNCTAD http://stats.unctad.org/Handbook.
## [1] 1
Once again I tried log transformation and Box-Cox too just in case but again first-order differencing alone resulted in the best outcome. So it seems that the ndiffs function does result in the best solution.
enplanementsDescription: "Domestic Revenue Enplanements (millions): 1996-2000.
Source: Department of Transportation, Bureau of Transportation Statistics, Air Carrier Traffic Statistic Monthly.
Here we can see a Box-Cox transformation is definitely warranted because we can see seasonal variability that increases with the increase in level.
## [1] 1
## [1] 1
The nsdiffs function recommended seasonal differencing after the Box-Cox transformation, and the ndiffs function recommends further first-order differencing which results in the following series…
# after Box-Cox transformation and Differencing once
enplanements %>% BoxCox(lambda = "auto") %>% diff(lag=12) %>% diff() %>% ggtsdisplay()cbind("enplanements" = enplanements,
"BoxCox\nTransformed" = BoxCox(enplanements, lambda = "auto"),
"Seasonally\ndifferenced" =
diff(BoxCox(enplanements, lambda = "auto"),12),
"Doubly\n differenced" =
diff(diff(BoxCox(enplanements, lambda = "auto"),12),1)) %>%
autoplot(facets=TRUE) +
xlab("Year") + ylab("") +
ggtitle("Domestic Revenue Enplanements (millions)")visitorsDescription: Monthly Australian short-term overseas visitors. May 1985-April 2005
Source: Hyndman, R.J., Koehler, A.B., Ord, J.K., and Snyder, R.D., (2008) Forecasting with exponential smoothing: the state space approach, Springer.
Again a Box-Cox transformation definitely seems to be warranted in this case due to seasonal variability that increases with the level of the series.
## [1] 1
## [1] 1
Once again, the nsdiffs function recommended seasonal differencing after the Box-Cox transformation, and the ndiffs function recommends further first-order differencing which results in the following series…
# after Box-Cox transformation and Seasonal Differencing
visitors %>% BoxCox(lambda = "auto") %>% diff(lag=12) %>% diff() %>% ggtsdisplay()cbind("visitors" = visitors,
"BoxCox\nTransformed" = BoxCox(visitors, lambda = "auto"),
"Seasonally\ndifferenced" =
diff(BoxCox(visitors, lambda = "auto"),12),
"Doubly\n differenced" =
diff(diff(BoxCox(visitors, lambda = "auto"),12),1)) %>%
autoplot(facets=TRUE) +
xlab("Year") + ylab("") +
ggtitle("Monthly Australian short-term overseas visitors")For your retail data (from Exercise 3 in Section 2.10), find the appropriate order of differencing (after transformation if necessary) to obtain stationary data.
retaildata <- readxl::read_excel("retail.xlsx", skip=1)
retail <- ts(retaildata[,"A3349335T"],
frequency=12, start=c(1982,4))
autoplot(retail)A Box-Cox transformation definitely seems to be warranted due to seasonal variability that increases with the level of the series.
## [1] 1
## [1] 1
Same as the last two time series in Exercise 8.3, the nsdiffs function recommended seasonal differencing after the Box-Cox transformation, and the ndiffs function recommends further first-order differencing which results in the following series…
# after Box-Cox transformation and Seasonal Differencing
retail %>% BoxCox(lambda = "auto") %>% diff(lag=12) %>% diff() %>% ggtsdisplay()There’s still a lot of autocorrelation in the ACF plot, but after trying a second-order seasonal difference and second-order difference after the seasonal differencing the plot above is still best.
cbind("Retail Sales" = retail,
"BoxCox\nTransformed" = BoxCox(retail, lambda = "auto"),
"Seasonally\ndifferenced" =
diff(BoxCox(retail, lambda = "auto"),12),
"Doubly\n differenced" =
diff(diff(BoxCox(retail, lambda = "auto"),12),1)) %>%
autoplot(facets=TRUE) +
xlab("Year") + ylab("") +
ggtitle("Retail Sales")Use R to simulate and plot some data from simple ARIMA models.
set.seed(123)
y <- ts(numeric(100))
e <- rnorm(100)
# for(i in 2:100)
# y[i] <- 0.6*y[i-1] + e[i]
head(cbind(y, e), 10)## Time Series:
## Start = 1
## End = 10
## Frequency = 1
## y e
## 1 0 -0.56047565
## 2 0 -0.23017749
## 3 0 1.55870831
## 4 0 0.07050839
## 5 0 0.12928774
## 6 0 1.71506499
## 7 0 0.46091621
## 8 0 -1.26506123
## 9 0 -0.68685285
## 10 0 -0.44566197
AR1 <- function(phi, y, e){
for(i in 2:100)
y[i] <- phi*y[i-1] + e[i]
return(y)
}
phi_0.6 <- AR1(0.6, y, e)
head(phi_0.6)## Time Series:
## Start = 1
## End = 6
## Frequency = 1
## [1] 0.0000000 -0.2301775 1.4206018 0.9228695 0.6830094 2.1248706
# par(mfrow=c(3,2))
phi_neg1 <- AR1(-1, y, e) %>% autoplot() + ggtitle('phi_neg1')
phi_neg0.9 <- AR1(-0.9, y, e) %>% autoplot() + ggtitle('phi_neg0.9')
phi_neg0.6 <- AR1(-0.6, y, e) %>% autoplot() + ggtitle('phi_neg0.6')
phi_neg0.3 <- AR1(-0.3, y, e) %>% autoplot() + ggtitle('phi_neg0.3')
phi_0 <- AR1(0, y, e) %>% autoplot() + ggtitle('phi_0')
phi_0.3 <- AR1(0.3, y, e) %>% autoplot() + ggtitle('phi_0.3')
phi_0.6 <- AR1(0.6, y, e) %>% autoplot() + ggtitle('phi_0.6')
phi_0.8 <- AR1(0.8, y, e) %>% autoplot() + ggtitle('phi_0.8')
phi_0.9 <- AR1(0.9, y, e) %>% autoplot() + ggtitle('phi_0.9')
phi_1.0 <- AR1(1.0, y, e) %>% autoplot() + ggtitle('phi_1.0')
grid.arrange(phi_neg1, phi_neg0.9, phi_neg0.6, phi_neg0.3,
phi_0, phi_0.3, phi_0.6, phi_0.8, phi_0.9, phi_1.0,
nrow = 5)Negative \(\phi_1\) values result in a plot that fluctuates wildly up and down since each y value is a function of the opposite of the previous y value plus error.
\(\phi_1 = 0\) results in a stationary plot that is pure white noise since it is equivalent to plotting the errors. The values fluctuate up and down between -2 and 2 since the errors are taken from a normal distribution with mean 0 and standard deviation of 1 and so the vast majority of values will be within 2 standard deviations of the mean.
At \(\phi_1 = 0.6\) and above we start to see the plot smooth out more as each value is affected more and more by the previous y value until at \(\phi_1 = 1\) each y value is equal to the previous y value plus the error and so is equivalent to a random walk.
# par(mfrow=c(3,2))
theta_neg0.9 <- MA1(-0.9, y, e) %>% autoplot() + ggtitle('theta_neg0.9')
theta_neg0.6 <- MA1(-0.6, y, e) %>% autoplot() + ggtitle('theta_neg0.6')
theta_neg0.3 <- MA1(-0.3, y, e) %>% autoplot() + ggtitle('theta_neg0.3')
theta_neg0.1 <- MA1(-0.1, y, e) %>% autoplot() + ggtitle('theta_neg0.1')
theta_0 <- MA1(0, y, e) %>% autoplot() + ggtitle('theta_0')
theta_0.1 <- MA1(0.1, y, e) %>% autoplot() + ggtitle('theta_0.1')
theta_0.3 <- MA1(0.3, y, e) %>% autoplot() + ggtitle('theta_0.3')
theta_0.6 <- MA1(0.6, y, e) %>% autoplot() + ggtitle('theta_0.6')
theta_0.9 <- MA1(0.9, y, e) %>% autoplot() + ggtitle('theta_0.9')
theta_1 <- MA1(1, y, e) %>% autoplot() + ggtitle('theta_1')
grid.arrange(theta_neg0.9, theta_neg0.6, theta_neg0.3,
theta_neg0.1, theta_0, theta_0.1,
theta_0.3, theta_0.6, theta_0.9, theta_1,
nrow = 5)Negative \(\theta_1\) values result in a plot that fluctuates wildly up and down since each value is a function of the opposite of the previous value.
At \(\theta_1 = 0\) the plot is equivalent to plotting the errors with the previous values having no influence on the current value.
For positive \(\theta_1\) values the higher the \(\theta_1\) value the more the previous error influences the current value until at \(\theta_1 = 1\) the current error and previous error have equal weight, which has no practical value.
In the first plot with \(\phi_1 = 0.6\) and \(\theta_1 = 0.6\) both the previous y value and the previous error value moderately influence the current value so you wind up with a pretty random plot. In the second plot with \(\phi_1 = -0.8\) and \(\phi_2 = 0.3\), since the current y value is strongly influenced by the opposite of the previous value you get a plot that alternately fluctuates between positive and negative values but since it also has a positive \(\phi_2\) value the absolute value of y increases exponentially over time.
Consider wmurders, the number of women murdered each year (per 100,000 standard population) in the United States.
## [1] 1
There is no seasonality in the data which is confirmed by the frequency of one. So a non-seasonal ARIMA model can be used in this case.
## [1] 2
The ndiffs function recommends 2nd order differencing which is confirmed in the plots above. The ACF plot has two significant spikes at the beginning then none after that but is also sinusoidal with alternating positive and negative spikes and the PACF has only one significant spike after the first one, but it’s not in the first few so we can disregard it. So in this case, the ACF and PACF lead us to think an \(\text{ARIMA}(0,2,1)\) model might be appropriate. But we should keep in mind that if both \(p\) and \(q\) are positive then the plots would not be helpful in finding these values.
Since \(d = 2\) if \(c = 0\) the long-term forecasts will follow a straight line. If \(c \neq 0\) the long-term forecasts will follow a quadratic trend. So in this case I would not include a constant since a quadratic trend does not seem appropriate to this model.3
\[(1-B)^2y_t=\theta_1B\varepsilon_t\]
## Series: wmurders
## ARIMA(0,2,1)
##
## Coefficients:
## ma1
## -0.8995
## s.e. 0.0669
##
## sigma^2 estimated as 0.04747: log likelihood=5.24
## AIC=-6.48 AICc=-6.24 BIC=-2.54
##
## Ljung-Box test
##
## data: Residuals from ARIMA(0,2,1)
## Q* = 13.04, df = 9, p-value = 0.1608
##
## Model df: 1. Total lags used: 10
There are no apparent patterns in the residuals and their distribution is nearly normal, so the model seems to be satisfactory.
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2005 2.504087 2.224876 2.783299 2.077070 2.931105
## 2006 2.418792 2.003603 2.833981 1.783815 3.053769
## 2007 2.333496 1.799788 2.867204 1.517260 3.149732
\[ \begin{align} (1-B)^2y_t &= \theta_1B\varepsilon_t \\ y_t-2y_{t-1}+y_{t-2} &= \theta_1 y_{t-1} \varepsilon_t \\ y_t &= 2y_{t-1} - y_{t-2} + \theta_1 y_{t-1} \varepsilon_t \end{align}\\ \]
fc <- ts(numeric(3))
theta <- -0.8995
y <- wmurders[54:55]
e <- residuals(fit)[55]
fc[1] <- 2*y[2] - y[1] + theta*e
fc[2] <- 2*fc[1] - y[2]
fc[3] <- 2*fc[2] - fc[1]
fc## Time Series:
## Start = 1
## End = 3
## Frequency = 1
## [1] 2.504087 2.418791 2.333495
The manually calculated forecasts match the fitted forecasts.
auto.arima() give the same model you have chosen? If not, which model do you think is better?## Series: wmurders
## ARIMA(1,2,1)
##
## Coefficients:
## ar1 ma1
## -0.2434 -0.8261
## s.e. 0.1553 0.1143
##
## sigma^2 estimated as 0.04632: log likelihood=6.44
## AIC=-6.88 AICc=-6.39 BIC=-0.97
No, the auto.arima function chose an \(\text{ARIMA}(1,2,1)\) model which is not the same model we chose but this is not surprising since if that is the ideal model then according to Hyndman since both \(p\) and \(q\) are positive the plots would not help us in choosing appropriate values for \(p\) and \(q\) anyway. Additionally, the auto.arima model gives us slightly better AIC and AICc values, so I would think it is the better model.
Hyndman, R.J., & Athanasopoulos, G. (2018) Forecasting: principles and practice, 2nd edition, OTexts: Melbourne, Australia. OTexts.com/fpp2. Accessed on February 23, 2020. section 2.9 White noise↩
Hyndman, R.J., & Athanasopoulos, G. (2018) Forecasting: principles and practice, 2nd edition, OTexts: Melbourne, Australia. OTexts.com/fpp2. Accessed on February 23, 2020. section 8.5 Non-seasonal ARIMA models↩