The autocorrelations in each of these figures have far different average absolute magnitudes. However, even though some of the autocorrelations in figure 1 approach .4, they still can be assumed to have been generated from a white noise process because only lag 12 reaches the critical value.
With a white noise process, there should be about 1 in 20 lags that exceeds the critical values at 5% confidence. That is why you see lag 12 in the first figure barley past the dotted line, even though these are known to be white noise processes. If we didn’t know for certain this was a white noise process, I would like to have an intuitive reason a particular lag exceeds the critical value before paying any attention to it, assuming it exceeds the critical value by a small amount. Same with figure 2 and lags 2 and 6, though I would pay a little more attention since the lag index is lower, though lag 12 from figure 1 could indicate seasonality if it were monthly data.
The value of linear correlation is asymptotically normal and inversely proportional to the square root (approximately) of the number of observations. This is why we see much smaller critical values and correlations for 1000 observations. White noise is a random process, so we expect to see different values every time a new series is generated. By extension, the lag correlations will also be different for each series.
library(fpp2)
autoplot(ibmclose)
ACF
ggAcf(ibmclose)
The ACF follows a pattern typically seen with ranodm walks, which are not stationary processes. The ACF shows a lag 1 autocorrelation near 1, then slowly tapers off. This means the series is not at all mean reverting and has long periods where the series stays around the same level.
ggPacf(ibmclose)
The partial autocorrelation, at lag 1, of virtually 1 is indicitive of a unit root. Series with unit roots are not stationary. The fact that this is a unit root can also be seen by plotting an ARIMA (1,0,0) model using this series
fit <- Arima(ibmclose, c(1,0,0))
autoplot(fit)
When we difference the series, it becomes white noise instead of a random walk.
ibmclose %>% diff %>% ggPacf()
library(urca)
autoplot(usnetelec)
There is some non-linearity early in the series. Though it is not clear if variance depends on the level of the series, the period of fluctuation in the 70s and 80s could be smoothed with a transformation.
BoxCox.lambda(usnetelec)
## [1] 0.5167714
We’ll go with a square root transform based on the BoxCox.lambda function.
Difference the series and check for stationarity
usnetelec_bc <- BoxCox(usnetelec, .5)
autoplot(usnetelec_bc)
usnetelec_bc %>% diff %>% ggtsdisplay()
usnetelec_bc %>% diff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 0.4656
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
We seem to still have a problem with heteroskedacity. We’ll try 2 differences.
usnetelec_bc %>% diff %>% diff %>% ggtsdisplay()
usnetelec_bc %>% diff %>% diff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 0.0707
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The series is now stationary. It should be noted that the second difference has introduced negative autocorrelation. The high absolute ACF and PACF terms do not necessarily mean we have reduced our RMSE on predictions; they are merely a consquence of the second order difference.
autoplot(usgdp)
This series is defintely non-linear, so a Box-Cox should help
BoxCox.lambda(usgdp)
## [1] 0.366352
We’ll go with a cube root.
Difference the series and check for stationarity
usgdp_bc <- BoxCox(usgdp, .3333)
usgdp_bc %>%diff %>% ggtsdisplay()
usgdp_bc %>% diff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 0.1521
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The first difference made our series stationary so we can stop here.
autoplot(mcopper)
This series is highly heteroskedastic, so again, a Box-Cox is probably necesary.
BoxCox.lambda(mcopper)
## [1] 0.1919047
This is close enough to zero to go with a log transform.
copper_bc <- BoxCox(mcopper, 0)
autoplot(copper_bc)
Difference the series and check for stationarity
copper_bc %>% diff %>% ggtsdisplay()
copper_bc %>% diff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 6 lags.
##
## Value of test-statistic is: 0.0425
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
We’ve succesfully made this series stationary based on the test. We won’t difference anymore, but would proceed with a model noting the heteroskedasticity. We would keep GARCH in mind if ARIMA didn’t work out.
autoplot(enplanements)
This data appears to be seasonal, with the difference depending on level.
BoxCox.lambda(enplanements)
## [1] -0.2269461
The log transform should help.
Difference the series and check for stationarity
enplanements_bc <- BoxCox(enplanements, 0)
enplanements_bc %>%diff %>% ggtsdisplay()
enplanements_bc %>% diff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0129
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
There is definitely quarterly seasonality here. Our series looks quite good as far as stationarity goes.
autoplot(visitors)
We see a similar type of series to the previous one.
BoxCox.lambda(visitors)
## [1] 0.2775249
This could be either a log or cube root. We’ll go with log to make interpreation easier.
Difference the series and check for stationarity
visitors_bc <- BoxCox(visitors, 0)
visitors_bc %>%diff %>% ggtsdisplay()
visitors_bc %>% diff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 4 lags.
##
## Value of test-statistic is: 0.0781
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
The seasonality is either bi-yearly or quarterly. Our series is now stationary after 1 difference.
library(tidyverse)
retaildata <- read_csv("https://raw.githubusercontent.com/TheFedExpress/Data/master/retail.csv", skip=1)
myts <- ts(retaildata[[12]],frequency=12, start=c(1982,4))
autoplot(myts)
The series clearly needs to be both transofmred and differenced. Seasonality depends on level and there is no mean reversion.
BoxCox.lambda(myts)
## [1] 0.0106194
The log transform will work best.
ts_tc <- BoxCox(myts, 0)
ts_tc %>%diff %>% ggtsdisplay()
ts_tc %>% diff %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 5 lags.
##
## Value of test-statistic is: 0.0254
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
We now have a stationary series with a predictable seasonal pattern.
sim_ar1 <- function (phi){
y <- ts(numeric(100))
e <- rnorm(100)
for(i in 2:100)
y[i] <- phi*y[i-1] + e[i]
plot <- autoplot(y) + labs(title = paste0('AR(1) ', phi, ' Process', collapse = ' '))
return(plot)
}
library(gridExtra)
ar_plots <- lapply(seq(.1, .9, .1), sim_ar1)
grid.arrange(grobs = ar_plots, ncol = 3, nrow = 3)
As we increase phi, the series becomes less choppy. It will stay around a given level for a longer period of time. This is because each observation is more likely to be similar to the last observation.
We make a similar function to the AR(1) for an MA(1) process. We use 500 points just to have more opportunity to find a pattern.
sim_ma1 <- function(theta = .6) {
y <- ts(numeric(500))
e <- rnorm(500)
y[1] <- e[1]
for (i in 2:500){
y[i] <- e[i] + theta * e[i-1]
}
return(autoplot(y) + labs(title = paste0('MA(1) ', theta, ' Process', collapse = ' ')))
}
ma_plots <- lapply(seq(.1, .9, .1), sim_ma1)
grid.arrange(grobs = ma_plots, ncol = 3, nrow = 3)
The MA(1) process only depends on the random error and the most recent point so there the difference between these plots is less obvious. As we increase theta, say to .9, extreme observations become more likely. This is because if the previous error was high, and the current error is high .9 * the previous high error + the current high error.
We use a similar function to generate an ARMA(1,1) process with theta and phi both equal to .6.
sim_arma11 <- function(phi = .6, theta = .6) {
y <- ts(numeric(100))
e <- rnorm(100)
y[1] <- e[1]
for (i in 2:100){
y[i] <- phi*y[i-1] + theta * e[i-1] + e[i]
}
return(autoplot(y))
}
This time we use the framework to create an AR(2) process with \({ \phi }_{ 1 }=-.8, {\phi}_{2}=.3\)
sim_ar2 <- function(phi1 = -.8, phi2 = .3) {
y <- ts(numeric(100))
e <- rnorm(100)
for (i in 3:100){
y[i] <- phi1*y[i-1] + phi2*y[i-2] + e[i]
}
return(autoplot(y))
}
sim_arma11()
sim_ar2()
The ARMA process has some cycles but is generally mean reverting. The non-stationary process just ballons out of control.I believe it is something like \({ 1.1 }^{ 100 }*e\)
autoplot(wmurders)
This series will have to be differenced but is somewhat unlike the previous series we’ve worked with in that it doesn’t have a trend pattern. It is more cyclical like a random walk. We’ll start with a single difference.
wmurders_d <- wmurders %>% diff
sd(wmurders_d)
## [1] 0.2156076
wmurders_d %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 0.4697
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
wmurders_d %>% ggtsdisplay()
This exceeds the 5% critical value for a non-stationary series, but the diagnostic plots mostly pass the eyeball test, only showing heteroskedasticity for concern. We’ll try a Box-Cox to deal with that.
Lambda Value
BoxCox.lambda(wmurders)
## [1] -0.09529835
wmurders_bc <- BoxCox(wmurders, 0) %>% diff
wmurders_bc %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 0.5406
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
wmurders_bc %>% ggtsdisplay()
The variation in the series looks somewhat better, at least around 1970. We’ll try a double difference on the data just to see if we can induce stationarity.
wmurders_dd <- wmurders %>% diff %>% diff
sd(wmurders_dd)
## [1] 0.3204323
wmurders_dd %>% ur.kpss() %>% summary()
##
## #######################
## # KPSS Unit Root Test #
## #######################
##
## Test is of type: mu with 3 lags.
##
## Value of test-statistic is: 0.0458
##
## Critical value for a significance level of:
## 10pct 5pct 2.5pct 1pct
## critical values 0.347 0.463 0.574 0.739
wmurders_dd %>% ggtsdisplay()
The double difference has increased the variance. We don’t want to over-difference, make a slight improvement on the KPSS test statistic.
Acf(wmurders_bc)
Pacf(wmurders_bc)
The (2,1,2) model seems most appropriate. It is unclear if the model is MA(2), AR(2), or both so if we had to pick just one model, having both MA and AR terms seems best.
Using backshift notation our model looks like:
\((1-{ \phi }_{ 1 }B-{ \phi }_{ 2 }{ B }^{ 2 }){ (1-B) }{ y }_{ t }=(1+{ \theta }_{ 1 }B+{ \theta }_{ 2 }{ B }^{ 2 }){ \varepsilon }_{ t }\)
This time series does not have a trend component so we want the forecast of the different to go to a contant in the long term. As such we do not want a contant. Intuitively, we see the series hasn’t increased in the last 50 years, so we do not want the contant increase that adding a constant in a differenced series would provide.
mod1 <- Arima(wmurders_bc, c(2,1,2))
summary(mod1)
## Series: wmurders_bc
## ARIMA(2,1,2)
##
## Coefficients:
## ar1 ar2 ma1 ma2
## -0.3075 0.1513 -0.7414 -0.1054
## s.e. 0.8091 0.2841 0.8144 0.6583
##
## sigma^2 estimated as 0.003666: log likelihood=74.72
## AIC=-139.44 AICc=-138.16 BIC=-129.58
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set -0.003131109 0.05767707 0.04412008 -198.5935 393.3589
## MASE ACF1
## Training set 0.664639 -0.01914654
checkresiduals(mod1)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(2,1,2)
## Q* = 8.2433, df = 6, p-value = 0.2208
##
## Model df: 4. Total lags used: 10
The residuals look okay, but none of the terms are significant so we’d probably be better off choosing a different model. We’ll try ARIMA(1,1,1)
mod1 <- Arima(wmurders_bc, c(1,1,1))
summary(mod1)
## Series: wmurders_bc
## ARIMA(1,1,1)
##
## Coefficients:
## ar1 ma1
## -0.2954 -0.7892
## s.e. 0.1533 0.1190
##
## sigma^2 estimated as 0.003613: log likelihood=74.09
## AIC=-142.17 AICc=-141.68 BIC=-136.26
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set -0.002751289 0.05841585 0.04456503 -332.7413 531.9883
## MASE ACF1
## Training set 0.6713418 0.01707485
checkresiduals(mod1)
##
## Ljung-Box test
##
## data: Residuals from ARIMA(1,1,1)
## Q* = 10.394, df = 8, p-value = 0.2385
##
## Model df: 2. Total lags used: 10
This model is satisfactory; the residuals and parameters look good. The AR1 and MA1 parameters didn’t change much, but their standard errors did. Both terms are now signficant. We’ve also lowered our AIC.
Hand-calculated predictions
library(kableExtra)
params <- mod1$coef %>% as.vector()
e0 <- residuals(mod1)[length(residuals(mod1))]
y_t0 <- mod1$x[54]
y_t1<- mod1$x[53]
pred1 <- (1+params[1]) * y_t0 - params[1]*y_t1 + params[2]*e0
pred2 <- (1+params[1]) * pred1 - params[1]*y_t0
pred3 <- (1+params[1]) * pred2 - params[1]*pred1
data.frame(Predictions = c(pred1, pred2, pred3)) %>% kable %>% kable_styling()
| Predictions |
|---|
| -0.0407358 |
| -0.0368982 |
| -0.0380317 |
forecast function
forecast(mod1, 3)
## Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
## 2005 -0.04073580 -0.1177691 0.03629752 -0.1585481 0.07707650
## 2006 -0.03689824 -0.1142063 0.04040977 -0.1551306 0.08133416
## 2007 -0.03803174 -0.1174452 0.04138176 -0.1594842 0.08342073
The hand calculated predictions check out.
autoplot(wmurders_bc) + autolayer(forecast(mod1, 3), series = 'Forecast') + ggtitle('(2,1,2) model of wmurders')
The interval on the forecast is quite high, nearly equal to the range between the highest and lowested observed points.
auto.arima(wmurders_bc) %>% summary
## Series: wmurders_bc
## ARIMA(1,1,1)
##
## Coefficients:
## ar1 ma1
## -0.2954 -0.7892
## s.e. 0.1533 0.1190
##
## sigma^2 estimated as 0.003613: log likelihood=74.09
## AIC=-142.17 AICc=-141.68 BIC=-136.26
##
## Training set error measures:
## ME RMSE MAE MPE MAPE
## Training set -0.002751289 0.05841585 0.04456503 -332.7413 531.9883
## MASE ACF1
## Training set 0.6713418 0.01707485
The auto.arima function chose the same model as us. When not using the transfomrmed data, the auto.arima function differed from what I would normally do:
auto.arima(wmurders) %>% summary
## Series: wmurders
## ARIMA(1,2,1)
##
## Coefficients:
## ar1 ma1
## -0.2434 -0.8261
## s.e. 0.1553 0.1143
##
## sigma^2 estimated as 0.04632: log likelihood=6.44
## AIC=-6.88 AICc=-6.39 BIC=-0.97
##
## Training set error measures:
## ME RMSE MAE MPE MAPE MASE
## Training set -0.01065956 0.2072523 0.1528734 -0.2149476 4.335214 0.9400996
## ACF1
## Training set 0.02176343
The auto.arima function’s algorithm will difference twice, even in borderline cases. It does not take into account interpretability or any other factors, just the test statistic on the KPSS test.