A stationary time series is one whose statistical properties (mean, variance, and autocorrelation) do not change over time.
The stationary series are: - (b) Daily change in the Google stock price in 2015 - (g) Annual total of Canadian Lynx furs traded
Series (b) is stationary because differencing removes trend and stabilizes the mean.
Series (g) is stationary because although it has cycles, they are irregular and not predictable.
The remaining series are non-stationary due to: - Trends - Seasonality - Changing variance
The backshift operator (B) is used to represent lagged values in a time series.
By definition: - B y_t = y_{t-1} - B^2 y_t = y_{t-2}
For seasonal data: - B^{12} y_t = y_{t-12}
This operator simplifies the notation for differencing.
The first difference can be written as:
y’t = y_t - y{t-1} = (1 - B)y_t
The second difference can be written as:
y’’t = y_t - 2y{t-1} + y_{t-2} = (1 - B)^2 y_t
A d-th order difference is written as:
(1 - B)^d y_t
A seasonal difference combined with a first difference is:
(1 - B)(1 - B^m)y_t
This expands to:
y_t - y_{t-1} - y_{t-m} + y_{t-m-1}
An autoregressive (AR) model is a time series model where the current value of a variable is explained using its past values.
An autoregressive model of order p, denoted AR(p), is written as:
y_t = c + φ₁y_{t-1} + φ₂y_{t-2} + … + φ_p y_{t-p} + ε_t
where ε_t is white noise.
In an AR model, the variable is regressed on its own previous values. This is similar to multiple regression, but instead of using external predictors, we use lagged values of the same variable.
For an AR(1) model: - If φ₁ = 0 → the series is white noise - If φ₁ = 1 → the series is a random walk - If φ₁ = 1 and c ≠ 0 → random walk with drift - If φ₁ < 0 → the series tends to oscillate
For the model to be stationary: - In AR(1): -1 < φ₁ < 1 - In AR(2): parameters must satisfy certain constraints to ensure stability
Autoregressive models are flexible and can capture many different time series patterns depending on the parameter values.
A non-seasonal ARIMA model combines autoregression (AR), differencing (I), and moving average (MA).
An ARIMA(p, d, q) model is written as:
y’t = c + φ₁y’{t-1} + … + φ_p y’{t-p} + θ₁ε{t-1} + … + θ_q ε_{t-q} + ε_t
where: - p = number of autoregressive terms - d = number of differences - q = number of moving average terms
ARIMA models are flexible and can represent many types of time series patterns by combining AR, differencing, and MA components.
Some special cases include: - ARIMA(0,0,0): white noise - ARIMA(0,1,0): random walk - ARIMA(p,0,0): autoregressive model - ARIMA(0,0,q): moving average model
library(fpp3)
## Warning: package 'fpp3' was built under R version 4.4.3
## Registered S3 methods overwritten by 'ggtime':
## method from
## +.gg_tsensemble feasts
## autolayer.fbl_ts fabletools
## autolayer.tbl_ts fabletools
## autoplot.dcmp_ts fabletools
## autoplot.fbl_ts fabletools
## autoplot.tbl_cf feasts
## autoplot.tbl_ts fabletools
## chooseOpsMethod.gg_tsensemble feasts
## fortify.fbl_ts fabletools
## grid.draw.gg_tsensemble feasts
## print.gg_tsensemble feasts
## scale_type.cf_lag feasts
## Warning: replacing previous import 'feasts::scale_x_cf_lag' by
## 'ggtime::scale_x_cf_lag' when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_season' by 'ggtime::gg_season'
## when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_tsresiduals' by
## 'ggtime::gg_tsresiduals' when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_irf' by 'ggtime::gg_irf' when
## loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_arma' by 'ggtime::gg_arma' when
## loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_tsdisplay' by
## 'ggtime::gg_tsdisplay' when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_subseries' by
## 'ggtime::gg_subseries' when loading 'fpp3'
## Warning: replacing previous import 'feasts::gg_lag' by 'ggtime::gg_lag' when
## loading 'fpp3'
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.3 ──
## ✔ tibble 3.3.1 ✔ tsibble 1.2.0
## ✔ dplyr 1.2.0 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.2 ✔ ggtime 0.2.0
## ✔ lubridate 1.9.5 ✔ feasts 0.4.2
## ✔ ggplot2 4.0.2 ✔ fable 0.5.0
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ feasts::gg_arma() masks ggtime::gg_arma()
## ✖ feasts::gg_irf() masks ggtime::gg_irf()
## ✖ feasts::gg_lag() masks ggtime::gg_lag()
## ✖ feasts::gg_season() masks ggtime::gg_season()
## ✖ feasts::gg_subseries() masks ggtime::gg_subseries()
## ✖ feasts::gg_tsdisplay() masks ggtime::gg_tsdisplay()
## ✖ feasts::gg_tsresiduals() masks ggtime::gg_tsresiduals()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ feasts::scale_x_cf_lag() masks ggtime::scale_x_cf_lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
# Fit ARIMA model on Egyptian exports
fit <- global_economy |>
filter(Code == "EGY") |>
model(ARIMA(Exports))
## Warning: 1 error encountered for ARIMA(Exports)
## [1] The `urca` package must be installed to use this functionality. It can be installed with install.packages("urca")
report(fit)
## Series: Exports
## Model: NULL model
## NULL model
fit |>
forecast(h = 10) |>
autoplot(global_economy)
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning: Removed 10 rows containing missing values or values outside the scale range
## (`geom_line()`).
The ARIMA model was automatically selected using the ARIMA() function.
The output shows the estimated model parameters, including autoregressive (AR) and moving average (MA) components.
The forecast plot shows predicted future values along with uncertainty intervals. The forecasts follow the general pattern of the data and provide a reasonable projection based on past trends.
Once the order of an ARIMA model (p, d, q) is selected, the parameters are estimated using maximum likelihood estimation (MLE).
MLE finds the parameter values that maximize the probability of observing the data.
In ARIMA models, MLE is used to estimate the coefficients (c, φ, and θ) by maximizing the likelihood of the observed data.
This is similar to least squares estimation, where we minimize the sum of squared errors.
To select the best model, information criteria are used:
These criteria balance model fit and complexity. Lower values indicate a better model.
AICc is generally preferred because it adjusts for small sample sizes.
Important note: - These criteria are used to select p and q - They are NOT reliable for selecting d (degree of differencing)
Therefore, differencing (d) is usually determined first, and then AICc is used to choose the best values of p and q.
The ARIMA() function in the fable package uses the Hyndman-Khandakar algorithm to automatically select the best ARIMA model.
The algorithm works as follows:
This allows the function to efficiently find a suitable ARIMA model without manually testing all combinations.
library(fpp3)
# Fit ARIMA model automatically
fit_auto <- global_economy |>
filter(Code == "CAF") |>
model(ARIMA(Exports))
## Warning: 1 error encountered for ARIMA(Exports)
## [1] The `urca` package must be installed to use this functionality. It can be installed with install.packages("urca")
report(fit_auto)
## Series: Exports
## Model: NULL model
## NULL model
The ARIMA() function automatically selected a model by combining unit root tests, AICc minimization, and MLE.
The output shows the selected model order and estimated parameters.
This demonstrates how ARIMA modelling can be automated while still following proper statistical procedures.
Forecasts from ARIMA models are calculated by using past values, past errors, and replacing future errors with zero.
The process for computing forecasts involves:
This process is repeated for each forecast step.
Prediction intervals are based on the standard deviation of the residuals and increase as the forecast horizon increases.
library(fpp3)
# Fit ARIMA model
fit_forecast <- global_economy |>
filter(Code == "EGY") |>
model(ARIMA(Exports))
## Warning: 1 error encountered for ARIMA(Exports)
## [1] The `urca` package must be installed to use this functionality. It can be installed with install.packages("urca")
# Generate forecasts
fc <- fit_forecast |>
forecast(h = 10)
# Plot forecasts
autoplot(fc, global_economy)
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning: Removed 10 rows containing missing values or values outside the scale range
## (`geom_line()`).
The ARIMA model generates forecasts by using past observations and estimated model parameters.
Future errors are assumed to be zero, which allows forecasts to be computed step-by-step.
The forecast plot shows predicted values along with prediction intervals, which widen as the forecast horizon increases due to increasing uncertainty.