The differences among the three ACF plots are due to sample size. As the sample size increases, the ACF becomes more stable and closer to zero, and the confidence bounds become narrower. All three indicate white noise because there is no consistent pattern in the autocorrelations, and most values lie within the confidence bounds. The differences seen are due to random variation in smaller samples
The critical values (blue dashed lines) are at different distances from zero because they depend on the sample size. The autocorrelations are different in each figure because of sampling variability. Even though all datasets are white noise, each sample is different, so the sample autocorrelations will vary randomly. Larger samples produce autocorrelations that are closer to zero.
# Filter Amazon stock
amazon <- gafa_stock %>%
filter(Symbol == "AMZN")
# Plot time series
autoplot(amazon, Close) +
labs(title = "Amazon Daily Closing Prices")
# ACF
amazon %>%
ACF(Close) %>%
autoplot()
## Warning: Provided data has an irregular interval, results should be treated
## with caution. Computing ACF by observation.
# PACF
amazon %>%
PACF(Close) %>%
autoplot()
## Warning: Provided data has an irregular interval, results should be treated
## with caution. Computing ACF by observation.
The time plot shows a trend, the ACF decays slowly, and the PACF
shows significant early lags, all indicating non-stationarity.
Therefore, the series should be differenced.
turkey_gdp <- global_economy %>%
filter(Country == "Turkey")
autoplot(turkey_gdp, GDP) +
labs(title = "Turkish GDP", y = "GDP", x = "Year")
turkey_gdp %>%
features(GDP, guerrero)
#Guerrero's lambda
lambda <- turkey_gdp %>%
features(GDP, guerrero) %>%
pull(lambda_guerrero)
lambda
## [1] 0.1572187
turkey_gdp %>%
features(GDP, unitroot_ndiffs)
autoplot(turkey_gdp, box_cox(GDP, lambda)) +
labs(title = "Box-Cox Turkish GDP",
y = "Transformed GDP", x = "Year")
tas_accom <- aus_accommodation %>%
filter(State == "Tasmania")
autoplot(tas_accom, Takings)
# Box-Cox
tas_accom %>%
features(Takings, guerrero)
# Differencing
tas_accom %>%
features(Takings, unitroot_ndiffs)
tas_accom %>%
features(Takings, unitroot_nsdiffs)
autoplot(souvenirs)
## Plot variable not specified, automatically selected `.vars = Sales`
# Box-Cox
souvenirs %>%
features(Sales, guerrero)
# Differencing
souvenirs %>%
features(Sales, unitroot_ndiffs)
souvenirs %>%
features(Sales, unitroot_nsdiffs)
set.seed(1234)
myseries <- aus_retail %>%
filter(`Series ID` == sample(aus_retail$`Series ID`, 1))
autoplot(myseries, Turnover)
# Box-Cox transformation
myseries %>%
features(Turnover, guerrero)
#Regular differencing
myseries %>%
features(Turnover, unitroot_ndiffs)
# Seasonal differencing
myseries %>%
features(Turnover, unitroot_nsdiffs)
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100)
y[i] <- 0.6*y[i-1] + e[i]
sim <- tsibble(idx = seq_len(100), y = y, index = idx)
autoplot(sim, y) +
labs(title = "Simulated AR(1)",
x = "Time", y = "y")
As ϕ1 gets larger and stays positive, the series becomes
smoother and shows more persistence. When ϕ1 is close to 0, the series
looks more like white noise.
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100){
y[i] <- 0.6*e[i-1] + e[i]
}
sim_ma1 <- tsibble(idx = seq_len(100), y = y, index = idx)
autoplot(sim_ma1, y) +
labs(title = "MA(1) Series with theta = 0.6", x = "Time", y = "y")
As θ1 changes, the short-run dependence in the series
changes.
y <- numeric(100)
e <- rnorm(100)
for(i in 2:100){
y[i] <- 0.6 * y[i-1] + e[i] + 0.6 * e[i-1]
}
sim_arma11 <- tsibble(idx = seq_len(100), y = y, index = idx)
y <- numeric(100)
e <- rnorm(100)
for(i in 3:100){
y[i] <- -0.8 * y[i-1] + 0.3 * y[i-2] + e[i]
}
sim_ar2 <- tsibble(idx = seq_len(100), y = y, index = idx)
autoplot(sim_arma11, y) +
labs(title = "ARMA(1,1) Series", x = "Time", y = "y")
autoplot(sim_ar2, y) +
labs(title = "AR(2) Series", x = "Time", y = "y")
Overall, the ARMA(1,1) series looks more stable, while the AR(2)
series appears more unstable and oscillatory.
data <- aus_airpassengers
fit <- data %>%
model(ARIMA(Passengers))
report(fit)
## Series: Passengers
## Model: ARIMA(0,2,1)
##
## Coefficients:
## ma1
## -0.8963
## s.e. 0.0594
##
## sigma^2 estimated as 4.308: log likelihood=-97.02
## AIC=198.04 AICc=198.32 BIC=201.65
gg_tsresiduals(fit)
## Warning: `gg_tsresiduals()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_tsresiduals()` instead.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
fc <- fit %>% forecast(h = 10)
autoplot(fc, data) +
labs(title = "ARIMA Forecasts for Air Passengers",
y = "Passengers (millions)")
\((1 - B)y_t = c + (1 + \theta_1 B)\varepsilon_t\). Where: B is the backshift operator, c is the drift term𝜃1 is the MA parameter.
fit_rw <- data %>%
model(ARIMA(Passengers ~ pdq(0,1,0)))
fc_rw <- fit_rw %>% forecast(h = 10)
autoplot(fc_rw, data) +
labs(title = "ARIMA(0,1,0) with Drift Forecast")
fit_212 <- data %>%
model(ARIMA(Passengers ~ pdq(2,1,2)))
## Warning: It looks like you're trying to fully specify your ARIMA model but have not said
## if a constant should be included. You can include a constant using `ARIMA(y~1)`
## to the formula or exclude it by adding `ARIMA(y~0)`.
## Warning: 1 error encountered for ARIMA(Passengers ~ pdq(2, 1, 2))
## [1] Could not find an appropriate ARIMA model.
## This is likely because automatic selection does not select models with characteristic roots that may be numerically unstable.
## For more details, refer to https://otexts.com/fpp3/arima-r.html#plotting-the-characteristic-roots
fc_212 <- fit_212 %>% forecast(h = 10)
autoplot(fc_212, data) +
labs(title = "ARIMA(2,1,2) with Drift Forecast")
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning: Removed 10 rows containing missing values or values outside the scale range
## (`geom_line()`).
# Without constant (no drift)
fit_212_nodrift <- data %>%
model(ARIMA(Passengers ~ pdq(2,1,2) + 0))
## Warning: 1 error encountered for ARIMA(Passengers ~ pdq(2, 1, 2) + 0)
## [1] non-stationary AR part from CSS
fc_212_nodrift <- fit_212_nodrift %>% forecast(h = 10)
autoplot(fc_212_nodrift, data) +
labs(title = "ARIMA(2,1,2) without Drift Forecast")
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning: Removed 10 rows containing missing values or values outside the scale range
## (`geom_line()`).
fit_021 <- data %>%
model(ARIMA(Passengers ~ pdq(0,2,1)))
fc_021 <- fit_021 %>% forecast(h = 10)
autoplot(fc_021, data) +
labs(title = "ARIMA(0,2,1) with Constant Forecast")
The ARIMA(0,2,1) model applies second differencing,As a result,
the forecasts may show curvature or instability and can behave
unrealistically.
us_gdp <- global_economy %>%
filter(Country == "United States")
autoplot(us_gdp, GDP)
us_gdp %>%
features(GDP, guerrero)
fit_arima <- us_gdp %>%
model(ARIMA(box_cox(GDP, 0)))
report(fit_arima)
## Series: GDP
## Model: ARIMA(0,2,1)
## Transformation: box_cox(GDP, 0)
##
## Coefficients:
## ma1
## -0.6353
## s.e. 0.1138
##
## sigma^2 estimated as 0.0004278: log likelihood=139.76
## AIC=-275.53 AICc=-275.3 BIC=-271.48
fit_alt <- us_gdp %>%
model(
arima1 = ARIMA(box_cox(GDP, 0) ~ pdq(0,1,1)),
arima2 = ARIMA(box_cox(GDP, 0) ~ pdq(1,1,1)),
arima3 = ARIMA(box_cox(GDP, 0) ~ pdq(2,1,2))
)
report(fit_alt)
## Warning in report.mdl_df(fit_alt): Model reporting is only supported for
## individual models, so a glance will be shown. To see the report for a specific
## model, use `select()` and `filter()` to identify a single model.
gg_tsresiduals(fit_arima)
The best model is ARIMA(1,1,1) because it has the lowest
residual variance among the candidate models. The residuals appear to be
uncorrelated and centered around zero.
fc_arima <- fit_alt %>%
select(arima2) %>%
forecast(h = 10)
autoplot(us_gdp, GDP) +
autolayer(fc_arima) +
labs(title = "ARIMA(1,1,1) Forecasts for US GDP")
The forecasts show a smooth upward trend, which is consistent with historical GDP growth. The forecasts look reasonable.
fit_ets <- us_gdp %>%
model(ETS(GDP))
fc_ets <- fit_ets %>% forecast(h = 10)
autoplot(fc_ets, us_gdp) +
labs(title = "ETS Forecasts for US GDP")
**The ETS model (without transformation) also produces upward-trending
forecasts.