Author: Farhod Ibragimov
library(fpp3)
library(future)
library(patchwork)
plan(multisession)
aus_livestock dataset.?aus_livestock
## starting httpd help server ... done
aus_livestock |>
distinct(Animal)
aus_pig |>
autoplot()+
labs(title = 'Counts of pigs slaughtered for meat production in Victoria')
## Plot variable not specified, automatically selected `.vars = Count`
aus_pig |>
features(Count, feat_stl)
From the plot and STL diagnostics can say following
Trend: fluctuates over time and does not show persistent level over years. Also, estimated trend strength is 0.90, indicating that the series is strongly dominated by long-term changes over time.
Cyclicity: There appears to be longer-term cycles (5-7 years) rather than a clear consistent upward or downward trend.
Seasonality: From the plot alone I
can’t confirm presence of seasonality. However the seasonal strength is
0.49, suggesting a moderate annual seasonal pattern, although the
seasonal effect is not extremely strong.
The seasonal peak occurs in month 1, indicating that the highest
seasonal slaughter counts happens in January.
To check if there is seasonality I would like to apply STL decomposition
and ACF on seasonal component.
aus_pig |>
model(STL(Count~season(window = 'periodic'))) |>
components() |>
ACF(season_year) |>
autoplot()
As we see ACF of the estimated seasonal component shows strong spikes
at lags 12 and 24, this suggests that seasonal patter repeats
annually.
This result is aligned with the STL diagnostics analysis, which
estimated the seasonal strength to be 0.49, suggesting a moderate
seasonal effect in the series.
Overall the trend (0.90) is dominant feature, while seasonal dominance
is 0.49, suggesting a moderate seasonal pattern.
1a. Use the ETS() function to estimate the
equivalent model for simple exponential smoothing. Find the optimal
values of alpha and l_0, and generate
forecasts for the next four months.
fit_pigs <- aus_pig |>
model(
SES = ETS(Count~ error('A') + trend('N') + season('N'))
)
fit_pigs |>
tidy()
report(fit_pigs)
## Series: Count
## Model: ETS(A,N,N)
## Smoothing parameters:
## alpha = 0.3221247
##
## Initial states:
## l[0]
## 100646.6
##
## sigma^2: 87480760
##
## AIC AICc BIC
## 13737.10 13737.14 13750.07
The optimal smoothing parameter was alpha = 0.322,
indicating that about 32% of the new observation is used when updating
the level estimate. The model’s memory of previous levels does not fade
away quickly giving more weights to previous observations.
The estimated initial level was l_0 = 100,646.6.
fc_pigs <- fit_pigs |>
forecast(h = '4 months')
fc_pigs |>
autoplot(aus_pig |> filter(year(Month) >= '2012 Jan'), level = NULL)+
labs(title = "Pigs slautering 4 months forecast")
fc_pigs
The forecast points for the next four months are all 95,186.56
slaughtered pigs. This is because simple exponential smoothing is a
level-only model, so multi-step forecasts remain constant.
Since the STL features suggested strong trend and moderate seasonality,
ETS models like AAN, ANA, AAA or AAdA than simple exponential smoothing
may be more appropriate.
1b. Compute a 95% prediction interval for the first forecast using y_hat + - 1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.
residuals_sd <- fit_pigs |>
augment() |>
pull(.innov) |>
sd()
residuals_sd
## [1] 9344.666
fc_point <- fc_pigs |>
pull(.mean)
y_hat <- fc_point[1]
y_hat
## [1] 95186.56
lower_ci <- y_hat - residuals_sd*1.96
upper_ci <- y_hat + residuals_sd*1.96
print(c(lower_ci, upper_ci))
## [1] 76871.01 113502.10
fc_pigs_intervals <- fc_pigs |>
hilo(level = 95) |>
unpack_hilo('95%')
fc_pigs_intervals |>
slice(1) |>
select(.model, Month, .mean, `95%_lower`, `95%_upper`)
The residual standard deviation was SD = 9344.66. With first forecast y_hat = 95186.56 the manual 95% prediction interval is approximately (76,817, 113,556).
R produced the interval (76,854.79, 113,518.30).
The two intervals are very similar, with only minor differences because
the ETS model uses its own estimate of the innovation variance rather
than the simple residual standard deviation.
global_economy contains the annual
Exports from many countries. Select one country to
analyse.?global_economy
canada_exports <- global_economy |>
filter(Code == 'CAN') |>
select(Exports)
canada_exports
a. Plot the Exports series and discuss the main features of the data.
canada_exports |>
autoplot(Exports)+
labs(title = "Canadian exports (% of GDP)",
y = 'Exports (% of GDP)')
The Canadian exports series shows a strong upward trend from the
1960s until around 2000, followed by a noticeable decline and
stabilization in later years. The series contains sharp growth from 1992
till 2000, followed sharp drop in exports until 2009 and exports appear
to stabilize after 2009. Since the data are annual, no seasonal pattern
is present.
Overall, the plot suggests that long-term movements play an important
role in the series, which may suggest using models that include a trend
component.
b. Use an ETS(A,N,N) and ETS(A,A,N) models to forecast the series, and plot the forecasts.
fit_canada <- canada_exports |>
model(
SES = ETS(Exports~ error('A') + trend('N') + season('N')),
AAN = ETS(Exports~ error('A') + trend('A') + season('N'))
)
fc_canada <- fit_canada |>
forecast(h = 7)
fc_canada |> autoplot(canada_exports, level = NULL)
b. Compute the RMSE values for the training data.
fit_canada |>
accuracy() |>
select(.model, .type, RMSE)
fit_canada |>
tidy()
d. Compare the results to those from an ETS(A,A,N) model.
(Remember that the trended model is using one more parameter than the
simpler model.) Discuss the merits of the two forecasting methods for
this data set.
Both models RMSE values for the training data are very similar , with
SES = 1.618 and AAN = 1.626. Since the SES model has a slightly lower
RMSE and uses fewer parameters, it provides a more parsimonious fit to
the data.
Both models have smoothing parameters alpha very close to 1,
therefore the level estimate puts almost all weight on the most recent
observation. As a result, the SES model behave similarly to a NAIVE
forecasting approach.
In the ETS(A,A,N) model, the trend parameter beta = 0.24 allows
the trend to update gradually. Because the alpha = 0.99 (more
weights on recent observations) for ETS(A,A,N) model and the most recent
observations (after 2000) show decline and stabilization in exports, the
estimated trend becomes slightly negative, leading to a downward
forecast.
e. Compare the forecasts from both methods. Which do you think is best?
The Canadian exports series shows long-term changes and structural
shifts, particularly around 2000 and 2009. Because the direction of the
trend changes over time and overall it is moving upward, model such as
ETS(A,A,N) may not capture the series well. The SES model adapts to the
current level without imposing a trend, which probably explains its
slightly better RMSE performance.
fc_one <- fc_canada |>
filter(Year == 2018)
fc_one_SES <- fc_one |>
filter(.model == 'SES') |>
pull(.mean)
fc_one_AAN <- fc_one |>
filter(.model == 'AAN') |>
pull(.mean)
print(c(fc_one_SES, fc_one_AAN))
## [1] 30.89403 30.77976
fit_canada
residuals_ANN_sd <- fit_canada |>
select(SES) |>
augment() |>
pull(.innov) |>
sd()
residuals_ANN_sd
## [1] 1.613758
residuals_AAN_sd <- fit_canada |>
select(AAN) |>
augment() |>
pull(.innov) |>
sd()
residuals_AAN_sd
## [1] 1.639708
lower_fc_SES <- fc_one_SES - residuals_ANN_sd*1.96
upper_fc_SES <- fc_one_SES + residuals_ANN_sd*1.96
print(c(lower_fc_SES, upper_fc_SES))
## [1] 27.73107 34.05700
#fc_canada
fc_canada_intervals_SES <- fc_canada |>
filter(.model == 'SES') |>
hilo(level = 95) |>
unpack_hilo('95%')
fc_canada_intervals_SES |>
slice(1) |>
select(.model, Year, .mean, `95%_lower`, `95%_upper`)
lower_fc_AAN <- fc_one_AAN - residuals_AAN_sd*1.96
upper_fc_AAN <- fc_one_AAN + residuals_AAN_sd*1.96
print(c(lower_fc_AAN, upper_fc_AAN))
## [1] 27.56593 33.99359
fc_canada_intervals_AAN <- fc_canada |>
filter(.model == 'AAN') |>
hilo(level = 95) |>
unpack_hilo('95%')
fc_canada_intervals_AAN |>
slice(1) |>
select(.model, Year, .mean, `95%_lower`, `95%_upper`)
library(dplyr)
library(knitr)
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
comparison_table <- tibble(
.model = c("SES", "AAN"),
Forecast = c(fc_one_SES, fc_one_AAN),
Manual_lower = c(lower_fc_SES, lower_fc_AAN),
Manual_upper = c(upper_fc_SES, upper_fc_AAN)
) |>
left_join(
bind_rows(
fc_canada_intervals_SES |> slice(1),
fc_canada_intervals_AAN |> slice(1)
) |>
select(.model, R_lower = `95%_lower`, R_upper = `95%_upper`),
by = ".model"
)
library(knitr)
library(kableExtra)
comparison_table |>
select(.model, Year, Forecast, Manual_lower, Manual_upper, R_lower, R_upper) |>
kable(
digits = 3,
col.names = c(
"Model", "Year", "Forecast",
"Manual lower", "Manual upper",
"R lower", "R upper"
)
) |>
kable_styling(full_width = FALSE) |>
row_spec(0:nrow(comparison_table), background = "white")
| Model | Year | Forecast | Manual lower | Manual upper | R lower | R upper |
|---|---|---|---|---|---|---|
| SES | 2018 | 30.894 | 27.731 | 34.057 | 27.667 | 34.121 |
| AAN | 2018 | 30.780 | 27.566 | 33.994 | 27.478 | 34.082 |
The manually calculated prediction intervals for both models are very
close to the intervals calculated by R.
There are small differences because the manual calculation uses the
residual standard deviation, while the ETS in R computes prediction
intervals using the estimated innovation variance from the state-space
model.
global_economy data set using an ETS model. Experiment
with the various options in the ETS() function to see how
much the forecasts change with damped trend, or with a Box-Cox
transformation. Try to develop an intuition of what each is doing to the
forecasts.china_gdp <- global_economy |>
filter(Country == 'China') |>
select(GDP) |>
mutate(GDP_bil = GDP / 1e9)
#china_gdp
lambda <- china_gdp |>
features(GDP_bil, features = guerrero) |>
pull(lambda_guerrero)
lambda
## [1] -0.03446284
china_gdp <- china_gdp |>
mutate(GDP_bil_bxcx = box_cox(GDP_bil, lambda))
china_gdp |> autoplot(GDP_bil)+
labs(title = "China GDP in billions",
y = 'GDP (billions)')
Chinese GDP shows very strong long-term growth, especially after around 2000 meaning the economy is growing faster over time.
fit_china <- china_gdp |>
# stretch_tsibble(.init = 10) |>
model(
SES = ETS(GDP_bil~error('A') + trend('N') + season('N')),
Holt = ETS(GDP_bil~error('A') + trend('A') + season('N')),
Damp = ETS(GDP_bil~error('A') + trend('Ad') + season('N')),
Damp_bxcx= ETS(log(GDP_bil)~error('A') + trend('Ad') + season('N'))
)
fit_china |>
forecast(h = 12) |>
autoplot(china_gdp, level = NULL)+
scale_y_continuous()+
guides(color = guide_legend(title = 'Forecast'))
# fit_china|>
# accuracy()
SES (ETS(A,N,N)): it assumes that the series fluctuates around a constant level with no trend. Because Chinese GDP shows strong long-term growth (especially after 2000), the forecast becomes flat and just repeats last predicted level in the model.
Holt (ETS(A,A,N)): this one adds a trend component to the level. This pushes the forecast to continue the recent growth of GDP, producing a constant increasing straight line forecast.
Damped trend (ETS(A,Ad,N)): similar to
Holt but assumes the trend will gradually slow down in the future. The
forecast still increases, but the growth rate becomes smaller and the
line eventually will become flat.
This model is a better option than two previous ones, because we cannot
assume that economic growth is constant and forever.
Damped trend with Box-Cox
transformation: applies a log transformation before
fitting the damped trend model. This models proportional growth instead
of absolute growth, so when converted back to the original scale the
forecast increases much faster and appears almost exponential.
When the model is fitted to log(GDP_bil), it is
mathematically calculates percentage growth instead of absolute
increases.
In other words the model assumes GDP grows by a certain percent each
year instead of by a fixed amount. When the forecasts are converted back
to the original scale, the increases become larger over time because
they are applied to a growing base. That is why the forecast curve rises
much faster and looks kinda exponential.
I think this issue can be fixed with applying lower damping coefficient
phi, and I will try it below.
fit_china_phi_adj <- china_gdp |>
# stretch_tsibble(.init = 10) |>
model(
SES = ETS(GDP_bil~error('A') + trend('N') + season('N')),
Holt = ETS(GDP_bil~error('A') + trend('A') + season('N')),
Damp = ETS(GDP_bil~error('A') + trend('Ad', phi = 0.9) + season('N')),
Damp_bxcx= ETS(log(GDP_bil)~error('A') + trend('Ad', phi = 0.55) + season('N'))
)
fit_china_phi_adj |>
forecast(h = 12) |>
autoplot(china_gdp, level = NULL)+
scale_y_continuous()+
guides(color = guide_legend(title = 'Forecast'))
I applied different damping coefficients phi to both
Damp (phi = 0.9) and Damp_bxcx
(phi = 0.55) models:
Damped trend (ETS(A,Ad,N)) the growth still continues but becomes slightly flatter compared with Holt’s model.
Damped trend with Box–Cox transformation: by lowering the damping parameter to phi = 0.55, the influence of the trend is reduced much faster, so the forecast increases more slowly and produces a more realistic path.
# fit_china_phi_adj |>
# accuracy()
8.7 Find an ETS model for the Gas data from
aus_production and forecast the next few years.
Why is multiplicative seasonality necessary here? Experiment with making
the trend damped. Does it improve the forecasts?
gas_production <- aus_production |>
select(Gas)
#gas_production
gas_production |>
autoplot(Gas)+
labs(title = 'Australian gas production',
y = 'Gas (petajoules)')
As we see from the plot gas production variance and trend looks
constant and stable until around 1970 with small seasonal
fluctuations.
But after 1970 the series shows a strong upward trend, and the seasonal
variation becomes larger as the level of the series increases. This
suggests that the seasonal variations are proportional to the level
indicating multiplicative seasonality.
lambda_gas <- gas_production |>
features(Gas, features = guerrero) |>
pull(lambda_guerrero)
#lambda_gas
gas_production <- gas_production |>
mutate(Gas_bxcx = box_cox(Gas, lambda_gas))
# gas_production
# gas_production |>
# autoplot(Gas_bxcx)
gas_production |>
autoplot(Gas_bxcx)+
labs(title = 'BoxCox transformed gas production series')
As we see applying BoxCox transformation improved variation in series making it more stable and consistent over time.
gas_production |>
model(
stl = STL(Gas~season(window = 'periodic'))
) |>
components() |>
ACF(season_year, lags = 48) |>
autoplot()
I applied STL decomposition to estimate seasonal component of the
series. After that I used ACF function on estimated seasonal
component.
And as we see lags 4, 8, 12 and other multiplies of 4 show strong
positive correlations, also there are strong negative autocorrelations
at lags 2, 6 ,10, 14 and etc.
Since the data are quarterly, ACF also confirms that the seasonal pattern repeats every 4 quarters. This confirms the presence of a strong yearly seasonal cycle in the gas production series.
Adding this to my previous findings from plots of regular and BoxCox transformed data, this gives additional evidence of multiplicative seasonality.
fit_gas <- gas_production |>
model(
AAA = ETS(Gas~ error('A') + trend('A') + season('A')),
AAM = ETS(Gas~ error('A') + trend('A')+ season('M')),
AAdM = ETS(Gas~ error('A') + trend('Ad')+ season('M')),
)
fc_gas <- fit_gas |>
forecast(h = '2 year')
fc_gas |>
autoplot(gas_production |>
filter(year(Quarter) >= 2007),
level = NULL) +
geom_point(data = fc_gas,
aes( y =.mean, color = .model), size = 1.7) +
scale_color_manual(values = c("AAA" = "#D55E00", "AAM" = "#0072B2", "AAdM" = "#009E73")) +
guides(colour = guide_legend(override.aes = list(linewidth = 1.2))) +
labs(title = "Gas Production Forecast Comparison",
subtitle = "Comparing Additive vs Multiplicative Seasonality",
y = "Petajoules")
Careful look at the plot shows that all three models follow the same
quarterly seasonal cycle, but the multiplicative seasonal models fit the
data structure better than the additive one. Since the seasonal swings
increase with the level of gas production, I think AAM and
AAdM methods are more appropriate than AAA
method.
The AAdM method is slightly more cautious since the damped
trend prevents the forecast from growing constantly over time.
fit_gas |>
accuracy()
The results show that models with multiplicative seasonality perform
better than the additive model. The AAA method has the highest
errors, while AAM and AAdM both have lower and similar
RMSEs.
Among them, i would prefer AAdM AAdM provides
the best overall accuracy and produces more realistic forecasts due to
the damped trend.
For this exercise I would like to choose “Supermarket and grocery stores” time series.
supermarket <- aus_retail |>
filter(Industry == 'Supermarket and grocery stores') |>
summarise(Turnover = sum(Turnover))
#supermarket
This series have 441 monthly observations of overall Australian supermarket and grocery stores sales turnovers in a time range of April 1982 to November 2018.
a. Why is multiplicative seasonality necessary for this series?
supermarket |>
autoplot(Turnover)+
labs(title = 'Australian Supermarket and grocery stores sales',
y = 'Turnover ($ million AUD)')
The plot clearly shows that seasonal variations growing wider as the trend of series increases. This strongly suggests presence of multiplicative seasonality.
lambda_supermarket <- supermarket |>
features(Turnover, features = guerrero) |>
pull(lambda_guerrero)
p1 <- supermarket |>
autoplot(Turnover)+
labs(title = 'Supermarkets Turnover time series')
p2 <- supermarket |>
autoplot(box_cox(Turnover, lambda_supermarket))+
labs(y = 'BoxCox transformed', title = paste0('BoxCox transformed Turnover. Lambda = ', round(lambda_supermarket, 3), ')'))
p1/p2
After applying the Box–Cox transformation (lambda = 0.184), the variance becomes more stable and the seasonal fluctuations appear more constant over time. This suggests that the transformation successfully stabilizes the variance and transformed the series from multiplicative closer to additive behavior.
lambda_supermarket
## [1] 0.1842339
b. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
fit_retail <- supermarket |>
model(
MAM = ETS(Turnover ~ error("M") + trend("A") + season("M")),
MAdM = ETS(Turnover ~ error("M") + trend("Ad") + season("M"))
)
fit_retail |>tidy() |>
filter(term %in% c("alpha", "beta", "gamma", "phi")) |>
pivot_wider(names_from = .model, values_from = estimate)
fit_retail |>
forecast(h = '3 year') |>
autoplot(supermarket |> filter(Month >= yearmonth('2012 Jan')), level = NULL)
Estimated smoothing parameters for both models are quite similar. The
level smoothing parameter alpha is slightly smaller in the
damped model, while beta (trend smoothing) is a bit larger for
MAM model. Seasonal smoothing gamma is almost the same in both
models, suggesting the seasonal pattern is fairly stable. The damping
parameter phi = 0.98 in MAdM is very close to 1, which means
the trend is only very lightly damped and behaves almost like the
regular trend model.
Alss from the plot we don’t see significant difference for 3 years
fit_retail |>
accuracy()
c. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
fit_retail |>
accuracy()
Both models produce very similar results, but the MAM model
performs slightly better. Its RMSE (76.46) is a little bit lower than
the damped trend model MAdM (77.66), which means the regular
multiplicative Holt-Winters fits the data slightly better on the
training set.
The estimated smoothing parameters are also similar between models, and
the damping parameter phi = 0.98 is very close to 1, indicating
that trend damping very light. Because of this and the lower RMSE, the
MAM model is preferred.
d. Check that the residuals from the best method look like white noise.
fit_retail |>
select(MAM) |>
gg_tsresiduals()
## Warning: `gg_tsresiduals()` was deprecated in feasts 0.4.2.
## ℹ Please use `ggtime::gg_tsresiduals()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
From the plot above I can say that”
The variance of residuals is not constant and stable overall. As we see, variance of residuals gradually becomes smaller over the years.
Distribution of residuals appears to be close to normal and symmetric centered around mean zero.
ACF plot shows high significant autocorrelations above confidence
bounds, especially at seasonal lags. This suggests that some
autocorrelation remains in the residuals and the model has not fully
captured the structure of the data. The residual variance also appears
somewhat larger in earlier years and slightly smaller later, indicating
mild heteroscedasticity.
This suggests the MAM model did not capture all structure in the data, therefore the residuals are not fully white noise.
Let’s confirm it with Ljuang-Box test
fit_retail |>
augment() |>
filter(.model == "MAM") |> # Use your best model name here
features(.innov, ljung_box, lag = 24)
Since p-value is 0 (or less than 0.05) and Q value is very high (713.81), the Ljung-Box test rejecting the null hypothesis. This confirms that residuals are not white noise, and there is still some significant information or pattern left in the errors that the MAM model didn’t capture.
e. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?
train_supermarket <- supermarket |>
filter(Month<=yearmonth('2010 Dec'))
test_supermarket <- supermarket |>
filter(Month> yearmonth('2010 Dec'))
fit_MAM_NAIVE <- train_supermarket |>
model(
MAM = ETS(Turnover~error('M') + trend('A') + season('M')),
S_NAIVE = SNAIVE(Turnover)
)
fc_MAM_NAIVE <- fit_MAM_NAIVE |>
forecast(test_supermarket)
fc_MAM_NAIVE |> autoplot(supermarket, level = NULL)
fc_MAM_NAIVE |>
accuracy(test_supermarket) |>
select(.model, RMSE, MAE, MAPE)
As we see from the forecasts plot and accuracy results MAM
model clearly outperforms the seasonal naive benchmark model on the test
set. The test RMSE for MAM is 764.86, while the seasonal naive
model has a much larger RMSE of 1563.51. Similar improvements appear in
MAE and MAPE as well.
This means the Holt-Winters multiplicative model captures the trend and
seasonality much better than simply repeating last year’s seasonal
pattern, so it is so far better model than seasonal naive approach.
fit_STL_ETS <- train_supermarket |>
model(
MAM = ETS(Turnover~ error('M') + trend('A')+ season('M')),
STL_ETS_periodic = decomposition_model(
STL(log(Turnover) ~ season(window = 'periodic')),
ETS(season_adjust ~ error('A') + trend('A') + season('N'))
),
STL_ETS_SW15 = decomposition_model(
STL(log(Turnover) ~ season(window = 15, degree = 0)),
ETS(season_adjust ~ error('A') + trend('A') + season('N'))
)
)
So here I applied 3 models:
MAM: the best model from previous exercise
STL_ETS_periodic: first applies a log transformation to stabilize variance, then uses STL decomposition to split the series into trend, seasonal, and remainder components. The seasonal component is estimated using a periodic seasonal window, meaning the seasonal pattern is assumed to repeat the same way every year. After removing seasonality, an ETS model with additive trend and no seasonal component is fitted to the seasonally adjusted data. Finally, forecasts from ETS are combined with the seasonal component and transformed back to the original scale.
STL_ETS_SW15 : this one has the same steps as STL_ETS_periodic, except it uses a seasonal smoothing window of 15, which lets the seasonal pattern to change slower over time. Instead of being fixed, the seasonal component is estimated with local smoothing, so seasonal effects can gradually evolve.
fc_STL_ETS <- fit_STL_ETS |>
forecast(test_supermarket)
fc_STL_ETS |>
accuracy(test_supermarket) |>
select(.model, RMSE, MAE, MAPE) |>
arrange(RMSE)
As we see STL_ETS_SW15 (model used log transformation, STL decomposition with seasonal window = 15)has the lowest RMSE (514.01) compare to another models including MAM model from previous exercise.
fc_STL_ETS |>
autoplot(supermarket |> filter(Month >= yearmonth('2000 Jan')), level = NULL) +
facet_wrap(~ .model, ncol = 1) +
guides(colour = "none") +
labs(title = "Model Comparison: Test Set Performance",
subtitle = "Data after 2012 Dec",
y = "Turnover")
From the plot we see that the MAM model underestimates the
future values and underfits the upward trend, so its forecasts stay
below the actual series.
Both STL_ETS models follow the structure of the data much better,
capturing the trend and seasonal movements more accurately. But
STL_ETS_SW15 produces forecasts closest to the observed values,
especially it alligns much better right before 2016.
Also STL_ETS_SW15 produced slightly overestimated forecasts after 2016, and I think it makes sense to try a damped additive trend inside the ETS part of the STL_ETS_SW15 model and try different values of the damping parameter phi to improve forecast points and accuracy.
Let’s try it below.
fit_STL_ETS_ETSdamped <- train_supermarket |>
model(
MAM = ETS(Turnover~ error('M') + trend('A')+ season('M')),
STL_ETS_periodic = decomposition_model(
STL(log(Turnover) ~ season(window = 'periodic')),
ETS(season_adjust ~ error('A') + trend('A') + season('N'))
),
STL_ETS_SW15 = decomposition_model(
STL(log(Turnover) ~ season(window = 15, degree = 0)),
ETS(season_adjust ~ error('A') + trend('A') + season('N'))
),
STL_ETS_SW15_Damped = decomposition_model(
STL(log(Turnover) ~ season(window = 15)),
ETS(season_adjust ~ error('A') + trend('Ad', phi = 0.993) + season('N'))
)
)
fit_STL_ETS_ETSdamped |> accuracy()
The STL models performs better than MAM model on
the training data. Setting the seasonal component to vary with window =
15 already improves RMSE substantially. Adding a damped trend slightly
improves the fit further, giving the lowest training RMSE (64.71).
This suggests that a lightly damped trend helps control the growth of
the series and produces a slightly better model.
fc_with_ETSdamped <- fit_STL_ETS_ETSdamped |>
forecast(test_supermarket)
fc_with_ETSdamped |>
accuracy(test_supermarket) |>
select(.model, RMSE, MAE, MAPE) |>
arrange(RMSE)
As we see the best result on the test data so far was obtained with
phi = 0.993. The new STL_ETS_SW15_Damped model
significantly improves forecast accuracy, reducing RMSE to 329.34,
compared with 514.01 for the STL_ETS_SW15 model. It also
performs much better than the original MAM model, which had
RMSE = 764.86.
This suggests that adding a damped trend to the ETS component can help
to control the trend growth and produces forecasts that align much
closer with the observed data.
fc_with_ETSdamped |>
autoplot(supermarket, level = NULL) +
facet_wrap(~ .model, ncol = 1) +
guides(colour = "none") +
labs(title = "Model comparison with test set",
subtitle = "Test set data after 2012 Dec",
y = "Turnover")
From the plot we see the new STL_ETS_SW15_Damped forecast linealigns with original data much better compared to other models.