1-) Consider the the number of pigs slaughtered in Victoria, available in the aus_livestock dataset.
a. Use the ETS() function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of α and ℓ0, and generate forecasts for the next four months.
library(fpp3)library(ggplot2)#Use the ETS() function to estimate the equivalent model for simple exponential smoothingfit <- aus_livestock |>filter(State =="Victoria", Animal =="Pigs") |>model(ETS(Count ~error("A") +trend("N") +season("N"))) #Generate forecasts for the next four monthsfc <- fit |>forecast(h =4)fc |>autoplot(aus_livestock) +ggtitle("Number of Pigs Slaughtered in Victoria")
# A fable: 4 x 6 [1M]
# Key: Animal, State, .model [1]
Animal State .model Month
<fct> <fct> <chr> <mth>
1 Pigs Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + season(… 2019 janv.
2 Pigs Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + season(… 2019 févr.
3 Pigs Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + season(… 2019 mars
4 Pigs Victoria "ETS(Count ~ error(\"A\") + trend(\"N\") + season(… 2019 avr.
# ℹ 2 more variables: Count <dist>, .mean <dbl>
b. Compute a 95% prediction interval for the first forecast using y^±1.96s where s is the standard deviation of the residuals. Compare your interval with the interval produced by R.
s <-residuals(fit)$.resid |>sd()y_hat <- fc$.mean[1]#Computing y_hat ± 1.96slower_limit <- y_hat -1.96*supper_limit <-1.96*s + y_hatcat("The confidence interval with y_hat ± 1.96s is [",lower_limit, ", ", upper_limit,"]\n")
The confidence interval with y_hat ± 1.96s is [ 76871.01 , 113502.1 ]
#Interval produced by Rfc |>hilo(95) |>pull('95%') |>head(1)
<hilo[1]>
[1] [76854.79, 113518.3]95
Comment: the 95% prediction interval for the first forecast using y^±1.96s is slightly different from the one produced by R: higher lower limit and smaller upper limit.
2-) Data set global_economy contains the annual Exports from many countries. Select one country to analyse.
a. Plot the Exports series and discuss the main features of the data.
global_economy |>filter(Country =="Benin") |>autoplot(Exports) +labs(y="% of GDP", title="Exports: Benin")
Comment: Overall, there is an increasing trend throughout the years with no apparent seasonality. There is an important increase for almost a decade after 1960; when the country became independent from France. From there, the increase has been a steady trend from 1970 through 1990. Same observation from 1991 through 2010.
b. Use an ETS(A,N,N) model to forecast the series, and plot the forecasts.
d. Compare the results to those from an ETS(A,A,N) model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.
Captures Trend: This model includes an additive trend component, making it suitable for time series data with a linear trend. It can effectively model changes in the level of the series over time.
Flexibility: The additive trend allows for more flexibility in forecasting, as it can adapt to changes in the trend over time.
Benchmark for Trended Series: Due to its ability to capture trends, ETS(A,A,N) is often considered a good benchmark for trended time series.
ETS(A,N,N) Merits:
Simplicity: This model is simpler as it only includes an additive error component and no trend or seasonality. It is easy to implement and understand. For our data set, there is an observed trend and not capturing it in our model, would make the forecasting less efficient.
Suitable for Stable Series: ETS(A,N,N) is ideal for time series data that do not exhibit any trend or seasonal patterns. It works well for stable series where the level remains relatively constant over time. There is a trend in our data set.
Efficient: The simplicity of the model makes it computationally efficient, which can be advantageous when dealing with large datasets or requiring quick forecasts.
e. Compare the forecasts from both methods. Which do you think is best?
The RMSE is lower for the ETS(A,A,N), which suggests that this is model forecasts better. I believe that the ETS(A,A,N) method is better since it captures the increasing trend in the data and has a lower RMSE.
f. Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.
- The confidence interval using the ETS(A,N,N) method:
The confidence interval with y_hat_bj_2 ± 1.96s is [ 22.07896 , 32.6629 ]
#Interval produced by Rfc_bj_2 |>hilo(95) |>pull('95%') |>head(1)
<hilo[1]>
[1] [21.93391, 32.80796]95
Comment:
- ETS(A,N,N) method vs. R method: when comparing my interval from the ETS(A,N,N) method with the one produced using R, I observed that the spread (difference between the limits of the confidence intervals) using the ETS(A,N,N) is lower than the spread using the R method (10.5 vs. 10.72).
- ETS(A,A,N) method vs. R method: when comparing my interval from the ETS(A,A,N) method with the one produced using R, I observed that the spread (difference between the limits of the confidence intervals) using the ETS(A,A,N) is lower than the spread using the R method (10.58 vs. 10.98).
3-) Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS() function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.
[Hint: use a relatively large value of h when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]
Comment: The Chinese GDP exhibits an upward trend mainly after 1990 without any seasonality. Therefore, using additive errors and “N” for seasonality would be appropriate.
Comment: Experimenting with the damping parameter phi in the damped trend models was intriguing, as it influenced the forecast’s intensity. When phi was set to 0.95, the forecasts appeared larger. To make the forecasts more realistic, I opted for phi at 0.8, considering the large values of GDP. Log and Box-Cox transformations without damping tended to over-forecast significantly as the forecast horizon increased. Both the damped log and Holt’s method with phi=0.8 produced similar effects. The damped Holt’s method without any transformation provided results that were between those of Holt’s method and the simple exponential method.
4-) Find an ETS model for the Gas data from aus_production and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?
Comment: Multiplicative seasonality is essential in this case because the seasonal pattern’s variation seems to be proportional to the time series level. As the trend increases, the amplitude of the seasonal pattern also grows.
Comment: Although the RMSE improved with the damped multiplicative method, the AIC slightly worsened with each method. When deciding whether dampening enhanced the model, you can base your judgment on either the RMSE or the AIC, AICc, and BIC.
Since all three methods have the same number of parameters to estimate, using RMSE for model comparison is appropriate. This comparison shows that the damped multiplicative method saw a slight improvement.
5-) Recall your retail time series data (from Exercise 7 in Section 2.10).
a. Why is multiplicative seasonality necessary for this series?
Multiplicative seasonality is essential in this context because the seasonal pattern’s variation seems to be proportional to the time series level. As the trend increases over time, the amplitude of the seasonality also grows.
b. Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
Comment: The damped method appears to be more stable, whereas the multiplicative method shows a greater increase in trend and results in higher forecasts.
c. Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
# A tibble: 1 × 5
State Industry .model lb_stat lb_pvalue
<chr> <chr> <chr> <dbl> <dbl>
1 Tasmania Cafes, restaurants and takeaway food servic… multi… 28.0 0.260
Comment: For both tests, the results are not significant at the 0.05 level, indicating that the residuals are indistinguishable from a white noise series. The residuals indeed resemble white noise.
e. Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?
# A tibble: 2 × 3
.type .model RMSE
<chr> <chr> <dbl>
1 Training multi 1.18
2 Training snaive 2.90
fc |>accuracy(myseries) |>select(.type, .model, RMSE)
# A tibble: 2 × 3
.type .model RMSE
<chr> <chr> <dbl>
1 Test multi 3.99
2 Test snaive 9.13
Comment: The multiplicative method appears to forecast the data more effectively. Its RMSE is significantly lower than that of the seasonal naïve approach, indicating that the multiplicative method is more suitable.
6-) For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
lambda <- myseries_train |>features(Turnover, guerrero) |>pull(lambda_guerrero)#stl decomp applied to the box cox transformed datamyseries_train |>model(STL(box_cox(Turnover,lambda) ~season(window ="periodic"), robust =TRUE)) |>components() |>autoplot() +ggtitle("STL with Box-Cox")
#computed the seasonally adjusted data , stl decomp applied to the box cox transformed datadcmp <- myseries_train |>model(STL_box =STL(box_cox(Turnover,lambda) ~season(window ="periodic"), robust =TRUE)) |>components()#replacing turnover with the seasonally adjusted datamyseries_train$Turnover <- dcmp$season_adjust#modeling on the seasonally adjusted datafit <- myseries_train |>model(ETS(Turnover ~error("M") +trend("A") +season("M")))#checking the residualsfit |>gg_tsresiduals() +ggtitle("Residual Plots for Australian Retail Turnover")
#produce forecasts for test datafc <- fit |>forecast(new_data =anti_join(myseries, myseries_train))fit |>accuracy() |>select(.model, .type, RMSE)