library(fpp3)
set.seed(314159)
DATA 624: Assignment 05
Setup
Assignment 05
Do exercises 8.1, 8.5, 8.6, 8.7, 8.8, 8.9 in Hyndman. Please submit both the link to your Rpubs and the .pdf
file with your run code.
Exercise 8.1
- Consider the the number of pigs slaughtered in Victoria, available in the
aus_livestock
dataset.
<- aus_livestock |>
data filter(State == "Victoria" & Animal == "Pigs")
- Use the
ETS()
function to estimate the equivalent model for simple exponential smoothing. Find the optimal values of \(\alpha\) and \(\ell_0\), and generate forecasts for the next four months.
8.1.a Solution
The ETS()
function is a formidable force. It not only chooses an optimal model, but optimizes the paramters as well. All we need to do is spy on him.
<- data |>
fit model(ets = ETS(Count))
report(fit)
Series: Count
Model: ETS(A,N,A)
Smoothing parameters:
alpha = 0.3579401
gamma = 0.0001000139
Initial states:
l[0] s[0] s[-1] s[-2] s[-3] s[-4] s[-5] s[-6]
95487.5 2066.235 6717.311 -2809.562 -1341.299 -7750.173 -8483.418 5610.89
s[-7] s[-8] s[-9] s[-10] s[-11]
-579.8107 1215.464 -2827.091 1739.465 6441.989
sigma^2: 60742898
AIC AICc BIC
13545.38 13546.26 13610.24
|>
fit forecast(h = 4) |>
autoplot(data)
# It's tiny. Let's zoom in a bit
|>
fit forecast(h = 4) |>
autoplot(data) +
coord_cartesian(xlim = c(yearmonth("2015-01"), yearmonth("2020-01")))
The model chosen as the best fit is ETS(A,N,A)
: an additive error component and an additive seasonal component. The optimal value of $=.3579 and \(\hat{\gamma}=0.0001\)
- Compute a 95% prediction interval for the first forecast using \(\hat{y}_1 \pm 1.96s\), where \(s\) is the standard deviation of the residuals. Compare your interval with the interval produced by R.
8.1.b Solution
<- forecast(fit, h = 4)
f <- residuals(fit)
r <- sd(r$.resid)
s <- f$.mean[1] + c(-1.96, 1.96) * s
pi95 |>
f autoplot() +
geom_hline(yintercept = pi95, color = "red") +
labs(title = "95% Prediction Interval")
It matches the first (January, 2019) 95% forecast interval from forecast
perfectly.
Exercise 8.5
Data set global_economy
contains the annual Exports from many countries. Select one country to analyse. Let’s look at Ethiopia’s exports (hoping it’s food). There isn’t much data on Ethiopia. Let’s try Japan.
<- global_economy |>
data filter(Code == "JPN")
- Plot the Exports series and discuss the main features of the data.
8.5.a Solution
|>
data autoplot(Exports) +
labs(title = "Japan's Exports")
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
It looks like there is a strong, positive trend line with a humungous valley spanning approximately 1990-2007. This period is known as the Lost Decades. Just as Japan was rising out of the Lost Decades, the 2008 financial crisis happened and Japan faced an approximately 10 year long downward facing spike. Nonetheless, Japan’s exports are on the rise.
- Use an
ETS(A,N,N)
model to forecast the series, and plot the forecasts.
8.5.b Solution
<- data |>
fit_ann filter(!is.na(Exports)) |>
model(ets = ETS(Exports ~ error("A") + trend("N") + season("N")))
|>
fit_ann # let's increase the forecast window so that it is more visible
forecast(h = 10) |>
autoplot(data)
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_line()`).
Wow, that is an instrument with a big bell. Using the error alone did not instill much confidence in the overall prediction.
- Compute the RMSE values for the training data.
8.5.c Solution
report(fit_ann)
Series: Exports
Model: ETS(A,N,N)
Smoothing parameters:
alpha = 0.9414732
Initial states:
l[0]
10.638
sigma^2: 1.6541
AIC AICc BIC
263.1026 263.5555 269.2318
accuracy(fit_ann)
# A tibble: 1 × 11
Country .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Japan ets Training 0.104 1.26 0.883 0.258 7.26 0.981 0.990 0.00173
The RMSE is ~1.263.
- Compare the results to those from an
ETS(A,A,N)
model. (Remember that the trended model is using one more parameter than the simpler model.) Discuss the merits of the two forecasting methods for this data set.
8.5.d Solution`
<- data |>
fit_aan filter(!is.na(Exports)) |>
model(ets = ETS(Exports ~ error("A") + trend("A") + season("N")))
|>
fit_aan report() |>
accuracy()
Series: Exports
Model: ETS(A,A,N)
Smoothing parameters:
alpha = 0.9094125
beta = 0.0001000003
Initial states:
l[0] b[0]
10.50228 0.1047452
sigma^2: 1.7047
AIC AICc BIC
266.7104 267.8868 276.9256
# A tibble: 1 × 11
Country .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<fct> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Japan ets Training -0.00389 1.26 0.880 -0.702 7.30 0.978 0.987 0.0246
Let’s put together a comparison table. |Model|AIC|AICc|BIC|RMSE| |—-|–|–|–|–| |ETS(A,N,N)|263|263.6|269.2|1.263| |ETS(A,A,N)|266.7|267.9|277|1.259|
The ETS(A,N,N)
model has a lower AIC, AICc, and BIC. But its RMSE is slightly higher than the ETS(A,A,N)
model’s RMSE. To be honest, I’m not sure what to do in this case. My inclination is to go with the lower RMSE, but that’s probably because I understand RMSE better than I do AIC, AICc and BIC.
- Compare the forecasts from both methods. Which do you think is best?
8.5.e Solution
|>
fit_ann forecast(h = 10) |>
autoplot(data) +
labs(title = "ETS(A,N,N)") +
coord_cartesian(ylim = c(5, 25))
|>
fit_aan forecast(h = 10) |>
autoplot(data) +
labs(title = "ETS(A,A,N)") +
coord_cartesian(ylim = c(5, 25))
Isn’t it interesting that ETS(A,A,N) captures the apparent trend. I guess this really no surprise since ETS(A,N,N) does not have a trend component. I’m starting to lean in favor of the ETS(A,A,N) model. I like the fact that it captures the trend. It seems like Japan is on the rise again. But, I am not an economist! Please don’t invest in the Japanese Yen just because I like ETS(A,A,N).
- Calculate a 95% prediction interval for the first forecast for each model, using the RMSE values and assuming normal errors. Compare your intervals with those produced using R.
8.5.f Solution
# ETS(A,N,N)
<- forecast(fit_ann, h = 10)
f_ann <- residuals(fit_ann)
r_ann <- sd(r_ann$.resid)
s_ann <- f_ann$.mean[1] + c(-1.96, 1.96) * s_ann
pi_ann |>
f_ann autoplot() +
geom_hline(yintercept = pi_ann, color = "red") +
labs(title = "ETS(A,N,N): 95% Prediction Interval")
# ETS(A,A,N)
<- forecast(fit_aan, h = 10)
f_aan <- residuals(fit_aan)
r_aan <- sd(r_aan$.resid)
s_aan <- f_aan$.mean[1] + c(-1.96, 1.96) * s_aan
pi_aan |>
f_aan autoplot() +
geom_hline(yintercept = pi_aan, color = "red") +
labs(title = "ETS(A,A,N): 95% Prediction Interval")
Both intersect with the 95% initial prediction interval from forecast
. The best fit line for ETS(A,N,N) is parallel to our horizontal 95% intercept. The best fit line for ETS(A,A,N) is not parallel to our horizontal 95% intercept, because the best fit line is not parallel to the x-axis, but rather reflects the apparent trend in the data.
Exercise 8.6
Forecast the Chinese GDP from the global_economy data set using an ETS model. Experiment with the various options in the ETS()
function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.
[Hint: use a relatively large value of \(h\) when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]
Let’s isolate the Chinese GDP data.
<- global_economy |>
data filter(Code == "CHN")
8.6 Solution
Let’s plot the data and see what we’ve got.
|>
data autoplot(GDP) +
labs(title = "Chinese GDP")
Oh, gosh! That is a curve begging to be box-coxed with a logarithmic transformation. Let’s try to straighten it out before we do anything else.
<- data |>
data_t mutate(GDP = log(GDP))
|>
data_t autoplot(GDP) +
labs(title = "Chinese GDP: Log Transformed")
Wow, box-cox won’t do a better job than a straight up log transformation. Let’s try the ETS model with the log transformed data. It’s very clear that the transformed trend is linear and arithmetically increasing. So, we will use an additive trend component. The seasonality, if there is any, is not clear. So, we will use an additive trend component and leave the rest up to ETS()
.
<- data_t |>
fit_t model(ets = ETS(GDP ~ trend("A"))) |>
report()
Series: GDP
Model: ETS(A,A,N)
Smoothing parameters:
alpha = 0.9999
beta = 0.1079782
Initial states:
l[0] b[0]
24.77007 0.04320226
sigma^2: 0.0088
AIC AICc BIC
-33.07985 -31.92600 -22.77763
Okay, ETS()
has chosen an ETS(A,A,N) model. Let’s see what happens when we use a damped trend.
<- data_t |>
fit_td model(ets = ETS(GDP ~ trend("Ad"))) |>
report()
Series: GDP
Model: ETS(A,Ad,N)
Smoothing parameters:
alpha = 0.9998999
beta = 0.2480956
phi = 0.979999
Initial states:
l[0] b[0]
24.82881 0.003249391
sigma^2: 0.0091
AIC AICc BIC
-30.46070 -28.81364 -18.09804
Damping has had a negative effect on the fit. The AIC, AICc, and BIC have all increased by 3-4 points.
Let’s try operating on the original data. We’ll let ETS()
find the best fit.
<- data |>
fit model(ets = ETS(GDP)) |>
report()
Series: GDP
Model: ETS(M,A,N)
Smoothing parameters:
alpha = 0.9998998
beta = 0.3119984
Initial states:
l[0] b[0]
45713434615 3288256682
sigma^2: 0.0108
AIC AICc BIC
3102.064 3103.218 3112.366
Oh my goodness, ETS()
has chosen an ETS(M,A,N) model. It’s the information criteria scores that have exploded. The AIC, AICc, and BIC have all increased by ~3000 points. We are going to stick with the log transformed data and the ETS(A,A,N) model.
First we will plot the forecasts for the log transformed, non-dampened data.
|>
fit_t forecast(h = 10) |>
autoplot(data_t) +
labs(title = "Chinese GDP: Log Transformed, Non-Damped")
Now we will plot the forecasts for the log transformed, dampened data.
|>
fit_td forecast(h = 10) |>
autoplot(data_t) +
labs(title = "Chinese GDP: Log Transformed and Damped")
It looks as if the damped trend has a less confident forecast; it has a wider bell. The non-damped trend has a more confident and narrow forecast. We shall stick with the non-damped trend with the log transformed data.
Exercise 8.7
Find an ETS model for the Gas data from aus_production
and forecast the next few years. Why is multiplicative seasonality necessary here? Experiment with making the trend damped. Does it improve the forecasts?
|>
aus_production autoplot(Gas)
Ah, I remember this guy. I haven’s searched for a model yet, nonethless can see by the plot that there is a growth factor in the seasonality. For that reason, the seasonal component will be multiplicative. And it very clearly has a steep, positive linear trend from 1970 - 2010. The only real variable is damping and being that we’ve been asked to experiment with it, we shall.
8.7 Solution
<- aus_production |>
fit model(ets = ETS(Gas)) |>
report()
Series: Gas
Model: ETS(M,A,M)
Smoothing parameters:
alpha = 0.6528545
beta = 0.1441675
gamma = 0.09784922
Initial states:
l[0] b[0] s[0] s[-1] s[-2] s[-3]
5.945592 0.07062881 0.9309236 1.177883 1.074851 0.8163427
sigma^2: 0.0032
AIC AICc BIC
1680.929 1681.794 1711.389
A champion has been found. Sure enough it has a multiplicative seasonal component as well as a multiplicative error component. It’s trend is additive. The optimal values of \(\hat{\alpha}=0.6529\), \(\hat{\beta}=0.1552\), and \(\hat{\gamma}=0.0978\).
Let’s try it with a damped trend component. Let’s whether it still
<- aus_production |>
fit model(ets = ETS(Gas ~ error("M") + trend("Ad") + season("M"))) |>
report()
Series: Gas
Model: ETS(M,Ad,M)
Smoothing parameters:
alpha = 0.6489044
beta = 0.1551275
gamma = 0.09369372
phi = 0.98
Initial states:
l[0] b[0] s[0] s[-1] s[-2] s[-3]
5.858941 0.09944006 0.9281912 1.177903 1.07678 0.8171255
sigma^2: 0.0033
AIC AICc BIC
1684.028 1685.091 1717.873
What do we see? The smoothing values, not surprisingly, have subtly changed. And all of the information criteria scores have slightly increased. But only by a small fraction (\(<0.04%\)).
Exercise 8.8
Recall your retail time series data (from Exercise 7 in Section 2.10).
<- aus_retail |>
data filter(`Series ID` == sample(aus_retail$`Series ID`, 1))
Let’s plot the data and see what we have.
|>
data autoplot(Turnover) +
labs(title = "Retail Turnover")
- Why is multiplicative seasonality necessary for this series?
8.8.a Solution
The amplitude of the seasonal component is increasing as the series grows. This is a clear indicator that the seasonal component is multiplicative. But, I do wonder whether multiplicative is idea for the entire series. It does appear to grow from the beginning of the series to ~2010. From ~2010-2019 the amplitude is pretty constant.
- Apply Holt-Winters’ multiplicative method to the data. Experiment with making the trend damped.
8.8.b Solution
First, we shall try the multiplicative method without damping.
|>
data model(ets = ETS(Turnover ~ error("M") + trend("A") + season("M"))) |>
report()
Series: Turnover
Model: ETS(M,A,M)
Smoothing parameters:
alpha = 0.5276247
beta = 0.0001000352
gamma = 0.08266459
Initial states:
l[0] b[0] s[0] s[-1] s[-2] s[-3] s[-4] s[-5]
4.760311 0.06749661 0.9355303 0.8488176 0.9066948 1.470268 1.01972 0.9194645
s[-6] s[-7] s[-8] s[-9] s[-10] s[-11]
0.9372326 1.006957 1.000908 1.004499 1.028769 0.9211401
sigma^2: 0.0056
AIC AICc BIC
2838.026 2839.473 2907.540
And now we shall try Holt-Winters with a damped trend.
|>
data model(ets = ETS(Turnover ~ error("M") + trend("Ad") + season("M"))) |>
report()
Series: Turnover
Model: ETS(M,Ad,M)
Smoothing parameters:
alpha = 0.5510153
beta = 0.000102758
gamma = 0.07987571
phi = 0.9786211
Initial states:
l[0] b[0] s[0] s[-1] s[-2] s[-3] s[-4] s[-5]
5.065674 0.07156032 0.9354592 0.8503412 0.8922841 1.474074 0.9996284 0.9053009
s[-6] s[-7] s[-8] s[-9] s[-10] s[-11]
0.9452715 1.005326 1.011572 1.010507 1.042879 0.927357
sigma^2: 0.0057
AIC AICc BIC
2844.284 2845.905 2917.887
The model with the damped trend has a higher AIC, AICc, and BIC. So, our champion is ETS(M,A,M).
Let’s see what ETS()
finds if we don’t specify the component types. And then we will compare.
|>
data model(ets = ETS(Turnover)) |>
report()
Series: Turnover
Model: ETS(M,A,M)
Smoothing parameters:
alpha = 0.5276247
beta = 0.0001000352
gamma = 0.08266459
Initial states:
l[0] b[0] s[0] s[-1] s[-2] s[-3] s[-4] s[-5]
4.760311 0.06749661 0.9355303 0.8488176 0.9066948 1.470268 1.01972 0.9194645
s[-6] s[-7] s[-8] s[-9] s[-10] s[-11]
0.9372326 1.006957 1.000908 1.004499 1.028769 0.9211401
sigma^2: 0.0056
AIC AICc BIC
2838.026 2839.473 2907.540
Uh oh, it looks like ETS()
has chosen ETS(M,A,M). This doesn’t look good for the model with damping.
Let’s compare the information criteria scores:
Model | AIC | AICc | BIC |
---|---|---|---|
ETS(M,A,M) | 2838 | 2839 | 2908 |
ETS(M,Ad,M) | 2844 | 2846 | 2918 |
The model with the damped trend has a higher AIC, AICc, and BIC. So, our champion indeed is ETS(M,A,M).
- Compare the RMSE of the one-step forecasts from the two methods. Which do you prefer?
8.8.c Solution
Okay, let’s compare the RMSE of the one-step forecasts from the two methods. One-step? I know accuracy()
calculates the RMSE, but for a single step? Let’s see. Ah, accuracy()
: Accuracy measures can be computed directly from models as the one-step-ahead fitted residuals are available.
# Non-damped
|>
data model(ets = ETS(Turnover ~ error("M") + trend("A") + season("M"))) |>
accuracy()
# A tibble: 1 × 12
State Industry .model .type ME RMSE MAE MPE MAPE MASE RMSSE
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Tasmania Electrical… ets Trai… -0.0114 1.33 0.944 -0.495 5.62 0.458 0.480
# ℹ 1 more variable: ACF1 <dbl>
# Damped
|>
data model(ets = ETS(Turnover ~ error("M") + trend("Ad") + season("M"))) |>
accuracy()
# A tibble: 1 × 12
State Industry .model .type ME RMSE MAE MPE MAPE MASE RMSSE
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Tasmania Electrical a… ets Trai… 0.0925 1.34 0.947 0.194 5.61 0.459 0.480
# ℹ 1 more variable: ACF1 <dbl>
The RMSE of the non-damped model is 1.3333, while the RMSE of the damped model is 1.3351. So close! But the non-damped model has a slightly lower RMSE, so, if that is all I am basing my decision on, then I will opt for the non-damped model.
- Check that the residuals from the best method look like white noise.
8.8.d Solution
|>
data model(ets = ETS(Turnover ~ error("M") + trend("A") + season("M"))) |>
gg_tsresiduals()
Wow, the residuals are very normally distributed. And the ACF show a few lags at which the correlation coefficient is above or below the dotted line, but very minimally. And the residual line plot looks like white noise. Let’s confirm by checking the mean.
<- data |>
c model(ets = ETS(Turnover ~ error("M") + trend("A") + season("M"))) |>
components()
mean(c$remainder, na.rm = TRUE)
[1] 0.0004808833
Alright! The mean is tiny. I’m comfortable calling it white noise.
- Now find the test set RMSE, while training the model to the end of 2010. Can you beat the seasonal naïve approach from Exercise 7 in Section 5.11?
8.9.e Solution
|>
data filter(year(Month) < 2011) |>
model(ets = ETS(Turnover ~ error("M") + trend("A") + season("M"))) |>
accuracy()
# A tibble: 1 × 12
State Industry .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Tasm… Electri… ets Trai… 0.0293 1.14 0.823 -0.522 5.96 0.482 0.511 0.0471
The RMSE of the test set above is ~1.144. The seasonal naïve approach from Exercise 7 in Section 5.11 had an RMSE of 2.24. So, yes indeed! We can beat the seasonal naïve approach.
Exercise 8.9
For the same retail data, try an STL decomposition applied to the Box-Cox transformed series, followed by ETS on the seasonally adjusted data. How does that compare with your best previous forecasts on the test set?
8.9 Solution
Find an optimized lambda for the Box-Cox transformation.
<- data |>
lambda features(Turnover, features = guerrero) |>
pull(lambda_guerrero)
<- data |>
data_t mutate(Turnover = box_cox(Turnover, lambda))
|>
data_t autoplot(Turnover)
Well, welly, welly, that’s done a pretty good job. Let’s try the STL decomposition. Actually, I’m not understanding what is being asked? What is the STL decomposition for? We have the transformed data. Let’s walk through it. Perhaps it will make sense?
<- data_t |>
decomped model(stl = STL(Turnover)) |>
components()
Okay, I see the season_adjust data. But why aren’t we trying to use the S
in ETS? All will be revealed. We shall use the season_adjust
data with ETS.
|>
decomped select(season_adjust) |>
model(ets = ETS(season_adjust ~ error("M") + trend("A"))) |>
accuracy()
# A tibble: 1 × 10
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 ets Training -0.00307 0.0930 0.0710 -0.156 2.31 0.396 0.415 0.0566
The RMSE of 8.9.e is ~1.144. Our RMSE, calculated from our seasonally adjusted, box-cox transformed data is 0.093. Ladies and gentleman, we have a new champion. Well, all was revealed. There is wisdom behind the ways of our wiley professors.