# A tibble: 12 × 11
.model Country .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Drift Australia Test 29.1 35.5 29.1 7.23 7.23 NaN NaN 0.210
2 Drift Canada Test 33.3 37.2 33.3 6.09 6.09 NaN NaN -0.229
3 Drift Japan Test 14.7 17.9 14.7 2.44 2.44 NaN NaN -0.229
4 Drift USA Test 75.9 76.2 75.9 12.7 12.7 NaN NaN -0.561
5 Mean Australia Test 35.7 42.3 35.7 8.89 8.89 NaN NaN 0.216
6 Mean Canada Test 90.4 92.9 90.4 16.7 16.7 NaN NaN -0.0799
7 Mean Japan Test 100. 101. 100. 16.8 16.8 NaN NaN -0.534
8 Mean USA Test 82.9 83.3 82.9 13.9 13.9 NaN NaN -0.423
9 Naive Australia Test 34.7 41.5 34.7 8.64 8.64 NaN NaN 0.216
10 Naive Canada Test 46.2 51.0 46.2 8.46 8.46 NaN NaN -0.0799
11 Naive Japan Test 36.3 37.8 36.3 6.06 6.06 NaN NaN -0.534
12 Naive USA Test 82.1 82.5 82.1 13.8 13.8 NaN NaN -0.423
print(paste0("Average accuracy of drift (RMSE): ", trunc(100*sum(acc[1:4, ]$RMSE/4))/100))
[1] "Average accuracy of drift (RMSE): 41.69"
print(paste0("Average accuracy of mean (RMSE): ", trunc(100*sum(acc[5:8, ]$RMSE/4))/100))
[1] "Average accuracy of mean (RMSE): 79.86"
print(paste0("Average accuracy of naive (RMSE): ", trunc(100*sum(acc[9:12, ]$RMSE/4))/100))
[1] "Average accuracy of naive (RMSE): 53.21"
Across all four countries, the model that performed the best (in terms of RMSE) was the drift method, with it’s average accuracy listed above. Worst was the mean model.
(d). Check the residuals of your preferred method. Do they resemble white noise?
hh_budget.train[hh_budget.train$Country =="Australia", ] |>model(RW(Wealth ~drift())) |>gg_tsresiduals() +labs(title ="Residuals for Australia")
hh_budget.train[hh_budget.train$Country =="Canada", ] |>model(RW(Wealth ~drift())) |>gg_tsresiduals() +labs(title ="Residuals for Canada")
hh_budget.train[hh_budget.train$Country =="Japan", ] |>model(RW(Wealth ~drift())) |>gg_tsresiduals() +labs(title ="Residuals for Japan")
hh_budget.train[hh_budget.train$Country =="USA", ] |>model(RW(Wealth ~drift())) |>gg_tsresiduals() +labs(title ="Residuals for USA")
The residuals somewhat resemble white noise, specifically for the first three countries (Australia, Canada, and Japan). We can see, on the residual histograms, scattering of the residuals around 0. However, there is still a strong negative skew on the residual histogram for the USA, suggesting a non-normal distribution. Autocorrelation is low for all, meaning that the drift model captures the temporal structure of the data. A log transform might be in order!
11. Analysis of Bricks data (from aus_production)
(a). Use an STL decomposition to calculate the trend-cycle and seasonal indices
Above was about the best that I could do to keep the remainder “small” but have the seasonality “regular”. It feels like there is an additional broader seasonality reflected in the remainder? A large trend window was chosen for smoothing of the trend cycle, as there are ~200 observations to work with.
(b). Compute and plot the seasonally adjusted data
The results are not dramatically better, but they are still better than with the non-robust STL decomposition model. Autocorrelation is slightly tighter for most lag values and the residual histogram is less skewed.
(g). Compare forecasts from decomposition_model() with those from SNAIVE(), using a test set comprising the last 2 years of data
# A tibble: 2 × 10
.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 snaive Test 17.2 30.6 24.8 4.18 6.06 NaN NaN -0.236
2 stlf Test 5.25 19.5 14.4 1.29 3.54 NaN NaN -0.104
The STL decomposition is a lot more accurate than the seasonal naive model, both in RMSE and MAE. Wow!