5.1) Produce forecasts for the following series using whichever of NAIVE(y), SNAIVE(y) or RW(y ~ drift()) is more appropriate in each case:
Australian Population (global_economy)
As the data trend shows an increase with no seasonal components, Drift method would be ideal in forecasting.
Bricks (aus_production)
Seasonal Naive is the most appropriate due to the trend of the Brick data propogating in a seasonal and quarterly interval.
NSW Lambs (aus_livestock)
Due to the seasonal component showing no significance and the fluxuation in trend, Naive method would be ideal.
Household wealth (hh_budget).
Due to a pattern in the trend and no seasonality, drift method would be ideal.
Australian takeaway food turnover (aus_retail)
With the seasonal component being minimal, Seasonal Naive method would be ideal.
5.2) Use the Facebook stock price (data set gafa_stock) to do the following:
Since there is no seasonality component, it would not be ideal to use SNaive. The Mean and Naive will not demonstrate the trend in the data, therefore will be bad methods to implement.
5.3) Apply a seasonal naïve method to the quarterly Australian beer production data from 1992. Check if the residuals look like white noise, and plot the forecasts. The following code will help.
Both tests have a p-value less than .05, which states the residuals are not explained by the white noise.
## Warning: Removed 4 rows containing missing values (`geom_line()`).
## Warning: Removed 4 rows containing missing values (`geom_point()`).
## Warning: Removed 4 rows containing non-finite values (`stat_bin()`).
Box-Pierce Test
## # A tibble: 1 × 3
## .model bp_stat bp_pvalue
## <chr> <dbl> <dbl>
## 1 SNAIVE(Beer) 34.4 0.000160
Ljung-Box Test
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 SNAIVE(Beer) 37.8 0.0000412
5.4) Repeat the previous exercise using the Australian Exports series from global_economy and the Bricks series from aus_production. Use whichever of NAIVE() or SNAIVE() is more appropriate in each case.
Since there is no seasonality, NAIVE method would be appropriate.
## Warning: Removed 1 row containing missing values (`geom_line()`).
## Warning: Removed 1 rows containing missing values (`geom_point()`).
## Warning: Removed 1 rows containing non-finite values (`stat_bin()`).
Both tests produce a p-value greater than .05, which states the residuals resemble white noise.
## # A tibble: 1 × 4
## Country .model bp_stat bp_pvalue
## <fct> <chr> <dbl> <dbl>
## 1 Australia NAIVE(Exports) 14.6 0.148
## # A tibble: 1 × 4
## Country .model lb_stat lb_pvalue
## <fct> <chr> <dbl> <dbl>
## 1 Australia NAIVE(Exports) 16.4 0.0896
There is a seasonality component in Bricks, so SNAIVE method would be appropriate.
## Warning: Removed 24 rows containing missing values (`geom_line()`).
## Warning: Removed 24 rows containing missing values (`geom_point()`).
## Warning: Removed 24 rows containing non-finite values (`stat_bin()`).
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning in max(ids, na.rm = TRUE): no non-missing arguments to max; returning
## -Inf
## Warning: Removed 8 rows containing missing values (`()`).
## Warning: Removed 20 rows containing missing values (`geom_line()`).
Both tests a p-value less than 0.05, which states the residuals do not resemble white noise.
## # A tibble: 1 × 3
## .model bp_stat bp_pvalue
## <chr> <dbl> <dbl>
## 1 SNAIVE(Bricks) 292. 0
## # A tibble: 1 × 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 SNAIVE(Bricks) 301. 0
5.7) For your retail time series (from Exercise 7 in Section 2.10):
set.seed(12345678)
myseries <- aus_retail |>
filter(`Series ID` == sample(aus_retail$`Series ID`,1))
myseries_train <- myseries |>
filter(year(Month) < 2011)
autoplot(myseries, Turnover) +
autolayer(myseries_train, Turnover, colour = "red")
fit <- myseries_train |>
model(SNAIVE(Turnover))
fit |> gg_tsresiduals()
## Warning: Removed 12 rows containing missing values (`geom_line()`).
## Warning: Removed 12 rows containing missing values (`geom_point()`).
## Warning: Removed 12 rows containing non-finite values (`stat_bin()`).
fc <- fit |>
forecast(new_data = anti_join(myseries, myseries_train))
## Joining with `by = join_by(State, Industry, `Series ID`, Month, Turnover)`
fc |> autoplot(myseries)
fit |> accuracy()
## # A tibble: 1 × 12
## State Industry .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Norther… Clothin… SNAIV… Trai… 0.439 1.21 0.915 5.23 12.4 1 1 0.768
fc |> accuracy(myseries)
## # A tibble: 1 × 12
## .model State Industry .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SNAIVE(T… Nort… Clothin… Test 0.836 1.55 1.24 5.94 9.06 1.36 1.28 0.601
The accuracy measures are very sensitive to the amount of training data used. With quality and quantity training data, the models can forecast better accuracy.