library(fpp3)
library(plotly)
library(tsibble)
Produce forecasts for the following series using whichever of ‘NAIVE(y)’, ‘SNAIVE(y)’ or ‘RW(y ~ drift())’ is more appropriate in each case:
aus_pop <- global_economy %>% filter(Country=='Australia') %>% select(Population)
aus_pop %>% model(NAIVE(Population)) %>% forecast(h=15) %>% autoplot(aus_pop)
bricks <- aus_production %>% select(Quarter,Bricks) %>% filter(!is.na(Bricks))
bricks %>% model(NAIVE(Bricks)) %>% forecast(h=15) %>% autoplot(bricks)
nsw_lambs <- aus_livestock %>% filter(Animal=='Lambs',State=='New South Wales') %>% select(Count)
nsw_lambs %>% model(NAIVE(Count)) %>% forecast(h=15) %>% autoplot(nsw_lambs)
house_w <- hh_budget
house_w_mod <- house_w %>% model(RW(Wealth ~ drift())) %>% forecast(h=15)
autoplot(house_w) + autolayer(house_w_mod)
aus_take <- aus_retail %>% filter(Industry=='Takeaway food services') %>% select(Turnover)
aus_take_mod <- aus_take %>% model(NAIVE(Turnover)) %>% forecast(h=15)
autoplot(aus_take) + autolayer(aus_take_mod)
fb <- gafa_stock %>% filter(Symbol=='FB') %>% mutate(Day = row_number()) %>% update_tsibble(index=Day, regular=TRUE)
fb %>% autoplot(Open)
fb_mod <- fb %>% model(RW(Open ~ drift())) %>% forecast(h=60)
autoplot(fb) + autolayer(fb_mod)
autoplot(fb) + autolayer(fb_mod) + geom_segment(x=min(fb$Day), y = fb$Open[which.min(fb$Day)], xend=max(fb$Day), yend = fb$Open[which.max(fb$Day)])
fb_mod <- fb %>% model(Mean = MEAN(Open),
Naive = NAIVE(Open),
SNaive = SNAIVE(Open),
Drift = RW(Open ~ drift())) %>% forecast(h=60)
autoplot(fb) + autolayer(fb_mod)
Of all the benchmark functions drift is most likely the best. The
seasonal does not appear as this is not seasonal data. From the
remaining benchmarks drift shows the best price outlook.
# Extract data of interest
recent_production <- aus_production |>
filter(year(Quarter) >= 1992)
# Define and estimate a model
fit <- recent_production |> model(SNAIVE(Beer))
# Look at the residuals
fit |> gg_tsresiduals()
# Look a some forecasts
fit |> forecast() |> autoplot(recent_production)
recent_production <- aus_production |>
filter(year(Quarter) >= 1992)
aus_beer_mod <- recent_production %>% model(SNAIVE(Beer))
aus_beer_mod %>% gg_tsresiduals()
The residuals do not look like white noise. The residual graph shows a bimodal distribution and the ACF graph shows lags extending the suggested limit lines. Thus we can conclude that the residuals are not uncorrelated and normally distributed.
aus_beer_mod %>% forecast() %>% autoplot(recent_production)
augment(aus_beer_mod) |> features(.innov, box_pierce, lag = 10)
## # A tibble: 1 x 3
## .model bp_stat bp_pvalue
## <chr> <dbl> <dbl>
## 1 SNAIVE(Beer) 34.4 0.000160
augment(aus_beer_mod) |> features(.innov, ljung_box, lag = 10)
## # A tibble: 1 x 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 SNAIVE(Beer) 37.8 0.0000412
Both tests , box-pierce and ljung-box, have a p-value of less than 0.05 and thus we can conclude that the residuals are distinguishable from white noise and that the model does not explain the variance in the data.
aus_bricks <- recent_production %>% filter(!is.na(Bricks))
aus_bricks_mod <- aus_bricks %>% model(NAIVE(Bricks))
aus_bricks_mod %>% gg_tsresiduals()
The residuals do not look like white noise. The histogram shows us that the residuals follow a bimodal distribution and most likely do not have a mean of 0 and are not normally distributed. The ACF graph shows lags passing the suggested limit lines thus. Thus we can conclude that the residuals are not normally distributed and are not uncoordinated.
aus_bricks_mod %>% forecast() %>% autoplot(bricks)
augment(aus_bricks_mod) |> features(.innov, box_pierce, lag = 10)
## # A tibble: 1 x 3
## .model bp_stat bp_pvalue
## <chr> <dbl> <dbl>
## 1 NAIVE(Bricks) 70.7 3.22e-11
augment(aus_bricks_mod) |> features(.innov, ljung_box, lag = 10)
## # A tibble: 1 x 3
## .model lb_stat lb_pvalue
## <chr> <dbl> <dbl>
## 1 NAIVE(Bricks) 82.8 1.42e-13
Both tests , box-pierce and ljung-box, have a p-value of less than 0.05 and thus we can conclude that the residuals are distinguishable from white noise and that the model does not explain the variance in the data.
set.seed(246810)
myseries <- aus_retail %>%
filter(`Series ID` == 'A3349849A')
myseries_train <- myseries %>%
filter(year(Month) < 2011)
autoplot(myseries, Turnover) +
autolayer(myseries_train, Turnover, colour = "red")
fit <- myseries_train %>%
model(SNAIVE(Turnover))
fit %>% gg_tsresiduals()
fc <- fit %>%
forecast(new_data = anti_join(myseries, myseries_train))
fc %>% autoplot(myseries)
fit %>% accuracy()
## # A tibble: 1 x 12
## State Industry .model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Austral~ Cafes, ~ SNAIV~ Trai~ 0.985 3.37 2.53 5.05 16.1 1 1 0.826
fc %>% accuracy(myseries)
## # A tibble: 1 x 12
## .model State Industry .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 SNAIVE(T~ Aust~ Cafes, ~ Test 5.86 8.42 7.38 13.4 19.0 2.92 2.50 0.847
The errors for the training data are lower than the errors for the test data.
Accuracy measures are betters for the training model with an increased amount of training data, but there is a decrease in the accuracy measures for the test model in the same situation.