When you run the code, try to hide all the associated output and warning for the packages below (for cleaner presentation). Of course, read the warning and see if you have to take any appropriate action.
library(fpp3)
## Registered S3 method overwritten by 'tsibble':
## method from
## as_tibble.grouped_df dplyr
## ── Attaching packages ──────────────────────────────────────────── fpp3 1.0.2 ──
## ✔ tibble 3.3.1 ✔ tsibble 1.1.6
## ✔ dplyr 1.1.4 ✔ tsibbledata 0.4.1
## ✔ tidyr 1.3.2 ✔ feasts 0.4.2
## ✔ lubridate 1.9.4 ✔ fable 0.5.0
## ✔ ggplot2 4.0.1
## ── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
## ✖ lubridate::date() masks base::date()
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::intersect() masks base::intersect()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ✖ tsibble::setdiff() masks base::setdiff()
## ✖ tsibble::union() masks base::union()
library(fredr)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.1 ✔ readr 2.1.6
## ✔ purrr 1.2.1 ✔ stringr 1.6.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ tsibble::interval() masks lubridate::interval()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(patchwork)
library(knitr)
library(kableExtra)
##
## Attaching package: 'kableExtra'
##
## The following object is masked from 'package:dplyr':
##
## group_rows
library(writexl)
Get your own FRED key. https://fred.stlouisfed.org/docs/api/api_key.html
fredr https://cran.r-project.org/web/packages/fredr/vignettes/fredr.htmlremove(list=ls())
fredr_set_key("8a9ec1330374c1696f05cc8e526233b5") # replace with your own key please
# Fetch Monthly Unemployment Rate
df_unrate <- fredr(series_id = "UNRATE",
observation_start = as.Date("1990-01-01")) |>
mutate(Month = yearmonth(date)) |>
select(Month, value) |>
as_tsibble(index = Month)
has_gaps(df_unrate)
## # A tibble: 1 × 1
## .gaps
## <lgl>
## 1 FALSE
test_length <- 24
train <- df_unrate |>
slice(1:(n() - test_length))
test <- df_unrate |>
slice((n() - test_length + 1):n())
# Verify split
paste("Training End:", max(train$Month))
## [1] "Training End: 2023 Dec"
paste("Test Start:", min(test$Month))
## [1] "Test Start: 2024 Jan"
fit <- train |>
model(
MEAN = MEAN(value),
NAIVE = NAIVE(value),
SNAIVE = SNAIVE(value),
AVG = MEAN(value),
WAVG = TSLM(value ~ trend())
)
# Generate forecasts
fc <- fit |> forecast(h = test_length)
# Plot Forecasts with Confidence Intervals
fc |>
autoplot(train |> tail(60), level = 95) + # Show last 5 years of history
autolayer(test, value, color = "black") + # Overlay actuals
labs(title = "US Unemployment Rate Forecasts (UNRATE)",
subtitle = "Benchmark Models vs Actuals (Black Line)",
y = "Percent (%)") +
theme_minimal()
accuracy_table <- accuracy(fc, test) |>
select(.model, ME, MPE, RMSE, MAE, MAPE) |>
arrange(RMSE) # Ordered by best RMSE
# Display Table
kable(accuracy_table, digits = 3, caption = "Forecast Accuracy Metrics (Test Set)") |>
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
| .model | ME | MPE | RMSE | MAE | MAPE |
|---|---|---|---|---|---|
| NAIVE | 0.339 | 8.000 | 0.388 | 0.348 | 8.235 |
| SNAIVE | 0.526 | 12.580 | 0.557 | 0.526 | 12.580 |
| WAVG | -1.293 | -31.532 | 1.308 | 1.293 | 31.532 |
| AVG | -1.625 | -39.549 | 1.636 | 1.625 | 39.549 |
| MEAN | -1.625 | -39.549 | 1.636 | 1.625 | 39.549 |
ME-tells you if your model is generally guessing too high or too low MAE-It treats every mistake equally. If you are off by 5 points, it counts as 5 points. It just averages the size of the mistakes. RMSE-It hates big mistakes. Because it squares the errors, one huge miss counts much more than many small misses. MPE-Just like ME, but in percentages. It tells you if you are over/under-forecasting relative to the size of the data. MAPE-It turns the error into a simple percentage.
In excel: ME(Bias): Average of the Error column.
MAE(Avg Error): Average of the Abs Error column.
RMSE(Big Error Penalty): Square root of the average of the Sq Error column.
MPE(Bias %): Average of the Pct Error column.
MAPE(Avg % Error): Average of the Abs Pct Error column.
# Additive Decomposition (STL)
dcmp_add <- train |>
model(STL(value ~ season(window = "periodic"))) |>
components()
# Multiplicative Decomposition (Log-STL)
dcmp_mult <- train |>
model(STL(log(value) ~ season(window = "periodic"))) |>
components()
# Plot both for comparison
p1 <- dcmp_add |> autoplot() + labs(title = "Additive Decomposition")
p2 <- dcmp_mult |> autoplot() + labs(title = "Multiplicative (Log) Decomposition")
p1 / p2
Multiplicative decomposition is the better choice for the US Unemployment Rate because it accounts for how volatility scales with the data, producing consistent residuals (random noise) rather than the error spikes seen in the Additive model during recessions. This approach is highly effective for forecasting as it explicitly separates seasonal hiring cycles from the broader economic trend, ensuring that predictable annual patterns do not confuse the model’s detection of real economic shifts.