# Packages
library(fpp3) # tsibble + fable + feasts + ggplot helpers
library(quantmod) # FRED download (no API key needed)
library(zoo) # as.yearmon helper (used by quantmod users sometimes)Week 2 — Accuracy Metrics + Decomposition (FRED Monthly: PAYEMS/PAYNSA)
1 1. Overview
This report follows the tidyverts/fpp3 workflow to:
- Pull a monthly FRED time series,
- Fit baseline forecasting models (MEAN, NAIVE, SNAIVE, DRIFT),
- Compute accuracy metrics (ME, MPE, RMSE, MAE, MAPE) on a holdout test set,
- Generate forecast plots with prediction intervals,
- Conduct additive and multiplicative-style decomposition (STL on level vs STL on log-scale),
- Provide a clear mapping to recreating the same accuracy metrics in Excel.
For the forecasting + accuracy metrics, I use PAYEMS (Total Nonfarm Payrolls, seasonally adjusted). For decomposition, I use PAYNSA (Total Nonfarm Payrolls, not seasonally adjusted) so that the seasonal component is visible; PAYEMS is already seasonally adjusted and will typically show weaker seasonality.
Important implementation detail: FRED dates often arrive as end-of-month Dates. For fpp3/tidyverts seasonal tools, we convert to a true monthly index yearmonth() and enforce regular spacing with fill_gaps().
2 2. Part I — Accuracy metrics + forecasts (PAYEMS)
2.1 2.1 Pull monthly data from FRED
fred_code <- "PAYEMS" # Total Nonfarm Payrolls (Seasonally Adjusted)
raw_xts <- getSymbols(fred_code, src = "FRED", auto.assign = FALSE)
y <- tibble(
date = as.Date(index(raw_xts)),
value = as.numeric(raw_xts[, 1])
) |>
mutate(month = yearmonth(date)) |>
select(month, value) |>
as_tsibble(index = month) |>
filter(!is.na(value)) |>
fill_gaps()
# quick sanity check (should show regular monthly interval)
y |> interval()<interval[1]>
[1] 1M
autoplot(y, value) +
labs(
title = paste("FRED series:", fred_code),
x = NULL,
y = "Thousands of persons"
)2.2 2.2 Train/test split
I evaluate forecasts using a simple holdout: the last 12 months are the test set.
h <- 12
train <- y |> slice(1:(n() - h))
test <- y |> slice((n() - h + 1):n())
train |> tail(3)# A tsibble: 3 x 2 [1M]
month value
<mth> <dbl>
1 2024 Oct 158358
2 2024 Nov 158619
3 2024 Dec 158942
test |> head(3)# A tsibble: 3 x 2 [1M]
month value
<mth> <dbl>
1 2025 Jan 159053
2 2025 Feb 159155
3 2025 Mar 159275
2.3 2.3 Fit baseline models
Models included: - MEAN: mean forecast - NAIVE: random walk / last observation carried forward - SNAIVE: seasonal naive (uses last year’s same month) - DRIFT: random walk with drift
fit <- train |>
model(
MEAN = MEAN(value),
NAIVE = NAIVE(value),
SNAIVE = SNAIVE(value),
DRIFT = RW(value ~ drift())
)
fit# A mable: 1 x 4
MEAN NAIVE SNAIVE DRIFT
<model> <model> <model> <model>
1 <MEAN> <NAIVE> <SNAIVE> <RW w/ drift>
2.4 2.4 Forecasts with prediction intervals
fc <- fit |> forecast(h = h)
# Forecast plot (screenshot: forecasts + CIs + test overlay)
autoplot(fc, train) +
autolayer(test, value, alpha = 0.7) +
labs(
title = paste("Forecasts with prediction intervals:", fred_code),
x = NULL,
y = "Thousands of persons"
)2.5 2.5 Accuracy metrics table
Accuracy metrics computed on the holdout test set: - ME (Mean Error): average signed error (bias in original units) - MPE (Mean Percentage Error): average signed percent error (bias in percent) - RMSE: penalizes large errors more (squared error) - MAE: average absolute error (linear penalty) - MAPE: average absolute percent error
acc <- accuracy(fc, y) |>
select(.model, .type, ME, MPE, RMSE, MAE, MAPE) |>
arrange(.type, .model)
acc# A tibble: 4 × 7
.model .type ME MPE RMSE MAE MAPE
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 DRIFT Test -354. -0.222 480. 354. 0.222
2 MEAN Test 66760. 41.9 66760. 66760. 41.9
3 NAIVE Test 460. 0.288 484. 460. 0.288
4 SNAIVE Test 1441. 0.904 1499. 1441. 0.904
# Nicely formatted table (screenshot this)
knitr::kable(
acc,
digits = 3,
caption = "Accuracy metrics for baseline models on PAYEMS (test set shown under .type = 'Test')"
)| .model | .type | ME | MPE | RMSE | MAE | MAPE |
|---|---|---|---|---|---|---|
| DRIFT | Test | -353.908 | -0.222 | 479.706 | 353.908 | 0.222 |
| MEAN | Test | 66760.219 | 41.882 | 66760.394 | 66760.219 | 41.882 |
| NAIVE | Test | 459.500 | 0.288 | 484.241 | 459.500 | 0.288 |
| SNAIVE | Test | 1441.083 | 0.904 | 1498.693 | 1441.083 | 0.904 |
2.6 2.6 How the measures differ (short explanation)
All metrics are functions of forecast errors: (e_t = y_t - _t).
- ME is the average of (e_t). It captures directional bias in the original units.
- MPE is the average of (100 e_t / y_t). It captures bias as a percent of the actual.
- MAE is the average of (|e_t|). It summarizes typical error magnitude in original units.
- RMSE is (). It penalizes large errors more than MAE.
- MAPE is the average of (100 |e_t| / y_t). It is scale-free but can be problematic if (y_t) is near zero.
For model comparison, smaller values of RMSE/MAE/MAPE generally indicate better predictive accuracy, while ME/MPE indicate systematic over- or under-forecasting.
2.7 2.7 Recreating the metrics in Excel
To recreate the exact metrics in Excel for the test period:
Create columns: - Actual (y_t) - Forecast (_t) - Error (e_t = y_t - _t) - (|e_t|) - (e_t^2) - Percent error (e_t / y_t) - Absolute percent error (|e_t| / y_t)
Excel formulas (assuming: Actual in column B, Forecast in C, Error in D):
- Error (D2):
=B2 - C2 - Abs error (E2):
=ABS(D2) - Squared error (F2):
=D2^2 - Percent error (G2):
=D2/B2 - Abs percent error (H2):
=ABS(G2)
Then aggregate over the test rows: - ME: =AVERAGE(D:D) - MAE: =AVERAGE(E:E) - RMSE: =SQRT(AVERAGE(F:F)) - MPE: =100*AVERAGE(G:G) - MAPE: =100*AVERAGE(H:H)
To match R’s printed table, make sure you use the same test rows and apply the same rounding.
3 3. Part II — Decomposition (PAYNSA)
PAYEMS is already seasonally adjusted, which often reduces visible seasonality. To demonstrate decomposition more clearly, I use the non-seasonally adjusted companion series PAYNSA.
3.1 3.1 Pull PAYNSA
fred_code_decomp <- "PAYNSA" # Total Nonfarm Payrolls (Not Seasonally Adjusted)
raw_xts2 <- getSymbols(fred_code_decomp, src = "FRED", auto.assign = FALSE)
y2 <- tibble(
date = as.Date(index(raw_xts2)),
value = as.numeric(raw_xts2[, 1])
) |>
mutate(month = yearmonth(date)) |>
select(month, value) |>
as_tsibble(index = month) |>
filter(!is.na(value)) |>
fill_gaps()
# sanity check
y2 |> interval()<interval[1]>
[1] 1M
autoplot(y2, value) +
labs(
title = paste("FRED series:", fred_code_decomp),
x = NULL,
y = "Thousands of persons"
)3.2 3.2 Additive STL decomposition
stl_add <- y2 |>
model(STL(value ~ season(window = "periodic")))
# Components plot (screenshot this)
components(stl_add) |>
autoplot() +
labs(title = "STL decomposition (additive) — PAYNSA")3.3 3.3 Multiplicative-style decomposition (log transform)
A common way to approximate multiplicative seasonality is to apply a log transform and then decompose additively on the log scale.
# PAYNSA should be strictly positive; filter defensively
y2_pos <- y2 |> filter(value > 0)
stl_log <- y2_pos |>
mutate(log_value = log(value)) |>
model(STL(log_value ~ season(window = "periodic")))
# Components plot (screenshot this)
components(stl_log) |>
autoplot() +
labs(title = "STL decomposition on log(value) (multiplicative-style) — PAYNSA")3.4 3.4 Which worked better? How can you tell?
Two practical criteria:
- Stability of seasonal amplitude
- If seasonal swings are roughly constant in absolute size over time, additive decomposition is appropriate.
- If seasonal swings grow/shrink proportionally with the level, log/multiplicative-style is more appropriate.
- Remainder (residual) behavior A “better” decomposition typically produces a remainder that looks more like noise (no obvious trend/seasonality left).
# Remainder diagnostics (additive)
rem_add <- components(stl_add) |>
as_tibble() |>
select(month, remainder) |>
as_tsibble(index = month)
autoplot(rem_add, remainder) +
labs(title = "Remainder (additive STL) — PAYNSA", x = NULL, y = "remainder")# Remainder diagnostics (log-scale)
rem_log <- components(stl_log) |>
as_tibble() |>
select(month, remainder) |>
as_tsibble(index = month)
autoplot(rem_log, remainder) +
labs(title = "Remainder (log-STL) — PAYNSA", x = NULL, y = "remainder")In practice, I would choose the decomposition whose remainder appears more stable (more homoscedastic) and whose seasonal component is more consistent with the series’ behavior (constant vs proportional seasonality).
3.5 3.5 How might decomposition be used in forecasting?
Decomposition can be used to: - Remove seasonality (deseasonalize), - Forecast the adjusted series (trend/remainder), - Add back seasonality (reseasonalize).
However, because many forecasting models (e.g., seasonal naive, ETS, ARIMA variants, or regression models with seasonal terms) directly model seasonality, decomposition is often most useful for interpretation, diagnostics, and variance stabilization rather than as a required forecasting step.
4 4. Submission checklist (images to include)
For the submission requirements, include images of:
- The accuracy table (nicely formatted)
- The forecast plot (point forecasts + prediction intervals + test overlay)
- The additive STL components plot (PAYNSA)
- The log/multiplicative-style STL components plot (PAYNSA)
- The Excel sheet reproducing ME/MPE/RMSE/MAE/MAPE for the test period, showing values match R