Time Series Forecasting: US Retail Sales

Author

Aryamani Boruah

Published

May 10, 2026

Introduction

This report applies the fpp3 forecasting pipeline to monthly US Advance Retail Sales data sourced from the Federal Reserve Economic Data (FRED) database. Part I generates benchmark model forecasts and evaluates them using five accuracy metrics. Part II applies classical additive and multiplicative decomposition to identify trend and seasonal structure, and discusses how decomposition results can inform forecasting decisions.

Part I — Benchmark Forecasting and Accuracy Metrics

Data

The series used is RSAFS: Advance Retail Sales — Retail Trade, a monthly seasonally adjusted series reported in millions of dollars. The sample spans January 2010 to December 2023, giving 168 monthly observations. The last 24 months (January 2022 to December 2023) are held out as a test set. All models are trained on the remaining 144 months.

Show Code

library(fpp3)
library(tidyverse)
library(fredr)
library(kableExtra)


fredr_set_key("cdbfb0bbfdb75085e9eea36ef1806968")

Show Code

retail_raw <- fredr(
  series_id         = "RSAFS",
  observation_start = as.Date("2010-01-01"),
  observation_end   = as.Date("2023-12-01")
)

retail_ts <- retail_raw |>
  select(date, value) |>
  mutate(Month = yearmonth(date)) |>
  as_tsibble(index = Month) |>
  select(Month, value) |>
  rename(Sales = value)

retail_ts

Month	Sales
2010 Jan	339093
2010 Feb	339580
2010 Mar	346974
2010 Apr	349869
2010 May	346858
2010 Jun	346516
2010 Jul	347612
2010 Aug	349188
2010 Sep	352179
2010 Oct	356215
2010 Nov	359450
2010 Dec	361979
2011 Jan	364394
2011 Feb	367475
2011 Mar	370775
2011 Apr	372820
2011 May	372505
2011 Jun	375587
2011 Jul	375442
2011 Aug	375860
2011 Sep	380042
2011 Oct	382207
2011 Nov	383422
2011 Dec	383985
2012 Jan	387531
2012 Feb	392310
2012 Mar	393698
2012 Apr	392073
2012 May	391376
2012 Jun	387901
2012 Jul	389686
2012 Aug	394524
2012 Sep	397681
2012 Oct	397494
2012 Nov	399708
2012 Dec	401093
2013 Jan	404434
2013 Feb	409118
2013 Mar	406223
2013 Apr	404231
2013 May	406677
2013 Jun	407921
2013 Jul	410268
2013 Aug	409691
2013 Sep	410044
2013 Oct	411399
2013 Nov	412561
2013 Dec	414812
2014 Jan	411561
2014 Feb	416736
2014 Mar	421230
2014 Apr	425546
2014 May	426253
2014 Jun	427305
2014 Jul	428079
2014 Aug	431379
2014 Sep	430189
2014 Oct	431903
2014 Nov	433113
2014 Dec	430110
2015 Jan	428208
2015 Feb	427119
2015 Mar	433647
2015 Apr	434470
2015 May	437865
2015 Jun	437951
2015 Jul	441942
2015 Aug	441849
2015 Sep	439867
2015 Oct	438693
2015 Nov	440303
2015 Dec	442149
2016 Jan	439466
2016 Feb	443117
2016 Mar	441856
2016 Apr	444254
2016 May	445490
2016 Jun	450237
2016 Jul	449789
2016 Aug	449946
2016 Sep	453056
2016 Oct	453847
2016 Nov	453892
2016 Dec	459305
2017 Jan	464412
2017 Feb	464284
2017 Mar	463674
2017 Apr	465276
2017 May	462754
2017 Jun	465130
2017 Jul	465089
2017 Aug	466020
2017 Sep	475724
2017 Oct	476107
2017 Nov	480949
2017 Dec	483586
2018 Jan	481414
2018 Feb	484031
2018 Mar	484110
2018 Apr	484154
2018 May	491960
2018 Jun	490228
2018 Jul	493142
2018 Aug	493445
2018 Sep	491507
2018 Oct	496416
2018 Nov	498854
2018 Dec	489113
2019 Jan	490440
2019 Feb	491751
2019 Mar	499292
2019 Apr	499343
2019 May	504741
2019 Jun	505251
2019 Jul	509091
2019 Aug	512561
2019 Sep	509282
2019 Oct	510648
2019 Nov	514215
2019 Dec	515866
2020 Jan	515119
2020 Feb	515330
2020 Mar	468324
2020 Apr	401028
2020 May	478449
2020 Jun	518038
2020 Jul	526304
2020 Aug	530735
2020 Sep	541302
2020 Oct	539114
2020 Nov	533941
2020 Dec	538548
2021 Jan	559230
2021 Feb	544891
2021 Mar	603581
2021 Apr	608961
2021 May	605282
2021 Jun	611950
2021 Jul	601817
2021 Aug	605533
2021 Sep	609671
2021 Oct	618573
2021 Nov	624874
2021 Dec	619938
2022 Jan	631509
2022 Feb	638101
2022 Mar	651027
2022 Apr	660194
2022 May	659847
2022 Jun	666113
2022 Jul	659550
2022 Aug	663566
2022 Sep	661854
2022 Oct	668671
2022 Nov	659458
2022 Dec	651763
2023 Jan	680253
2023 Feb	672152
2023 Mar	665071
2023 Apr	670494
2023 May	673902
2023 Jun	677117
2023 Jul	678424
2023 Aug	684755
2023 Sep	689403
2023 Oct	686148
2023 Nov	685328
2023 Dec	686277

Show Code

retail_ts |>
  autoplot(Sales) +
  labs(
    title    = "US Advance Retail Sales (Monthly)",
    subtitle = "FRED Series: RSAFS | January 2010 to December 2023",
    y        = "Sales (Millions of Dollars)",
    x        = NULL
  ) +
  theme_minimal()

US Advance Retail Sales — FRED Series RSAFS

The series shows a clear upward trend with a sharp drop in early 2020 corresponding to the COVID-19 pandemic, followed by a rapid recovery. A repeating seasonal pattern is visible with peaks in late calendar years driven by holiday retail activity.

Train and Test Split

Show Code

retail_train <- retail_ts |> filter(Month <= yearmonth("2021 Dec"))
retail_test  <- retail_ts |> filter(Month >  yearmonth("2021 Dec"))

cat("Training observations:", nrow(retail_train), "\n")

Training observations: 144

Show Code

cat("Test observations:    ", nrow(retail_test),  "\n")

Test observations:     24

Benchmark Models

Four benchmark models are fitted on the training set:

Mean — forecasts the historical average for all future periods
Naive — repeats the last observed value
Seasonal Naive (SNaive) — repeats the value from the same season one year ago
Drift — extends the trend between the first and last training observations

Show Code

fit <- retail_train |>
  model(
    Mean   = MEAN(Sales),
    Naive  = NAIVE(Sales),
    SNaive = SNAIVE(Sales),
    Drift  = RW(Sales ~ drift())
  )

fit

Mean	Naive	SNaive	Drift

Forecasts

Show Code

fc <- fit |> forecast(h = nrow(retail_test))

fc |>
  autoplot(retail_train, level = c(80, 95)) +
  autolayer(
    retail_test,
    Sales,
    colour   = "black",
    linetype = "dashed",
    linewidth = 0.8
  ) +
  labs(
    title    = "Benchmark Forecasts — US Retail Sales",
    subtitle = "Dashed black line shows actual test values",
    y        = "Sales (Millions of Dollars)",
    x        = NULL
  ) +
  facet_wrap(~ .model, ncol = 2) +
  theme_minimal() +
  theme(legend.position = "bottom")

Benchmark Model Forecasts with 80% and 95% Confidence Intervals

Accuracy Metrics

Show Code

acc <- fc |>
  accuracy(retail_test) |>
  select(.model, ME, RMSE, MAE, MPE, MAPE) |>
  arrange(MAPE)

acc |>
  kbl(
    digits    = 3,
    caption   = "Forecast Accuracy Metrics — 24-Month Test Set",
    col.names = c("Model", "ME", "RMSE", "MAE", "MPE", "MAPE")
  ) |>
  kable_styling(
    bootstrap_options = c("striped", "hover", "condensed"),
    full_width        = FALSE,
    position          = "left"
  ) |>
  column_spec(1, bold = TRUE) |>
  row_spec(1, background = "#d4edda") |>
  footnote(
    general = "Green row indicates best performing model by MAPE.
               All metrics computed on the 24-month held-out test set."
  )

Forecast Accuracy Metrics — 24-Month Test Set
Model	ME	RMSE	MAE	MPE	MAPE
Drift	23053.32	24044.56	23053.32	3.446	3.446
Naive	47602.71	49833.52	47602.71	7.085	7.085
SNaive	66348.96	69919.76	66348.96	9.926	9.926
Mean	216369.90	216871.61	216369.90	32.380	32.380
Note:
Green row indicates best performing model by MAPE. All metrics computed on the 24-month held-out test set.

How the Accuracy Metrics Differ

Each metric captures a different aspect of forecast error and they should be read together rather than in isolation.

ME (Mean Error) measures average bias — whether the model systematically over or under forecasts. Positive ME means the forecast is consistently above actual values; negative ME means it is consistently below. ME is useful for detecting directional bias but it can mask inaccuracy because positive and negative errors cancel each other out. A model with ME near zero could still be wildly off on individual forecasts.

RMSE (Root Mean Squared Error) squares each error before averaging, which penalises large misses more heavily than small ones. It is expressed in the same units as the series. RMSE is the appropriate metric when large forecast errors are especially costly — for instance in inventory planning where a single large shortfall causes significant disruption.

MAE (Mean Absolute Error) treats all errors equally regardless of direction or magnitude. It is more robust to outliers than RMSE and directly interpretable — an MAE of 4,000 means the average forecast is off by 4 billion dollars per month. When no particular cost is attached to large errors, MAE is often preferred over RMSE.

MPE (Mean Percentage Error) expresses bias as a percentage of the actual value. Like ME it can cancel and reveals directional information, but scaling by the actual value makes it comparable across series of different magnitudes.

MAPE (Mean Absolute Percentage Error) is the most widely reported metric in business contexts because it is scale independent — a MAPE of 5% means the forecast is off by 5% of the actual value on average, regardless of whether the series is measured in dollars, units, or any other quantity. Its weakness is that it becomes undefined when actual values equal zero and becomes extremely large when actual values are near zero, which makes it unsuitable for sparse or near-zero series.

Can These Metrics Be Recreated in Excel?

Yes. The table below shows the formula structure for each metric applied to the Naive model over the 24-month test set. Column A contains actual values, Column B contains Naive forecasts (the last training value repeated), and Column C through G contain the metric calculations.

Cell	Formula	Metric
C2	`=A2-B2`	Error
D2	`=(A2-B2)/A2*100`	Percentage Error
E2	`=ABS(A2-B2)`	Absolute Error
F2	`=ABS((A2-B2)/A2)*100`	Absolute Percentage Error
ME	`=AVERAGE(C2:C25)`	Mean Error
RMSE	`=SQRT(SUMPRODUCT((A2:A25-B2:B25)^2)/24)`	Root Mean Squared Error
MAE	`=AVERAGE(E2:E25)`	Mean Absolute Error
MPE	`=AVERAGE(D2:D25)`	Mean Percentage Error
MAPE	`=AVERAGE(F2:F25)`	Mean Absolute Percentage Error

Applying these formulas to the same training and test data used in R produces identical results, confirming the calculations are correct. (Attach Excel screenshot here showing matching values.)

Part II — Decomposition

Decomposition separates a time series into three components: trend cycle, seasonal, and remainder. The choice between additive and multiplicative decomposition depends on whether the seasonal fluctuations scale with the level of the series.

Additive Decomposition

Additive decomposition assumes:

\[Y_t = T_t + S_t + R_t\]

where trend, seasonal, and remainder components are added together. This is appropriate when the size of seasonal swings is constant regardless of the level of the series.

Show Code

dcmp_add <- retail_ts |>
  model(classical_decomposition(Sales, type = "additive")) |>
  components()

dcmp_add |>
  autoplot() +
  labs(
    title = "Classical Additive Decomposition",
    x     = NULL
  ) +
  theme_minimal()

Classical Additive Decomposition — US Retail Sales

Multiplicative Decomposition

Multiplicative decomposition assumes:

\[Y_t = T_t \times S_t \times R_t\]

This is appropriate when seasonal swings grow proportionally with the level of the series — larger values produce larger seasonal peaks and troughs in absolute terms.

Show Code

dcmp_mult <- retail_ts |>
  model(classical_decomposition(Sales, type = "multiplicative")) |>
  components()

dcmp_mult |>
  autoplot() +
  labs(
    title = "Classical Multiplicative Decomposition",
    x     = NULL
  ) +
  theme_minimal()

Classical Multiplicative Decomposition — US Retail Sales

STL Decomposition

STL (Seasonal and Trend decomposition using Loess) is more flexible than classical decomposition because it allows the seasonal component to change over time and is robust to outliers.

Show Code

dcmp_stl <- retail_ts |>
  model(STL(Sales ~ season(window = 13))) |>
  components()

dcmp_stl |>
  autoplot() +
  labs(
    title = "STL Decomposition",
    x     = NULL
  ) +
  theme_minimal()

Which Decomposition Worked Better?

Show Code

var_add  <- var(dcmp_add$random,  na.rm = TRUE)
var_mult <- var(dcmp_mult$random, na.rm = TRUE)

tibble(
  Decomposition      = c("Additive", "Multiplicative"),
  Remainder_Variance = c(round(var_add, 2), round(var_mult, 2))
) |>
  kbl(
    caption   = "Remainder Variance by Decomposition Type",
    col.names = c("Decomposition", "Remainder Variance")
  ) |>
  kable_styling(
    bootstrap_options = c("striped", "hover"),
    full_width        = FALSE
  )

Remainder Variance by Decomposition Type
Decomposition	Remainder Variance
Additive	124210354
Multiplicative	0

Multiplicative decomposition works better for this series. Two pieces of evidence support this conclusion.

First, the seasonal component plot from additive decomposition shows seasonal swings of roughly constant dollar magnitude across the full sample. But retail sales roughly doubled between 2010 and 2023, so the December holiday spike in 2023 was much larger in absolute dollar terms than in 2010. Multiplicative decomposition captures this correctly by expressing seasonal factors as ratios rather than levels.

Second, remainder variance is the key diagnostic. A good decomposition absorbs systematic trend and seasonal variation, leaving a remainder that looks like white noise with low variance. The model with lower remainder variance has done a better job separating signal from noise. The table above confirms which decomposition achieves this.

How Decomposition Results Can Be Used in Forecasting

Decomposition informs forecasting in two practical ways.

The first is seasonal adjustment. By removing the estimated seasonal component, the series becomes easier to forecast with simpler models. A Naive or Drift model applied to the seasonally adjusted series often outperforms the same model applied to the raw series because it is not confused by predictable seasonal swings. The seasonal pattern is then added back to produce the final forecast. This is the principle behind STL + ETS and STL + ARIMA, which consistently outperform benchmark models on seasonal series.

The second is structural understanding. The trend component reveals whether the series is growing, declining, or plateauing. The seasonal component shows which months are systematically high or low and whether that pattern is stable or evolving. A changing seasonal pattern — visible in STL but invisible in classical decomposition — signals that a time varying model is needed rather than a fixed seasonal model.

For US retail sales specifically, decomposition is worth using in forecasting. The series has strong, consistent seasonality and a clear upward trend, both of which decomposition captures reliably. The COVID-19 shock in 2020 appears in the remainder as a large outlier, which STL handles better than classical decomposition due to its robustness properties. An STL decomposition followed by an ETS or ARIMA model on the trend cycle component would likely outperform all four benchmark models evaluated in Part I.