Forecasting U.S. Vehicle Sales

Author

John M Guarini

Abstract

Proper vehicle sales prediction is vital for vehicle manufacturers and policymakers who need to navigate a fluctuating economic landscape. This article presents the evaluation of the monthly U.S. total vehicle sales data (FRED: TOTALSA) from 2010 onwards using classical time series forecasting techniques. The data shows significant seasonality effects, structural breaks caused by the COVID-19 pandemic, and a complex recovery path. An 80-20 train-test split is employed for performance evaluation. The performance is compared among the Naïve Method, the Seasonal Naïve Method (SNAIVE), the Exponential Smoothing Method (ETS), and a time series linear regression model with a nonlinear trend component using a spline approach. The performance is evaluated using the RMSE, MAE, and MAPE metrics. The seasonal persistence methods perform exceptionally well; however, the addition of a nonlinear trend component improves the performance of the linear regression model. An equally weighted ensemble approach combining all the methods yields the minimum forecast error for all metrics.

Introduction

The automotive industry can be seen as an indicator of the overall macroeconomic activity in the economy, as it is linked to the trends in consumer confidence, industrial production, and spending patterns of households in the economy. The monthly sales of vehicles in the economy can be seen as being affected by the cyclical patterns of the economy, as well as the impact of the COVID-19 pandemic and the semiconductor supply chain disruptions in the global economy. As such, it is imperative to forecast the sales of the vehicle in the economy to enhance the planning of the relevant stakeholders in the economy. In this study, the author examines the monthly total vehicle sales data for the US, which can be obtained from FRED, labeled as TOTALSA, starting from 2010, and tests the performance of various classical time series-based forecasting techniques. The author includes the Naïve and Seasonal Naïve techniques as the basic models, which capture the persistence of the data and the repetition of patterns over the seasons, respectively. Additionally, the author includes the Exponential Smoothing State Space (ETS) technique, which is implemented with automatic model selection. To capture the nonlinear patterns of the data, which might be attributed to the structural changes, the author includes the time series linear regression model with the spline-based nonlinear trend. To evaluate the performance of the above models, the author includes the 80/20 split for training and testing, as well as the commonly used metrics such as the Root Mean Squared Error, Mean Absolute Error, Mean Absolute Percentage Error, and, for comparison purposes, the equal-weighted ensemble model as recommended in applied forecasting literature (@hyndman2021). By including the classical benchmarks, nonlinear trend, and ensemble-based models, this study offers insights into the impact of structural instability on the performance of the models for macroeconomic data.

Literature Review

Exponential smoothing algorithms have traditionally been at the heart of practical time series forecasting. @gardner1985 offers basic analysis of simple and Holt’s exponential smoothing algorithms, proving their usefulness for trend and level modeling. @hyndman2002 extended this approach by introducing the state-space formulation of ETS models, enabling formal model selection based on information criteria. Large-scale forecasting challenges also illustrate the real-world performance of traditional approaches. The M3 Competition ( @makridakis2000) and subsequent M-competitions ( @makridakis2018) have illustrated that simpler models tend to perform well in comparison to more complex ones. The role of out-of-sample evaluation and hold-out validation in model comparison is strongly emphasized by Hyndman and Athanasopoulos ( @hyndman2021). Recent studies on ensemble forecasting ( @wang2023 ) illustrate the usefulness of combining forecasts to improve robustness. Taken together, these studies indicate that simplicity, rigorous validation, and empirical evaluation are essential in practical model selection.

Data Description

The data set includes the total number of vehicle sales in the U.S. every month (FRED series TOTALSA) measured in millions of units. The data sample range is from January 2010 to the latest available data point. The data set is highly seasonal, with a clear structural break in 2020 due to the COVID-19 pandemic, followed by changes in volatility patterns due to supply chain disruptions.

Methodology

The data set was split using an 80-20 percent train-test split, with data up to December 2021 used for model estimation and the remaining data used for out-of-sample forecasting.

Four benchmark forecasting models were estimated:

1.) Naïve (NAIVE) – The forecast is simply the last observed value. 2.) Seasonal Naïve (SNAIVE) – The forecast is simply the value observed in the same month of the previous year. 3.) Exponential Smoothing State Space Model (ETS) – The specification of the model was selected using the AICc criterion. 4.) Time Series Linear Regression with Nonlinear Trend (TSLM_SPLINE) – The model is a linear regression that controls for seasonal dummy variables and a nonlinear trend using cubic regression splines.

A spline is a type of function that can exhibit a wide degree of curvature, making it an appropriate method to pick up the impact of the COVID-19 collapse and subsequent recovery period.

In order to make the forecast more robust, an ensemble forecast was produced by simply averaging the point forecasts of the four models above using equal weights. This method of forecast combination is well supported in the literature as a means of improving forecast stability ( @wang2023).

The model’s performance was assessed on the test set using the following criteria:

  • Root Mean Squared Error (RMSE)
  • Mean Absolute Error (MAE)
  • Mean Absolute Percentage Error (MAPE)

These criteria measure different aspects of the magnitude and relative error of the forecast.

Data Preparation

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.2
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 
── Attaching core tidyquant packages ─────────────────────── tidyquant 1.0.11 ──
✔ PerformanceAnalytics 2.0.8      ✔ TTR                  0.24.4
✔ quantmod             0.4.28     ✔ xts                  0.14.1── Conflicts ────────────────────────────────────────── tidyquant_conflicts() ──
✖ zoo::as.Date()                 masks base::as.Date()
✖ zoo::as.Date.numeric()         masks base::as.Date.numeric()
✖ dplyr::filter()                masks stats::filter()
✖ xts::first()                   masks dplyr::first()
✖ dplyr::lag()                   masks stats::lag()
✖ xts::last()                    masks dplyr::last()
✖ PerformanceAnalytics::legend() masks graphics::legend()
✖ quantmod::summary()            masks base::summary()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Registered S3 method overwritten by 'tsibble':
  method               from 
  as_tibble.grouped_df dplyr
── Attaching packages ──────────────────────────────────────────── fpp3 1.0.2 ──
✔ tsibble     1.1.6     ✔ feasts      0.4.2
✔ tsibbledata 0.4.1     ✔ fable       0.4.1
── Conflicts ───────────────────────────────────────────────── fpp3_conflicts ──
✖ lubridate::date()    masks base::date()
✖ dplyr::filter()      masks stats::filter()
✖ xts::first()         masks dplyr::first()
✖ tsibble::index()     masks zoo::index()
✖ tsibble::intersect() masks base::intersect()
✖ tsibble::interval()  masks lubridate::interval()
✖ dplyr::lag()         masks stats::lag()
✖ xts::last()          masks dplyr::last()
✖ tsibble::setdiff()   masks base::setdiff()
✖ tsibble::union()     masks base::union()
✖ fable::VAR()         masks tidyquant::VAR()

Attaching package: 'fpp3'

The following object is masked from 'package:PerformanceAnalytics':

    prices
library(dplyr)
library(splines)
us_cars_raw <- tq_get("TOTALSA", get = "economic.data")
theme_set(theme_minimal() + theme(legend.position = "bottom"))
us_cars <- us_cars_raw |>
  rename(sales = price) |>
  mutate(date = yearmonth(date)) |>
  as_tsibble(index = date)
us_cars <- us_cars |>
  filter(date >= yearmonth("2010 Jan"))
regs_raw <- tq_get("USASLRTCR03MLSAM", get = "economic.data")
regs_ts <- regs_raw |>
  transmute(date = yearmonth(date), registrations = price) |>
  as_tsibble(index = date) |>
  filter(date >= yearmonth("2010 Jan"))

price_raw <- tq_get("CUSR0000SETA02", get = "economic.data")
price_ts <- price_raw |>
  transmute(date = yearmonth(date), price_index = price) |>
  as_tsibble(index = date) |>
  filter(date >= yearmonth("2010 Jan"))

cars_all <- us_cars |>
  left_join(regs_ts, by = "date") |>
  left_join(price_ts, by = "date")

Plot

Figure 1: blah 2 but not Figure.

In Figure 1, we find that…

Plot (A): Sales vs Registration

cars_all |>
  select(date, sales, registrations) |>
  pivot_longer(-date, names_to = "series", values_to = "value") |>
  group_by(series) |>
  mutate(index_2010 = 100 * value / first(value)) |>
  ungroup() |>
  ggplot(aes(x = date, y = index_2010, color = series)) +
  geom_line(linewidth = 1) +
  labs(
    title = "Sales vs Registrations (Indexed to 2010 = 100)",
    x = "Date",
    y = "Index (2010 = 100)",
    color = ""
  ) +
  theme(legend.position = "bottom")
Warning: Removed 27 rows containing missing values or values outside the scale range
(`geom_line()`).

Plot (B): Average Price (Index)

autoplot(cars_all, price_index) +
  labs(
    title = "Used Car & Truck Price Index (CPI)",
    subtitle = "Proxy for average vehicle price level",
    x = "Date",
    y = "Index (1982–84 = 100)"
  ) +
  theme(legend.position = "bottom")

Train/Test (80%/20%)

train <- us_cars |>
  filter(date <= yearmonth("2021 Dec"))

test <- us_cars |>
  filter(date > yearmonth("2021 Dec"))

h <- nrow(test)

Fitting ETS Modeling

models <- train |>
  model(
    NAIVE = NAIVE(sales),
    SNAIVE = SNAIVE(sales),
    ETS = ETS(sales),
    TSLM_SPLINE = TSLM(sales ~ ns(trend(), df = 6) + season())
  )

forecasts <- models |> forecast(h = h)

Forecasting and Accuracy

Forecast Comparison (All)

forecasts <- models |>
  forecast(new_data = test)

accuracy(forecasts, test)
# Create equal-weight ensemble from selected models
ensemble_fc <- forecasts |>
  filter(.model %in% c("NAIVE","SNAIVE","ETS","TSLM_SPLINE")) |>
  as_tibble() |>
  group_by(date) |>
  summarise(.mean = mean(.mean), .groups = "drop") |>
  mutate(.model = "ENSEMBLE") |>
  as_tsibble(key = .model, index = date)

# Combine original forecasts with ensemble
forecasts_all <- bind_rows(
  forecasts,
  ensemble_fc
)

(figure 1)

Forecast Comparison (Selected Models)

idx <- tsibble::index_var(test)

ensemble_fc <- forecasts |>
  as_tibble() |>
  group_by(.data[[idx]]) |>
  summarise(.mean = mean(.mean), .groups = "drop") |>
  mutate(.model = "ENSEMBLE") |>
  as_tsibble(key = .model, index = !!sym(idx))

forecasts_all <- bind_rows(forecasts, ensemble_fc)
# ensemble point forecast: average of ALL models' point forecasts
ens_plot <- forecasts |>
  as_tibble() |>
  group_by(date) |>
  summarise(ens_mean = mean(.mean), .groups = "drop")

Visual Comparision

# Forecasts (all models including spline)
fc_plot <- forecasts |>
  as_tibble() |>
  select(date, .model, .mean) |>
  rename(value = .mean)

# Ensemble
ens_plot2 <- ens_plot |>
  rename(value = ens_mean) |>
  mutate(.model = "ENSEMBLE")

# Actual test values
actual_plot <- test |>
  as_tibble() |>
  select(date, sales) |>
  rename(value = sales) |>
  mutate(.model = "Actual")

# Combine everything
plot_data <- bind_rows(fc_plot, ens_plot2, actual_plot)
autoplot(train, sales, alpha = 0.6) +
  autolayer(forecasts, level = NULL, alpha = 0.8) +
  geom_line(
    data = ens_plot,
    aes(x = date, y = ens_mean, color = "ENSEMBLE"),
    linewidth = 1
  ) +
  geom_line(
    data = test,
    aes(x = date, y = sales, color = "Actual"),
    linewidth = 1
  ) +
  labs(
    title = "Forecast Comparison (Test Period)",
    y = "Millions of Units",
    x = "Date",
    color = ""
  ) +
  scale_color_manual(values = c(
    "Actual" = "black",
    "ENSEMBLE" = "green",
    "ETS" = "blue",
    "NAIVE" = "violet",
    "SNAIVE" = "red",
    "TSLM_SPLINE" = "brown"
  )) +
  theme(legend.position = "bottom")

autoplot(train, sales, alpha = 0.5) +
  autolayer(forecasts, level = NULL, alpha = 0.4) +
  geom_line(
    data = ens_plot,
    aes(x = date, y = ens_mean, color = "ENSEMBLE"),
    linewidth = 1.6
  ) +
  geom_line(
    data = test,
    aes(x = date, y = sales, color = "Actual"),
    linewidth = 1.4
  ) +
  labs(
    title = "Forecast Comparison (Test Set)",
    y = "Millions of Units",
    x = "Date",
    color = ""
  ) +
  scale_color_manual(values = c(
    "Actual" = "black",
    "ENSEMBLE" = "darkgreen",
    "ETS" = "#1f77b4",
    "NAIVE" = "#ff7f0e",
    "SNAIVE" = "#9467bd",
    "TSLM_SPLINE" = "#e377c2"
  )) +
  theme(legend.position = "bottom")

(Figure 2)

Interpretation

Inspection of the test period forecasts reveals significant differences between the models’ assumptions. The Naïve model reveals flat lines since it only extends the most recent observation into the future, ignoring the patterns that occur season after season. The Seasonal Naïve (SNAIVE) model extends the prior year’s monthly values, which work effectively for strong seasonality but may be less effective if there is significant trend dynamics change. The ETS model indicates smooth lines; however, in relation to this problem, it performs in a similar manner to the Naïve model, implying that the ETS model chosen by default did not heavily emphasize deterministic trend/seasonal components during the test period.

On the other hand, the TSLM_SPLINE model outperforms all other models since it allows for a nonlinear trend, which is necessary in capturing the nonlinear recovery after the pandemic. The Ensemble model performs closely with the actual test data by leveraging the strengths of other models, making it the most stable and accurate forecasting path

Accuracy Table

acc_models <- forecasts |>
  accuracy(test, measures = list(RMSE = RMSE, MAE = MAE, MAPE = MAPE)) |>
  select(.model, RMSE, MAE, MAPE) |>
  mutate(.model = as.character(.model))

acc_ens <- ens_metrics |>
  mutate(.model = "ENSEMBLE") |>
  select(.model, RMSE, MAE, MAPE)

acc_final <- bind_rows(acc_models, acc_ens) |>
  arrange(RMSE)

acc_final

While on initial performance the Seasonal Naive model out preformed the traditional ETS and linear regression models, the addition of a non-linear trend using a spline function helped to improve the predictive performance of the TSLM model. Although, the Ensemble model, which is the combination of all models’ forecasting with equal weight averaging to combine them, does have the lowest error rate across the RMSE, MAE, and MAPE vales. This is consistent with the advantage of model combination.

Results

From the initial evaluation, the Seasonal Naïve (SNAIVE) model performed better than the traditional ETS and linear regression models. This suggests that high seasonal persistence is still an important feature of the monthly vehicle sales. Although, the inclusion of a non-linear spline trend in the regression model noticeably improved the predictive performance. The TSLM model with the spline trend lowered the forecast errors compared to the traditional regression and ETS models. This suggests that the vehicle sales data contain nonlinear structural dynamics, especially during the recovery phase after the COVID-19 disruption. The Ensemble model had the minimum forecast errors for all the evaluation criteria: RMSE, MAE, and MAPE. The ensemble forecast is derived by averaging the forecasts from the Naive, Seasonal Naive, ETS, and regression models. Combining the forecasts minimizes the variance between the models, resulting in a more stable forecast path ( @wang2023). The results from the residual diagnostics suggest that although the accuracy of the forecast has been improved, there is still some temporal dependence in the residuals, and this is likely due to the structural volatility in the vehicle sales series during the pandemic shock and the recovery period.

Residual Diagnostics

models |> select(SNAIVE) |> gg_tsresiduals()
Warning: `gg_tsresiduals()` was deprecated in feasts 0.4.2.
ℹ Please use `ggtime::gg_tsresiduals()` instead.
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 12 rows containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 12 rows containing non-finite outside the scale range
(`stat_bin()`).

models |> select(TSLM_SPLINE) |> gg_tsresiduals()

(Figure 6)

This is evidenced through residual diagnostics which indicate non-zero levels of autocorrelation in the models. This implies that there is a certain level of dependency in the series, especially during times of structural instability. This implies that though the SNAIVE model has better predictive capabilities, there is still a certain level of temporal dependency which has not been captured.This could potentially be modeled using ARIMA or combined models in the future.

Discussion

The results underscore the significance of seasonality and non-linear structural dynamics in US vehicle sales data. Simple models based on seasonal persistence have been successful, reflecting the pronounced seasonality in the data. However, by extending the model to include additional flexibility through spline regression, it is possible to improve the ability of the model to capture non-linear structural changes, especially during the collapse and subsequent recovery caused by the COVID-19 pandemic. The performance of the ensemble model is seen to be better, and this is a testament to the advantages of model combination ( @wang2023).Each model is seen to capture unique features of the underlying data structure, including seasonality, smoothing, and non-linear trend changes. By combining multiple models, it is possible to improve the stability of the forecast path. The effects of structural changes, such as the COVID-19 pandemic, introduce instability that is hard to model using linear models. The ability of the model to add more flexibility using spline regression is seen to be important, and this can be improved upon by the addition of regressors to the model.

Conclusion

The study aims to assess the performance of several classical time series forecasting techniques for vehicle sales in the US using monthly data since 2010. The study also compares the results of several baseline models, such as the Naïve, Seasonal Naïve, and Exponential Smoothing models, with a regression model using a nonlinear spline regression approach for modeling the time series trend. The results of the study show that seasonal persistence is a major factor in vehicle sales time series, as indicated by the good results of the Seasonal Naïve method. The inclusion of a nonlinear regression approach using a spline regression method also improved the results of the study, as indicated by the good results of the regression approach, particularly in modeling the impact of the COVID-19 pandemic and the subsequent recovery of the US economy. The results of the study also show that the ensemble approach, which is a combination of all the results of the models, resulted in the lowest forecast error of all the models used in the study, as indicated by the results of the study, which is in consonance with the results of other studies on time series forecasting, as indicated in the literature.

References