Series: U.S. Total Vehicle Sales
Source: U.S. Bureau of Economic Analysis, Total Vehicle Sales [TOTALNSA], retrieved from FRED, Federal Reserve Bank of St. Louis; https://fred.stlouisfed.org/series/TOTALNSA, December 9, 2020.
library(fpp2)
library(ggplot2)
library(ggthemes)
library(tidyverse)
library(kableExtra)
library(vars)
TOTALNSA <- read.csv("F:/MSAE/2020fall_PredictiveAnalytics/TOTALNSA.csv")
VS <- ts(TOTALNSA[,2],start=c(1990,1),end=c(2020,10),frequency=12)
autoplot(VS, series = "Vehicle Sales") + ylab("Number of Units (Thousands)") + ggtitle("US: Total Vehicle Sales, NSA, Thousands of units")
A quick glance at the data shows some clear seasonality patterns. Upward trend through the early 2000’s, with a sharp drop in 2008-2009 (Great Financial Crisis). Sharp recovery until 2020, where vehicle sales declined due to the COVID pandemic.
The dataset will be split into a train dataset (2003 - 2018) and a validation dataset (2019). Given the severe shock of COVID, I’ll leave 2020 out of the sample.
VStrain <- VS[1:348]
VStest <- VS[349:360]
VStrain <- ts(VStrain, start=c(1990,1),end=c(2018,12), frequency=12)
VStest <- ts(VStest, start=c(2019,1), end=c(2019,12), frequency=12)
nnet1 <- nnetar(VStrain , lambda = "auto", size = 1)
fcast1 <- forecast(nnet1, PI=TRUE, h=12)
autoplot(fcast1) + autolayer(VStest, series = "Actual") + autolayer(fcast1$fitted, series = "Fitted Values")
kable(round(accuracy(fcast1,VStest),2), caption = "NNAR(25,1,1)") %>%
kable_classic(full_width = F)
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|---|
| Training set | 0.86 | 85.67 | 63.89 | -0.40 | 5.04 | 0.65 | -0.03 | NA |
| Test set | 7.63 | 75.46 | 54.06 | 0.17 | 3.71 | 0.55 | -0.37 | 0.37 |
nnet2 <- nnetar(VStrain , lambda = "auto", size = 2)
fcast2 <- forecast(nnet2, PI=TRUE, h=12)
autoplot(fcast2) + autolayer(VStest, series = "Actual") + autolayer(fcast2$fitted, series = "Fitted Values")
kable(round(accuracy(fcast2,VStest),2), caption = "NNAR(25,1,2)") %>%
kable_classic(full_width = F)
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|---|
| Training set | 1.23 | 67.86 | 50.19 | -0.29 | 3.92 | 0.51 | -0.01 | NA |
| Test set | -2.43 | 69.29 | 48.19 | -0.52 | 3.31 | 0.49 | -0.36 | 0.33 |
nnet3 <- nnetar(VStrain , lambda = "auto", size = 3)
fcast3 <- forecast(nnet3, PI=TRUE, h=12)
autoplot(fcast3) + autolayer(VStest, series = "Actual") + autolayer(fcast3$fitted, series = "Fitted Values")
kable(round(accuracy(fcast3,VStest),2), caption = "NNAR(25,1,3)") %>%
kable_classic(full_width = F)
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|---|
| Training set | 0.51 | 57.56 | 41.96 | -0.25 | 3.26 | 0.42 | -0.03 | NA |
| Test set | -8.00 | 68.51 | 46.71 | -0.86 | 3.27 | 0.47 | -0.45 | 0.33 |
nnet4 <- nnetar(VStrain , lambda = "auto", size = 4)
fcast4 <- forecast(nnet4, PI=TRUE, h=12)
autoplot(fcast4) + autolayer(VStest, series = "Actual") + autolayer(fcast4$fitted, series = "Fitted Values")
kable(round(accuracy(fcast4,VStest),2), caption = "NNAR(25,1,4)") %>%
kable_classic(full_width = F)
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|---|
| Training set | 0.47 | 44.60 | 31.44 | -0.19 | 2.45 | 0.32 | -0.02 | NA |
| Test set | -19.32 | 69.98 | 47.39 | -1.63 | 3.34 | 0.48 | -0.27 | 0.33 |
While forecast are close to each other, the best performing Neural model is the model of size = 2 (low RSME and lowest in test RMSE). Size 3 and 4 have lower train RMSE, but higher test RMSE which can suggest ovefitting.
hwf <- hw(VStrain, seasonal = "additive", h=12)
autoplot(hwf, series = "HW Forecast") + autolayer(VStest, series = "Actual Observations") + autolayer(hwf$fitted, series = "Fitted Values") +
ggtitle("Holt-Winter Projections and Actual") + ylab("Value")
kable(round(accuracy(hwf,VStest),2), caption = "Holt-Winter (Additive)") %>%
kable_classic(full_width = F)
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|---|
| Training set | -0.03 | 87.21 | 67.58 | -0.36 | 5.34 | 0.68 | 0.05 | NA |
| Test set | -51.81 | 82.25 | 65.32 | -3.83 | 4.66 | 0.66 | -0.59 | 0.37 |
aam <- auto.arima(VStrain)
aamf <- forecast(aam, h=12)
autoplot(aamf, series = "ARIMA(2,0,1)(0,1,2)") + autolayer(VStest, series = "Actual Observations") + autolayer(aam$fitted, series = "Fitted Values") +
ggtitle("Arima ARIMA(2,0,1)(0,1,2)Model Projections and Actual") + ylab("Value")
kable(round(accuracy(aamf,VStest),2), caption = "Arima Model ARIMA(2,0,1)(0,1,2)") %>%
kable_classic(full_width = F)
| ME | RMSE | MAE | MPE | MAPE | MASE | ACF1 | Theil’s U | |
|---|---|---|---|---|---|---|---|---|
| Training set | 3.08 | 82.83 | 60.77 | -0.11 | 4.78 | 0.62 | -0.01 | NA |
| Test set | -40.08 | 74.14 | 64.07 | -3.03 | 4.52 | 0.65 | -0.44 | 0.34 |
The neural net performs slightly better than ARIMA and HW forecasts.