(a) Plot ALTSALVES
E %>%
filter(!is.na(ALTSALES)) %>%
ggplot(aes(x = observation_date, y = ALTSALES)) +
geom_line(color = "steelblue", linewidth = 1) +
labs(title = "Automobile and Light Truck Sales",
x = "Observation Date",
y = "Sales (Thousands of Units)") +
theme_minimal()

(b) Convert all the data into quarterly observations by using the
average of the 3 months of data for ALTSALES, DSPIC96 and UNRATE.
# Convert to Date
E$observation_date <- as.Date(E$observation_date)
# Create a 'quarter' column
E <- E %>%
mutate(quarter = paste0(year(observation_date), " Q", quarter(observation_date)))
# Group by quarter and calculate averages
E_quarterly <- E %>%
group_by(quarter) %>%
summarise(
ALTSALES = mean(ALTSALES, na.rm = TRUE),
DSPIC96 = mean(DSPIC96, na.rm = TRUE),
UNRATE = mean(UNRATE, na.rm = TRUE),
TERMCBPER24NS = mean(TERMCBPER24NS, na.rm = TRUE)
) %>%
ungroup()
(c) Create a trend variable in your work file that starts with 0 in
the first quarter of 1976 and increases by one unit.
# Creating the trend variable
E_quarterly$trend <- 0:(nrow(E_quarterly) - 1)
(d) Create a one-quarter and two-quarter lag of each of the
independent variables we use (DSPIC96, TERMCBPER24NS, and UNRATE.)
# Ensure your data is ordered by 'quarter' and lag the variables within E_quarterly
E_quarterly <- E_quarterly %>%
arrange(quarter) %>%
mutate(
# Create one-quarter and two-quarter lags of the variables
DSPIC96_lag1 = lag(DSPIC96, 1),
DSPIC96_lag2 = lag(DSPIC96, 2),
TERMCBPER24NS_lag1 = lag(TERMCBPER24NS, 1),
TERMCBPER24NS_lag2 = lag(TERMCBPER24NS, 2),
UNRATE_lag1 = lag(UNRATE, 1),
UNRATE_lag2 = lag(UNRATE, 2)
)
(e) Choosing the preferred predictor selection criterion, identify
the best model available using a training period for your model of
1976q1 to 2019q4. Explain your choice and describe how the model
fits.
# Subset training data
train_data <- E_quarterly %>%
filter(quarter >= "1976Q1" & quarter <= "2019Q4")
# Example model candidates
model1 <- lm(ALTSALES ~ DSPIC96_lag1 + UNRATE_lag1 + TERMCBPER24NS_lag1, data = train_data)
model2 <- lm(ALTSALES ~ DSPIC96_lag1 + DSPIC96_lag2 + UNRATE_lag1 + UNRATE_lag2 + TERMCBPER24NS_lag1 + TERMCBPER24NS_lag2, data = train_data)
model3 <- lm(ALTSALES ~ DSPIC96_lag1 + UNRATE_lag1, data = train_data)
# Compare using BIC
BIC(model1, model2, model3)
## df BIC
## model1 5 627.9405
## model2 8 587.5154
## model3 4 633.2665
Model 2 includes two lags of each predictor and has the best
trade-off between fit and complexity according to BIC. Although it’s
more complex, the improvement in fit justifies the additional
predictors. Therefore, Model 2 is selected as the best-performing
model.
(f) Predict the period of 2020q1 through 2023q4. Obviously COVID
mattered! Describe how well your model behaved.
# Define test period data
test_data <- E_quarterly %>%
filter(quarter >= "2020Q1" & quarter <= "2023Q4")
# Predict using the preferred model (model2, lowest BIC)
test_data$predicted_ALTSALES <- predict(model2, newdata = test_data)
# Convert 'quarter' from "YYYYQX" to Date for plotting (optional)
test_data <- test_data %>%
mutate(quarter_date = as.Date(as.yearqtr(quarter, format = "%YQ%q")))
rmse_value <- rmse(test_data$ALTSALES, test_data$predicted_ALTSALES)
rmse_value
## [1] 3.558335
Despite the disruption of COVID-19, Model2 held up reasonably well
in predicting auto and light truck sales. While the pandemic introduced
volatility the model could not fully anticipate, the RMSE was modest,
suggesting that the underlying structure still captured meaningful
relationships. Overall, the model maintained predictive value even
during an unusually turbulent period.
(g) Update your forecasting model using data through 2023. Generate
a forecast for 2024 and describe its fit.
# Filter data through 2023
train_updated <- E_quarterly %>%
filter(quarter <= "2023Q4")
# Re-estimate model2 with updated training data
model2_updated <- lm(ALTSALES ~ UNRATE + UNRATE_lag1 + UNRATE_lag2 +
DSPIC96 + DSPIC96_lag1 + DSPIC96_lag2,
data = train_updated)
# Create 2024 forecast input data
forecast_2024 <- E_quarterly %>%
filter(quarter >= "2024 Q1" & quarter <= "2024 Q4")
# Predict 2024 values using the updated model
forecast_2024$predicted_ALTSALES <- predict(model2_updated, newdata = forecast_2024)
# Recalculate RMSE for 2024 forecast
rmse_2024 <- rmse(forecast_2024$ALTSALES, forecast_2024$predicted_ALTSALES)
rmse_2024
## [1] 1.306964
The updated model continues to perform well using data through 2023.
The RMSE of 1.31 indicates a strong fit, and the predictions for 2024
closely track the actual ALTSALES values. While some quarter-to-quarter
variation exists, the model captures the overall level and trend
accurately, reinforcing its reliability for short-term forecasting.
(h) Based on the Wall Street Journal Economic Forecasting Survey,
you have forecasts of unemployment at 4.25% through June 2025, real
disposable income growth of 2.1% for the first half of 2025 and that the
rate of interest on a 24 month personal loan at commercial banks will
average 12% through June. Use your preferred model to forecast auto and
light truck vehicle sales for the first half of 2025. Place a 95%
confidence interval around your forecast.
# Build forecast data
forecast_2025 <- data.frame(
quarter = c("2025Q1", "2025Q2"),
UNRATE = c(4.25, 4.25),
DSPIC96 = c(2.1, 2.1),
UNRATE_lag1 = tail(E_quarterly$UNRATE, 1), # Use most recent actual lags
UNRATE_lag2 = tail(E_quarterly$UNRATE, 2)[1],
DSPIC96_lag1 = tail(E_quarterly$DSPIC96, 1),
DSPIC96_lag2 = tail(E_quarterly$DSPIC96, 2)[1],
TERMCBPER24NS_lag1 = tail(E_quarterly$TERMCBPER24NS, 1),
TERMCBPER24NS_lag2 = tail(E_quarterly$TERMCBPER24NS, 2)[1]
)
# Predict with 95% confidence intervals
forecast_with_ci <- predict(model2, newdata = forecast_2025, interval = "confidence", level = 0.95)
# Combine predictions with forecast_2025
forecast_2025_results <- cbind(forecast_2025, forecast_with_ci)
# Print forecast results
print(forecast_2025_results)
## quarter UNRATE DSPIC96 UNRATE_lag1 UNRATE_lag2 DSPIC96_lag1 DSPIC96_lag2
## 1 2025Q1 4.25 2.1 4.133333 4.166667 17671.15 17544.87
## 2 2025Q2 4.25 2.1 4.133333 4.166667 17671.15 17544.87
## TERMCBPER24NS_lag1 TERMCBPER24NS_lag2 fit lwr upr
## 1 12.32 12.33 17.46802 16.60269 18.33334
## 2 12.32 12.33 17.46802 16.60269 18.33334
Based on forecasts of macroeconomic indicators from the Wall Street
Journal Economic Forecasting Survey, the model predicts stable auto and
light truck vehicle sales for the first half of 2025, with a point
estimate of approximately 17.47 million units per quarter. The 95%
confidence interval ranges from 16.60 to 18.33 million units, reflecting
moderate uncertainty. Given that the model includes key economic drivers
and was selected based on its strong fit and low BIC, this forecast is
reasonable and consistent with recent trends.