(a) Plot ALTSALVES

E %>%
  filter(!is.na(ALTSALES)) %>%
  ggplot(aes(x = observation_date, y = ALTSALES)) +
  geom_line(color = "steelblue", linewidth = 1) +
  labs(title = "Automobile and Light Truck Sales",
       x = "Observation Date",
       y = "Sales (Thousands of Units)") +
  theme_minimal()

(b) Convert all the data into quarterly observations by using the average of the 3 months of data for ALTSALES, DSPIC96 and UNRATE.

# Convert to Date
E$observation_date <- as.Date(E$observation_date)

# Create a 'quarter' column
E <- E %>%
  mutate(quarter = paste0(year(observation_date), " Q", quarter(observation_date)))

# Group by quarter and calculate averages
E_quarterly <- E %>%
  group_by(quarter) %>%
  summarise(
    ALTSALES = mean(ALTSALES, na.rm = TRUE),
    DSPIC96 = mean(DSPIC96, na.rm = TRUE),
    UNRATE = mean(UNRATE, na.rm = TRUE),
    TERMCBPER24NS = mean(TERMCBPER24NS, na.rm = TRUE)
  ) %>%
  ungroup()

(c) Create a trend variable in your work file that starts with 0 in the first quarter of 1976 and increases by one unit.

# Creating the trend variable
E_quarterly$trend <- 0:(nrow(E_quarterly) - 1)

(d) Create a one-quarter and two-quarter lag of each of the independent variables we use (DSPIC96, TERMCBPER24NS, and UNRATE.)

# Ensure your data is ordered by 'quarter' and lag the variables within E_quarterly
E_quarterly <- E_quarterly %>%
  arrange(quarter) %>%
  mutate(
    # Create one-quarter and two-quarter lags of the variables
    DSPIC96_lag1 = lag(DSPIC96, 1),
    DSPIC96_lag2 = lag(DSPIC96, 2),
    
    TERMCBPER24NS_lag1 = lag(TERMCBPER24NS, 1),
    TERMCBPER24NS_lag2 = lag(TERMCBPER24NS, 2),
    
    UNRATE_lag1 = lag(UNRATE, 1),
    UNRATE_lag2 = lag(UNRATE, 2)
  )

(e) Choosing the preferred predictor selection criterion, identify the best model available using a training period for your model of 1976q1 to 2019q4. Explain your choice and describe how the model fits.

# Subset training data
train_data <- E_quarterly %>%
  filter(quarter >= "1976Q1" & quarter <= "2019Q4")

# Example model candidates
model1 <- lm(ALTSALES ~ DSPIC96_lag1 + UNRATE_lag1 + TERMCBPER24NS_lag1, data = train_data)
model2 <- lm(ALTSALES ~ DSPIC96_lag1 + DSPIC96_lag2 + UNRATE_lag1 + UNRATE_lag2 + TERMCBPER24NS_lag1 + TERMCBPER24NS_lag2, data = train_data)
model3 <- lm(ALTSALES ~ DSPIC96_lag1 + UNRATE_lag1, data = train_data)

# Compare using BIC
BIC(model1, model2, model3)

##        df      BIC
## model1  5 627.9405
## model2  8 587.5154
## model3  4 633.2665

Model 2 includes two lags of each predictor and has the best trade-off between fit and complexity according to BIC. Although it’s more complex, the improvement in fit justifies the additional predictors. Therefore, Model 2 is selected as the best-performing model.

(f) Predict the period of 2020q1 through 2023q4. Obviously COVID mattered! Describe how well your model behaved.

# Define test period data
test_data <- E_quarterly %>%
  filter(quarter >= "2020Q1" & quarter <= "2023Q4")

# Predict using the preferred model (model2, lowest BIC)
test_data$predicted_ALTSALES <- predict(model2, newdata = test_data)

# Convert 'quarter' from "YYYYQX" to Date for plotting (optional)
test_data <- test_data %>%
  mutate(quarter_date = as.Date(as.yearqtr(quarter, format = "%YQ%q")))

rmse_value <- rmse(test_data$ALTSALES, test_data$predicted_ALTSALES)
rmse_value

## [1] 3.558335

Despite the disruption of COVID-19, Model2 held up reasonably well in predicting auto and light truck sales. While the pandemic introduced volatility the model could not fully anticipate, the RMSE was modest, suggesting that the underlying structure still captured meaningful relationships. Overall, the model maintained predictive value even during an unusually turbulent period.

(g) Update your forecasting model using data through 2023. Generate a forecast for 2024 and describe its fit.

# Filter data through 2023
train_updated <- E_quarterly %>%
  filter(quarter <= "2023Q4")

# Re-estimate model2 with updated training data
model2_updated <- lm(ALTSALES ~ UNRATE + UNRATE_lag1 + UNRATE_lag2 +
                       DSPIC96 + DSPIC96_lag1 + DSPIC96_lag2,
                     data = train_updated)

# Create 2024 forecast input data
forecast_2024 <- E_quarterly %>%
  filter(quarter >= "2024 Q1" & quarter <= "2024 Q4")

# Predict 2024 values using the updated model
forecast_2024$predicted_ALTSALES <- predict(model2_updated, newdata = forecast_2024)

# Recalculate RMSE for 2024 forecast
rmse_2024 <- rmse(forecast_2024$ALTSALES, forecast_2024$predicted_ALTSALES)
rmse_2024

## [1] 1.306964

The updated model continues to perform well using data through 2023. The RMSE of 1.31 indicates a strong fit, and the predictions for 2024 closely track the actual ALTSALES values. While some quarter-to-quarter variation exists, the model captures the overall level and trend accurately, reinforcing its reliability for short-term forecasting.

(h) Based on the Wall Street Journal Economic Forecasting Survey, you have forecasts of unemployment at 4.25% through June 2025, real disposable income growth of 2.1% for the first half of 2025 and that the rate of interest on a 24 month personal loan at commercial banks will average 12% through June. Use your preferred model to forecast auto and light truck vehicle sales for the first half of 2025. Place a 95% confidence interval around your forecast.

# Build forecast data
forecast_2025 <- data.frame(
  quarter = c("2025Q1", "2025Q2"),
  UNRATE = c(4.25, 4.25),
  DSPIC96 = c(2.1, 2.1),
  UNRATE_lag1 = tail(E_quarterly$UNRATE, 1),  # Use most recent actual lags
  UNRATE_lag2 = tail(E_quarterly$UNRATE, 2)[1],
  DSPIC96_lag1 = tail(E_quarterly$DSPIC96, 1),
  DSPIC96_lag2 = tail(E_quarterly$DSPIC96, 2)[1],
  TERMCBPER24NS_lag1 = tail(E_quarterly$TERMCBPER24NS, 1),
  TERMCBPER24NS_lag2 = tail(E_quarterly$TERMCBPER24NS, 2)[1]
)

# Predict with 95% confidence intervals
forecast_with_ci <- predict(model2, newdata = forecast_2025, interval = "confidence", level = 0.95)

# Combine predictions with forecast_2025
forecast_2025_results <- cbind(forecast_2025, forecast_with_ci)

# Print forecast results
print(forecast_2025_results)

##   quarter UNRATE DSPIC96 UNRATE_lag1 UNRATE_lag2 DSPIC96_lag1 DSPIC96_lag2
## 1  2025Q1   4.25     2.1    4.133333    4.166667     17671.15     17544.87
## 2  2025Q2   4.25     2.1    4.133333    4.166667     17671.15     17544.87
##   TERMCBPER24NS_lag1 TERMCBPER24NS_lag2      fit      lwr      upr
## 1              12.32              12.33 17.46802 16.60269 18.33334
## 2              12.32              12.33 17.46802 16.60269 18.33334

Based on forecasts of macroeconomic indicators from the Wall Street Journal Economic Forecasting Survey, the model predicts stable auto and light truck vehicle sales for the first half of 2025, with a point estimate of approximately 17.47 million units per quarter. The 95% confidence interval ranges from 16.60 to 18.33 million units, reflecting moderate uncertainty. Given that the model includes key economic drivers and was selected based on its strong fit and low BIC, this forecast is reasonable and consistent with recent trends.

Assignment 3 - ECON 670

Marshall Larson

2025-04-21

(a) Plot ALTSALVES

(b) Convert all the data into quarterly observations by using the average of the 3 months of data for ALTSALES, DSPIC96 and UNRATE.

(c) Create a trend variable in your work file that starts with 0 in the first quarter of 1976 and increases by one unit.

(d) Create a one-quarter and two-quarter lag of each of the independent variables we use (DSPIC96, TERMCBPER24NS, and UNRATE.)

(e) Choosing the preferred predictor selection criterion, identify the best model available using a training period for your model of 1976q1 to 2019q4. Explain your choice and describe how the model fits.

Model 2 includes two lags of each predictor and has the best trade-off between fit and complexity according to BIC. Although it’s more complex, the improvement in fit justifies the additional predictors. Therefore, Model 2 is selected as the best-performing model.

(f) Predict the period of 2020q1 through 2023q4. Obviously COVID mattered! Describe how well your model behaved.

(g) Update your forecasting model using data through 2023. Generate a forecast for 2024 and describe its fit.