Introduction

The Baregg Tunnel dataset contains daily traffic count (number of vehicles) passing through the tunnel from November 2003 through November 2005. The data is recorded on a day-to-day frequency.

Forecast daily traffic volume and determine which modeling approach provides the most accurate predictions for future periods.

Naïve Model v/s Linear Regression

Validation period: Jul 2005 – Nov 2005.

Data Exploration

Observations

  • Clear weekly seasonality

  • Mild upward trend

  • No extreme outliers

Data Partitioning

Training: Nov 1, 2003 – Jun 30, 2005 Validation: Jul 1, 2005 – Nov 30, 2005

Model Fitting

Naïve Model Accuracy Metrics
Validation Performance
.model ME RMSE MAE MAPE
Naive −12,733.511 16,820.620 13,605.525 13.147
Linear Regression Model Accuracy Metrics
.model ME RMSE MAE MAPE
LinReg -1603.491 5869.156 3899.738 3.703

Forecast with Both Models

Interpretation

  • Naïve forecast is flat

  • Linear Regression captures weekly ups and downs

  • Regression tracks actual traffic more closely

Accuracy Metrics

Model Comparison – Validation Accuracy
.model ME RMSE MAE MAPE
LinReg -1603.491 5869.156 3899.738 3.703
Naive -12733.511 16820.620 13605.525 13.147
## MASE Naïve: 1.444
## MASE Linear Regression: 0.414

Interpretation

MASE < 1 → model beats Naïve benchmark

Linear Regression MASE < Naïve → better performance

ME, RMSE, MAE also lower for regression

Observations

  • Residuals fluctuate randomly - Mostly white noise

  • Histogram centered at 0 - No bias

  • ACF shows minimal spikes - Most autocorrelation captured

Conclusion

  • Linear Regression outperforms Naïve in all metrics

  • Trend + weekly seasonality improves accuracy

  • Residuals show no systematic pattern → reliable forecasts

Recommendation: Use Linear Regression (TSLM) for daily traffic forecasts at Baregg Tunnel.

Daily traffic exhibits strong weekly patterns. Linear Regression with trend + weekly seasonality provides accurate forecasts and should be used over the Naïve benchmark.