Severity of Red-Light Running (RLR) Crashes using Artificial Intelligence

Group 4
Nazmuz Sadat, Sawgat Ahmed Shuvo

Introduction

  • Red-light running (RLR) is a persistent and dangerous cause of crashes at signalized intersections in the United States.
  • RLR crash severity is often tied to right-angle (“T-bone”) collisions, which generate strong lateral impact forces and frequently cause severe or fatal injuries.
  • Evidence shows that reduction of RLR crashes can significantly reduce fatal and injury crashes at signalized intersections.
  • Therefore, reducing RLR is essential for achieving a “zero‑fatality” which is the goal of Safe System Approach.

Problem Statement

  • Each year roughly one–quarter of traffic fatalities and about one–half of all traffic injuries in the United States are attributed to intersections.
  • RLR Contributes to 16%–20% of all signalized intersection crashes nationwide.
  • Responsible for nearly 9,000 fatalities over the past decade and an estimated 165,000 injuries annually.
  • Despite enforcement and awareness efforts, RLR remains a major urban safety issue due to its high frequency, severity, and continued prevalence.

Literature Review

Gaps in the Literature

  • The studies discussed previously lack focus on data preprocessing methods like missing data handling and feature selection, which are crucial for improving model performance.
  • Altough SHAP is mentioned in one study, there is a limitation of understanding of the key factors driving predictions
  • Additionally, RLR crashes severity were not addressed, which could offer a new perspective for model analysis

Research Questions

  • How can Artificial Intelligence (AI) be utilized to identify and predict RLR crash severity?
  • What are the key factors influencing the severity of RLR crashes?
  • How effective are AI-based models (e.g., machine learning, deep learning) in predicting the severity of RLR crashes?

Modeling Workflow

Data Description

  • Source of Data: Center for Advanced Public Safety (CAPS) at University of Alabama
  • While extracting data, we choose the primary contributing factor as “Ran Red-light”.
  • Number of Variables: CARE datasets for a “crash record” include more than 200 variables
  • Main Variable Categories
  • Identification & Temporal (e.g., location)
  • Roadway & Environment (e.g., urban/rural, lighting condition)
  • Crash & Impact Characteristics (e.g., main cause, crash severity)
  • Driver / Vehicle Characteristics (Causal Unit) (e.g., driver age, license validity)

Resampling

Technique Used: SMOTE-ENN

Hyperparameter Tuning

  • Models Used: Random Forest and XGBoost
  • Methods Used: Grid Search and 5-fold Cross Validation
  • Best Parameters Identified for Random Forest :
  • max_depth = None
  • min_samples_split = 2
  • n_estimators = 200
  • Best Score: 0.9074
  • Best Parameters Identified for XGBoost:
  • learning_rate = 0.1
  • max_depth = 5
  • n_estimators = 100
  • Best Score: 0.8103

Performance Comparison

Results & Discussion

Results & Discussion

Results & Discussion

SHAP Waterfall Plot

Future Scope

  • Optimizing the model to further improve predictive performance and enhance model generalizability.

  • Expanding the dataset to include multiple states or national records.

  • Recommending safety countermeasures based on the findings of the model.

References

  • Akter, R., Susilawati, S., Zubair, H., Chor, W.T., 2025. Analyzing feature importance for older pedestrian crash severity: A comparative study of DNN models, emphasizing road and vehicle types with SHAP interpretation. Multimodal Transp. 4, 100203.
  • Alanazi, F., Umar, I.K., Yosri, A.M., Okail, M.A., 2025. Comparative evaluation of deep learning and traditional models for predicting traffic accident severity in Saudi Arabia. Sci. Rep. 15, 32568
  • Chen, F., Liu, X.Q., Yang, J.J., Liu, X.K., Ma, J.H., Chen, J., Xiao, H.Y., 2025. Traffic accident severity prediction based on an enhanced MSCPO-XGBoost hybrid model. Sci. Rep. 15, 25729.
  • Khan, M.N., Das, S., 2024. Advancing traffic safety through the safe system approach: A systematic review. Accid. Anal. Prev. 199, 107518.

References

  • Jahangiri, A., Rakha, H., Dingus, T.A., Transportation Research Board, 2015. Predicting Red-light Running Violations at Signalized Intersections Using Machine Learning Techniques. p. 13p.
  • Liu, J., 2021. Severity Analysis of Large Truck Crashes- Comparison Between the Regression Modeling Methods with Machine Learning Methods (Thesis). Texas Southern University.
  • Roudnitski, A., 2024. Evaluating road crash severity prediction with balanced ensemble models. Findings.
  • FHWA, 2017. Safety Evaluation of Red-Light Indicator Lights (RLILs) (No. FHWA-HRT-17-078). U.S. Department of Transportation, Federal Highway Administration, McLean, VA.