Final Project

Author

Jhonathan Urquilla

library(tidyverse)
library(tidyr)
library(leaflet)

setwd("C:/Users/ubjho/Downloads")
Crashes <- read_csv("Crash_Reporting.csv")

This project investigates how weather conditions affect the frequency and severity of traffic crashes. The analysis uses the Crash_Reporting dataset, which contains detailed records of reported traffic crashes in Maryland.I got the dataset this dataset was collected via the Automated Crash Reporting System of the Maryland State Police.These were reported by the Montgomery County Police, Gaithersburg Police, Rockville Police, or the Maryland-National Capital Park Police. Each record includes variables such as the weather condition at the time of the crash, the severity of any injuries sustained, the date and time of the crash, and additional information about location, road conditions, lighting, and vehicles involved. For the purpose of this study, the focus will be on two key variables: Weather, which categorizes the weather conditions ex. Clear, Rain, Snow, and Injury Severity, which describes the level of harm caused by the crash ex. No Injury, Possible Injury, Fatal Injury. By examining the relationship between weather and injury severity, this study aims to identify patterns that could help improve public safety and awareness, particularly in adverse weather conditions. The dataset used was obtained from the course provided data folder.

 Crashes
# A tibble: 200,810 × 39
   `Report Number` `Local Case Number` `Agency Name`          `ACRS Report Type`
   <chr>                         <dbl> <chr>                  <chr>             
 1 DM8479000T                210020119 Takoma Park Police De… Property Damage C…
 2 MCP2970000R                15045937 MONTGOMERY             Property Damage C…
 3 MCP20160036               180040948 Montgomery County Pol… Property Damage C…
 4 EJ7879003C                230048975 Gaithersburg Police D… Injury Crash      
 5 MCP2967004Y               230070277 Montgomery County Pol… Property Damage C…
 6 MCP3348000Z               230051804 Montgomery County Pol… Injury Crash      
 7 MCP302600BD               230046425 Montgomery County Pol… Property Damage C…
 8 MCP2583003S               230074198 Montgomery County Pol… Injury Crash      
 9 MCP3372001V               230065250 Montgomery County Pol… Property Damage C…
10 MCP3005007M               230060937 Montgomery County Pol… Property Damage C…
# ℹ 200,800 more rows
# ℹ 35 more variables: `Crash Date/Time` <chr>, `Route Type` <chr>,
#   `Road Name` <chr>, `Cross-Street Name` <chr>, `Off-Road Description` <chr>,
#   Municipality <chr>, `Related Non-Motorist` <chr>, `Collision Type` <chr>,
#   Weather <chr>, `Surface Condition` <chr>, Light <chr>,
#   `Traffic Control` <chr>, `Driver Substance Abuse` <chr>,
#   `Non-Motorist Substance Abuse` <chr>, `Person ID` <chr>, …
unique(Crashes$Weather)
 [1] "CLEAR"                             "CLOUDY"                           
 [3] "RAINING"                           "N/A"                              
 [5] "SNOW"                              "FOGGY"                            
 [7] "OTHER"                             "UNKNOWN"                          
 [9] "WINTRY MIX"                        "SEVERE WINDS"                     
[11] "SLEET"                             "BLOWING SNOW"                     
[13] "BLOWING SAND, SOIL, DIRT"          "Clear"                            
[15] "Rain"                              "Fog, Smog, Smoke"                 
[17] "Unknown"                           "Cloudy"                           
[19] "Severe Crosswinds"                 "Snow"                             
[21] "Freezing Rain Or Freezing Drizzle" "Blowing Snow"                     
[23] "Sleet Or Hail"                    
unique(Crashes$'Injury Severity')
 [1] "NO APPARENT INJURY"       "SUSPECTED MINOR INJURY"  
 [3] "POSSIBLE INJURY"          "SUSPECTED SERIOUS INJURY"
 [5] "FATAL INJURY"             NA                        
 [7] "No Apparent Injury"       "Possible Injury"         
 [9] "Suspected Minor Injury"   "Suspected Serious Injury"
[11] "Fatal Injury"            
Crashes <- select(Crashes, Weather, `Injury Severity`)

Crashes <- filter(Crashes, !is.na(Weather), Weather != "")
Crashes$Weather <- ifelse(Crashes$Weather %in% c("RAINING", "Rain", "Freezing Rain Or Freezing Drizzle"), "Rain",
                    ifelse(Crashes$Weather %in% c("snow", "SNOW","Blowing Snow", "SLEET", "WINTRY MIX", "BLOWING SNOW", "Sleet Or Hail"), "Snow",
                    ifelse(Crashes$Weather %in% c("Severe Crosswinds", "SEVERE WINDS"), "Windy",
                    ifelse(Crashes$Weather %in% c("BLOWING SAND, SOIL, DIRT","Cloudy", "Fog, Smog, Smoke", "FOGGY", "CLOUDY"), "Cloudy",
                    ifelse(Crashes$Weather %in% c("clear", "CLEAR"), "Clear", Crashes$Weather)))))

Crashes <- filter(Crashes, 
                  !is.na(Weather), Weather != "", 
                  !Weather %in% c("Other", "Unknown", "UNKNOWN", "OTHER", "N/A"))
Crashes$`Injury Severity` <- tolower(Crashes$`Injury Severity`)
Crashes$`Injury Severity` <- case_when(
  Crashes$`Injury Severity` == "fatal injury" ~ "Fatal Injury",
  Crashes$`Injury Severity` == "no apparent injury" ~ "No Apparent Injury",
  Crashes$`Injury Severity` == "possible injury" ~ "Possible Injury",
  Crashes$`Injury Severity` == "suspected minor injury" ~ "Suspected Minor Injury",
  Crashes$`Injury Severity` == "suspected serious injury" ~ "Suspected Serious Injury")

Crashes <- filter(Crashes, !is.na(`Injury Severity`))

The dataset was first cleaned by removing any rows with missing or unclear weather information and standardizing the weather categories to combine similar entries such as “RAINING” and “Freezing Rain Or Freezing Drizzle” into “Rain.” The cleaned data was then used to examine crash frequency by weather condition. A bar chart showed that clear weather had the highest total number of crashes, which may be partly due to higher road usage in good conditions. However, when the data was broken down by injury severity, a different trend emerged: adverse conditions such as rain and snow had higher proportions of severe injuries relative to their total crash counts. This suggests that although crashes are more common in clear weather, poor weather conditions tend to result in more dangerous outcomes.

severity_counts <- group_by(Crashes, Weather, `Injury Severity`)
severity_counts <- summarise(severity_counts, count = n())
`summarise()` has grouped output by 'Weather'. You can override using the
`.groups` argument.
ggplot(severity_counts, aes(x = Weather, y = count, fill = `Injury Severity`)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Crash Injury Severity by Weather Condition",
       x = "Weather Condition",
       y = "Number of Crashes",
       fill = "Injury Severity") +
  theme_minimal() +
  theme (axis.text.x = element_text(angle = 45))

The analysis of the Crash_Reporting dataset indicates that weather plays a significant role in both the frequency and severity of traffic crashes. While clear conditions see the most crashes overall, likely due to greater traffic volume, crashes in rainy or snowy weather are more likely to lead to severe injuries. These results highlight the need for drivers to exercise additional caution during adverse weather and for safety agencies to promote targeted awareness campaigns during such conditions. Future research could expand on this work by including time-of-day and seasonal data to identify when weather effects are most pronounced, examining road surface conditions to better understand combined risk factors, and incorporating geographic analysis to determine whether certain areas are more vulnerable during specific weather events. By broadening the scope of the analysis, policymakers and transportation officials could design more effective interventions to reduce crash severity in poor weather, poor lit areas or even on clear days which most accident tend to happen.