library(tidyverse)
library(tidyr)
library(leaflet)
setwd("C:/Users/ubjho/Downloads")
Crashes <- read_csv("Crash_Reporting.csv")Final Project
This project investigates how weather conditions affect the frequency and severity of traffic crashes. The analysis uses the Crash_Reporting dataset, which contains detailed records of reported traffic crashes in Maryland.I got the dataset this dataset was collected via the Automated Crash Reporting System of the Maryland State Police.These were reported by the Montgomery County Police, Gaithersburg Police, Rockville Police, or the Maryland-National Capital Park Police. Each record includes variables such as the weather condition at the time of the crash, the severity of any injuries sustained, the date and time of the crash, and additional information about location, road conditions, lighting, and vehicles involved. For the purpose of this study, the focus will be on two key variables: Weather, which categorizes the weather conditions ex. Clear, Rain, Snow, and Injury Severity, which describes the level of harm caused by the crash ex. No Injury, Possible Injury, Fatal Injury. By examining the relationship between weather and injury severity, this study aims to identify patterns that could help improve public safety and awareness, particularly in adverse weather conditions. The dataset used was obtained from the course provided data folder.
Crashes# A tibble: 200,810 × 39
`Report Number` `Local Case Number` `Agency Name` `ACRS Report Type`
<chr> <dbl> <chr> <chr>
1 DM8479000T 210020119 Takoma Park Police De… Property Damage C…
2 MCP2970000R 15045937 MONTGOMERY Property Damage C…
3 MCP20160036 180040948 Montgomery County Pol… Property Damage C…
4 EJ7879003C 230048975 Gaithersburg Police D… Injury Crash
5 MCP2967004Y 230070277 Montgomery County Pol… Property Damage C…
6 MCP3348000Z 230051804 Montgomery County Pol… Injury Crash
7 MCP302600BD 230046425 Montgomery County Pol… Property Damage C…
8 MCP2583003S 230074198 Montgomery County Pol… Injury Crash
9 MCP3372001V 230065250 Montgomery County Pol… Property Damage C…
10 MCP3005007M 230060937 Montgomery County Pol… Property Damage C…
# ℹ 200,800 more rows
# ℹ 35 more variables: `Crash Date/Time` <chr>, `Route Type` <chr>,
# `Road Name` <chr>, `Cross-Street Name` <chr>, `Off-Road Description` <chr>,
# Municipality <chr>, `Related Non-Motorist` <chr>, `Collision Type` <chr>,
# Weather <chr>, `Surface Condition` <chr>, Light <chr>,
# `Traffic Control` <chr>, `Driver Substance Abuse` <chr>,
# `Non-Motorist Substance Abuse` <chr>, `Person ID` <chr>, …
unique(Crashes$Weather) [1] "CLEAR" "CLOUDY"
[3] "RAINING" "N/A"
[5] "SNOW" "FOGGY"
[7] "OTHER" "UNKNOWN"
[9] "WINTRY MIX" "SEVERE WINDS"
[11] "SLEET" "BLOWING SNOW"
[13] "BLOWING SAND, SOIL, DIRT" "Clear"
[15] "Rain" "Fog, Smog, Smoke"
[17] "Unknown" "Cloudy"
[19] "Severe Crosswinds" "Snow"
[21] "Freezing Rain Or Freezing Drizzle" "Blowing Snow"
[23] "Sleet Or Hail"
unique(Crashes$'Injury Severity') [1] "NO APPARENT INJURY" "SUSPECTED MINOR INJURY"
[3] "POSSIBLE INJURY" "SUSPECTED SERIOUS INJURY"
[5] "FATAL INJURY" NA
[7] "No Apparent Injury" "Possible Injury"
[9] "Suspected Minor Injury" "Suspected Serious Injury"
[11] "Fatal Injury"
Crashes <- select(Crashes, Weather, `Injury Severity`)
Crashes <- filter(Crashes, !is.na(Weather), Weather != "")Crashes$Weather <- ifelse(Crashes$Weather %in% c("RAINING", "Rain", "Freezing Rain Or Freezing Drizzle"), "Rain",
ifelse(Crashes$Weather %in% c("snow", "SNOW","Blowing Snow", "SLEET", "WINTRY MIX", "BLOWING SNOW", "Sleet Or Hail"), "Snow",
ifelse(Crashes$Weather %in% c("Severe Crosswinds", "SEVERE WINDS"), "Windy",
ifelse(Crashes$Weather %in% c("BLOWING SAND, SOIL, DIRT","Cloudy", "Fog, Smog, Smoke", "FOGGY", "CLOUDY"), "Cloudy",
ifelse(Crashes$Weather %in% c("clear", "CLEAR"), "Clear", Crashes$Weather)))))
Crashes <- filter(Crashes,
!is.na(Weather), Weather != "",
!Weather %in% c("Other", "Unknown", "UNKNOWN", "OTHER", "N/A"))Crashes$`Injury Severity` <- tolower(Crashes$`Injury Severity`)
Crashes$`Injury Severity` <- case_when(
Crashes$`Injury Severity` == "fatal injury" ~ "Fatal Injury",
Crashes$`Injury Severity` == "no apparent injury" ~ "No Apparent Injury",
Crashes$`Injury Severity` == "possible injury" ~ "Possible Injury",
Crashes$`Injury Severity` == "suspected minor injury" ~ "Suspected Minor Injury",
Crashes$`Injury Severity` == "suspected serious injury" ~ "Suspected Serious Injury")
Crashes <- filter(Crashes, !is.na(`Injury Severity`))The dataset was first cleaned by removing any rows with missing or unclear weather information and standardizing the weather categories to combine similar entries such as “RAINING” and “Freezing Rain Or Freezing Drizzle” into “Rain.” The cleaned data was then used to examine crash frequency by weather condition. A bar chart showed that clear weather had the highest total number of crashes, which may be partly due to higher road usage in good conditions. However, when the data was broken down by injury severity, a different trend emerged: adverse conditions such as rain and snow had higher proportions of severe injuries relative to their total crash counts. This suggests that although crashes are more common in clear weather, poor weather conditions tend to result in more dangerous outcomes.
severity_counts <- group_by(Crashes, Weather, `Injury Severity`)
severity_counts <- summarise(severity_counts, count = n())`summarise()` has grouped output by 'Weather'. You can override using the
`.groups` argument.
ggplot(severity_counts, aes(x = Weather, y = count, fill = `Injury Severity`)) +
geom_bar(stat = "identity", position = "dodge") +
labs(title = "Crash Injury Severity by Weather Condition",
x = "Weather Condition",
y = "Number of Crashes",
fill = "Injury Severity") +
theme_minimal() +
theme (axis.text.x = element_text(angle = 45))