TAN NGUYEN
2024-05-07
The analysis of traffic collision data is crucial for understanding the dynamics of road safety and implementing effective measures to reduce accidents and their associated impacts. In this report, I use the “Crash Report Drivers Data of Montgomery County, Maryland” data set. This report try to reveal the key factors driving on these accidents and proposing actionable recommendations to mitigate their occurrence and impacts.
The “Crash Report Drivers Data of Montgomery County, Maryland” data set provides a detailed overview of traffic collisions involving motor vehicle operators throughout the county. The data has been collected by the Automated Crash Reporting System (ACRS) managed by the Maryland State Police and reported by multiple law enforcement agencies. Available for download at https://catalog.data.gov/dataset/crash-reporting-drivers-data
# Load tidyverse, dplyr to manipulate data
# Load lubridate to manipulate dates
# Load ggplot2 for graphing
library(dplyr, warn.conflicts = F)
library(tidyverse, warn.conflicts = F)
library(lubridate, warn.conflicts = F)
library(ggplot2, warn.conflicts = F)
data = read.csv("Crash_Reporting.csv", stringsAsFactors = FALSE)
glimpse(data)## Rows: 172,105
## Columns: 43
## $ Report.Number <chr> "MCP3170003V", "MCP3254003K", "EJ788700…
## $ Local.Case.Number <chr> "240000438", "230072050", "230074270", …
## $ Agency.Name <chr> "Montgomery County Police", "Montgomery…
## $ ACRS.Report.Type <chr> "Property Damage Crash", "Injury Crash"…
## $ Crash.Date.Time <chr> "01/03/2024 02:55:00 PM", "12/16/2023 1…
## $ Route.Type <chr> "", "Maryland (State)", "Maryland (Stat…
## $ Road.Name <chr> "", "GERMANTOWN RD", "GREAT SENECA HWY"…
## $ Cross.Street.Type <chr> "", "County", "Municipality", "County",…
## $ Cross.Street.Name <chr> "", "MIDDLEBROOK RD", "KENTLANDS BLVD",…
## $ Off.Road.Description <chr> "IN FRONT OF 18900 BIRDSEYE DR", "", ""…
## $ Municipality <chr> "", "N/A", "GAITHERSBURG", "N/A", "N/A"…
## $ Related.Non.Motorist <chr> "", "BICYCLIST", "", "", "", "", "PEDES…
## $ Collision.Type <chr> "OPPOSITE DIRECTION SIDESWIPE", "STRAIG…
## $ Weather <chr> "CLOUDY", "CLEAR", "CLEAR", "CLEAR", "R…
## $ Surface.Condition <chr> "", "DRY", "DRY", "DRY", "WET", "DRY", …
## $ Light <chr> "DAYLIGHT", "DAYLIGHT", "DAYLIGHT", "DA…
## $ Traffic.Control <chr> "NO CONTROLS", "TRAFFIC SIGNAL", "TRAFF…
## $ Driver.Substance.Abuse <chr> "NONE DETECTED", "NONE DETECTED", "NONE…
## $ Non.Motorist.Substance.Abuse <chr> "", "NONE DETECTED", "", "", "", "", "N…
## $ Person.ID <chr> "ACC015E9-08A4-4856-866E-0004005F986C",…
## $ Driver.At.Fault <chr> "Yes", "No", "No", "No", "Yes", "Yes", …
## $ Injury.Severity <chr> "NO APPARENT INJURY", "NO APPARENT INJU…
## $ Circumstance <chr> "N/A", "N/A", "N/A", "ANIMAL, N/A", "RA…
## $ Driver.Distracted.By <chr> "LOOKED BUT DID NOT SEE", "NOT DISTRACT…
## $ Drivers.License.State <chr> "MD", "MD", "MD", "MD", "MD", "MD", "MD…
## $ Vehicle.ID <chr> "4E492574-893B-4EB1-ADCA-53FDD633D6C4",…
## $ Vehicle.Damage.Extent <chr> "FUNCTIONAL", "FUNCTIONAL", "FUNCTIONAL…
## $ Vehicle.First.Impact.Location <chr> "SEVEN OCLOCK", "ELEVEN OCLOCK", "SIX O…
## $ Vehicle.Second.Impact.Location <chr> "SEVEN OCLOCK", "ELEVEN OCLOCK", "SIX O…
## $ Vehicle.Body.Type <chr> "PASSENGER CAR", "PASSENGER CAR", "(SPO…
## $ Vehicle.Movement <chr> "MOVING CONSTANT SPEED", "MOVING CONSTA…
## $ Vehicle.Continuing.Dir <chr> "South", "North", "South", "South", "No…
## $ Vehicle.Going.Dir <chr> "South", "West", "South", "South", "Nor…
## $ Speed.Limit <int> 0, 35, 35, 40, 20, 35, 35, 10, 35, 0, 2…
## $ Driverless.Vehicle <chr> "No", "No", "No", "No", "No", "No", "No…
## $ Parked.Vehicle <chr> "No", "No", "No", "No", "No", "No", "No…
## $ Vehicle.Year <int> 2017, 2010, 2021, 2019, 2014, 1991, 201…
## $ Vehicle.Make <chr> "LEXUS", "TOYT", "SUBARU", "DODGE", "NI…
## $ Vehicle.Model <chr> "SUV", "PRIUS", "FORRESTER", "CHARGER",…
## $ Equipment.Problems <chr> "NO MISUSE", "NO MISUSE", "NO MISUSE", …
## $ Latitude <dbl> 39.16500, 39.17878, 39.12357, 39.21174,…
## $ Longitude <dbl> -77.24931, -77.26719, -77.23177, -77.17…
## $ Location <chr> "(39.16500483, -77.24931)", "(39.178775…
This data set has a lot of information, When looking at this, we can uncover the trends and relationships between various factors that contribute to collisions. The exploration can provide crucial insights for improving road safety within Montgomery County.
Some features have a lot of categories, some features have a lot of N/A information, also these collision reports are based on preliminary information, so information may include verified and unverified collision data, information may include mechanical or human error.
Feature has a lot of category, some category has the same meaning. Some category data is free input by law enforcement agencies so they are different name but same meaning. It need to consolidate.
tmp_data <- data %>%
mutate(Crash.Date.Time = mdy_hms(Crash.Date.Time)) %>%
group_by(Year = year(Crash.Date.Time)) %>%
summarise(Count = n_distinct(as.Date(Crash.Date.Time)))
fully_filled_years <- tmp_data$Year[tmp_data$Count == 365 | tmp_data$Count == 366]
partially_filled_years <- tmp_data$Year[!(tmp_data$Year %in% fully_filled_years)]
print(paste("Fully filled years:", paste(fully_filled_years, collapse = ", ")))## [1] "Fully filled years: 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023"
## [1] "Partially filled years: 2024"