Data 101 : project 6

Author

Aminata Diatta

Introduction :

How many accidents do we have per day ? we do not know the exact answer. but I can confirm that we all want to know what is the reasons of the accidents, why those accidents happen. I order to answer to our questions, we will be explore the crash reporting dataset.

let’s load the libraries

library(tidyverse)
Warning: package 'tidyverse' was built under R version 4.3.3
Warning: package 'ggplot2' was built under R version 4.3.3
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.0     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
setwd("C:/Users/satad/Desktop/data101 celia")
crash_reported<- read_csv("Crash_Reporting_-_Drivers_Data_20240407.csv")
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
Rows: 172105 Columns: 43
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (38): Report Number, Agency Name, ACRS Report Type, Crash Date/Time, Rou...
dbl  (5): Local Case Number, Speed Limit, Vehicle Year, Latitude, Longitude

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

let’s have a look to the dataset

view(crash_reported)

let’s filter

on this dataset, we will focus on Montgomery County Police only , which is my county , and the road will bne georgia avenue. I choose georgia avenue because it is one of the biggest route in Maryland, whenever you are going you have to drive through georgia avenue But only for a specific vehicule I really like which is ford.

montgomery_county <- crash_reported |>
  filter(`Agency Name`== "Montgomery County Police" & `ACRS Report Type`== "Injury Crash" & `Route Type`== "Maryland (State)" & `Road Name`== "GEORGIA AVE" & `Cross-Street Type`== "County" & `Surface Condition` == "DRY" & Weather== "CLOUDY" & `Driver Substance Abuse`== "NONE DETECTED" & `Vehicle Make`== "FORD" )
view(montgomery_county)

Now that we have our clear dataet , let’s just removing the columns that we do not need

montgomery_county2 <- montgomery_county |>
 select(`Report Number`, `Local Case Number`,`Agency Name`,`ACRS Report Type`,`Route Type`,`Road Name`,`Cross-Street Type`,`Cross-Street Name`,`Collision Type`,Weather,`Surface Condition`,`Traffic Control`,`Driver At Fault`,`Injury Severity`,`Driver Distracted By`,`Vehicle Damage Extent`,`Vehicle Body Type`, `Vehicle Movement`, `Vehicle Continuing Dir`, `Vehicle Going Dir`, `Speed Limit`, `Vehicle Year`, `Vehicle Model`, Latitude, Longitude, Location)
view(montgomery_county2)

let’s look at the number

library(ggplot2)
first_vis <- montgomery_county2 |>
  ggplot(aes(x= `Vehicle Year`, y= `Speed Limit`)) + geom_point(size= 6, color="green") + labs(title = "the speed limit of each vehicle according to their year made", x= "Car year", y= "speed limit when the accident happened") + theme_dark()
first_vis

Comments :

When we look at the graph, we can see that at georgia avenue, there is a lot accidents whatever the speed limit is.This happen because something a normal driver can get hit by another one,

Second visualisation :

library(ggplot2)
library(plotly)

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout
second_vis <- montgomery_county2 |>
 ggplot(aes(x = `Vehicle Continuing Dir`, y = `Cross-Street Name`)) +
  geom_bin2d() +
  labs(title = "the vehicle type that has an accident based on the cross street name",
       x = "continuing direction",
       y = "body vehicle") + 
  theme_update()
second_vis

Comments:

when we look at the visualisation, we can notice it is only white oak dr that the direction of the vehicle is in the east .