Data101FinalProject

Author

Nathaniel Nguyen

In this project I have taken a dive into the crash reports data set provided by Professor Evans. This data set is truly vast and open to unlimited potential. However, I will be mainly focusing on a few questions that I want to answer. First of all, does the driver’s substance abuse affect whether the driver is at fault or not? Second, I want to know if the weather changes the collsion type or not. Finally, I would like to find out how many of these reported crashes were on parked cars.

getwd()
[1] "/Users/natty/Downloads"
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.0     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(ggplot2)
library(RColorBrewer)
setwd("/Users/natty/Downloads")
crashes <- read_csv("CrashReport.csv")
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
Rows: 172105 Columns: 43
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (38): Report Number, Agency Name, ACRS Report Type, Crash Date/Time, Rou...
dbl  (5): Local Case Number, Speed Limit, Vehicle Year, Latitude, Longitude

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
graph1 <- crashes |>
  ggplot() +
  geom_bar(aes(x = `Driver At Fault`, y = `Driver Distracted By`),
           position = "dodge", stat = "identity") +
  labs(x = "Driver At Fault",
       y = "What the Driver was Distracted By",
       title = "Does the Driver Being Distracted Affect Whether They Are At Fault"
       ) +
  theme_minimal(base_size = 8)
graph1

graph2 <- crashes |>
  ggplot() +
  geom_bar(aes(x = `Surface Condition`, y = `Collision Type`),
           position = "dodge", stat = "identity") +
  labs(x = "Weather",
       y = "Collision Type",
       title = "Does the Surface Condition Affect The Collision Type"
       ) +
  theme_minimal(base_size = 8)
graph2

clean <- crashes |> filter(`Parked Vehicle` == "Yes")
summary(clean)
 Report Number      Local Case Number   Agency Name        ACRS Report Type  
 Length:2689        Min.   : 11032675   Length:2689        Length:2689       
 Class :character   1st Qu.: 16066729   Class :character   Class :character  
 Mode  :character   Median :190011494   Mode  :character   Mode  :character  
                    Mean   :146285443                                        
                    3rd Qu.:210042059                                        
                    Max.   :230074331                                        
 Crash Date/Time     Route Type         Road Name         Cross-Street Type 
 Length:2689        Length:2689        Length:2689        Length:2689       
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
                                                                            
                                                                            
 Cross-Street Name  Off-Road Description Municipality      
 Length:2689        Length:2689          Length:2689       
 Class :character   Class :character     Class :character  
 Mode  :character   Mode  :character     Mode  :character  
                                                           
                                                           
                                                           
 Related Non-Motorist Collision Type       Weather          Surface Condition 
 Length:2689          Length:2689        Length:2689        Length:2689       
 Class :character     Class :character   Class :character   Class :character  
 Mode  :character     Mode  :character   Mode  :character   Mode  :character  
                                                                              
                                                                              
                                                                              
    Light           Traffic Control    Driver Substance Abuse
 Length:2689        Length:2689        Length:2689           
 Class :character   Class :character   Class :character      
 Mode  :character   Mode  :character   Mode  :character      
                                                             
                                                             
                                                             
 Non-Motorist Substance Abuse  Person ID         Driver At Fault   
 Length:2689                  Length:2689        Length:2689       
 Class :character             Class :character   Class :character  
 Mode  :character             Mode  :character   Mode  :character  
                                                                   
                                                                   
                                                                   
 Injury Severity    Circumstance       Driver Distracted By
 Length:2689        Length:2689        Length:2689         
 Class :character   Class :character   Class :character    
 Mode  :character   Mode  :character   Mode  :character    
                                                           
                                                           
                                                           
 Drivers License State  Vehicle ID        Vehicle Damage Extent
 Length:2689           Length:2689        Length:2689          
 Class :character      Class :character   Class :character     
 Mode  :character      Mode  :character   Mode  :character     
                                                               
                                                               
                                                               
 Vehicle First Impact Location Vehicle Second Impact Location
 Length:2689                   Length:2689                   
 Class :character              Class :character              
 Mode  :character              Mode  :character              
                                                             
                                                             
                                                             
 Vehicle Body Type  Vehicle Movement   Vehicle Continuing Dir
 Length:2689        Length:2689        Length:2689           
 Class :character   Class :character   Class :character      
 Mode  :character   Mode  :character   Mode  :character      
                                                             
                                                             
                                                             
 Vehicle Going Dir   Speed Limit    Driverless Vehicle Parked Vehicle    
 Length:2689        Min.   : 0.00   Length:2689        Length:2689       
 Class :character   1st Qu.: 0.00   Class :character   Class :character  
 Mode  :character   Median :15.00   Mode  :character   Mode  :character  
                    Mean   :14.54                                        
                    3rd Qu.:25.00                                        
                    Max.   :60.00                                        
  Vehicle Year  Vehicle Make       Vehicle Model      Equipment Problems
 Min.   :   0   Length:2689        Length:2689        Length:2689       
 1st Qu.:2007   Class :character   Class :character   Class :character  
 Median :2012   Mode  :character   Mode  :character   Mode  :character  
 Mean   :1986                                                           
 3rd Qu.:2016                                                           
 Max.   :2024                                                           
    Latitude       Longitude        Location        
 Min.   :38.89   Min.   :-77.49   Length:2689       
 1st Qu.:39.01   1st Qu.:-77.17   Class :character  
 Median :39.06   Median :-77.09   Mode  :character  
 Mean   :39.07   Mean   :-77.10                     
 3rd Qu.:39.12   3rd Qu.:-77.02                     
 Max.   :39.30   Max.   :-75.98