From the FiveThirtyEight page, I found a dataset to figure out which state has the "worst" drivers: https://github.com/fivethirtyeight/data/blob/master/bad-drivers/bad-drivers.csv. This caught my attention because all of my relatives from other states laugh at me and say I am a bad driver since I am from NY (I don't think I am a bad driver at all). I am here to prove them wrong (hopefully). The article is here: https://fivethirtyeight.com/features/which-state-has-the-worst-drivers/. I havent't read the article or looked at the graphs closely, due to me wanting to find out the results myself.

# load data
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.8     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.1
## ✔ readr   2.1.2     ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
url <- read.csv('https://raw.githubusercontent.com/fivethirtyeight/data/master/bad-drivers/bad-drivers.csv')
head(url)
##        State Number.of.drivers.involved.in.fatal.collisions.per.billion.miles
## 1    Alabama                                                             18.8
## 2     Alaska                                                             18.1
## 3    Arizona                                                             18.6
## 4   Arkansas                                                             22.4
## 5 California                                                             12.0
## 6   Colorado                                                             13.6
##   Percentage.Of.Drivers.Involved.In.Fatal.Collisions.Who.Were.Speeding
## 1                                                                   39
## 2                                                                   41
## 3                                                                   35
## 4                                                                   18
## 5                                                                   35
## 6                                                                   37
##   Percentage.Of.Drivers.Involved.In.Fatal.Collisions.Who.Were.Alcohol.Impaired
## 1                                                                           30
## 2                                                                           25
## 3                                                                           28
## 4                                                                           26
## 5                                                                           28
## 6                                                                           28
##   Percentage.Of.Drivers.Involved.In.Fatal.Collisions.Who.Were.Not.Distracted
## 1                                                                         96
## 2                                                                         90
## 3                                                                         84
## 4                                                                         94
## 5                                                                         91
## 6                                                                         79
##   Percentage.Of.Drivers.Involved.In.Fatal.Collisions.Who.Had.Not.Been.Involved.In.Any.Previous.Accidents
## 1                                                                                                     80
## 2                                                                                                     94
## 3                                                                                                     96
## 4                                                                                                     95
## 5                                                                                                     89
## 6                                                                                                     95
##   Car.Insurance.Premiums....
## 1                     784.55
## 2                    1053.48
## 3                     899.47
## 4                     827.34
## 5                     878.41
## 6                     835.50
##   Losses.incurred.by.insurance.companies.for.collisions.per.insured.driver....
## 1                                                                       145.08
## 2                                                                       133.93
## 3                                                                       110.35
## 4                                                                       142.39
## 5                                                                       165.63
## 6                                                                       139.91
class(url)
## [1] "data.frame"
colnames(url)
## [1] "State"                                                                                                 
## [2] "Number.of.drivers.involved.in.fatal.collisions.per.billion.miles"                                      
## [3] "Percentage.Of.Drivers.Involved.In.Fatal.Collisions.Who.Were.Speeding"                                  
## [4] "Percentage.Of.Drivers.Involved.In.Fatal.Collisions.Who.Were.Alcohol.Impaired"                          
## [5] "Percentage.Of.Drivers.Involved.In.Fatal.Collisions.Who.Were.Not.Distracted"                            
## [6] "Percentage.Of.Drivers.Involved.In.Fatal.Collisions.Who.Had.Not.Been.Involved.In.Any.Previous.Accidents"
## [7] "Car.Insurance.Premiums...."                                                                            
## [8] "Losses.incurred.by.insurance.companies.for.collisions.per.insured.driver...."
mean(url$Percentage.Of.Drivers.Involved.In.Fatal.Collisions.Who.Were.Alcohol.Impaired)
## [1] 30.68627

31% is the average percent of drivers involved in fatal collisions who were impaired by alcohol.

Research question

You should phrase your research question in a way that matches up with the scope of inference your dataset allows for. Is NY state one of the top 3 states with the worst drivers? If not, which are the top 3?

Cases

What are the cases, and how many are there? There are 51 cases, one for each state in the United States.

Data collection

Describe the method of data collection. I found the dataset from FiveThirtyEight on Github, I will just need to import the raw file.

Type of study

What type of study is this (observational/experiment)? This dataset is based on an observational study, collected from collisions.

Data Source

If you collected the data, state self-collected. If not, provide a citation/link. FiveThirtyEight Article, DataSet