In this assignment, we will practice collaborating around a code project with Github as a class.
Using several TidyVerse packages, and the bad-drivers dataset from fivethirtyeight.com, I’m going to create a programming sample “vignette” that demonstrates how to use the capabilities of ggplot2, dplyr, readr packages with the bad-drivers dataset.
library(tidyverse)
## -- Attaching packages ----------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.1.0 v purrr 0.2.5
## v tibble 2.1.1 v dplyr 0.8.0.1
## v tidyr 0.8.2 v stringr 1.3.1
## v readr 1.3.1 v forcats 0.3.0
## -- Conflicts -------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
drivers <- read_csv("https://raw.githubusercontent.com/fivethirtyeight/data/master/bad-drivers/bad-drivers.csv")
## Parsed with column specification:
## cols(
## State = col_character(),
## `Number of drivers involved in fatal collisions per billion miles` = col_double(),
## `Percentage Of Drivers Involved In Fatal Collisions Who Were Speeding` = col_double(),
## `Percentage Of Drivers Involved In Fatal Collisions Who Were Alcohol-Impaired` = col_double(),
## `Percentage Of Drivers Involved In Fatal Collisions Who Were Not Distracted` = col_double(),
## `Percentage Of Drivers Involved In Fatal Collisions Who Had Not Been Involved In Any Previous Accidents` = col_double(),
## `Car Insurance Premiums ($)` = col_double(),
## `Losses incurred by insurance companies for collisions per insured driver ($)` = col_double()
## )
head(drivers)
## # A tibble: 6 x 8
## State `Number of driv~ `Percentage Of ~ `Percentage Of ~ `Percentage Of ~
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Alab~ 18.8 39 30 96
## 2 Alas~ 18.1 41 25 90
## 3 Ariz~ 18.6 35 28 84
## 4 Arka~ 22.4 18 26 94
## 5 Cali~ 12 35 28 91
## 6 Colo~ 13.6 37 28 79
## # ... with 3 more variables: `Percentage Of Drivers Involved In Fatal
## # Collisions Who Had Not Been Involved In Any Previous Accidents` <dbl>,
## # `Car Insurance Premiums ($)` <dbl>, `Losses incurred by insurance
## # companies for collisions per insured driver ($)` <dbl>
drivers %>% ggplot(aes(x=reorder(State, -`Car Insurance Premiums ($)`), y=`Car Insurance Premiums ($)`, fill=State)) +
geom_bar(stat = "identity") +
guides(fill = FALSE) +
theme(axis.text.x = element_text(angle = 60, hjust = 1))