library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(nycflights13)
flights %>% filter(dest=="SFO") %>% group_by(dest) %>% summarise(avg_air_time=mean(air_time, na.rm=TRUE), min_dist=min(distance, na.rm=TRUE))
flights %>% filter(dest=="MDW") %>% group_by(dest) %>% summarise(num_na = sum(is.na(air_time)))
flights2 <- flights %>% filter(month==9) %>% filter(dest=="DEN"|dest=="DFW")
flights2 %>% filter(dep_delay <= -5) %>% count()
flights2 %>% filter(dep_delay <= -5) %>% summarise(mean(air_time, na.rm=TRUE))
ggplot(flights2 %>% filter(dep_delay <= -5)) +
geom_histogram(aes(dep_time))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
flights %>% group_by(dest) %>% summarise(avg_arr_delay = mean(arr_delay, na.rm=TRUE), med_arr_delay = median(arr_delay, na.rm=TRUE))
flights3 <- flights %>% group_by(carrier) %>% summarise(avg_flight_time = mean(air_time, na.rm=TRUE)) %>% arrange(desc(avg_flight_time))
airlines %>% left_join(flights3)
## Joining, by = "carrier"
Read about the car speeding data set.
“In a study into the effect that warning signs have on speeding patterns, Cambridgeshire County Council considered 14 pairs of locations. The locations were paired to account for factors such as traffic volume and type of road. One site in each pair had a sign erected warning of the dangers of speeding and asking drivers to slow down. No action was taken at the second site. Three sets of measurements were taken at each site. Each set of measurements was nominally of the speeds of 100 cars but not all sites have exactly 100 measurements. These speed measurements were taken before the erection of the sign, shortly after the erection of the sign, and again after the sign had been in place for some time.”
speeding <- read_csv("http://vincentarelbundock.github.io/Rdatasets/csv/boot/amis.csv",
col_types = cols(col_double(), col_double(), col_factor(), col_factor(), col_factor())) %>%
rename(row = `...1`)
## New names:
## • `` -> `...1`
head(speeding)
Answer in words. Produce graphs. Explain how graphs support your answer.
Danger: If you have not carefully read the problem description paragraph, your answer is not likely to make sense!
speeding %>% group_by(period, warning, pair) %>% summarise(avg_speed = mean(speed, na.rm=TRUE)) %>% ggplot() +
geom_point(aes(x=period, y=avg_speed, color=pair, shape=warning))
## `summarise()` has grouped output by 'period', 'warning'. You can override using
## the `.groups` argument.
speeding %>% group_by(period, warning, pair) %>% summarise(avg_speed = mean(speed, na.rm=TRUE)) %>% ggplot() +
geom_point(aes(x=period, y=avg_speed, color=warning)) +
facet_wrap(~pair)
## `summarise()` has grouped output by 'period', 'warning'. You can override using
## the `.groups` argument.
speeding %>% group_by(warning, period) %>% summarise(avg_speed = mean(speed, na.rm=TRUE)) %>% ggplot() +
geom_point(aes(x=period, y=avg_speed, color=warning))
## `summarise()` has grouped output by 'warning'. You can override using the
## `.groups` argument.
I would conclude that warning signs are effective initially, but fall off after people get used to the signs.