Week 5

Author

B.Liu

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights13)
#data(package = "nycflights13")
windflight <- flights |> 
  select(year, month, day,sched_dep_time, dep_delay , dep_time, carrier) |>
  left_join(select(weather, year, month, day, hour, wind_speed) , by = c("year", "month", "day"))
Warning in left_join(select(flights, year, month, day, sched_dep_time, dep_delay, : Detected an unexpected many-to-many relationship between `x` and `y`.
ℹ Row 1 of `x` matches multiple rows in `y`.
ℹ Row 1 of `y` matches multiple rows in `x`.
ℹ If a many-to-many relationship is expected, set `relationship =
  "many-to-many"` to silence this warning.
JanAA <- windflight |> filter(month == 1 , carrier == "AA")
avedaywind <- JanAA |> group_by(year, month, day) |>
  summarise(avewind = mean(wind_speed, na.rm = TRUE))
`summarise()` has grouped output by 'year', 'month'. You can override using the
`.groups` argument.
avedepdelay <- JanAA |> group_by(year, month, day) |>
  summarise(avedepdelay = mean(dep_delay, na.rm = TRUE))
`summarise()` has grouped output by 'year', 'month'. You can override using the
`.groups` argument.
avewinddelay <- avedepdelay |> left_join(avedaywind)
Joining with `by = join_by(year, month, day)`
ggplot(data = avewinddelay, aes(x = day)) +
    geom_line(aes(y = avedepdelay, color = "Departure Delay"), size = 2) +
    geom_line(aes(y = avewind, color = "Wind Speed"), size = 1) +
    labs(x = "Date", y = "Delay & Windspeed", title = "Delayed Departures: American Airlines - January") +
    scale_color_manual(values = c("Departure Delay" = "tomato3", "Wind Speed" = "steelblue1"), 
                       labels = c("Departure Delay (Min)", "Wind Speed (MPH)"),
                       name = "Average") +
    scale_x_continuous(breaks = c(seq(1, 31, by = 4), 31)) 
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

Essay

We havea graph illustrating the average flight delays attributed to American Airlines. The objective was to investigate whether a meaningful correlation exists between departure delays and the daily wind speed. Upon initial examination, there appears to be a discernible connection between these variables. Notably, I’d like to mention the varying line thickness within the graph, as it serves the dual purpose of distinguishing between the data lines and enhancing its prominence. This emphasis underscores the unfortunate nature of flight delays.