NYC Flights Homework

Author

NCowan

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.4     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights23)
data(flights)
delayed_flights <- flights %>% filter(arr_delay > 0) %>%
  mutate(delay_group = cut(arr_delay,breaks = c(0, 30, 60, 120, 180, Inf), labels = c("0-30 min", "31-60 min", "61-120 min", "121-180 min", "180+ min"), right = TRUE)) %>% 
  group_by(origin, delay_group) %>% 
  summarize(total_delays = n(), .groups = "drop") %>% 
  mutate(origin = recode(origin, "JFK" = "John F. Kennedy International", "LGA" = "LaGuardia Airport", "EWR" = "Newark Liberty International"))
ggplot(delayed_flights, aes(x = origin, y = total_delays, fill = delay_group)) + geom_col() + 
  scale_fill_brewer(palette = "Purples") + 
  scale_x_discrete(labels = function(x) stringr::str_wrap(x, width = 12)) +
  labs( x = "Origin Airport", y = "Number of Delayed Arrivals", title = "Delayed Arrivals by Airport and Delay Duration (NYC, 2023)", caption = "Source: nycflights23 dataset", fill = "Amount Delayed" ) +
  theme_minimal() + 
  theme(axis.text.x = element_text(angle = 0))

I have made a bar graph showing how many time a flight that came from a specific airport was delayed (arrival), and by how much time it ended up being delayed. I listed the three airports on the x-axis and then the number of times on the y-axis, using different color sections inside each bar to show the amount of time. I think I nice thing about this graph is you can easily see which airport has the most delays and a estimate on the percentage of how the light was delayed. For this project, I ended up using Google to search how to connect all of my specific arguments in the the first chunk. I types into Google “how to connect arguments when filtering data in r studio” to see how to put everything together in the first part of my code. I have edited everything to be on the next line so the code is not in one big line.