NYC Flights

Author

Hannah Le

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
# install.packages("devtools")
devtools::install_github("r-lib/conflicted")
Skipping install of 'conflicted' from a github remote, the SHA1 (4d759ac6) has not changed since last install.
  Use `force = TRUE` to force installation
#install.packages("nycflights23")
library(nycflights23)
library(RColorBrewer)
data(flights)

Calculate the average arrival delay per carrier and month

heatmap_data <- flights %>%
  group_by(carrier, month) %>%
  summarize(avg_arr_delay = mean(arr_delay, na.rm = TRUE)) %>%
  arrange(desc(avg_arr_delay))
`summarise()` has grouped output by 'carrier'. You can override using the
`.groups` argument.

Average Arrival Delay by Airline Carrier and Month (2023)

ggplot(heatmap_data, aes(x = factor(month), y = carrier, fill = avg_arr_delay)) +
  geom_tile(color = "purple") +
  labs(
    title = "Average Arrival Delay by Airline Carrier and Month (2023)",
    x = "Month",
    y = "Airline Carrier",
    fill = "Avg Arrival Delay (mins)",
    caption = "Data Source: nycflights23"
  ) +
  scale_fill_gradient(low = "lightblue", high = "red") +
  theme_minimal()

One intriguing feature of the plot is how different airlines may face more delays during various months, such as the summer or during vacation seasons. For example, in July and August, some airlines may display a deep red, indicating high delays due to peak travel times or inclement weather. This heatmap quickly identifies seasonal patterns in flight delays, potentially indicating periods when airlines need to improve their timing or experience more demand.