Heatmaps Homework

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.2     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights23)

data(flights)
flights_nona <- flights |>
filter(!is.na(distance) & !is.na(arr_delay) & !is.na(dep_delay))
by_dest <- flights_nona |>
group_by(carrier) |> 
summarise(count = n(), 
avg_dist = mean(distance), 
avg_arr_delay = mean(arr_delay), 
avg_dep_delay = mean(dep_delay), 
.groups = "drop") |> 
arrange(avg_arr_delay) |>
filter(avg_dist < 2000)
head(by_dest)
# A tibble: 6 × 5
  carrier count avg_dist avg_arr_delay avg_dep_delay
  <chr>   <int>    <dbl>         <dbl>         <dbl>
1 G4        667     723.        -5.88           3.98
2 YX      85431     485.        -4.64           4.11
3 9E      52204     487.        -2.23           7.38
4 MQ        354     725.         0.119         10.5 
5 DL      60364    1278.         1.64          15.0 
6 AA      39750    1156.         5.27          14.0 
by_dest <- merge(by_dest, airlines, by = "carrier")
ggplot(by_dest, aes(x = reorder(name, avg_arr_delay),
y = avg_arr_delay,
fill = avg_dep_delay)) +
geom_bar(stat = "identity") +
coord_flip() +
scale_fill_gradient(low = "green", high = "purple") +
labs(title = "Average Arrival Delay by Airline (NYC Flights, 2023)",
x = "Airline",
y = "Average Arrival Delay (minutes)",
fill = "Avg Departure Delay",
caption = "Source: nycflights23 dataset") +
theme_minimal()

The visualization above is a horizontal bar chart that displays the average arrival delay by airline for flights under the average arrival distance of 2000 miles based on flights from New York City in 2013. Each bar represents one airline and is colored based on the average departure delay: the larger the average delay, the more purple the bar gets, the lower the average delay, the greener. The bars are also arranged from the smallest delay to the largest, Allegiant air having the smallest average delay, and Frontier Airlines having the largest average delay. It surprised me that Allegiant air, Republic airline, and Endeavor air. all had an average departure delay beneath a minute, making me wonder if they simply had less flights or if they each were luckier in terms of possible weather conditions.