Heatmaps, Treemaps, and Alluvials - Data 110

Author

Leah Marshall

library(tidyverse)
Warning: package 'ggplot2' was built under R version 4.5.1
Warning: package 'tibble' was built under R version 4.5.1
Warning: package 'purrr' was built under R version 4.5.1
Warning: package 'stringr' was built under R version 4.5.1
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.2
✔ ggplot2   4.0.0     ✔ tibble    3.3.0
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.1.0     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights23)
data(flights)
flights_clean <- flights |>
  filter(!is.na(arr_delay) & !is.na(hour))
delays_by_hour <- flights_clean |>
  group_by(hour) |>
  summarize(
    avg_arr_delay = mean(arr_delay),
    count = n(),
    .groups = "drop"
  )
ggplot(delays_by_hour, aes(x = hour, y = avg_arr_delay)) +
  geom_line(color = "steelblue", size = 1.2) +    
  geom_point(aes(color = avg_arr_delay), size = 3) + 
  scale_color_gradient(low = "green", high = "red") + 
  labs(
    title = "Average Arrival Delay by Hour of Day",
    x = "Hour of Day",
    y = "Average Arrival Delay (minutes)",
    color = "Avg Delay (min)",
    caption = "Data source: FAA Aircraft registry"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(face = "bold", size = 14),
    axis.title = element_text(face = "bold")
  )
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

##Essay

My visualization looks at average flight delays throughout the day using the nycflights23 data set. I used a line chart to show the average arrival delay for each hour, and the points are colored to show how bad the delays are—green for low delays, yellow for moderate delays, and red for high delays. The x-axis shows the hour of the day, and the y-axis shows the average arrival delay in minutes. One thing that stands out is that delays get bigger as the day goes on. Morning flights tend to be more on time, with most points in green, while afternoon and evening flights have higher delays, shown in yellow and red. I think this happens because small delays early in the day add up and affect flights later on. This makes it clear that flights later in the day are more likely to be delayed. I also hoped when making this that the colors help you quickly see which times of day are better for on-time flights. Overall, the visualization makes it easy to see how flight delays change throughout the day and helps highlight when delays are most likely to happen.