NYC Flights HW

Author

Thejitha Rajapakshe

NYC Flights Homework

Scatter Plot

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights13)
library(RColorBrewer)
data(flights)

Heat Map

flights |> filter(!is.na(dep_delay), !is.na(day)) |> group_by(carrier, day) |> summarise(avg_delay = mean(dep_delay)) |>
  ggplot(aes(x = day, y = carrier, fill = avg_delay)) +
  geom_tile() +
  scale_fill_gradient(low = "pink", high = "red") +
  labs(title = "Average Departure Delay by Carrier and Day of Week",
       x = "Day",
       y = "Carrier",
       fill = "Average Delay (in minutes)",
       caption = "Source: NYC Flights Dataset")
`summarise()` has grouped output by 'carrier'. You can override using the
`.groups` argument.

The visualization created is a heatmap illustrating the average departure delay by carrier against what day those flights were. Each cell in the heatmap represents the average departure delay (in minutes) for a specific carrier on a particular day of the week. The color gradient ranging from pink to red indicates the magnitude of the average delay with brighter red shades representing longer delays.

One aspect of the plot worth highlighting is the variation in departure delays across different carriers and days of the week.This can be used to identify patterns and trends, such as which carriers tend to experience more delays on certain days.