NYC_Flights_Homework

Author

Arthur Krause Nunes De Almeida

NYC Flights Homework

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights23)
data("flights")

April

#Creating a new data set without N|A
flights1 <- flights %>%
  arrange(month) %>%
  filter(!is.na(arr_delay)) %>%
  group_by(month)
heatmapdf <- flights %>%
  group_by(carrier, month) %>%
  summarize(arr_delay = mean(arr_delay, na.rm = TRUE))
`summarise()` has grouped output by 'carrier'. You can override using the
`.groups` argument.
p1 <- heatmapdf |>
ggplot(aes(x=factor(month), y=carrier, fill = arr_delay)) +
  geom_tile()+ #Creating the geometry for a heat map
  scale_fill_distiller(palette="Spectral") +
  theme_bw()+
  theme(axis.text.x = element_text(angle = 0)) + xlab(label = "Month") + ylab(label = "Carrier") + labs(title = "Arrival Delay per carrier each month",
              subtitle = "Plot of carrier by month",
              caption = "Data source: Flights23(FAA Aircraft registry)")
p1

The visualization is a heat map that tries to analyse and visualize the difference in arrival delay per carrier on each month. With this visualization we can see what carrier is the fastest to arrival having less delays and which months tend to have faster arrivals compared to the rest. On this we first filtered the original data to take out all N/as so that the data is not influenced by them. Then we did the ggplot2 and used the function geom_tile to create this geometry of the heat map with the tiles and used a graded scale of colors named spectral to create this more harmonious feeling to the colors. The visualization shows some difference in colors for example in July and June delay higher than the others especially for the F9 carrier. Another example is the carrier G4 that has a way smaller arrival delay witch I would recommend in case of getting a connection.