NYC Flights Homework

Author

Aashka Navale

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights13)
data(flights)
anyNA(flights$dep_delay)
[1] TRUE
anyNA(flights$arr_delay)
[1] TRUE
flights_nona <- flights |>
  filter(!is.na(dep_delay) & (!is.na(arr_delay)))
summerflights <- flights_nona |>
  filter(month >= 6, month <= 8)
ggplot(summerflights, aes(dep_delay, arr_delay, color = origin)) +  
  geom_point() +
  scale_color_manual(values = c("paleturquoise4", "plum3", "honeydew3")) +
  labs(title = "NYC Summer Flights \n Departure Delays & Arrival Delays", 
       x = "Departure Delays (in minutes)",
       y = "Arrival Delays (in minutes)", 
       color = "Origin",
       caption = "FAA Aircraft Registry")

This plot shows the relationship between departure delays (x-axis) and arrival delays (y-axis) from Newark Liberty International Airport, John F. Kennedy International Airport, and the LaGuardia Airport during the summer months (June, July, August) in minutes in 2013. As the the departure delay time increases, the arrival delay also increases. This graph shows a positive correlation. I observed that JFK seemed to the most arrival and departure delays, which might be the result of more flights from that airport. The flight with the longest departure and arrival delay is 1137 minutes to 1127 minutes (18.95 hours and 18.78 hours, respectively).