NYC Flights Assignment

Author

A Porambo

NYC Flights in 2024

Load the libraries

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights23)
library(alluvial) 
library(ggalluvial)
library(streamgraph)
data("flights")

Check for unique carriers

unique(flights$carrier)
 [1] "UA" "DL" "B6" "AA" "NK" "WN" "AS" "YX" "9E" "HA" "G4" "MQ" "OO" "F9"

Filter down flights to those from American Airlines, Delta Airlines, and United Airlines only andarrival delays of greater than 0 minutes. Arrange data first by month, followed by origin airport, carrier, and destination.

Delayed_Arrival_Flights <- flights %>% 
  filter((carrier == "AA" | carrier == "UA" | carrier == "DL") & arr_delay > "0") %>%
  arrange(month, origin, carrier, dest)

Create a new variable, quarter, to indicate which quarter of the fiscal year the flight occured in.

  Delayed_Arrival_Flights2 <- Delayed_Arrival_Flights |> 
  mutate(quarter = case_when(Delayed_Arrival_Flights$month %in% c(1:3) ~ "Q1",
                             Delayed_Arrival_Flights$month %in% c(4:6) ~ "Q2",
                             Delayed_Arrival_Flights$month %in% c(7:9) ~ "Q3",
                             Delayed_Arrival_Flights$month %in% c(10:12) ~ "Q4"))

Plot faceted, stacked bar plots.

ggplot(Delayed_Arrival_Flights2, aes(x = origin, fill = carrier)) +
         geom_bar() +
facet_wrap(~quarter) +
  scale_fill_brewer(palette = "Dark2") +
  labs(title = "Flights with Delayed Arrivals from NYC Airports by Carrier, 2023",
    caption = "Source: RITA, Bureau of Transportation Statistics")

I’ve created a series of four faceted stacked bar plots to see the number of flights with delayed arrivals that originated from each of the three New York City-area airports by carrier and quarter. The origination airports are listed on the x-axes of these four charts, and the count of delayed arrivals from these airports is indicated on the y-axes. Each bar is in turn divided by color to represent how many of these delays occurred on one of the given carriers: American (AA), Delta (DL), and United (UA). When each segment is stacked on top of each other in a single bar, they together represent the quarterly total of delayed flights from all three airlines at the originating airport.

The most immediate aspect of the stacked bar plots is how Newark Liberty has the most flight delays in each of the four quarters of 2023 by far. Even at its lowest count, in Q4, it has almost double the amount of delays listed as those for the airport with the second-highest number of delays, JFK. Among the flight delays from Newark Liberty, the airline with the most delays by far is United; this is also true for each and every quarter of 2023. At JFK and LaGuardia airports, however, most of the delays occured on American and Delt airlines’ flights. Another odd aspect I noticed was the relatively low number of flight delays in the fourth quarter of the year. This is the quarter that includes Thanksgiving and Christmas, two of the most travel-heavy holidays of the year; if anything, I thought it would be the quarter with the most delays.

These patterns brings up far more questions than answers. What causes so many planes departing from Newark to arrive late at their destinations? Does this airport experience a greater volume of travelers than either JFK or LaGuardia? Is there a greater shortage of labor at Newark than at the other two airports? Why are so many of the delayed flights from United Airlines? Is New York City - in particular, the Newark Liberty Airport - a major hub for the airline? Rather than Newark Liberty airport, is it United Airlines that is experiencing an acute labor shortage? Are people increasingly driving to their destinations for Thanksgiving and Christmas precisely in order to avoid airline delays? Did something occur near the end of 2023 that lowered the demand for air travel? I can only guess what inquiries a different form of analysis would bring up!