Can Pilots Make Up Time?

Introduction

Pilots are often under pressure to maintain a schedule. While they can’t control traffic on the ground, they have a fair bit of control over routing a aircraft performance in the air. It would be logical to ask if it’s a practical expectation for pilots to make up time for delays in the air by adjusting the cost index of the flight (the balance between speed and efficiency).

Data Processing

flights <- read_csv("T_ONTIME_REPORTING.csv") |> 
  select(
    arr_delay=ARR_DELAY,
    dep_delay=DEP_DELAY,
    air_time=AIR_TIME
    )
## Rows: 537902 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (10): ORIGIN_AIRPORT_ID, ORIGIN_AIRPORT_SEQ_ID, ORIGIN_CITY_MARKET_ID, D...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Number of Rows and NA count by column:

print(flights |> nrow())
## [1] 537902
print(
  flights |> summarise(
    dep_delay_na = sum(is.na(dep_delay)),
    arr_delay_na = sum(is.na(arr_delay)),
    air_time_na = sum(is.na(air_time)),
    )
)
## # A tibble: 1 × 3
##   dep_delay_na arr_delay_na air_time_na
##          <int>        <int>       <int>
## 1        32914        34373       34373

Filtering and Computation

To appropriately assess whether pilots can make up time, the difference between the the departure delay and arrival delay was used to compute a synthetic measure: Time Made Up. Flights were filtered to departures more than 30 minutes late.

flights <- flights |>
  filter(dep_delay > 30) |>
  mutate(made_up_time = dep_delay - arr_delay)

Analysis

ggplot(flights, aes(x = air_time, y = made_up_time)) +
  geom_point(size = 0.05) +
  labs(title = "Flight Time vs Time Made Up In Flight",
       x = "Flight Time", y = "Time Made Up") +
  theme_minimal()
## Warning: Removed 422 rows containing missing values or values outside the scale range
## (`geom_point()`).

cor_matrix <- cor(
  flights |>
    select(made_up_time, dep_delay, arr_delay), use = "complete.obs")
print(cor_matrix)
##              made_up_time   dep_delay  arr_delay
## made_up_time   1.00000000 -0.02408648 -0.1901903
## dep_delay     -0.02408648  1.00000000  0.9860434
## arr_delay     -0.19019028  0.98604344  1.0000000

With an R value of -0.02, there is minimal correlation between the departure delay and the time made up in the air. This suggests that regardless of the delay, pilots are typically unable to get back on schedule by altering the route or performance of the aircraft. This is consistent with the strong correlation between arrival delay and departure delay (R = 0.99). Interestingly, there is a correlation between arrival delay and time made up in flight, which would seem to be inconsistent with other data. More investigation is required.