Exploring Jet Stream Effects on Domestic Flight Speeds

Jaclyn Janis

MPH 676, University of Southern Maine, Fall 2018

Purpose

Looking at flight data made me interested in the effects of jet streams and if differences in air speeds are realized based on the flight path. I refreshed on some information on jet streams from the North Carolina Climate Office and found that in winter, the polar jet stream will affect a greater portion of the continental U.S. Though jet streams do not follow an exact west-east pattern (see Jet Stream Diagram), I chose flight paths that were more directly eastbound or westbound along lines of latitude for simplicity. The questions I chose to address in this assignment were:

  1. Do eastbound flights average higher air speeds than westbound flights?
  2. Do eastbound flights with a northernmost flight path reach faster airspeeds than eastbound flights with a southernmost flight path?
  3. Do westbound flights experience more arrival delays greater than 15 minutes than eastbound flights?

Jet Stream Diagram:

Preparing the Data

I downloaded the dataset, domestic_flights_jan_2016.csv, from the class website. I began by creating the calculated fields that were demonstrated in the Unit 6 lecture.

library(dplyr)
library(knitr)
usflights <- read.csv("domestic_flights_jan_2016.csv")

Next, I converted all date and time variables to cleaner formats using sprintf() and as.POSIXct.

usflights$FlightDate <- as.Date(usflights$FlightDate, format = "%m/%d/%Y")

usflights <- usflights %>%
  mutate(new_CRSDepTime = paste(FlightDate, sprintf("%04d", CRSDepTime)))
usflights$new_CRSDepTime <- as.POSIXct(usflights$new_CRSDepTime, format = "%Y-%m-%d %H%M")

usflights <- usflights %>% 
  mutate(new_CRSArrTime = paste(FlightDate, sprintf("%04d", CRSArrTime)))
usflights$new_CRSArrTime <- as.POSIXct(usflights$new_CRSArrTime, format="%Y-%m-%d %H%M")

usflights <- usflights %>% filter(Cancelled == 0) %>% 
  mutate(new_DepTime = paste(FlightDate, sprintf("%04d", DepTime)), new_WheelsOff = paste(FlightDate, sprintf("%04d", WheelsOff)),
         new_WheelsOn = paste(FlightDate, sprintf("%04d", WheelsOn)), new_ArrTime = paste(FlightDate, sprintf("%04d", ArrTime)))

usflights$new_DepTime <- as.POSIXct(usflights$new_DepTime, format="%Y-%m-%d %H%M")
usflights$new_WheelsOff <- as.POSIXct(usflights$new_WheelsOff, format="%Y-%m-%d %H%M")
usflights$new_WheelsOn <- as.POSIXct(usflights$new_WheelsOn, format="%Y-%m-%d %H%M")
usflights$new_ArrTime <- as.POSIXct(usflights$new_ArrTime, format="%Y-%m-%d %H%M")

To answer my questions, I didn’t need all of the flight data, so I selected just the flights I’d be using. For the northern flight path, I chose Boston <-> Seattle. For the southern flight path, I chose Atlanta <-> Los Angeles. I created new variables that will help me answer my questions. I glanced through the data to ensure my variables Fdirection and Path were assigned correctly.

ewusflights <- usflights %>%
  mutate(TaxiOut = as.integer(difftime(new_WheelsOff, new_DepTime, units = "mins")),
         TaxiIn = as.integer(difftime(new_ArrTime, new_WheelsOn, units = "mins")),
         ArrDelay = as.integer(difftime(new_ArrTime, new_CRSArrTime, units = "mins")),
         ArrDelayMinutes = ifelse(ArrDelay < 0, 0, ArrDelay), 
         ArrDel15 = ifelse(ArrDelay >= 15, 1, 0),
         AirTime = ActualElapsedTime - TaxiOut - TaxiIn,
         AirSpeed = Distance / (AirTime / 60)) %>%
  filter(OriginCityName == "Boston, MA" & DestCityName == "Seattle, WA" |
         OriginCityName == "Seattle, WA" & DestCityName == "Boston, MA" |
         OriginCityName == "Atlanta, GA" & DestCityName == "Los Angeles, CA" |
         OriginCityName == "Los Angeles, CA" & DestCityName == "Atlanta, GA") %>%
  mutate(Fdirection = ifelse(OriginCityName == "Boston, MA" & DestCityName == "Seattle, WA" |
                             OriginCityName == "Atlanta, GA" & DestCityName == "Los Angeles, CA", "Westbound", "Eastbound"),
         Path = ifelse(OriginCityName == "Boston, MA" & DestCityName == "Seattle, WA" | OriginCityName == "Seattle, WA" & DestCityName == "Boston, MA", "Northern Path", "Southern Path")) %>%
  select(OriginCityName, DestCityName, new_CRSDepTime, AirSpeed, ArrDel15, Fdirection, Path)

head(ewusflights)

Exploring the Data

  1. Do eastbound flights average higher air speeds than westbound flights?
  • I began by exploring some descriptive statistics of AirSpeed using the summary() function. I wanted to have the median and range of air speeds in mind. I then determined the mean air speeds for eastbound and westbound flights: 506 mph for eastbound flights, and 442 mph for westbound flights.
summary(ewusflights$AirSpeed)
library(ggvis)
avg_speed <- na.omit(ewusflights) %>% group_by(Fdirection) %>% summarize(Average_Speed = mean(AirSpeed))
avg_speed %>% ggvis(~Fdirection, ~Average_Speed) %>% layer_bars(fill := "#6699CC")
  1. Do eastbound flights with a northernmost flight path reach faster airspeeds than eastbound flights with a southernmost flight path?
  • I filtered the ewusflights dataframe for just eastbound flights then plotted air speeds for the Boston <-> Seattle flights (northern path) and the Atlanta <-> Los Angeles flights (southern path). It looks like there is not much of a difference in air speeds.
eastbound <-na.omit(ewusflights) %>% filter(Fdirection == "Eastbound") %>% ggvis(~Path, ~AirSpeed) %>%
  layer_boxplots(fill := "#6699CC")
eastbound
  • Just out of curiosity, I plotted the air speeds for the westbound flights by northern and southern flight paths. Wow, that looks like absolutely no difference.
westbound <-na.omit(ewusflights) %>% filter(Fdirection == "Westbound") %>% ggvis(~Path, ~AirSpeed) %>%
  layer_boxplots(fill := "#6699CC")
westbound
  1. Do westbound flights experience more arrival delays greater than 15 minutes than eastbound flights?
  • To answer this question, I used the ArrDel15 variable previously created. I calculated the rate of arrival delays over 15 minutes for westbound and eastbound flights. Interestingly, the rate of delays for eastbound flights was higher - 21.9% - than for westbound flights - 14.9%.
directiondelay <- ewusflights %>% group_by(Fdirection) %>% summarize(Percent_Delayed = sum(ArrDel15)/n()*100)
directiondelay %>% ggvis(~Fdirection, ~Percent_Delayed) %>% layer_bars(fill := "#6699CC")

Discussion

Without actually testing for significance, it appears that there is a difference in average air speed between eastbound and westbound flights, and this difference reflects the direction of the jet stream. Average eastbound flight speed was 506 mph while average westbound flight speed was 442 mph. If these numbers were closer, I might have imagined that pilots successfully navigated around jet streams by adjusting altitude at which they flew. Or, I thought it possible that average speeds would be similar if more fuel was used to maintain faster speeds since the plane would be working against more force. I might conclude from my first question that there is an advantage for eastbound flights due to the jet stream.

My next questions show no difference in air speeds on northern paths versus southern paths. This lack of difference leads me to wonder if it is because I am using data from winter without a summer comparison. In winter, the more extensive polar jet stream may equalize the magitude of jet stream effects in the northern and southern regions of the country. I would be curious to test this theory by using summer data for comparison.

But the most interesting finding is the percent of delayed flights difference between eastbound and westbound directions. A greater percent of eastbound flights were delayed more than 15 minutes compared to westbound flights. Even though eastbound flights averaged higher speeds, more eastbound flights were delayed. Perhaps eastbound flights travel at higher speeds because they were more often running behind schedule. Maybe they don’t have the jet stream to thank at all! To tease out these questions, I would look at seasonal differences north to south, departure delay differences, and fuel use differences. Based on my exploration, I can’t say for sure that the jet stream is responsible for eastbound/westbound flight differences.