’’’ ## Comparing change in recorded domestic flight load factor from 1998 to 2009, in the US, by the Census
#####How did the average load factor of U.S. domestic flights evolve between 1998 and 2009?
#####This analysis follows a dataset containing detailed information on allrecorded U.S. domestic flights from 1998-2009 published by the US Census.
#####The load factor is a percentage based on the number of passengers per flight, dividied by the number of available seats.A higher load factor means more seats are filled, while a lower load factor means many seats are empty, suggesting underutilized flights. Analyzing the change in load factor is a great indication of the efficiency and profitability of US airport systems over time
’’’
In this analysis, the load factorshows how well U.S. domestic flights utilized their seating capacity year to year.
| ’’’ ### First we need to set the directory to our data set setwd(“~/Library/Mobile Documents/comappleCloudDocs/Data 101/flightlogs”) getwd() |
| ’’’ install.packages(“tidyverse”) install.packages(“lubridate”) install.packages(“janitor”) |
| library(tidyverse) library(lubridate) library(janitor) |
| ’’’ |
| ’’’ flights<- read.csv(“flight_edges.csv”) flights <- flights %>% clean_names() head(flights) colnames(flights) |
| colnames(flights) <- c(“Origin”, “Destination”, “OriginCity”, “DestinationCity”, “Passengers”, “Seats”, “Flights”, “Distance”, “FlyDate”, “OriginPopulation”, “DestinationPopulation”) head(flights) |
| library(lubridate) |
| flights <- flights %>% mutate( FlyDate = as.character(FlyDate), FlyDate = paste0(FlyDate, “01”), FlyDate = ymd(FlyDate) ) |
| head(flights$FlyDate) |
| ’’’ |
| ’’’ |
| flights_clean <- flights %>% filter(!is.na(Passengers), Passengers > 0, !is.na(Seats), Seats > 0) print(flights_clean) |
| monthly_load <- flights_clean %>% group_by(FlyDate) %>% summarise( TotalPassengers = sum(Passengers, na.rm = TRUE), TotalSeats = sum(Seats, na.rm = TRUE) ) %>% mutate( LoadFactor = TotalPassengers / TotalSeats ) %>% arrange(FlyDate) |
| head(monthly_load) print(monthly_load) |
| yearly_load <- flights_clean %>% mutate(Year = year(FlyDate)) %>% group_by(Year) %>% summarise( TotalPassengers = sum(Passengers, na.rm = TRUE), TotalSeats = sum(Seats, na.rm = TRUE) ) %>% mutate( LoadFactor = TotalPassengers / TotalSeats ) %>% arrange(Year) head(yearly_load) |
| print(yearly_load) |
| ’’’ |
#Summary of Key Findings #The load factor rose from 0.66 in 1998 to 0.76 in 2009, creating a roughly 10% improvement in average seat utilization over the decade
#####Future research can build on this exploration by examining how different #social and economic climates influence airline load factors, within flights.
#####By continuing to monitor changes over time, researchers can identify how events such as economic recessions, pandemics directly affect flight occupancy rates.This type of analysis could be expanded by adding in more variables to document, such as ticket prices, route distances,airline type to really research consumer habits in various social climates.
#######Citation: #Perkins, Jacob. 3.5 Million+ US Domestic Flights from 1990 to 2009. Infochimps, 2010, http://infochimps.org/datasets/d35-million-us-domestic-flights-from-1990-to-2009.