Load tidyr and dplyr
library(tidyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
Load Data after placing the csv into your working directory
airlinedata = read.csv('Airline Data.csv')
This section parses the data frame removing all NA rows and maintaining only complete cases. The columns are then renamed for the purpose of tidying the data frame.
airlinedata = airlinedata[complete.cases(airlinedata),]
colnames(airlinedata) = c("State","Flight Status","Flight LA","Flight PH","Flight SD","Flight SF","Flight SEA")
This section converts empty rows to NA and down fills those rows so that all rows have respective state values. This is necessary for tidying the data frame later in the process.
airlinedata$State[as.character(airlinedata$State) == ""] = NA
airlinedata = airlinedata %>% fill(State)
The first two columns are removed to eliminate redundancy and create a concise data frame.
airlinedata = airlinedata[,-1:-2]
I calculate the absolute difference and create a flag indicating which airport has quicker flights. Looking at the two columns, you can determine the absolute time advantage that one airport has over the other.
delaytrans$Delay_Difference = abs(delaytrans$ALASKA_delayed - delaytrans$`AM WEST_delayed`)
delaytrans$Speed_Status = if_else(delaytrans$ALASKA_delayed>delaytrans$`AM WEST_delayed`,"AM is Faster","Alaska is Faster")
Print Results
print(delaytrans)
## Destination ALASKA_delayed AM WEST_delayed Delay_Difference
## 1 Flight LA 62 117 55
## 2 Flight PH 12 415 403
## 3 Flight SD 20 65 45
## 4 Flight SEA 305 61 244
## 5 Flight SF 102 129 27
## Speed_Status
## 1 Alaska is Faster
## 2 Alaska is Faster
## 3 Alaska is Faster
## 4 AM is Faster
## 5 Alaska is Faster
Conclusion: 80% of the flights from the Alaska Airport have a shorter delay time than the AM West Airport. On average, the delay time for Alaska is 100.2 Minutes where as the delay time on average for AM West is 157.4.
averagedelayAlaska = sum(delaytrans$ALASKA_delayed)/length(delaytrans$ALASKA_delayed)
averagedelayAMWest = sum(delaytrans$`AM WEST_delayed`)/length(delaytrans$`AM WEST_delayed`)
print(paste("The Delay Time for Alaska Airport is:",averagedelayAlaska,"minutes",sep = " "))
## [1] "The Delay Time for Alaska Airport is: 100.2 minutes"
print(paste("The Delay Time for AM West Airport is:",averagedelayAMWest,"minutes",sep = " "))
## [1] "The Delay Time for AM West Airport is: 157.4 minutes"