First we load the csv data into R from Github. We also load the required packages.

library(stringr)
library(RCurl)
## Loading required package: bitops
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.2.2
## 
## Attaching package: 'dplyr'
## 
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## 
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
## Warning: package 'tidyr' was built under R version 3.2.3
## 
## Attaching package: 'tidyr'
## 
## The following object is masked from 'package:RCurl':
## 
##     complete
url = "https://raw.githubusercontent.com/cyadusha/arrival-delays/master/arrival%20delays.csv"
x = getURL(url)
arrivaldelays = read.csv(file = textConnection(x), header = TRUE)

Then, we remove the row that has null values using the command below.

arrivaldelays = arrivaldelays[complete.cases(arrivaldelays), ]

We also replace the blank below Alaska with Alaska and the blank below AM West with AM West.

arrivaldelays[2,1] = "Alaska"
arrivaldelays[4,1] = "AM West"

We can also rename two of the columns in the original dataset as follows.

colnames(arrivaldelays)[1] = "Airline"
colnames(arrivaldelays)[2] = "Status"

We now use the gather command from the tidyr package to collect the names of the last 5 columns into one column.

arrivaldelays = arrivaldelays %>%
  gather(Destination, Number_of_Arrivals, Los.Angeles:Seattle, na.rm = T)

Because a punctuation mark exists in some of the destinations, we replace each punctuation mark with a space.

arrivaldelays$Destination = str_replace_all(arrivaldelays$Destination, "[[:punct:]]", " ")

Because each of the arrival statuses will soon be the names of the two columns in each of our tables, to avoid spaces, we replace “on time” with one word that would suffice. That word is “punctual”.

arrivaldelays$Status = str_replace_all(arrivaldelays$Status, "on time", "punctual")

Now we use the spread command in the tidyr package to organize the number of arrivals into the two new columns created which are “delayed” and “punctual”.

arrivaldelays = arrivaldelays %>%
  spread(Status, Number_of_Arrivals)
arrivaldelays
##    Airline   Destination delayed punctual
## 1   Alaska   Los Angeles      62      497
## 2   Alaska       Phoenix      12      221
## 3   Alaska     San Diego      20      212
## 4   Alaska San Francisco     102      503
## 5   Alaska       Seattle     305     1841
## 6  AM West   Los Angeles     117      694
## 7  AM West       Phoenix     415     4840
## 8  AM West     San Diego      65      383
## 9  AM West San Francisco     129      320
## 10 AM West       Seattle      61      201

To facilitate the comparison of the arrival delays for each airline, we add a new column which computes the percentage of punctual flights in each city for each airline.

arrivaldelays$percentage = (arrivaldelays$punctual/(arrivaldelays$delayed + arrivaldelays$punctual))*100
arrivaldelays
##    Airline   Destination delayed punctual percentage
## 1   Alaska   Los Angeles      62      497   88.90877
## 2   Alaska       Phoenix      12      221   94.84979
## 3   Alaska     San Diego      20      212   91.37931
## 4   Alaska San Francisco     102      503   83.14050
## 5   Alaska       Seattle     305     1841   85.78751
## 6  AM West   Los Angeles     117      694   85.57337
## 7  AM West       Phoenix     415     4840   92.10276
## 8  AM West     San Diego      65      383   85.49107
## 9  AM West San Francisco     129      320   71.26949
## 10 AM West       Seattle      61      201   76.71756

Now we use the group_by command to compute the mean percentage of punctual arrivals for each flight.

arrivalcomparison = arrivaldelays %>%
  group_by(Airline) %>% summarize(percentage = round(mean(percentage), 0))
arrivalcomparison
## Source: local data frame [2 x 2]
## 
##   Airline percentage
##    (fctr)      (dbl)
## 1  Alaska         89
## 2 AM West         82

We notice that for each destination, Alaska Airlines has a higher percentage of punctual arrivals as compared to AM West Airlines. But yet, the percentage of punctual arrivals for each city is much greater than 50.

It turns out that each airline has an average of at least 80% of punctual arrivals because the average percentage of punctual arrivals for each airline falls between 80 and 90. However, Alaska Airlines has a better performance because 89% of its flights arrive punctually. On the other hand, for AM West Airlines only 82% of its flights arrive punctually.