Task: Perform tasks with tidyr and dplyr to clean and tidy a data frame. Next, analyze arrival delays for the two airlines.

Our data frame, in its original form looks like:

##         X     X.1 Los.Angeles Phoenix San.Diego San.Francisco Seattle
## 1  ALASKA on time         497     221       212           503    1841
## 2         delayed          62      12        20           102     305
## 3                          NA      NA        NA            NA      NA
## 4 AM WEST on time         694    4840       383           320     201
## 5         delayed         117     415        65           129      61

Fill in the Airline name in the two missing places. Then eliminate the row full of NAs. Then, gather it with X and X.1 as keys. (They will be renamed airline and category.)

airline.chart[2,1]<-airline.chart[1,1]
airline.chart[5,1]<-airline.chart[4,1]
airline.chart<-airline.chart[-3,]
airline.chart<-gather(airline.chart,key='X','X.1')
airline.chart
##          X     X.1             X  X.1
## 1   ALASKA on time   Los.Angeles  497
## 2   ALASKA delayed   Los.Angeles   62
## 3  AM WEST on time   Los.Angeles  694
## 4  AM WEST delayed   Los.Angeles  117
## 5   ALASKA on time       Phoenix  221
## 6   ALASKA delayed       Phoenix   12
## 7  AM WEST on time       Phoenix 4840
## 8  AM WEST delayed       Phoenix  415
## 9   ALASKA on time     San.Diego  212
## 10  ALASKA delayed     San.Diego   20
## 11 AM WEST on time     San.Diego  383
## 12 AM WEST delayed     San.Diego   65
## 13  ALASKA on time San.Francisco  503
## 14  ALASKA delayed San.Francisco  102
## 15 AM WEST on time San.Francisco  320
## 16 AM WEST delayed San.Francisco  129
## 17  ALASKA on time       Seattle 1841
## 18  ALASKA delayed       Seattle  305
## 19 AM WEST on time       Seattle  201
## 20 AM WEST delayed       Seattle   61

Attach column names. Arrange by airline. Use spread to move on time status (category) to the columns.

colnames(airline.chart)<-c('airline','category','city','number')
airline.chart<-arrange(airline.chart,airline.chart[,1])
airline.chart<-spread(airline.chart,category,number)
airline.chart
##    airline          city delayed on time
## 1   ALASKA   Los.Angeles      62     497
## 2   ALASKA       Phoenix      12     221
## 3   ALASKA     San.Diego      20     212
## 4   ALASKA San.Francisco     102     503
## 5   ALASKA       Seattle     305    1841
## 6  AM WEST   Los.Angeles     117     694
## 7  AM WEST       Phoenix     415    4840
## 8  AM WEST     San.Diego      65     383
## 9  AM WEST San.Francisco     129     320
## 10 AM WEST       Seattle      61     201

Create a new column for on-time percentage.

##    airline          city delayed on time on_time_percentage
## 1   ALASKA   Los.Angeles      62     497          0.8890877
## 2   ALASKA       Phoenix      12     221          0.9484979
## 3   ALASKA     San.Diego      20     212          0.9137931
## 4   ALASKA San.Francisco     102     503          0.8314050
## 5   ALASKA       Seattle     305    1841          0.8578751
## 6  AM WEST   Los.Angeles     117     694          0.8557337
## 7  AM WEST       Phoenix     415    4840          0.9210276
## 8  AM WEST     San.Diego      65     383          0.8549107
## 9  AM WEST San.Francisco     129     320          0.7126949
## 10 AM WEST       Seattle      61     201          0.7671756

AM West has an online percentage that is lower in each city, which is clear from the above graph.

In San Francisco, Alaska Airlines has the greatest advantage, whereas Pheonix has the smallest difference

For average overall performance, the result is the opposite. This means that, in the city where on-time percentage is the lowest, it matters most which airline you choose.


———————————————————————————————————————–

The last 2 charts were aided by the following tables:

##            city    ALASKA   AM WEST Alaska_advantage
## 1   Los.Angeles 0.8890877 0.8557337       0.03335399
## 2       Phoenix 0.9484979 0.9210276       0.02747026
## 3     San.Diego 0.9137931 0.8549107       0.05888239
## 4 San.Francisco 0.8314050 0.7126949       0.11871008
## 5       Seattle 0.8578751 0.7671756       0.09069954
##            city    ALASKA   AM WEST On_time_average
## 1   Los.Angeles 0.8890877 0.8557337       0.8724107
## 2       Phoenix 0.9484979 0.9210276       0.9347627
## 3     San.Diego 0.9137931 0.8549107       0.8843519
## 4 San.Francisco 0.8314050 0.7126949       0.7720499
## 5       Seattle 0.8578751 0.7671756       0.8125253