Introduction

In this assignment we are to create a table commonly seen in excel and convert it into a format that allows R to easily manipulate in order to do data analysis on.

Reading CSV file

This dataset has been uploaded to Github and is a csv file converted from excel. Making sure in the parameters to fill all the blanks with NA. We are going to need to fix the NA in the next step

air = read.csv("https://raw.githubusercontent.com/xvicxpx/Data607/main/Homework%205/Airline.csv", header = TRUE, na.strings = "")
incomplete final line found by readTableHeader on 'https://raw.githubusercontent.com/xvicxpx/Data607/main/Homework%205/Airline.csv'

Fixing the NA

Since we know that the Airline NA are always going to be on the even rows and will be using row on top of it to be filling in the data. This is easily solved with a loop and manipulation of the dataframe

for(x in seq(from=2, to=nrow(air), by=2))
{
  air[x, 1] = air[x-1, 1]
}

Using tidyr gather function

Next we are going to be breaking down the dataset in order to maker it easier for analysis and manipulation. In SQL this is called a unpivot and in the tidyr this is called the gather function

library(tidyr)

air_unpivot = air %>%
  gather(key="City", value="Count", c("Los.Angeles","Phoenix", "San.Diego", "San.Francisco", "Seattle"))

Splitting On Time and Delays

We are going to split out the On Time and Delays into their own columns so that it will be easier to calculate a on time performance by taking the total amount of flights and dividing by how many flights are on time

air_unpivot_2 = air_unpivot_2 %>%
  mutate(otp = on_time/(on_time + delay))
Error: Problem with `mutate()` input `otp`.
x object 'on_time' not found
ℹ Input `otp` is `on_time/(on_time + delay)`.
Run `rlang::last_error()` to see where the error occurred.

Data Analysis

Airline OTP by City

We can see that Alaska Airline has to Phoenix has the highest on time performance, but that is not that impressive given the amount of flights Alaska Airline has to Phoenix. I would give the best Air On Time Performance to AM WEST to phoenix as even with 5255 flights it is still able to have a 92% on time performance rating

air_unpivot_2 = air_unpivot_2 %>%
  mutate(otp = on_time/(on_time + delay)) %>%
  arrange(desc(otp))

air_unpivot_2

Overall on Time Performance

Overall we can see that AM WEST has a better on time performance compared to ALASKA, and both are very close in on time performance. If one was to look at only overall on time performance they one would easily recommend AM WEST but if one looks at the data by city one will see that AM WEST underperforms significantly against ALASKA against all routes. This shows that looking closer at the data is very important and that AM WEST on time performance numbers are inflated because they have a lot of flights to Phoenix with a high on time performance therefore it increases their overall on time performance. We can see that without the Phoenix route AM WEST on time performance is now lower compared to ALASKA

air_unpivot_2 %>%
  select(Airline, delay, on_time) %>%
  group_by(Airline) %>%
  summarise(delay = sum(delay), on_time = sum(on_time), otp = sum(on_time) / (sum(on_time) + sum(delay))) %>%
  arrange(desc(otp))
air_unpivot_2 %>%
  filter(City != "Phoenix") %>%
  select(Airline, delay, on_time) %>%
  group_by(Airline) %>%
  summarise(delay = sum(delay), on_time = sum(on_time), otp = sum(on_time) / (sum(on_time) + sum(delay))) %>%
  arrange(desc(otp))
NA

Overall vs Per City OTP

The reason for the discrepancy between the per-city and the overall on time performance is because of AM WEST Phoenix route which had a high weight and on time performance when calculating the overall on time performance. But when one looks closely at the data they will see that AM WEST in consistently behind ALASKA in every route, and when taking Phoenix route out of the equation ALASKA had a higher on time performance

Conclusion

Based on the data provided I would overall I would recommend that everyone fly with ALASKA as they are consistently have better on time performance in every route compared to AM WEST. But I would recommend that anyone flying to Phoenix fly with AM WEST as the on time performance is on par with ALASKA and they have a lot more flights to Phoenix compared to ALASKA

LS0tCnRpdGxlOiAiRGF0YSA2MDcgV2VlayA1IEFzc2lnbm1lbnQiCm91dHB1dDogaHRtbF9ub3RlYm9vawotLS0KCiMgSW50cm9kdWN0aW9uCkluIHRoaXMgYXNzaWdubWVudCB3ZSBhcmUgdG8gY3JlYXRlIGEgdGFibGUgY29tbW9ubHkgc2VlbiBpbiBleGNlbCBhbmQgY29udmVydCBpdCBpbnRvIGEgZm9ybWF0IHRoYXQgYWxsb3dzIFIgdG8gZWFzaWx5IG1hbmlwdWxhdGUgaW4gb3JkZXIgdG8gZG8gZGF0YSBhbmFseXNpcyBvbi4KCiMgUmVhZGluZyBDU1YgZmlsZQpUaGlzIGRhdGFzZXQgaGFzIGJlZW4gdXBsb2FkZWQgdG8gR2l0aHViIGFuZCBpcyBhIGNzdiBmaWxlIGNvbnZlcnRlZCBmcm9tIGV4Y2VsLiBNYWtpbmcgc3VyZSBpbiB0aGUgcGFyYW1ldGVycyB0byBmaWxsIGFsbCB0aGUgYmxhbmtzIHdpdGggTkEuIFdlIGFyZSBnb2luZyB0byBuZWVkIHRvIGZpeCB0aGUgTkEgaW4gdGhlIG5leHQgc3RlcAoKYGBge3J9CmFpciA9IHJlYWQuY3N2KCJodHRwczovL3Jhdy5naXRodWJ1c2VyY29udGVudC5jb20veHZpY3hweC9EYXRhNjA3L21haW4vSG9tZXdvcmslMjA1L0FpcmxpbmUuY3N2IiwgaGVhZGVyID0gVFJVRSwgbmEuc3RyaW5ncyA9ICIiKQpgYGAKCiMgRml4aW5nIHRoZSBOQQpTaW5jZSB3ZSBrbm93IHRoYXQgdGhlIEFpcmxpbmUgTkEgYXJlIGFsd2F5cyBnb2luZyB0byBiZSBvbiB0aGUgZXZlbiByb3dzIGFuZCB3aWxsIGJlIHVzaW5nIHJvdyBvbiB0b3Agb2YgaXQgdG8gYmUgZmlsbGluZyBpbiB0aGUgZGF0YS4gVGhpcyBpcyBlYXNpbHkgc29sdmVkIHdpdGggYSBsb29wIGFuZCBtYW5pcHVsYXRpb24gb2YgdGhlIGRhdGFmcmFtZQoKYGBge3J9CmZvcih4IGluIHNlcShmcm9tPTIsIHRvPW5yb3coYWlyKSwgYnk9MikpCnsKICBhaXJbeCwgMV0gPSBhaXJbeC0xLCAxXQp9CmBgYAoKIyBVc2luZyB0aWR5ciBnYXRoZXIgZnVuY3Rpb24KTmV4dCB3ZSBhcmUgZ29pbmcgdG8gYmUgYnJlYWtpbmcgZG93biB0aGUgZGF0YXNldCBpbiBvcmRlciB0byBtYWtlciBpdCBlYXNpZXIgZm9yIGFuYWx5c2lzIGFuZCBtYW5pcHVsYXRpb24uIEluIFNRTCB0aGlzIGlzIGNhbGxlZCBhIHVucGl2b3QgYW5kIGluIHRoZSB0aWR5ciB0aGlzIGlzIGNhbGxlZCB0aGUgZ2F0aGVyIGZ1bmN0aW9uCgpgYGB7cn0KbGlicmFyeSh0aWR5cikKCmFpcl91bnBpdm90ID0gYWlyICU+JQogIGdhdGhlcihrZXk9IkNpdHkiLCB2YWx1ZT0iQ291bnQiLCBjKCJMb3MuQW5nZWxlcyIsIlBob2VuaXgiLCAiU2FuLkRpZWdvIiwgIlNhbi5GcmFuY2lzY28iLCAiU2VhdHRsZSIpKQoKYGBgCgojIFNwbGl0dGluZyBPbiBUaW1lIGFuZCBEZWxheXMKV2UgYXJlIGdvaW5nIHRvIHNwbGl0IG91dCB0aGUgT24gVGltZSBhbmQgRGVsYXlzIGludG8gdGhlaXIgb3duIGNvbHVtbnMgc28gdGhhdCBpdCB3aWxsIGJlIGVhc2llciB0byBjYWxjdWxhdGUgYSBvbiB0aW1lIHBlcmZvcm1hbmNlIGJ5IHRha2luZyB0aGUgdG90YWwgYW1vdW50IG9mIGZsaWdodHMgYW5kIGRpdmlkaW5nIGJ5IGhvdyBtYW55IGZsaWdodHMgYXJlIG9uIHRpbWUKYGBge3J9CmxpYnJhcnkoZHBseXIpCgphaXJfdW5waXZvdF8yID0gYWlyX3VucGl2b3QgJT4lCiAgc3ByZWFkKGtleT0iU3RhdHVzIiwgdmFsdWU9IkNvdW50IikKCmFpcl91bnBpdm90XzIgPSBhaXJfdW5waXZvdF8yICU+JQogIHJlbmFtZShvbl90aW1lID0gYG9uIHRpbWVgKQoKYWlyX3VucGl2b3RfMiA9IGFpcl91bnBpdm90XzIgJT4lCiAgbXV0YXRlKG90cCA9IG9uX3RpbWUvKG9uX3RpbWUgKyBkZWxheSkpCmBgYAoKIyBEYXRhIEFuYWx5c2lzCgojIyBBaXJsaW5lIE9UUCBieSBDaXR5CldlIGNhbiBzZWUgdGhhdCBBbGFza2EgQWlybGluZSBoYXMgdG8gUGhvZW5peCBoYXMgdGhlIGhpZ2hlc3Qgb24gdGltZSBwZXJmb3JtYW5jZSwgYnV0IHRoYXQgaXMgbm90IHRoYXQgaW1wcmVzc2l2ZSBnaXZlbiB0aGUgYW1vdW50IG9mIGZsaWdodHMgQWxhc2thIEFpcmxpbmUgaGFzIHRvIFBob2VuaXguIEkgd291bGQgZ2l2ZSB0aGUgYmVzdCBBaXIgT24gVGltZSBQZXJmb3JtYW5jZSB0byBBTSBXRVNUIHRvIHBob2VuaXggYXMgZXZlbiB3aXRoIDUyNTUgZmxpZ2h0cyBpdCBpcyBzdGlsbCBhYmxlIHRvIGhhdmUgYSA5MiUgb24gdGltZSBwZXJmb3JtYW5jZSByYXRpbmcKCmBgYHtyfQphaXJfdW5waXZvdF8yID0gYWlyX3VucGl2b3RfMiAlPiUKICBtdXRhdGUob3RwID0gb25fdGltZS8ob25fdGltZSArIGRlbGF5KSkgJT4lCiAgYXJyYW5nZShkZXNjKG90cCkpCgphaXJfdW5waXZvdF8yCmBgYAoKIyMgT3ZlcmFsbCBvbiBUaW1lIFBlcmZvcm1hbmNlCk92ZXJhbGwgd2UgY2FuIHNlZSB0aGF0IEFNIFdFU1QgaGFzIGEgYmV0dGVyIG9uIHRpbWUgcGVyZm9ybWFuY2UgY29tcGFyZWQgdG8gQUxBU0tBLCBhbmQgYm90aCBhcmUgdmVyeSBjbG9zZSBpbiBvbiB0aW1lIHBlcmZvcm1hbmNlLiBJZiBvbmUgd2FzIHRvIGxvb2sgYXQgb25seSBvdmVyYWxsIG9uIHRpbWUgcGVyZm9ybWFuY2UgdGhleSBvbmUgd291bGQgZWFzaWx5IHJlY29tbWVuZCBBTSBXRVNUIGJ1dCBpZiBvbmUgbG9va3MgYXQgdGhlIGRhdGEgYnkgY2l0eSBvbmUgd2lsbCBzZWUgdGhhdCBBTSBXRVNUIHVuZGVycGVyZm9ybXMgc2lnbmlmaWNhbnRseSBhZ2FpbnN0IEFMQVNLQSBhZ2FpbnN0IGFsbCByb3V0ZXMuIFRoaXMgc2hvd3MgdGhhdCBsb29raW5nIGNsb3NlciBhdCB0aGUgZGF0YSBpcyB2ZXJ5IGltcG9ydGFudCBhbmQgdGhhdCBBTSBXRVNUIG9uIHRpbWUgcGVyZm9ybWFuY2UgbnVtYmVycyBhcmUgaW5mbGF0ZWQgYmVjYXVzZSB0aGV5IGhhdmUgYSBsb3Qgb2YgZmxpZ2h0cyB0byBQaG9lbml4IHdpdGggYSBoaWdoIG9uIHRpbWUgcGVyZm9ybWFuY2UgdGhlcmVmb3JlIGl0IGluY3JlYXNlcyB0aGVpciBvdmVyYWxsIG9uIHRpbWUgcGVyZm9ybWFuY2UuIFdlIGNhbiBzZWUgdGhhdCB3aXRob3V0IHRoZSBQaG9lbml4IHJvdXRlIEFNIFdFU1Qgb24gdGltZSBwZXJmb3JtYW5jZSBpcyBub3cgbG93ZXIgY29tcGFyZWQgdG8gQUxBU0tBCgpgYGB7cn0KYWlyX3VucGl2b3RfMiAlPiUKICBzZWxlY3QoQWlybGluZSwgZGVsYXksIG9uX3RpbWUpICU+JQogIGdyb3VwX2J5KEFpcmxpbmUpICU+JQogIHN1bW1hcmlzZShkZWxheSA9IHN1bShkZWxheSksIG9uX3RpbWUgPSBzdW0ob25fdGltZSksIG90cCA9IHN1bShvbl90aW1lKSAvIChzdW0ob25fdGltZSkgKyBzdW0oZGVsYXkpKSkgJT4lCiAgYXJyYW5nZShkZXNjKG90cCkpCmBgYAoKYGBge3IgcmVtb3ZpbmcgcGhvZW5peH0KYWlyX3VucGl2b3RfMiAlPiUKICBmaWx0ZXIoQ2l0eSAhPSAiUGhvZW5peCIpICU+JQogIHNlbGVjdChBaXJsaW5lLCBkZWxheSwgb25fdGltZSkgJT4lCiAgZ3JvdXBfYnkoQWlybGluZSkgJT4lCiAgc3VtbWFyaXNlKGRlbGF5ID0gc3VtKGRlbGF5KSwgb25fdGltZSA9IHN1bShvbl90aW1lKSwgb3RwID0gc3VtKG9uX3RpbWUpIC8gKHN1bShvbl90aW1lKSArIHN1bShkZWxheSkpKSAlPiUKICBhcnJhbmdlKGRlc2Mob3RwKSkKYGBgCgoKIyMgT3ZlcmFsbCB2cyBQZXIgQ2l0eSBPVFAKVGhlIHJlYXNvbiBmb3IgdGhlIGRpc2NyZXBhbmN5IGJldHdlZW4gdGhlIHBlci1jaXR5IGFuZCB0aGUgb3ZlcmFsbCBvbiB0aW1lIHBlcmZvcm1hbmNlIGlzIGJlY2F1c2Ugb2YgQU0gV0VTVCBQaG9lbml4IHJvdXRlIHdoaWNoIGhhZCBhIGhpZ2ggd2VpZ2h0IGFuZCBvbiB0aW1lIHBlcmZvcm1hbmNlIHdoZW4gY2FsY3VsYXRpbmcgdGhlIG92ZXJhbGwgb24gdGltZSBwZXJmb3JtYW5jZS4gQnV0IHdoZW4gb25lIGxvb2tzIGNsb3NlbHkgYXQgdGhlIGRhdGEgdGhleSB3aWxsIHNlZSB0aGF0IEFNIFdFU1QgaW4gY29uc2lzdGVudGx5IGJlaGluZCBBTEFTS0EgaW4gZXZlcnkgcm91dGUsIGFuZCB3aGVuIHRha2luZyBQaG9lbml4IHJvdXRlIG91dCBvZiB0aGUgZXF1YXRpb24gQUxBU0tBIGhhZCBhIGhpZ2hlciBvbiB0aW1lIHBlcmZvcm1hbmNlCgojIENvbmNsdXNpb24KQmFzZWQgb24gdGhlIGRhdGEgcHJvdmlkZWQgSSB3b3VsZCBvdmVyYWxsIEkgd291bGQgcmVjb21tZW5kIHRoYXQgZXZlcnlvbmUgZmx5IHdpdGggQUxBU0tBIGFzIHRoZXkgYXJlIGNvbnNpc3RlbnRseSBoYXZlIGJldHRlciBvbiB0aW1lIHBlcmZvcm1hbmNlIGluIGV2ZXJ5IHJvdXRlIGNvbXBhcmVkIHRvIEFNIFdFU1QuIEJ1dCBJIHdvdWxkIHJlY29tbWVuZCB0aGF0IGFueW9uZSBmbHlpbmcgdG8gUGhvZW5peCBmbHkgd2l0aCBBTSBXRVNUIGFzIHRoZSBvbiB0aW1lIHBlcmZvcm1hbmNlIGlzIG9uIHBhciB3aXRoIEFMQVNLQSBhbmQgdGhleSBoYXZlIGEgbG90IG1vcmUgZmxpZ2h0cyB0byBQaG9lbml4IGNvbXBhcmVkIHRvIEFMQVNLQQoK