Load libraries and import .csv

library(tidyr)
library(dplyr)
dat <- read.csv("https://raw.githubusercontent.com/TheWerefriend/data607/master/week5/numbersense.csv")

Reshape the data

Separate the two airlines, and create another column of totals. Create a tibble with all the data, and preserve the city names as a column.

alaska <- t(dat[1:2, 3:7])
am_west <- t(dat[4:5, 3:7])
totals <- alaska + am_west

flight_info <- cbind(alaska, am_west, totals)
colnames(flight_info) <- c("AL_OT", "AL_D", "AM_OT", "AM_D", "TOT_OT", "TOT_D")
fi <- tibble::rownames_to_column(as.data.frame(flight_info), "cities") %>%
  tibble()

Summarize the data

fi %>%
  summarize(total_percentage = (TOT_D / (TOT_OT + TOT_D)),
            AL_percentage = (AL_D / (AL_OT + AL_D)),
            AM_percentage = (AM_D / (AM_OT + AM_D)))

## # A tibble: 5 x 3
##   total_percentage AL_percentage AM_percentage
##              <dbl>         <dbl>         <dbl>
## 1           0.131         0.111         0.144 
## 2           0.0778        0.0515        0.0790
## 3           0.125         0.0862        0.145 
## 4           0.219         0.169         0.287 
## 5           0.152         0.142         0.233

Conclusions

Phoenix has the lowest percentage of delayed flights in comparison with the other cities. Northern cities have a higher percentage of delays for both airlines. Alaska has lower delay percentages than AM West for every single city.

tidyingAndTransforming

Sam Reeves

3/5/2021

Load libraries and import .csv

Reshape the data

Summarize the data

Conclusions