Librairies
## -- Attaching packages -------------------------------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.3.2 v purrr 0.3.4
## v tibble 3.0.3 v dplyr 1.0.2
## v tidyr 1.1.2 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.5.0
## -- Conflicts ----------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
Load data
airlines <- read.csv("https://raw.githubusercontent.com/jnataky/DATA-607/master/A5_Data_transformation/airlines_dest.csv")
Data Analysis
Plotting the airlines performance
Airlines on time comparison per city
on_time1 <- airlines_df%>%
group_by(dest, carrier) %>%
summarise(ontime_percent)
## `summarise()` regrouping output by 'dest' (override with `.groups` argument)
on_time1 %>%
kbl(caption = "On time performance per city", align = 'c') %>%
kable_material(c("striped", "hover")) %>%
row_spec(0, color = "indigo")
On time performance per city
|
dest
|
carrier
|
ontime_percent
|
|
los_angeles
|
ALASKA
|
0.889
|
|
los_angeles
|
AM WEST
|
0.856
|
|
phoenix
|
ALASKA
|
0.948
|
|
phoenix
|
AM WEST
|
0.921
|
|
san_diego
|
ALASKA
|
0.914
|
|
san_diego
|
AM WEST
|
0.855
|
|
san_francisco
|
ALASKA
|
0.831
|
|
san_francisco
|
AM WEST
|
0.713
|
|
seattle
|
ALASKA
|
0.858
|
|
seattle
|
AM WEST
|
0.767
|
# Plotting on time performance
ggplot(data = on_time1, aes(x = dest, y = ontime_percent, fill = carrier)) +
geom_bar(stat = "identity", position = "dodge") +
xlab("City") + ylab("On time % ") + ggtitle("Carriers on time performance per city")

A note on discrepancy
The overall on-time performance per carrier difference between two carriers is about 6%, and when looking the per city one the difference goes up to about 12% with a lowest of 2%. This creates a lack of similarities on that. The difference is varying.
Take Away
The analyze has shown that Alaska performs better on-time than AM West… Maybe it is because AM West have more flights in major cities than Alaska which cause the delay. Should it be expected ?
Before digging into the conclusion, let have a look on delays and analyze the overall number of flights per city, and see how it goes!
Graphs and insights
on_time3 <- airlines_df%>%
group_by(dest, carrier) %>%
summarise(ontime_percent)
## `summarise()` regrouping output by 'dest' (override with `.groups` argument)
# Plotting on time performance
ggplot(data = on_time3, aes(x = dest, y = delayed_percent, fill = carrier)) +
geom_bar(stat = "identity", position = "dodge") +
xlab("City") + ylab("Delayed % ") + ggtitle("Carriers delay per city")

on_time4 <- airlines_df%>%
group_by(dest, carrier) %>%
summarise(n_total)
## `summarise()` regrouping output by 'dest' (override with `.groups` argument)
# Plotting on time performance
ggplot(data = on_time4, aes(x = dest, y = n_total, fill = carrier)) +
geom_bar(stat = "identity", position = "dodge") +
xlab("City") + ylab("Number of flights ") + ggtitle("Number of flights per city")

Conclusion
The two graphs above hve shown something different… That’s not what could be expected!
Here’s: Looking at the graphs above, we can see that Alaska operates more flights in Seattle and San Francisco, but even in Seattle where Alaska has largely more flights than AM West, AM West still have more delay than Alaska. And the on-time performance per city graph can also proove it, Alaska flights are more on-time than AM West. It is also the case in the cities where AM West operates more flights (Los Angeles, Phoenix, and San Diego), those cities have more delays flights with AM West compared to Alaska… Alaska still better performing than AM West in any condition, that’s not number of flights operated in major cities causing delays to AM West. Thus, I will recommend to AM West to work on their reservation system and in their hand when it comes to make a choose between traveling to one the five cities with one of those two airlines , I will suggest Alaska airline to avoid a high potential delay with AM West.
