First we need to be able to group the data by the carrier and summarize the data based on its average flight delay. With this information sorted we can then represent it with the code to show how they relate.
ggplot(avg_arrival_delay, aes(y =reorder(carrier, avg_arr_delay), x = avg_arr_delay, fill = avg_arr_delay >0)) +geom_bar(stat ="identity") +scale_fill_manual(values =alpha(c("green", "red"), 0.7), labels =c("On Time", "Delayed"),name ="Arrival Status") +labs(title ="Average Arrival Delay by Carrier",x ="Average Arrival Delay (minutes)",y ="Carrier",caption ="Data source: nycflights23") +theme_minimal() +theme(plot.title =element_text(size =15, face ="bold", hjust =0.5), # hjust centers the textaxis.title =element_text(size =13, face ="bold"),axis.text =element_text(size =10), # Slightly larger axis text for better readability )
Summary
This bar plot shows the average arrival delays for each carrier flights from New York City. The carriers are represented on the y axis, while the average delay in minutes is shown on the x axis. Carriers with a positive average delay are colored red, indicating that their flights are generally delayed, while those with a non-positive average delay are colored green, showing they are on time or early and by how much. This visualization allows for quick comparison across carriers. Notably, G4 shows a significant average delay, which stands out compared to other carriers. This insight highlights the need for G4 airlines to examine factors contributing to their delays, because in order for them to be down there, the average would have to be delayed. This means a significant amount of their flights are not on time which is good information in order to fix it.