Untitled

Author

Eyong

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights23)

For the bargraph

library(dplyr)
library(ggplot2)
library(viridis)
Loading required package: viridisLite
library(nycflights13)

Attaching package: 'nycflights13'
The following objects are masked from 'package:nycflights23':

    airlines, airports, flights, planes, weather
nycflights13_delays <- flights %>%
  filter(origin %in% c("JFK", "LGA", "EWR")) %>%
  group_by(carrier) %>%
  summarize(avg_dep_delay = mean(dep_delay, na.rm = TRUE)) %>%
  arrange(desc(avg_dep_delay)) %>%
  left_join(airlines, by = "carrier")  

ggplot(nycflights13_delays, aes(x = reorder(name, -avg_dep_delay), y = avg_dep_delay, fill = name)) +
  geom_bar(stat = "identity") +
  scale_fill_viridis_d() +
  labs(
    x = "Airline Carrier",
    y = "Average Departure Delay (minutes)",
    title = "Average Departure Delays by Airline at NYC Airports",
    caption = "Data source: nycflights13 package",
    fill = "Airline Carrier"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

print(nycflights13_delays)
# A tibble: 16 × 3
   carrier avg_dep_delay name                       
   <chr>           <dbl> <chr>                      
 1 F9              20.2  Frontier Airlines Inc.     
 2 EV              20.0  ExpressJet Airlines Inc.   
 3 YV              19.0  Mesa Airlines Inc.         
 4 FL              18.7  AirTran Airways Corporation
 5 WN              17.7  Southwest Airlines Co.     
 6 9E              16.7  Endeavor Air Inc.          
 7 B6              13.0  JetBlue Airways            
 8 VX              12.9  Virgin America             
 9 OO              12.6  SkyWest Airlines Inc.      
10 UA              12.1  United Air Lines Inc.      
11 MQ              10.6  Envoy Air                  
12 DL               9.26 Delta Air Lines Inc.       
13 AA               8.59 American Airlines Inc.     
14 AS               5.80 Alaska Airlines Inc.       
15 HA               4.90 Hawaiian Airlines Inc.     
16 US               3.78 US Airways Inc.            

A bar graph of average departure delays

The visualization created is a bar graph that displays the average departure delays for various airline carriers operating at New York City airports, specifically John F. Kennedy International Airport, LaGuardia Airport, and Newark Liberty International Airport. Each bar represents an airline, with the length indicating the average delay in minutes. The airlines are arranged from those with the longest average delays to those with the shortest, allowing for easy comparison.

One notable aspect of this plot is the use of the Viridis color palette, which enhances both the visual appeal and accessibility of the graph, ensuring that it remains readable even for individuals with color vision deficiencies. Additionally, the labels are rotated for better visibility, making it easier to identify each airline. This bar graph not only highlights which airlines are experiencing significant delays but also serves as a practical tool for travelers who may wish to consider this information when booking their flights. The data is sourced from the nycflights13 package, which ensures that the information is derived from a reputable dataset. Overall, this visualization effectively communicates important information about airline performance regarding departure times.

For the treemap

library(dplyr)
library(treemap)
library(nycflights13)
nycflights13_delays <- flights %>%
  filter(origin %in% c("JFK", "LGA", "EWR")) %>%
  group_by(carrier) %>%
  summarize(avg_dep_delay = mean(dep_delay, na.rm = TRUE)) %>%
  arrange(desc(avg_dep_delay)) %>%
  left_join(airlines, by = "carrier") 
treemap(nycflights13_delays,
        index = "name",  
        vSize = "avg_dep_delay",
        vColor = "avg_dep_delay",
        type = "manual",
        palette = "RdYlBu", 
        draw = TRUE,
        title = "Average Departure Delays by Airline at NYC Airports",
        title.legend = "Average Delay (minutes)",
        bg.labels = "transparent",
        fontfamily.labels = "serif",
        border.col = "white"
)

print(nycflights13_delays)
# A tibble: 16 × 3
   carrier avg_dep_delay name                       
   <chr>           <dbl> <chr>                      
 1 F9              20.2  Frontier Airlines Inc.     
 2 EV              20.0  ExpressJet Airlines Inc.   
 3 YV              19.0  Mesa Airlines Inc.         
 4 FL              18.7  AirTran Airways Corporation
 5 WN              17.7  Southwest Airlines Co.     
 6 9E              16.7  Endeavor Air Inc.          
 7 B6              13.0  JetBlue Airways            
 8 VX              12.9  Virgin America             
 9 OO              12.6  SkyWest Airlines Inc.      
10 UA              12.1  United Air Lines Inc.      
11 MQ              10.6  Envoy Air                  
12 DL               9.26 Delta Air Lines Inc.       
13 AA               8.59 American Airlines Inc.     
14 AS               5.80 Alaska Airlines Inc.       
15 HA               4.90 Hawaiian Airlines Inc.     
16 US               3.78 US Airways Inc.            

A treemap that shows average departure delays for different airlines

The visualization I created is a treemap that illustrates the average departure delays of airlines at New York City airports, specifically John F. Kennedy International Airport, LaGuardia Airport, and Newark Liberty International Airport. Using R and packages like dplyr and treemap, I processed the flight data to summarize delays by airline and visually represent this information. Each airline is depicted as a rectangle, where the size corresponds to the average delay in minutes, and the color gradient—from red to blue—indicates the severity of delays, with darker colors representing longer wait times.

One important point to emphasize is the noticeable variation in delays among different airlines. For instance, some airlines may be represented by larger, darker rectangles, suggesting they have longer average delays compared to others. This discrepancy raises questions about what factors might be contributing to these delays—whether it’s operational issues, air traffic, or even weather conditions. The treemap format effectively allows viewers to quickly understand how different airlines compare in terms of performance, making it a practical tool for both passengers looking to make informed travel choices and industry professionals seeking to enhance service quality. Overall, this visualization not only conveys data effectively but also encourages deeper inquiry into the underlying causes of airline delays.