NYC Flights

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(nycflights13)
nycflights13::flights
# A tibble: 336,776 × 19
    year month   day dep_time sched_dep_time dep_delay arr_time sched_arr_time
   <int> <int> <int>    <int>          <int>     <dbl>    <int>          <int>
 1  2013     1     1      517            515         2      830            819
 2  2013     1     1      533            529         4      850            830
 3  2013     1     1      542            540         2      923            850
 4  2013     1     1      544            545        -1     1004           1022
 5  2013     1     1      554            600        -6      812            837
 6  2013     1     1      554            558        -4      740            728
 7  2013     1     1      555            600        -5      913            854
 8  2013     1     1      557            600        -3      709            723
 9  2013     1     1      557            600        -3      838            846
10  2013     1     1      558            600        -2      753            745
# ℹ 336,766 more rows
# ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
#   tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
#   hour <dbl>, minute <dbl>, time_hour <dttm>
flights <- nycflights13::flights %>%
  filter(!is.na(dep_delay), !is.na(air_time), !is.na(carrier))

ggplot(flights, aes(x = dep_delay, y = air_time, color = carrier)) +
  geom_point(alpha = 0.7, size = 3) +
  labs(
    title = "Departure Delay vs. Airtime by Carrier",
    x = "Departure Delay (minutes)",
    y = "Airtime (minutes)"
  ) +
  scale_color_brewer(palette = "Set1") +
  theme_minimal()
Warning in RColorBrewer::brewer.pal(n, pal): n too large, allowed maximum for palette Set1 is 9
Returning the palette you asked for with that many colors
Warning: Removed 120383 rows containing missing values (`geom_point()`).

This scatter plot allows you to explore the relationship between departure delay and airtime for flights operated by different carriers. It can help identify if there are any patterns or differences in how various carriers manage their flights in terms of departure delay and airtime.

One aspect of the scatter plot that’s worth noting is the distribution of points for different carriers along the x-axis (departure delay) and y-axis (airtime). By examining the positions and patterns of points for each carrier, you can gain insights into how different airlines handle departure delays in relation to the duration of their flights.