NYC Flights Assignments

Author

Dormowa Sherman

Published

October 1, 2023

NYC Flights Assignment

library(tidyverse)
library(nycflights13)
delayed_over_sixty <- flights |>
  filter(arr_delay >=60 & dep_delay >=60)
  
delayed_over_sixty
# A tibble: 23,065 × 19
    year month   day dep_time sched_dep_time dep_delay arr_time sched_arr_time
   <int> <int> <int>    <int>          <int>     <dbl>    <int>          <int>
 1  2013     1     1      811            630       101     1047            830
 2  2013     1     1      848           1835       853     1001           1950
 3  2013     1     1      957            733       144     1056            853
 4  2013     1     1     1114            900       134     1447           1222
 5  2013     1     1     1120            944        96     1331           1213
 6  2013     1     1     1301           1150        71     1518           1345
 7  2013     1     1     1337           1220        77     1649           1531
 8  2013     1     1     1400           1250        70     1645           1502
 9  2013     1     1     1505           1310       115     1638           1431
10  2013     1     1     1525           1340       105     1831           1626
# ℹ 23,055 more rows
# ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
#   tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
#   hour <dbl>, minute <dbl>, time_hour <dttm>
ggplot(data = delayed_over_sixty, aes(x = dep_delay, y = arr_delay, color = carrier)) +
  geom_point(alpha = .6) +
  labs(x = "Departure Delays", 
       y = "Arrival Delays",
       title = "NYC Significant Departure and Arrival Delays in 2013",
       caption = "*includes only departures and arrivals delayed by 60 minutes or more")

early_dep <- flights |>
  filter(dep_delay <=-1)
  
early_dep
# A tibble: 183,575 × 19
    year month   day dep_time sched_dep_time dep_delay arr_time sched_arr_time
   <int> <int> <int>    <int>          <int>     <dbl>    <int>          <int>
 1  2013     1     1      544            545        -1     1004           1022
 2  2013     1     1      554            600        -6      812            837
 3  2013     1     1      554            558        -4      740            728
 4  2013     1     1      555            600        -5      913            854
 5  2013     1     1      557            600        -3      709            723
 6  2013     1     1      557            600        -3      838            846
 7  2013     1     1      558            600        -2      753            745
 8  2013     1     1      558            600        -2      849            851
 9  2013     1     1      558            600        -2      853            856
10  2013     1     1      558            600        -2      924            917
# ℹ 183,565 more rows
# ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
#   tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
#   hour <dbl>, minute <dbl>, time_hour <dttm>
ggplot(data = early_dep, aes(x = dep_delay, y = arr_delay, color = carrier)) +
  geom_point(alpha = .6) +
  labs(x = "Early Departures", 
       y = "Early Arrivals",
       title = "NYC Early Flight Departures and Arrivals in 2013")

About the Plot

This is a scatterplot of NYC flights in 2013 that had significant delays, 60 minutes or more for the purpose of this assignment. First, I created a dataframe called “delayed_over_sixty”, which filtered out all flights with less than a sixty-minute departure and arrival delay. Then, I used geom_point to plot the delays for each carrier. This plot highlights that there is a strong relationship between departure delays and arrival delays.

The subsequent plot was just my curiosity to see early departures and arrivals for all carriers.