NYC flights project

Author

Aminata Diatta

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

#let’s look at our dataset

library(nycflights13)
data("flights")
view(flights)

#we are interested on UA airline, the arr_delay , the distance

flights_sata <- flights[, c("carrier", "arr_delay", "distance")]
flights_sata
# A tibble: 336,776 × 3
   carrier arr_delay distance
   <chr>       <dbl>    <dbl>
 1 UA             11     1400
 2 UA             20     1416
 3 AA             33     1089
 4 B6            -18     1576
 5 DL            -25      762
 6 UA             12      719
 7 B6             19     1065
 8 EV            -14      229
 9 B6             -8      944
10 AA              8      733
# ℹ 336,766 more rows

#let’s remove na’s

flights_sata2 <- flights |>
 filter(!is.na(distance) & !is.na(arr_delay) & !is.na(carrier))
view(flights_sata2)
United_airlaineS <- flights_sata2 |> 
  filter(carrier == "UA")
United_airlaineS2 <- United_airlaineS |> 
  filter(arr_delay >400 & distance > 1000)
library(treemap)
library(RColorBrewer)
treemap( flights_sata2 , index = "carrier" , vSize = "distance" , vColor = "arr_delay" , type = "manual", 
         title =  "The treemap of any Nyc flights over 6hrs delay in 1000 miles", palette = "OrRd")

#Our treemap visualization shows flights with over 6 hours of delay upon arrival, covering distances of at least 1000 miles. Each rectangle represents a flight, with its size indicating the distance traveled and its color showing the delay upon arrival. The rectangles are grouped by airline carrier. When we focus on Delta flights which is major airline operating globally, there was a significant computer failure reported by the Wall Street Journal,in 2013 , causing widespread disruptions across various airlines. For instance, passengers who had connecting flights may have missed them due to this issue. I did this visualization to know if flight delays are influenced by distance or if they are primarily dependent on the airline.