Untitled

remotes::install_github("moderndive/nycflights23")
Skipping install of 'nycflights23' from a github remote, the SHA1 (54a296ac) has not changed since last install.
  Use `force = TRUE` to force installation
library(nycflights23)
data("flights")
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(conflicted)
library(dplyr)
library(RColorBrewer)
head(flights)
# A tibble: 6 × 19
   year month   day dep_time sched_dep_time dep_delay arr_time sched_arr_time
  <int> <int> <int>    <int>          <int>     <dbl>    <int>          <int>
1  2023     1     1        1           2038       203      328              3
2  2023     1     1       18           2300        78      228            135
3  2023     1     1       31           2344        47      500            426
4  2023     1     1       33           2140       173      238           2352
5  2023     1     1       36           2048       228      223           2252
6  2023     1     1      503            500         3      808            815
# ℹ 11 more variables: arr_delay <dbl>, carrier <chr>, flight <int>,
#   tailnum <chr>, origin <chr>, dest <chr>, air_time <dbl>, distance <dbl>,
#   hour <dbl>, minute <dbl>, time_hour <dttm>
summary(flights$origin)
   Length     Class      Mode 
   435352 character character 
columns = count
library(dplyr)
library(tidyr)
library(ggplot2)
library(alluvial)
library(ggalluvial)
data_processed <- flights %>%
  mutate(year = as.integer(format(time_hour, "%Y"))) %>%
  group_by(year, carrier) %>%
  tally(name = "flights") %>%
  ungroup()

# Create the alluvial plot
ggplot(data_processed, 
       aes(axis1 = year, axis2 = carrier, y = flights)) +
  geom_alluvium(aes(fill = carrier), width = 0.2) +
  geom_stratum(width = 0.2) +
  geom_text(stat = "stratum", aes(label = after_stat(stratum))) +
  scale_x_discrete(limits = c("Year", "Carrier"), expand = c(0.15, 0.05)) +
  theme_minimal() +
  labs(title = "Alluvial Diagram of Flights by Carrier and Year",
       x = "",
       y = "Number of Flights")

My visualization is an alluvial that shows the number of flights in 2023 by carrier. One thins I would like to highlight about this visualization isn’t so much a good thing. some of the carriers don’t have large enough values so the labels cannot fit in some of the areas.