library(tidyverse)── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(skimr)
library(gt)
flights <- read_csv("https://jsuleiman.com/datasets/domestic_flights_jan_2016.csv")Rows: 445827 Columns: 21
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (9): FlightDate, Carrier, TailNum, Origin, OriginCityName, OriginState,...
dbl (12): FlightNum, CRSDepTime, DepTime, WheelsOff, WheelsOn, CRSArrTime, A...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(flights)Rows: 445,827
Columns: 21
$ FlightDate <chr> "1/6/2016", "1/7/2016", "1/8/2016", "1/9/2016", "1/1…
$ Carrier <chr> "AA", "AA", "AA", "AA", "AA", "AA", "AA", "AA", "AA"…
$ TailNum <chr> "N4YBAA", "N434AA", "N541AA", "N489AA", "N439AA", "N…
$ FlightNum <dbl> 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, …
$ Origin <chr> "DFW", "DFW", "DFW", "DFW", "DFW", "DFW", "DFW", "DF…
$ OriginCityName <chr> "Dallas/Fort Worth, TX", "Dallas/Fort Worth, TX", "D…
$ OriginState <chr> "TX", "TX", "TX", "TX", "TX", "TX", "TX", "TX", "TX"…
$ Dest <chr> "DTW", "DTW", "DTW", "DTW", "DTW", "DTW", "DTW", "DT…
$ DestCityName <chr> "Detroit, MI", "Detroit, MI", "Detroit, MI", "Detroi…
$ DestState <chr> "MI", "MI", "MI", "MI", "MI", "MI", "MI", "MI", "MI"…
$ CRSDepTime <dbl> 1100, 1100, 1100, 1100, 1100, 1100, 1100, 1100, 1100…
$ DepTime <dbl> 1057, 1056, 1055, 1102, 1240, 1107, 1059, 1055, 1058…
$ WheelsOff <dbl> 1112, 1110, 1116, 1115, 1300, 1118, 1113, 1107, 1110…
$ WheelsOn <dbl> 1424, 1416, 1431, 1424, 1617, 1426, 1429, 1419, 1420…
$ CRSArrTime <dbl> 1438, 1438, 1438, 1438, 1438, 1438, 1438, 1438, 1438…
$ ArrTime <dbl> 1432, 1426, 1445, 1433, 1631, 1435, 1438, 1431, 1428…
$ Cancelled <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ Diverted <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ CRSElapsedTime <dbl> 158, 158, 158, 158, 158, 158, 158, 158, 158, 158, 15…
$ ActualElapsedTime <dbl> 155, 150, 170, 151, 171, 148, 159, 156, 150, 158, 14…
$ Distance <dbl> 986, 986, 986, 986, 986, 986, 986, 986, 986, 986, 98…
flights |>
skim()| Name | flights |
| Number of rows | 445827 |
| Number of columns | 21 |
| _______________________ | |
| Column type frequency: | |
| character | 9 |
| numeric | 12 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| FlightDate | 0 | 1.00 | 8 | 9 | 0 | 31 | 0 |
| Carrier | 0 | 1.00 | 2 | 2 | 0 | 12 | 0 |
| TailNum | 4244 | 0.99 | 5 | 6 | 0 | 4238 | 0 |
| Origin | 0 | 1.00 | 3 | 3 | 0 | 294 | 0 |
| OriginCityName | 0 | 1.00 | 8 | 34 | 0 | 290 | 0 |
| OriginState | 0 | 1.00 | 2 | 2 | 0 | 52 | 0 |
| Dest | 0 | 1.00 | 3 | 3 | 0 | 294 | 0 |
| DestCityName | 0 | 1.00 | 8 | 34 | 0 | 290 | 0 |
| DestState | 0 | 1.00 | 2 | 2 | 0 | 52 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| FlightNum | 0 | 1.00 | 2078.86 | 1757.27 | 1 | 702 | 1594 | 2763 | 7438 | ▇▅▁▂▁ |
| CRSDepTime | 0 | 1.00 | 1330.38 | 482.81 | 1 | 920 | 1325 | 1730 | 2359 | ▁▇▇▇▅ |
| DepTime | 11473 | 0.97 | 1334.24 | 492.96 | 1 | 924 | 1331 | 1737 | 2400 | ▁▇▇▇▅ |
| WheelsOff | 11600 | 0.97 | 1356.55 | 493.87 | 1 | 939 | 1344 | 1750 | 2400 | ▁▇▇▇▅ |
| WheelsOn | 11907 | 0.97 | 1483.30 | 514.35 | 1 | 1104 | 1519 | 1914 | 2400 | ▁▅▇▇▇ |
| CRSArrTime | 0 | 1.00 | 1502.95 | 505.24 | 1 | 1118 | 1527 | 1920 | 2359 | ▁▃▇▇▇ |
| ArrTime | 11907 | 0.97 | 1488.10 | 518.68 | 1 | 1108 | 1522 | 1919 | 2400 | ▁▅▇▇▇ |
| Cancelled | 0 | 1.00 | 0.03 | 0.16 | 0 | 0 | 0 | 0 | 1 | ▇▁▁▁▁ |
| Diverted | 0 | 1.00 | 0.00 | 0.04 | 0 | 0 | 0 | 0 | 1 | ▇▁▁▁▁ |
| CRSElapsedTime | 0 | 1.00 | 146.50 | 76.61 | 21 | 90 | 128 | 180 | 705 | ▇▃▁▁▁ |
| ActualElapsedTime | 12529 | 0.97 | 140.14 | 74.75 | 15 | 85 | 122 | 173 | 721 | ▇▃▁▁▁ |
| Distance | 0 | 1.00 | 844.23 | 610.35 | 31 | 391 | 679 | 1086 | 4983 | ▇▂▁▁▁ |