ModernDiveCh4.Rmd

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(ggplot2)
library(readr)
library(tidyr)
library(nycflights23)
library(fivethirtyeight)

## Some larger datasets need to be installed separately, like senators and
## house_district_forecast. To install these, we recommend you install the
## fivethirtyeightdata package by running:
## install.packages('fivethirtyeightdata', repos =
## 'https://fivethirtyeightdata.github.io/drat/', type = 'source')

4.1 Tidy data frames have a column for every variable in the data set, a row for every observation, and a table for each observational unit.

4.2 Tidy data frames are useful for organizing data because they ensure that every variable has its own column and makes it easier to plot or visualize data based on selected variables. It also makes it easier to sort data by specific variables as they have their own columns to sort by.

4.3

airline_safety_smaller <- airline_safety |> 
  select(airline, starts_with("fatalities"))

airline_safety_smaller |>
  pivot_longer(names_to = "fatalities_years",
               values_to = "count",
               cols = -airline)

## # A tibble: 112 × 3
##    airline               fatalities_years count
##    <chr>                 <chr>            <int>
##  1 Aer Lingus            fatalities_85_99     0
##  2 Aer Lingus            fatalities_00_14     0
##  3 Aeroflot              fatalities_85_99   128
##  4 Aeroflot              fatalities_00_14    88
##  5 Aerolineas Argentinas fatalities_85_99     0
##  6 Aerolineas Argentinas fatalities_00_14     0
##  7 Aeromexico            fatalities_85_99    64
##  8 Aeromexico            fatalities_00_14     0
##  9 Air Canada            fatalities_85_99     0
## 10 Air Canada            fatalities_00_14     0
## # ℹ 102 more rows

ModernDiveCh4.Rmd

Eden Long

2026-02-24