Import two related datasets from TidyTuesday Project.
survivalists <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2023/2023-01-24/survivalists.csv")
## Rows: 94 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (10): name, gender, city, state, country, reason_tapped_out, reason_cate...
## dbl (5): season, age, result, days_lasted, day_linked_up
## lgl (1): medically_evacuated
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
loadouts <- read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2023/2023-01-24/loadouts.csv")
## Rows: 940 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): version, name, item_detailed, item
## dbl (2): season, item_number
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Describe the two datasets:
Data1: Survivalist
Data 2 Loadouts
survivalists_small <- survivalists %>% select(season, name, age) %>% sample_n(10)
loadouts_small <- loadouts %>% select(season, name, item_detailed) %>% sample_n(10)
survivalists_small
## # A tibble: 10 × 3
## season name age
## <dbl> <chr> <dbl>
## 1 3 Zachary Fowler 36
## 2 8 Rose Anna Moore 43
## 3 1 Chris Weatherman 41
## 4 4 Jesse Bosdell 31
## 5 5 Brad Richardson 24
## 6 7 Callie Russell 31
## 7 4 Brooke Whipple 45
## 8 9 Benki Hill 46
## 9 6 Ray Livingston 43
## 10 7 Kielyn Marrone 33
loadouts_small
## # A tibble: 10 × 3
## season name item_detailed
## <dbl> <chr> <chr>
## 1 9 Teimojin Tan Trapping wire
## 2 1 Brant McGee Axe
## 3 2 Mike Lowe Emergency rations
## 4 6 Nathan Donnelly Axe
## 5 7 Joe Nicholas Knife
## 6 7 Roland Welker Bow and arrows
## 7 5 Britt Ahart Ferro Rod
## 8 4 Brad Richardson Pot – 2 quarts, stainless steel
## 9 3 Dan Wowak Paracord
## 10 8 Clay Hayes Axe
Describe the resulting data:
How is it different from the original two datasets? 1 rows compared to the 10 in original all items from original datasets
survivalists_small %>% inner_join(loadouts_small)
## Joining with `by = join_by(season, name)`
## # A tibble: 0 × 4
## # ℹ 4 variables: season <dbl>, name <chr>, age <dbl>, item_detailed <chr>
Describe the resulting data:
How is it different from the original two datasets? 10 rows, but some of the items_detailed are NA’s.
survivalists_small %>% left_join(loadouts_small)
## Joining with `by = join_by(season, name)`
## # A tibble: 10 × 4
## season name age item_detailed
## <dbl> <chr> <dbl> <chr>
## 1 3 Zachary Fowler 36 <NA>
## 2 8 Rose Anna Moore 43 <NA>
## 3 1 Chris Weatherman 41 <NA>
## 4 4 Jesse Bosdell 31 <NA>
## 5 5 Brad Richardson 24 <NA>
## 6 7 Callie Russell 31 <NA>
## 7 4 Brooke Whipple 45 <NA>
## 8 9 Benki Hill 46 <NA>
## 9 6 Ray Livingston 43 <NA>
## 10 7 Kielyn Marrone 33 <NA>
Describe the resulting data:
How is it different from the original two datasets? 10 rows, but some of the ages are NA’s
survivalists_small %>% right_join(loadouts_small)
## Joining with `by = join_by(season, name)`
## # A tibble: 10 × 4
## season name age item_detailed
## <dbl> <chr> <dbl> <chr>
## 1 9 Teimojin Tan NA Trapping wire
## 2 1 Brant McGee NA Axe
## 3 2 Mike Lowe NA Emergency rations
## 4 6 Nathan Donnelly NA Axe
## 5 7 Joe Nicholas NA Knife
## 6 7 Roland Welker NA Bow and arrows
## 7 5 Britt Ahart NA Ferro Rod
## 8 4 Brad Richardson NA Pot – 2 quarts, stainless steel
## 9 3 Dan Wowak NA Paracord
## 10 8 Clay Hayes NA Axe
Describe the resulting data:
How is it different from the original two datasets? 19 rows instead of the orginal 10, and some of the items_detailed and ages are NA’s
survivalists_small %>% full_join(loadouts_small)
## Joining with `by = join_by(season, name)`
## # A tibble: 20 × 4
## season name age item_detailed
## <dbl> <chr> <dbl> <chr>
## 1 3 Zachary Fowler 36 <NA>
## 2 8 Rose Anna Moore 43 <NA>
## 3 1 Chris Weatherman 41 <NA>
## 4 4 Jesse Bosdell 31 <NA>
## 5 5 Brad Richardson 24 <NA>
## 6 7 Callie Russell 31 <NA>
## 7 4 Brooke Whipple 45 <NA>
## 8 9 Benki Hill 46 <NA>
## 9 6 Ray Livingston 43 <NA>
## 10 7 Kielyn Marrone 33 <NA>
## 11 9 Teimojin Tan NA Trapping wire
## 12 1 Brant McGee NA Axe
## 13 2 Mike Lowe NA Emergency rations
## 14 6 Nathan Donnelly NA Axe
## 15 7 Joe Nicholas NA Knife
## 16 7 Roland Welker NA Bow and arrows
## 17 5 Britt Ahart NA Ferro Rod
## 18 4 Brad Richardson NA Pot – 2 quarts, stainless steel
## 19 3 Dan Wowak NA Paracord
## 20 8 Clay Hayes NA Axe
Describe the resulting data:
How is it different from the original two datasets? 1 row compared to the orginal 10, and only 3 columns
survivalists_small %>% semi_join(loadouts_small)
## Joining with `by = join_by(season, name)`
## # A tibble: 0 × 3
## # ℹ 3 variables: season <dbl>, name <chr>, age <dbl>
Describe the resulting data:
How is it different from the original two datasets? 9 rows compared to the original 10, and only 3 rows, season, age and name
survivalists_small %>% anti_join(loadouts_small)
## Joining with `by = join_by(season, name)`
## # A tibble: 10 × 3
## season name age
## <dbl> <chr> <dbl>
## 1 3 Zachary Fowler 36
## 2 8 Rose Anna Moore 43
## 3 1 Chris Weatherman 41
## 4 4 Jesse Bosdell 31
## 5 5 Brad Richardson 24
## 6 7 Callie Russell 31
## 7 4 Brooke Whipple 45
## 8 9 Benki Hill 46
## 9 6 Ray Livingston 43
## 10 7 Kielyn Marrone 33