Import two related datasets from TidyTuesday Project.
survivalists <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/survivalists.csv')
## Rows: 94 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (10): name, gender, city, state, country, reason_tapped_out, reason_cate...
## dbl (5): season, age, result, days_lasted, day_linked_up
## lgl (1): medically_evacuated
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
loadouts <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/loadouts.csv')
## Rows: 940 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): version, name, item_detailed, item
## dbl (2): season, item_number
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Describe the two datasets:
Data1
Data 2
set.seed(5819)
survivalist_s <- survivalists %>% select(season, name, days_lasted) %>% sample_n(5)
loadouts_s <- loadouts %>% select(season, name, item) %>% sample_n(5)
survivalist_s
## # A tibble: 5 × 3
## season name days_lasted
## <dbl> <chr> <dbl>
## 1 4 Josh Richardson 1
## 2 8 Colter Barnes 67
## 3 6 Nikki van Schyndel 52
## 4 2 Mike Lowe 21
## 5 3 Greg Ovens 51
loadouts_s
## # A tibble: 5 × 3
## season name item
## <dbl> <chr> <chr>
## 1 2 Mary Kate Green Axe
## 2 4 Pete Brockdorff Trapping wire
## 3 6 Tim Backus Pot
## 4 1 Wayne Russell Fishing gear
## 5 4 Josh Richardson Bivy bag
Describe the resulting data:
How is it different from the original two datasets?
Combined both data sets to complete 1 joined subsection
survivalist_s %>% inner_join(loadouts_s, by = c("season", "name"))
## # A tibble: 1 × 4
## season name days_lasted item
## <dbl> <chr> <dbl> <chr>
## 1 4 Josh Richardson 1 Bivy bag
Describe the resulting data:
How is it different from the original two datasets?
Every Entry besides one has item listed as “N/A”
survivalist_s %>% left_join(loadouts_s, by = c("season", "name"))
## # A tibble: 5 × 4
## season name days_lasted item
## <dbl> <chr> <dbl> <chr>
## 1 4 Josh Richardson 1 Bivy bag
## 2 8 Colter Barnes 67 <NA>
## 3 6 Nikki van Schyndel 52 <NA>
## 4 2 Mike Lowe 21 <NA>
## 5 3 Greg Ovens 51 <NA>
Describe the resulting data:
How is it different from the original two datasets?
Every Entry besides one has days_lasted as “N/A”
survivalist_s %>% right_join(loadouts_s, by = c("season", "name"))
## # A tibble: 5 × 4
## season name days_lasted item
## <dbl> <chr> <dbl> <chr>
## 1 4 Josh Richardson 1 Bivy bag
## 2 2 Mary Kate Green NA Axe
## 3 4 Pete Brockdorff NA Trapping wire
## 4 6 Tim Backus NA Pot
## 5 1 Wayne Russell NA Fishing gear
Describe the resulting data:
How is it different from the original two datasets?
Combined both sets of data which leaves some entries as “N/A” for both groups
survivalist_s %>% full_join(loadouts_s, by = c("season", "name"))
## # A tibble: 9 × 4
## season name days_lasted item
## <dbl> <chr> <dbl> <chr>
## 1 4 Josh Richardson 1 Bivy bag
## 2 8 Colter Barnes 67 <NA>
## 3 6 Nikki van Schyndel 52 <NA>
## 4 2 Mike Lowe 21 <NA>
## 5 3 Greg Ovens 51 <NA>
## 6 2 Mary Kate Green NA Axe
## 7 4 Pete Brockdorff NA Trapping wire
## 8 6 Tim Backus NA Pot
## 9 1 Wayne Russell NA Fishing gear
Describe the resulting data:
How is it different from the original two datasets?
Smaller data set and doesn’t have “Item”
survivalist_s %>% semi_join(loadouts_s, by = c("season", "name"))
## # A tibble: 1 × 3
## season name days_lasted
## <dbl> <chr> <dbl>
## 1 4 Josh Richardson 1
Describe the resulting data:
How is it different from the original two datasets?
Only 4 Rows, No “N/A’s” and no “Item” column
survivalist_s %>% anti_join(loadouts_s, by = c("season", "name"))
## # A tibble: 4 × 3
## season name days_lasted
## <dbl> <chr> <dbl>
## 1 8 Colter Barnes 67
## 2 6 Nikki van Schyndel 52
## 3 2 Mike Lowe 21
## 4 3 Greg Ovens 51