1. Import your data

Import two related datasets from TidyTuesday Project.

survivalists <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/survivalists.csv') 
## Rows: 94 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (10): name, gender, city, state, country, reason_tapped_out, reason_cate...
## dbl  (5): season, age, result, days_lasted, day_linked_up
## lgl  (1): medically_evacuated
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
loadouts <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/loadouts.csv') 
## Rows: 940 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): version, name, item_detailed, item
## dbl (2): season, item_number
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2. Make data small

Describe the two datasets:

Data1

Data 2

set.seed(5819)
survivalist_s <- survivalists %>% select(season, name, days_lasted) %>% sample_n(5)
loadouts_s <- loadouts %>% select(season, name, item) %>% sample_n(5)

survivalist_s
## # A tibble: 5 × 3
##   season name               days_lasted
##    <dbl> <chr>                    <dbl>
## 1      4 Josh Richardson              1
## 2      8 Colter Barnes               67
## 3      6 Nikki van Schyndel          52
## 4      2 Mike Lowe                   21
## 5      3 Greg Ovens                  51
loadouts_s 
## # A tibble: 5 × 3
##   season name            item         
##    <dbl> <chr>           <chr>        
## 1      2 Mary Kate Green Axe          
## 2      4 Pete Brockdorff Trapping wire
## 3      6 Tim Backus      Pot          
## 4      1 Wayne Russell   Fishing gear 
## 5      4 Josh Richardson Bivy bag

3. inner_join

Describe the resulting data:

How is it different from the original two datasets?

Combined both data sets to complete 1 joined subsection

survivalist_s %>% inner_join(loadouts_s, by = c("season", "name"))
## # A tibble: 1 × 4
##   season name            days_lasted item    
##    <dbl> <chr>                 <dbl> <chr>   
## 1      4 Josh Richardson           1 Bivy bag

4. left_join

Describe the resulting data:

How is it different from the original two datasets?

Every Entry besides one has item listed as “N/A”

survivalist_s %>% left_join(loadouts_s, by = c("season", "name"))
## # A tibble: 5 × 4
##   season name               days_lasted item    
##    <dbl> <chr>                    <dbl> <chr>   
## 1      4 Josh Richardson              1 Bivy bag
## 2      8 Colter Barnes               67 <NA>    
## 3      6 Nikki van Schyndel          52 <NA>    
## 4      2 Mike Lowe                   21 <NA>    
## 5      3 Greg Ovens                  51 <NA>

5. right_join

Describe the resulting data:

How is it different from the original two datasets?

Every Entry besides one has days_lasted as “N/A”

survivalist_s %>% right_join(loadouts_s, by = c("season", "name"))
## # A tibble: 5 × 4
##   season name            days_lasted item         
##    <dbl> <chr>                 <dbl> <chr>        
## 1      4 Josh Richardson           1 Bivy bag     
## 2      2 Mary Kate Green          NA Axe          
## 3      4 Pete Brockdorff          NA Trapping wire
## 4      6 Tim Backus               NA Pot          
## 5      1 Wayne Russell            NA Fishing gear

6. full_join

Describe the resulting data:

How is it different from the original two datasets?

Combined both sets of data which leaves some entries as “N/A” for both groups

survivalist_s %>% full_join(loadouts_s, by = c("season", "name"))
## # A tibble: 9 × 4
##   season name               days_lasted item         
##    <dbl> <chr>                    <dbl> <chr>        
## 1      4 Josh Richardson              1 Bivy bag     
## 2      8 Colter Barnes               67 <NA>         
## 3      6 Nikki van Schyndel          52 <NA>         
## 4      2 Mike Lowe                   21 <NA>         
## 5      3 Greg Ovens                  51 <NA>         
## 6      2 Mary Kate Green             NA Axe          
## 7      4 Pete Brockdorff             NA Trapping wire
## 8      6 Tim Backus                  NA Pot          
## 9      1 Wayne Russell               NA Fishing gear

7. semi_join

Describe the resulting data:

How is it different from the original two datasets?

Smaller data set and doesn’t have “Item”

survivalist_s %>% semi_join(loadouts_s, by = c("season", "name"))
## # A tibble: 1 × 3
##   season name            days_lasted
##    <dbl> <chr>                 <dbl>
## 1      4 Josh Richardson           1

8. anti_join

Describe the resulting data:

How is it different from the original two datasets?

Only 4 Rows, No “N/A’s” and no “Item” column

survivalist_s %>% anti_join(loadouts_s, by = c("season", "name"))
## # A tibble: 4 × 3
##   season name               days_lasted
##    <dbl> <chr>                    <dbl>
## 1      8 Colter Barnes               67
## 2      6 Nikki van Schyndel          52
## 3      2 Mike Lowe                   21
## 4      3 Greg Ovens                  51