1. Import your data

Import two related datasets from TidyTuesday Project.

survivalists <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/survivalists.csv')
## Rows: 94 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (10): name, gender, city, state, country, reason_tapped_out, reason_cate...
## dbl  (5): season, age, result, days_lasted, day_linked_up
## lgl  (1): medically_evacuated
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
loadouts <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/loadouts.csv')
## Rows: 940 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): version, name, item_detailed, item
## dbl (2): season, item_number
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2. Make data small

Describe the two datasets:

Data1

Data 2

set.seed(2718)
survivalists_small <- survivalists %>% select(season, name, age) %>% sample_n(10)
loadouts_small <- loadouts %>% select(season, name, item_detailed) %>% sample_n(10)

survivalists_small
## # A tibble: 10 × 3
##    season name              age
##     <dbl> <chr>           <dbl>
##  1      5 Sam Larson         24
##  2      9 Jessie Krebs       49
##  3      5 Britt Ahart        41
##  4      3 Dave Nessia        49
##  5      9 Tom Garstang       35
##  6      8 Tim Madsen         48
##  7      7 Joe Nicholas       31
##  8      4 Josh Richardson    19
##  9      3 Callie North       27
## 10      8 Nate Weber         47
loadouts_small
## # A tibble: 10 × 3
##    season name                item_detailed         
##     <dbl> <chr>               <chr>                 
##  1      9 Benki Hill          Trapping wire         
##  2      6 Nikki van Schyndel  Trapping wire         
##  3      6 Barry Karcher       Sleeping bag          
##  4      5 Brad Richardson     Sleeping bag          
##  5      4 Pete Brockdorff     Gillnet – 12′ x 4′    
##  6      7 Amos Rodriguez      Fishing line and hooks
##  7      3 Megan Hanacek       Gillnet               
##  8      4 Dave Whipple        Tarp – 12′ x 12′      
##  9      9 Juan Pablo Quinonez Axe                   
## 10      4 Jesse Bosdell       Rations

3. inner_join

Describe the resulting data:

How is it different from the original two datasets? -Fewer rows name x and y columns are intersected

inner_joined_data <-inner_join(loadouts_small, survivalists_small)
## Joining with `by = join_by(season, name)`
inner_joined_data
## # A tibble: 0 × 4
## # ℹ 4 variables: season <dbl>, name <chr>, item_detailed <chr>, age <dbl>

4. left_join

Describe the resulting data:

How is it different from the original two datasets? - Contains all rows from each set, additional two rows

left_joined_data <- left_join(loadouts_small, survivalists_small)
## Joining with `by = join_by(season, name)`
left_joined_data
## # A tibble: 10 × 4
##    season name                item_detailed            age
##     <dbl> <chr>               <chr>                  <dbl>
##  1      9 Benki Hill          Trapping wire             NA
##  2      6 Nikki van Schyndel  Trapping wire             NA
##  3      6 Barry Karcher       Sleeping bag              NA
##  4      5 Brad Richardson     Sleeping bag              NA
##  5      4 Pete Brockdorff     Gillnet – 12′ x 4′        NA
##  6      7 Amos Rodriguez      Fishing line and hooks    NA
##  7      3 Megan Hanacek       Gillnet                   NA
##  8      4 Dave Whipple        Tarp – 12′ x 12′          NA
##  9      9 Juan Pablo Quinonez Axe                       NA
## 10      4 Jesse Bosdell       Rations                   NA

5. right_join

Describe the resulting data:

How is it different from the original two datasets? -NA is included for missing columns

right_joined_data <- right_join(loadouts_small, survivalists_small)
## Joining with `by = join_by(season, name)`
right_joined_data
## # A tibble: 10 × 4
##    season name            item_detailed   age
##     <dbl> <chr>           <chr>         <dbl>
##  1      5 Sam Larson      <NA>             24
##  2      9 Jessie Krebs    <NA>             49
##  3      5 Britt Ahart     <NA>             41
##  4      3 Dave Nessia     <NA>             49
##  5      9 Tom Garstang    <NA>             35
##  6      8 Tim Madsen      <NA>             48
##  7      7 Joe Nicholas    <NA>             31
##  8      4 Josh Richardson <NA>             19
##  9      3 Callie North    <NA>             27
## 10      8 Nate Weber      <NA>             47

6. full_join

Describe the resulting data:

How is it different from the original two datasets? -All rows included from both datasets. NA is inputed in data set if it only appears in one of the datasets

full_joined_data <- full_join(loadouts_small, survivalists_small)
## Joining with `by = join_by(season, name)`
full_joined_data
## # A tibble: 20 × 4
##    season name                item_detailed            age
##     <dbl> <chr>               <chr>                  <dbl>
##  1      9 Benki Hill          Trapping wire             NA
##  2      6 Nikki van Schyndel  Trapping wire             NA
##  3      6 Barry Karcher       Sleeping bag              NA
##  4      5 Brad Richardson     Sleeping bag              NA
##  5      4 Pete Brockdorff     Gillnet – 12′ x 4′        NA
##  6      7 Amos Rodriguez      Fishing line and hooks    NA
##  7      3 Megan Hanacek       Gillnet                   NA
##  8      4 Dave Whipple        Tarp – 12′ x 12′          NA
##  9      9 Juan Pablo Quinonez Axe                       NA
## 10      4 Jesse Bosdell       Rations                   NA
## 11      5 Sam Larson          <NA>                      24
## 12      9 Jessie Krebs        <NA>                      49
## 13      5 Britt Ahart         <NA>                      41
## 14      3 Dave Nessia         <NA>                      49
## 15      9 Tom Garstang        <NA>                      35
## 16      8 Tim Madsen          <NA>                      48
## 17      7 Joe Nicholas        <NA>                      31
## 18      4 Josh Richardson     <NA>                      19
## 19      3 Callie North        <NA>                      27
## 20      8 Nate Weber          <NA>                      47

7. semi_join

Describe the resulting data:

How is it different from the original two datasets? -Only used one data sets columns. Rows were taken from items included in both sets.

semi_joined_data <- semi_join(loadouts_small, survivalists_small)
## Joining with `by = join_by(season, name)`
semi_joined_data
## # A tibble: 0 × 3
## # ℹ 3 variables: season <dbl>, name <chr>, item_detailed <chr>

8. anti_join

Describe the resulting data:

How is it different from the original two datasets? -Only 2 rows. Takes missing pieces from other dataset

anti_joined_data <- anti_join(loadouts_small, survivalists_small)
## Joining with `by = join_by(season, name)`
anti_joined_data
## # A tibble: 10 × 3
##    season name                item_detailed         
##     <dbl> <chr>               <chr>                 
##  1      9 Benki Hill          Trapping wire         
##  2      6 Nikki van Schyndel  Trapping wire         
##  3      6 Barry Karcher       Sleeping bag          
##  4      5 Brad Richardson     Sleeping bag          
##  5      4 Pete Brockdorff     Gillnet – 12′ x 4′    
##  6      7 Amos Rodriguez      Fishing line and hooks
##  7      3 Megan Hanacek       Gillnet               
##  8      4 Dave Whipple        Tarp – 12′ x 12′      
##  9      9 Juan Pablo Quinonez Axe                   
## 10      4 Jesse Bosdell       Rations