Week 9: Apply it to your data 8

1. Import your data

Import two related datasets from TidyTuesday Project.

survivalists <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/survivalists.csv')

## Rows: 94 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (10): name, gender, city, state, country, reason_tapped_out, reason_cate...
## dbl  (5): season, age, result, days_lasted, day_linked_up
## lgl  (1): medically_evacuated
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

loadouts <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2023/2023-01-24/loadouts.csv')

## Rows: 940 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): version, name, item_detailed, item
## dbl (2): season, item_number
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2. Make data small

Describe the two datasets:

Data1

Columns:
Rows:

Data 2

Columns:
Rows:

set.seed(2718)
survivalists_small <- survivalists %>% select(season, name, age) %>% sample_n(10)
loadouts_small <- loadouts %>% select(season, name, item_detailed) %>% sample_n(10)

survivalists_small

## # A tibble: 10 × 3
##    season name              age
##     <dbl> <chr>           <dbl>
##  1      5 Sam Larson         24
##  2      9 Jessie Krebs       49
##  3      5 Britt Ahart        41
##  4      3 Dave Nessia        49
##  5      9 Tom Garstang       35
##  6      8 Tim Madsen         48
##  7      7 Joe Nicholas       31
##  8      4 Josh Richardson    19
##  9      3 Callie North       27
## 10      8 Nate Weber         47

loadouts_small

## # A tibble: 10 × 3
##    season name                item_detailed         
##     <dbl> <chr>               <chr>                 
##  1      9 Benki Hill          Trapping wire         
##  2      6 Nikki van Schyndel  Trapping wire         
##  3      6 Barry Karcher       Sleeping bag          
##  4      5 Brad Richardson     Sleeping bag          
##  5      4 Pete Brockdorff     Gillnet – 12′ x 4′    
##  6      7 Amos Rodriguez      Fishing line and hooks
##  7      3 Megan Hanacek       Gillnet               
##  8      4 Dave Whipple        Tarp – 12′ x 12′      
##  9      9 Juan Pablo Quinonez Axe                   
## 10      4 Jesse Bosdell       Rations

3. inner_join

Describe the resulting data:

Columns:
Rows:

How is it different from the original two datasets?

x <- tribble(
  ~key, ~val_x,
     1, "x1",
     2, "x2",
     3, "x3"
)
y <- tribble(
  ~key, ~val_y,
     1, "y1",
     2, "y2",
     4, "y3"
)

inner_join(x, y)

## Joining with `by = join_by(key)`

## # A tibble: 2 × 3
##     key val_x val_y
##   <dbl> <chr> <chr>
## 1     1 x1    y1   
## 2     2 x2    y2

4. left_join

Describe the resulting data:

Columns:
Rows:

How is it different from the original two datasets?

left_join(x, y)

## Joining with `by = join_by(key)`

## # A tibble: 3 × 3
##     key val_x val_y
##   <dbl> <chr> <chr>
## 1     1 x1    y1   
## 2     2 x2    y2   
## 3     3 x3    <NA>

5. right_join

Describe the resulting data:

Columns:
Rows:

How is it different from the original two datasets?

right_join(x, y)

## Joining with `by = join_by(key)`

## # A tibble: 3 × 3
##     key val_x val_y
##   <dbl> <chr> <chr>
## 1     1 x1    y1   
## 2     2 x2    y2   
## 3     4 <NA>  y3

6. full_join

Describe the resulting data:

Columns:
Rows:

How is it different from the original two datasets?

full_join(x, y)

## Joining with `by = join_by(key)`

## # A tibble: 4 × 3
##     key val_x val_y
##   <dbl> <chr> <chr>
## 1     1 x1    y1   
## 2     2 x2    y2   
## 3     3 x3    <NA> 
## 4     4 <NA>  y3

7. semi_join

Describe the resulting data:

Columns:
Rows:

How is it different from the original two datasets?

semi_join(x, y)

## Joining with `by = join_by(key)`

## # A tibble: 2 × 2
##     key val_x
##   <dbl> <chr>
## 1     1 x1   
## 2     2 x2

8. anti_join

Describe the resulting data:

Columns:
Rows:

How is it different from the original two datasets?

anti_join(x, y)

## Joining with `by = join_by(key)`

## # A tibble: 1 × 2
##     key val_x
##   <dbl> <chr>
## 1     3 x3

Week 9: Apply it to your data 8

Cam Paquette

2022-10-23

1. Import your data

2. Make data small

3. inner_join

4. left_join

5. right_join

6. full_join

7. semi_join

8. anti_join