1. Import your data

Import two related datasets from TidyTuesday Project.

sevens <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2022/2022-05-24/sevens.csv')
## Rows: 7966 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (10): team_1, score_1, score_2, team_2, venue, tournament, stage, winne...
## dbl   (5): row_id, t1_game_no, t2_game_no, series, margin
## date  (1): date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
fifteens <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2022/2022-05-24/fifteens.csv')
## Rows: 1468 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (7): team_1, team_2, venue, tournament, home_away_win, winner, loser
## dbl  (7): test_no, score_1, score_2, home_test_no, away_test_no, series_no, ...
## date (1): date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2. Make data small

Describe the two datasets:

Data1: Fifteens

Data 2: sevens

set.seed(1234)
fifteens_small <- fifteens %>% select(test_no, team_1 , winner) %>% sample_n(10)
sevens_small <- sevens %>% select(margin, team_1, winner) %>% sample_n(10)

fifteens_small
## # A tibble: 10 × 3
##    test_no team_1         winner        
##      <dbl> <chr>          <chr>         
##  1    1308 Scotland       Wales         
##  2    1018 Kazakhstan     Kazakhstan    
##  3    1125 Switzerland    Switzerland   
##  4    1004 New Zealand    New Zealand   
##  5     623 Italy          Ireland       
##  6     905 England        England       
##  7     645 France         France        
##  8     934 Ireland        Ireland       
##  9     400 Germany        Netherlands   
## 10     900 Cayman Islands Cayman Islands
sevens_small
## # A tibble: 10 × 3
##    margin team_1      winner     
##     <dbl> <chr>       <chr>      
##  1     38 China       China      
##  2     19 Australia   Australia  
##  3      3 France      France     
##  4      5 Australia   New Zealand
##  5     12 Taiwan      Taiwan     
##  6     29 Australia   Australia  
##  7      5 Ireland     Ireland    
##  8      5 Spain       Spain      
##  9     30 New Zealand New Zealand
## 10     21 China       China

3. inner_join

Describe the resulting data:

How is it different from the original two datasets?

fifteens_small %>% inner_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 3 × 4
##   test_no team_1      winner      margin
##     <dbl> <chr>       <chr>        <dbl>
## 1    1004 New Zealand New Zealand     30
## 2     645 France      France           3
## 3     934 Ireland     Ireland          5

4. left_join

Describe the resulting data:

How is it different from the original two datasets?

fifteens_small %>% left_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 10 × 4
##    test_no team_1         winner         margin
##      <dbl> <chr>          <chr>           <dbl>
##  1    1308 Scotland       Wales              NA
##  2    1018 Kazakhstan     Kazakhstan         NA
##  3    1125 Switzerland    Switzerland        NA
##  4    1004 New Zealand    New Zealand        30
##  5     623 Italy          Ireland            NA
##  6     905 England        England            NA
##  7     645 France         France              3
##  8     934 Ireland        Ireland             5
##  9     400 Germany        Netherlands        NA
## 10     900 Cayman Islands Cayman Islands     NA

5. right_join

Describe the resulting data:

How is it different from the original two datasets?

fifteens_small %>% right_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 10 × 4
##    test_no team_1      winner      margin
##      <dbl> <chr>       <chr>        <dbl>
##  1    1004 New Zealand New Zealand     30
##  2     645 France      France           3
##  3     934 Ireland     Ireland          5
##  4      NA China       China           38
##  5      NA Australia   Australia       19
##  6      NA Australia   New Zealand      5
##  7      NA Taiwan      Taiwan          12
##  8      NA Australia   Australia       29
##  9      NA Spain       Spain            5
## 10      NA China       China           21

6. full_join

Describe the resulting data:

How is it different from the original two datasets?

fifteens_small %>% full_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 17 × 4
##    test_no team_1         winner         margin
##      <dbl> <chr>          <chr>           <dbl>
##  1    1308 Scotland       Wales              NA
##  2    1018 Kazakhstan     Kazakhstan         NA
##  3    1125 Switzerland    Switzerland        NA
##  4    1004 New Zealand    New Zealand        30
##  5     623 Italy          Ireland            NA
##  6     905 England        England            NA
##  7     645 France         France              3
##  8     934 Ireland        Ireland             5
##  9     400 Germany        Netherlands        NA
## 10     900 Cayman Islands Cayman Islands     NA
## 11      NA China          China              38
## 12      NA Australia      Australia          19
## 13      NA Australia      New Zealand         5
## 14      NA Taiwan         Taiwan             12
## 15      NA Australia      Australia          29
## 16      NA Spain          Spain               5
## 17      NA China          China              21

7. semi_join

Describe the resulting data:

How is it different from the original two datasets?

fifteens_small %>% semi_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 3 × 3
##   test_no team_1      winner     
##     <dbl> <chr>       <chr>      
## 1    1004 New Zealand New Zealand
## 2     645 France      France     
## 3     934 Ireland     Ireland

8. anti_join

Describe the resulting data:

How is it different from the original two datasets?

fifteens_small %>% anti_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 7 × 3
##   test_no team_1         winner        
##     <dbl> <chr>          <chr>         
## 1    1308 Scotland       Wales         
## 2    1018 Kazakhstan     Kazakhstan    
## 3    1125 Switzerland    Switzerland   
## 4     623 Italy          Ireland       
## 5     905 England        England       
## 6     400 Germany        Netherlands   
## 7     900 Cayman Islands Cayman Islands