1. Import your data

Import two related datasets from TidyTuesday Project.

wcmatches <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-11-29/wcmatches.csv')
## Rows: 900 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (11): country, city, stage, home_team, away_team, outcome, win_conditio...
## dbl   (3): year, home_score, away_score
## date  (1): date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
worldcups <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-11-29/worldcups.csv')
## Rows: 21 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): host, winner, second, third, fourth
## dbl (5): year, goals_scored, teams, games, attendance
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2. Make data small

Describe the two datasets:

Data1: wc matches

Data 2: world cups

set.seed(1234)
wcmatches_small <- wcmatches %>% select(year, winning_team, home_score, away_score) %>% sample_n(10)
worldcups_small <- worldcups %>% select(year,winner,second) %>% sample_n(10)

wcmatches_small
## # A tibble: 10 × 4
##     year winning_team home_score away_score
##    <dbl> <chr>             <dbl>      <dbl>
##  1  1978 <NA>                  0          0
##  2  2018 Sweden                1          0
##  3  1954 West Germany          3          2
##  4  2002 <NA>                  1          1
##  5  2006 Germany               4          2
##  6  1986 Brazil                4          0
##  7  1954 West Germany          1          6
##  8  1958 Brazil                0          3
##  9  2010 Argentina             4          1
## 10  2002 Spain                 3          1
worldcups_small
## # A tibble: 10 × 3
##     year winner       second      
##    <dbl> <chr>        <chr>       
##  1  1958 Brazil       Sweden      
##  2  1994 Brazil       Italy       
##  3  1990 West Germany Argentina   
##  4  2010 Spain        Netherlands 
##  5  1950 Uruguay      Brazil      
##  6  2002 Brazil       Germany     
##  7  1954 West Germany Hungary     
##  8  1966 England      West Germany
##  9  1998 France       Brazil      
## 10  2006 Italy        France

3. inner_join

Describe the resulting data: Combining data from points where they overlap with eachother.

How is it different from the original two datasets?

This only shows the data from points where they overlap.

wcmatches_small %>% inner_join(worldcups_small)
## Joining with `by = join_by(year)`
## # A tibble: 7 × 6
##    year winning_team home_score away_score winner       second     
##   <dbl> <chr>             <dbl>      <dbl> <chr>        <chr>      
## 1  1954 West Germany          3          2 West Germany Hungary    
## 2  2002 <NA>                  1          1 Brazil       Germany    
## 3  2006 Germany               4          2 Italy        France     
## 4  1954 West Germany          1          6 West Germany Hungary    
## 5  1958 Brazil                0          3 Brazil       Sweden     
## 6  2010 Argentina             4          1 Spain        Netherlands
## 7  2002 Spain                 3          1 Brazil       Germany

4. left_join

Describe the resulting data:

How is it different from the original two datasets?

wcmatches_small %>% left_join(worldcups_small)
## Joining with `by = join_by(year)`
## # A tibble: 10 × 6
##     year winning_team home_score away_score winner       second     
##    <dbl> <chr>             <dbl>      <dbl> <chr>        <chr>      
##  1  1978 <NA>                  0          0 <NA>         <NA>       
##  2  2018 Sweden                1          0 <NA>         <NA>       
##  3  1954 West Germany          3          2 West Germany Hungary    
##  4  2002 <NA>                  1          1 Brazil       Germany    
##  5  2006 Germany               4          2 Italy        France     
##  6  1986 Brazil                4          0 <NA>         <NA>       
##  7  1954 West Germany          1          6 West Germany Hungary    
##  8  1958 Brazil                0          3 Brazil       Sweden     
##  9  2010 Argentina             4          1 Spain        Netherlands
## 10  2002 Spain                 3          1 Brazil       Germany

5. right_join

Describe the resulting data:

How is it different from the original two datasets?

wcmatches_small %>% right_join(worldcups_small)
## Joining with `by = join_by(year)`
## # A tibble: 12 × 6
##     year winning_team home_score away_score winner       second      
##    <dbl> <chr>             <dbl>      <dbl> <chr>        <chr>       
##  1  1954 West Germany          3          2 West Germany Hungary     
##  2  2002 <NA>                  1          1 Brazil       Germany     
##  3  2006 Germany               4          2 Italy        France      
##  4  1954 West Germany          1          6 West Germany Hungary     
##  5  1958 Brazil                0          3 Brazil       Sweden      
##  6  2010 Argentina             4          1 Spain        Netherlands 
##  7  2002 Spain                 3          1 Brazil       Germany     
##  8  1994 <NA>                 NA         NA Brazil       Italy       
##  9  1990 <NA>                 NA         NA West Germany Argentina   
## 10  1950 <NA>                 NA         NA Uruguay      Brazil      
## 11  1966 <NA>                 NA         NA England      West Germany
## 12  1998 <NA>                 NA         NA France       Brazil

6. full_join

Describe the resulting data:

How is it different from the original two datasets?

wcmatches_small %>% full_join(worldcups_small)
## Joining with `by = join_by(year)`
## # A tibble: 15 × 6
##     year winning_team home_score away_score winner       second      
##    <dbl> <chr>             <dbl>      <dbl> <chr>        <chr>       
##  1  1978 <NA>                  0          0 <NA>         <NA>        
##  2  2018 Sweden                1          0 <NA>         <NA>        
##  3  1954 West Germany          3          2 West Germany Hungary     
##  4  2002 <NA>                  1          1 Brazil       Germany     
##  5  2006 Germany               4          2 Italy        France      
##  6  1986 Brazil                4          0 <NA>         <NA>        
##  7  1954 West Germany          1          6 West Germany Hungary     
##  8  1958 Brazil                0          3 Brazil       Sweden      
##  9  2010 Argentina             4          1 Spain        Netherlands 
## 10  2002 Spain                 3          1 Brazil       Germany     
## 11  1994 <NA>                 NA         NA Brazil       Italy       
## 12  1990 <NA>                 NA         NA West Germany Argentina   
## 13  1950 <NA>                 NA         NA Uruguay      Brazil      
## 14  1966 <NA>                 NA         NA England      West Germany
## 15  1998 <NA>                 NA         NA France       Brazil

7. semi_join

Describe the resulting data:

How is it different from the original two datasets?

wcmatches_small %>% semi_join(worldcups_small)
## Joining with `by = join_by(year)`
## # A tibble: 7 × 4
##    year winning_team home_score away_score
##   <dbl> <chr>             <dbl>      <dbl>
## 1  1954 West Germany          3          2
## 2  2002 <NA>                  1          1
## 3  2006 Germany               4          2
## 4  1954 West Germany          1          6
## 5  1958 Brazil                0          3
## 6  2010 Argentina             4          1
## 7  2002 Spain                 3          1

8. anti_join

Describe the resulting data:

How is it different from the original two datasets?

wcmatches_small %>% anti_join(worldcups_small)
## Joining with `by = join_by(year)`
## # A tibble: 3 × 4
##    year winning_team home_score away_score
##   <dbl> <chr>             <dbl>      <dbl>
## 1  1978 <NA>                  0          0
## 2  2018 Sweden                1          0
## 3  1986 Brazil                4          0
worldcups_small %>% anti_join(wcmatches_small)
## Joining with `by = join_by(year)`
## # A tibble: 5 × 3
##    year winner       second      
##   <dbl> <chr>        <chr>       
## 1  1994 Brazil       Italy       
## 2  1990 West Germany Argentina   
## 3  1950 Uruguay      Brazil      
## 4  1966 England      West Germany
## 5  1998 France       Brazil