1. Import your data

Import two related datasets from TidyTuesday Project.

game_goals <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-03/game_goals.csv')
## Rows: 49384 Columns: 25
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr   (7): player, age, team, at, opp, location, outcome
## dbl  (17): season, rank, game_num, goals, assists, points, plus_minus, penal...
## date  (1): date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
season_goals <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-03-03/season_goals.csv')
## Rows: 4810 Columns: 23
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (9): position, hand, player, years, status, season, team, league, headshot
## dbl (14): rank, total_goals, yr_start, age, season_games, goals, assists, po...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

2. Make data small

Describe the two datasets:

Data1 ## Game goals for a set player, used goals and assists aswell as player

Data 2 ## Season records fpr each player * Columns: 3 * Rows: 20

Games <- game_goals %>%
    sample_n(20) %>%
    select(player,goals,assists)
Season <- season_goals %>%
    sample_n(20) %>%
    select(player,goals,assists)

3. inner_join

Describe the resulting data:
## No one matched up from bboth data sets

How is it different from the original two datasets?

inner_join (Games,Season, by = "goals")
## # A tibble: 6 × 5
##   player.x        goals assists.x player.y     assists.y
##   <chr>           <dbl>     <dbl> <chr>            <dbl>
## 1 Zach Parise         1         0 John MacLean         0
## 2 Joe Sakic           1         1 John MacLean         0
## 3 Steven Stamkos      1         0 John MacLean         0
## 4 Wayne Gretzky       1         1 John MacLean         0
## 5 Patrick Marleau     1         0 John MacLean         0
## 6 Sidney Crosby       1         2 John MacLean         0

4. left_join

Describe the resulting data:

How is it different from the original two datasets?

left_join(Games,Season, by = "player")
## # A tibble: 20 × 5
##    player          goals.x assists.x goals.y assists.y
##    <chr>             <dbl>     <dbl>   <dbl>     <dbl>
##  1 Joe Thornton          0         0      NA        NA
##  2 Steve Yzerman         0         0      NA        NA
##  3 Max Pacioretty        0         0      NA        NA
##  4 Dino Ciccarelli       0         0      NA        NA
##  5 Mario Lemieux         0         1      NA        NA
##  6 Ryan Getzlaf          0         0      24        58
##  7 Evgeni Malkin         0         1      NA        NA
##  8 Zach Parise           1         0      NA        NA
##  9 Jeff Carter           0         1      NA        NA
## 10 Patrick Kane          0         0      NA        NA
## 11 Joe Sakic             1         1      NA        NA
## 12 Dave Andreychuk       0         0      NA        NA
## 13 James Neal            0         0      NA        NA
## 14 Eric Staal            0         0      NA        NA
## 15 Dustin Brown          0         0      NA        NA
## 16 Steven Stamkos        1         0      NA        NA
## 17 Wayne Gretzky         1         1      NA        NA
## 18 Patrick Marleau       1         0      44        39
## 19 Sidney Crosby         1         2      NA        NA
## 20 Ryan Getzlaf          0         0      24        58

5. right_join

Describe the resulting data:

How is it different from the original two datasets?

left_join(Games,Season, by = "player")
## # A tibble: 20 × 5
##    player          goals.x assists.x goals.y assists.y
##    <chr>             <dbl>     <dbl>   <dbl>     <dbl>
##  1 Joe Thornton          0         0      NA        NA
##  2 Steve Yzerman         0         0      NA        NA
##  3 Max Pacioretty        0         0      NA        NA
##  4 Dino Ciccarelli       0         0      NA        NA
##  5 Mario Lemieux         0         1      NA        NA
##  6 Ryan Getzlaf          0         0      24        58
##  7 Evgeni Malkin         0         1      NA        NA
##  8 Zach Parise           1         0      NA        NA
##  9 Jeff Carter           0         1      NA        NA
## 10 Patrick Kane          0         0      NA        NA
## 11 Joe Sakic             1         1      NA        NA
## 12 Dave Andreychuk       0         0      NA        NA
## 13 James Neal            0         0      NA        NA
## 14 Eric Staal            0         0      NA        NA
## 15 Dustin Brown          0         0      NA        NA
## 16 Steven Stamkos        1         0      NA        NA
## 17 Wayne Gretzky         1         1      NA        NA
## 18 Patrick Marleau       1         0      44        39
## 19 Sidney Crosby         1         2      NA        NA
## 20 Ryan Getzlaf          0         0      24        58

6. full_join

Describe the resulting data:

How is it different from the original two datasets?

full_join(Games,Season, by = "player")
## # A tibble: 38 × 5
##    player          goals.x assists.x goals.y assists.y
##    <chr>             <dbl>     <dbl>   <dbl>     <dbl>
##  1 Joe Thornton          0         0      NA        NA
##  2 Steve Yzerman         0         0      NA        NA
##  3 Max Pacioretty        0         0      NA        NA
##  4 Dino Ciccarelli       0         0      NA        NA
##  5 Mario Lemieux         0         1      NA        NA
##  6 Ryan Getzlaf          0         0      24        58
##  7 Evgeni Malkin         0         1      NA        NA
##  8 Zach Parise           1         0      NA        NA
##  9 Jeff Carter           0         1      NA        NA
## 10 Patrick Kane          0         0      NA        NA
## # ℹ 28 more rows

7. semi_join

Describe the resulting data:

How is it different from the original two datasets?

semi_join(Games,Season, by = "player")
## # A tibble: 3 × 3
##   player          goals assists
##   <chr>           <dbl>   <dbl>
## 1 Ryan Getzlaf        0       0
## 2 Patrick Marleau     1       0
## 3 Ryan Getzlaf        0       0

8. anti_join

Describe the resulting data:

How is it different from the original two datasets?

anti_join(Games,Season, by = "player")
## # A tibble: 17 × 3
##    player          goals assists
##    <chr>           <dbl>   <dbl>
##  1 Joe Thornton        0       0
##  2 Steve Yzerman       0       0
##  3 Max Pacioretty      0       0
##  4 Dino Ciccarelli     0       0
##  5 Mario Lemieux       0       1
##  6 Evgeni Malkin       0       1
##  7 Zach Parise         1       0
##  8 Jeff Carter         0       1
##  9 Patrick Kane        0       0
## 10 Joe Sakic           1       1
## 11 Dave Andreychuk     0       0
## 12 James Neal          0       0
## 13 Eric Staal          0       0
## 14 Dustin Brown        0       0
## 15 Steven Stamkos      1       0
## 16 Wayne Gretzky       1       1
## 17 Sidney Crosby       1       2