Import two related datasets from TidyTuesday Project.
nhl_rosters <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-01-09/nhl_rosters.csv')
## Rows: 54883 Columns: 18
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (10): team_code, position_type, headshot, first_name, last_name, positi...
## dbl (7): season, player_id, sweater_number, height_in_inches, weight_in_po...
## date (1): birth_date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
nhl_teams <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2024/2024-01-09/nhl_teams.csv')
## Rows: 59 Columns: 2
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): team_code, full_name
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Describe the two datasets:
Data1: nhl_rosters
nhl_rosters_small <- nhl_rosters %>% select(team_code, last_name, sweater_number, position_type) %>% sample_n(10)
nhl_rosters_small
## # A tibble: 10 × 4
## team_code last_name sweater_number position_type
## <chr> <chr> <dbl> <chr>
## 1 ATL Valabik 5 defensemen
## 2 PIT Cunneyworth 19 forwards
## 3 NYR Bathgate 9 forwards
## 4 PHX Hudson 15 forwards
## 5 BUF Bogosian 4 defensemen
## 6 VAN Mazur 33 forwards
## 7 TBL Angelidis 10 forwards
## 8 VAN Benning 3 defensemen
## 9 BOS Millar 1 goalies
## 10 CHI Amonte 10 forwards
Data 2: nhl_teams
nhl_teams_small <- nhl_teams %>% select(team_code, full_name) %>% sample_n(10)
nhl_teams_small
## # A tibble: 10 × 2
## team_code full_name
## <chr> <chr>
## 1 MNS Minnesota North Stars
## 2 QUA Philadelphia Quakers
## 3 SJS San Jose Sharks
## 4 DET Detroit Red Wings
## 5 LAK Los Angeles Kings
## 6 CGS California Golden Seals
## 7 NYI New York Islanders
## 8 TAN Toronto Arenas
## 9 CHI Chicago Blackhawks
## 10 PHX Phoenix Coyotes
Describe the resulting data:
How is it different from the original two datasets? - 2 rows in output - all columns from the 2 datasets
nhl_rosters_small %>% inner_join(nhl_teams_small, by = c("team_code"))
## # A tibble: 2 × 5
## team_code last_name sweater_number position_type full_name
## <chr> <chr> <dbl> <chr> <chr>
## 1 PHX Hudson 15 forwards Phoenix Coyotes
## 2 CHI Amonte 10 forwards Chicago Blackhawks
Describe the resulting data:
How is it different from the original two datasets? - all columns from the 2 datasets - Shows the players as rows; full team name of the team the player plays for IF the team code shows in both samples of 10
nhl_rosters_small %>% left_join(nhl_teams_small, by = c("team_code"))
## # A tibble: 10 × 5
## team_code last_name sweater_number position_type full_name
## <chr> <chr> <dbl> <chr> <chr>
## 1 ATL Valabik 5 defensemen <NA>
## 2 PIT Cunneyworth 19 forwards <NA>
## 3 NYR Bathgate 9 forwards <NA>
## 4 PHX Hudson 15 forwards Phoenix Coyotes
## 5 BUF Bogosian 4 defensemen <NA>
## 6 VAN Mazur 33 forwards <NA>
## 7 TBL Angelidis 10 forwards <NA>
## 8 VAN Benning 3 defensemen <NA>
## 9 BOS Millar 1 goalies <NA>
## 10 CHI Amonte 10 forwards Chicago Blackhawks
Describe the resulting data:
How is it different from the original two datasets? - all columns from the 2 datasets - Shows the teams as rows, includes all information on player if team code shows for that player in both samples of 10
nhl_rosters_small %>% right_join(nhl_teams_small, by = c("team_code"))
## # A tibble: 10 × 5
## team_code last_name sweater_number position_type full_name
## <chr> <chr> <dbl> <chr> <chr>
## 1 PHX Hudson 15 forwards Phoenix Coyotes
## 2 CHI Amonte 10 forwards Chicago Blackhawks
## 3 MNS <NA> NA <NA> Minnesota North Stars
## 4 QUA <NA> NA <NA> Philadelphia Quakers
## 5 SJS <NA> NA <NA> San Jose Sharks
## 6 DET <NA> NA <NA> Detroit Red Wings
## 7 LAK <NA> NA <NA> Los Angeles Kings
## 8 CGS <NA> NA <NA> California Golden Seals
## 9 NYI <NA> NA <NA> New York Islanders
## 10 TAN <NA> NA <NA> Toronto Arenas
Describe the resulting data:
How is it different from the original two datasets? - inludes all columns from both tables. If there are no relations with regard to the team_code, the teams without associated players are listed at the end
nhl_rosters_small %>% full_join(nhl_teams_small, by = c("team_code"))
## # A tibble: 18 × 5
## team_code last_name sweater_number position_type full_name
## <chr> <chr> <dbl> <chr> <chr>
## 1 ATL Valabik 5 defensemen <NA>
## 2 PIT Cunneyworth 19 forwards <NA>
## 3 NYR Bathgate 9 forwards <NA>
## 4 PHX Hudson 15 forwards Phoenix Coyotes
## 5 BUF Bogosian 4 defensemen <NA>
## 6 VAN Mazur 33 forwards <NA>
## 7 TBL Angelidis 10 forwards <NA>
## 8 VAN Benning 3 defensemen <NA>
## 9 BOS Millar 1 goalies <NA>
## 10 CHI Amonte 10 forwards Chicago Blackhawks
## 11 MNS <NA> NA <NA> Minnesota North Stars
## 12 QUA <NA> NA <NA> Philadelphia Quakers
## 13 SJS <NA> NA <NA> San Jose Sharks
## 14 DET <NA> NA <NA> Detroit Red Wings
## 15 LAK <NA> NA <NA> Los Angeles Kings
## 16 CGS <NA> NA <NA> California Golden Seals
## 17 NYI <NA> NA <NA> New York Islanders
## 18 TAN <NA> NA <NA> Toronto Arenas
Describe the resulting data:
How is it different from the original two datasets? - Displays only the columns from nhl_rosters_small for rows(players) that are related to nhl_teams_small by team_code
nhl_rosters_small %>% semi_join(nhl_teams_small, by = c("team_code"))
## # A tibble: 2 × 4
## team_code last_name sweater_number position_type
## <chr> <chr> <dbl> <chr>
## 1 PHX Hudson 15 forwards
## 2 CHI Amonte 10 forwards
Describe the resulting data:
How is it different from the original two datasets? - opposite to semi_join, shows the columns from nhl_rosters_small for rows(players) that are not related to nhl_teams_small by team_code
nhl_rosters_small %>% anti_join(nhl_teams_small, by = c("team_code"))
## # A tibble: 8 × 4
## team_code last_name sweater_number position_type
## <chr> <chr> <dbl> <chr>
## 1 ATL Valabik 5 defensemen
## 2 PIT Cunneyworth 19 forwards
## 3 NYR Bathgate 9 forwards
## 4 BUF Bogosian 4 defensemen
## 5 VAN Mazur 33 forwards
## 6 TBL Angelidis 10 forwards
## 7 VAN Benning 3 defensemen
## 8 BOS Millar 1 goalies