Import two related datasets from TidyTuesday Project.
sevens <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2022/2022-05-24/sevens.csv')
## Rows: 7966 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (10): team_1, score_1, score_2, team_2, venue, tournament, stage, winne...
## dbl (5): row_id, t1_game_no, t2_game_no, series, margin
## date (1): date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
fifteens <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2022/2022-05-24/fifteens.csv')
## Rows: 1468 Columns: 15
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): team_1, team_2, venue, tournament, home_away_win, winner, loser
## dbl (7): test_no, score_1, score_2, home_test_no, away_test_no, series_no, ...
## date (1): date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Describe the two datasets:
Data1: Fifteens
Data 2: sevens
set.seed(1234)
fifteens_small <- fifteens %>% select(test_no, team_1 , winner) %>% sample_n(10)
sevens_small <- sevens %>% select(margin, team_1, winner) %>% sample_n(10)
fifteens_small
## # A tibble: 10 × 3
## test_no team_1 winner
## <dbl> <chr> <chr>
## 1 1308 Scotland Wales
## 2 1018 Kazakhstan Kazakhstan
## 3 1125 Switzerland Switzerland
## 4 1004 New Zealand New Zealand
## 5 623 Italy Ireland
## 6 905 England England
## 7 645 France France
## 8 934 Ireland Ireland
## 9 400 Germany Netherlands
## 10 900 Cayman Islands Cayman Islands
sevens_small
## # A tibble: 10 × 3
## margin team_1 winner
## <dbl> <chr> <chr>
## 1 38 China China
## 2 19 Australia Australia
## 3 3 France France
## 4 5 Australia New Zealand
## 5 12 Taiwan Taiwan
## 6 29 Australia Australia
## 7 5 Ireland Ireland
## 8 5 Spain Spain
## 9 30 New Zealand New Zealand
## 10 21 China China
Describe the resulting data:
How is it different from the original two datasets?
fifteens_small %>% inner_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 3 × 4
## test_no team_1 winner margin
## <dbl> <chr> <chr> <dbl>
## 1 1004 New Zealand New Zealand 30
## 2 645 France France 3
## 3 934 Ireland Ireland 5
Describe the resulting data:
How is it different from the original two datasets?
fifteens_small %>% left_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 10 × 4
## test_no team_1 winner margin
## <dbl> <chr> <chr> <dbl>
## 1 1308 Scotland Wales NA
## 2 1018 Kazakhstan Kazakhstan NA
## 3 1125 Switzerland Switzerland NA
## 4 1004 New Zealand New Zealand 30
## 5 623 Italy Ireland NA
## 6 905 England England NA
## 7 645 France France 3
## 8 934 Ireland Ireland 5
## 9 400 Germany Netherlands NA
## 10 900 Cayman Islands Cayman Islands NA
Describe the resulting data:
How is it different from the original two datasets?
fifteens_small %>% right_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 10 × 4
## test_no team_1 winner margin
## <dbl> <chr> <chr> <dbl>
## 1 1004 New Zealand New Zealand 30
## 2 645 France France 3
## 3 934 Ireland Ireland 5
## 4 NA China China 38
## 5 NA Australia Australia 19
## 6 NA Australia New Zealand 5
## 7 NA Taiwan Taiwan 12
## 8 NA Australia Australia 29
## 9 NA Spain Spain 5
## 10 NA China China 21
Describe the resulting data:
How is it different from the original two datasets?
fifteens_small %>% full_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 17 × 4
## test_no team_1 winner margin
## <dbl> <chr> <chr> <dbl>
## 1 1308 Scotland Wales NA
## 2 1018 Kazakhstan Kazakhstan NA
## 3 1125 Switzerland Switzerland NA
## 4 1004 New Zealand New Zealand 30
## 5 623 Italy Ireland NA
## 6 905 England England NA
## 7 645 France France 3
## 8 934 Ireland Ireland 5
## 9 400 Germany Netherlands NA
## 10 900 Cayman Islands Cayman Islands NA
## 11 NA China China 38
## 12 NA Australia Australia 19
## 13 NA Australia New Zealand 5
## 14 NA Taiwan Taiwan 12
## 15 NA Australia Australia 29
## 16 NA Spain Spain 5
## 17 NA China China 21
Describe the resulting data:
How is it different from the original two datasets?
fifteens_small %>% semi_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 3 × 3
## test_no team_1 winner
## <dbl> <chr> <chr>
## 1 1004 New Zealand New Zealand
## 2 645 France France
## 3 934 Ireland Ireland
Describe the resulting data:
How is it different from the original two datasets?
fifteens_small %>% anti_join(sevens_small, by = c("team_1", "winner"))
## # A tibble: 7 × 3
## test_no team_1 winner
## <dbl> <chr> <chr>
## 1 1308 Scotland Wales
## 2 1018 Kazakhstan Kazakhstan
## 3 1125 Switzerland Switzerland
## 4 623 Italy Ireland
## 5 905 England England
## 6 400 Germany Netherlands
## 7 900 Cayman Islands Cayman Islands