Import two related datasets from TidyTuesday Project.
polls <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-04-14/polls.csv')
## Rows: 535 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): title, artist, gender, critic_name, critic_rols, critic_country, cr...
## dbl (2): rank, year
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
rankings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-04-14/rankings.csv')
## Rows: 311 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): title, artist, gender
## dbl (9): ID, year, points, n, n1, n2, n3, n4, n5
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Describe the two datasets:
Data1: Polls
Data 2 rankings
set.seed(1234)
polls_small <- polls %>%
select(title, artist, year) %>%
sample_n(10)
rankings_small <- rankings %>%
select(title, artist, gender) %>%
sample_n(10)
polls_small
## # A tibble: 10 × 3
## title artist year
## <chr> <chr> <dbl>
## 1 I've Seen Footage Death Grips 2012
## 2 Walk This Way Run DMC 1986
## 3 Juicy The Notorious B.I.G. 1994
## 4 Ready Or Not The Fugees 1996
## 5 N.Y. State Of Mind Nas 1994
## 6 93 ’Til Infinity Souls of Mischief 1993
## 7 Black Steel In The Hour Of Chaos Public Enemy 1988
## 8 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz 1998
## 9 Jonylah Forever Lupe Fiasco 2013
## 10 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Dogg 1992
rankings_small
## # A tibble: 10 × 3
## title artist gender
## <chr> <chr> <chr>
## 1 Ms Jackson OutKast male
## 2 The Message Grandmaster Flash & The Furious… male
## 3 Ultralight Beam Kanye West male
## 4 California Love 2Pac ft. Dr Dre male
## 5 Boyz-n-the-Hood Eazy-E male
## 6 Swimming Pools (Drank) Kendrick Lamar male
## 7 Straight Outta Compton (Extended Mix) NWA male
## 8 Freaky Tales Too $hort male
## 9 I Got 5 On It Luniz male
## 10 Learned from Texas BIG K.R.I.T male
Describe the resulting data:
How is it different from the original two datasets? * 0 rows compared to the original data sets * all columns from the two datasets
polls_small %>%
inner_join(rankings_small, by = c("title", "artist"))
## # A tibble: 0 × 4
## # ℹ 4 variables: title <chr>, artist <chr>, year <dbl>, gender <chr>
Describe the resulting data:
How is it different from the original two datasets? * there is a 4th column compared to the origanl data set with 3
polls_small %>%
left_join(rankings_small, by = c("title", "artist"))
## # A tibble: 10 × 4
## title artist year gender
## <chr> <chr> <dbl> <chr>
## 1 I've Seen Footage Death Grips 2012 <NA>
## 2 Walk This Way Run DMC 1986 <NA>
## 3 Juicy The Notorious B.I.G. 1994 <NA>
## 4 Ready Or Not The Fugees 1996 <NA>
## 5 N.Y. State Of Mind Nas 1994 <NA>
## 6 93 ’Til Infinity Souls of Mischief 1993 <NA>
## 7 Black Steel In The Hour Of Chaos Public Enemy 1988 <NA>
## 8 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz 1998 <NA>
## 9 Jonylah Forever Lupe Fiasco 2013 <NA>
## 10 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Dogg 1992 <NA>
Describe the resulting data:
How is it different from the original two datasets? * The year and gender columns switched places * Year came before gender
polls_small %>%
right_join(rankings_small, by = c("title", "artist"))
## # A tibble: 10 × 4
## title artist year gender
## <chr> <chr> <dbl> <chr>
## 1 Ms Jackson OutKast NA male
## 2 The Message Grandmaster Flash & The F… NA male
## 3 Ultralight Beam Kanye West NA male
## 4 California Love 2Pac ft. Dr Dre NA male
## 5 Boyz-n-the-Hood Eazy-E NA male
## 6 Swimming Pools (Drank) Kendrick Lamar NA male
## 7 Straight Outta Compton (Extended Mix) NWA NA male
## 8 Freaky Tales Too $hort NA male
## 9 I Got 5 On It Luniz NA male
## 10 Learned from Texas BIG K.R.I.T NA male
Describe the resulting data:
How is it different from the original two datasets? * Combined what looks like right_join and left_join into 1
polls_small %>% full_join(rankings_small, by = c("title", "artist"))
## # A tibble: 20 × 4
## title artist year gender
## <chr> <chr> <dbl> <chr>
## 1 I've Seen Footage Death Grips 2012 <NA>
## 2 Walk This Way Run DMC 1986 <NA>
## 3 Juicy The Notorious B.I.G. 1994 <NA>
## 4 Ready Or Not The Fugees 1996 <NA>
## 5 N.Y. State Of Mind Nas 1994 <NA>
## 6 93 ’Til Infinity Souls of Mischief 1993 <NA>
## 7 Black Steel In The Hour Of Chaos Public Enemy 1988 <NA>
## 8 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz 1998 <NA>
## 9 Jonylah Forever Lupe Fiasco 2013 <NA>
## 10 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Do… 1992 <NA>
## 11 Ms Jackson OutKast NA male
## 12 The Message Grandmaster Flash & The F… NA male
## 13 Ultralight Beam Kanye West NA male
## 14 California Love 2Pac ft. Dr Dre NA male
## 15 Boyz-n-the-Hood Eazy-E NA male
## 16 Swimming Pools (Drank) Kendrick Lamar NA male
## 17 Straight Outta Compton (Extended Mix) NWA NA male
## 18 Freaky Tales Too $hort NA male
## 19 I Got 5 On It Luniz NA male
## 20 Learned from Texas BIG K.R.I.T NA male
Describe the resulting data:
How is it different from the original two datasets? * shows 0 columns and only 3 rows * gender is left out
polls_small %>%
semi_join(rankings_small, by = c("title", "artist"))
## # A tibble: 0 × 3
## # ℹ 3 variables: title <chr>, artist <chr>, year <dbl>
Describe the resulting data:
How is it different from the original two datasets? This is the same data as polls_small
polls_small %>% anti_join(rankings_small, by = c("title", "artist"))
## # A tibble: 10 × 3
## title artist year
## <chr> <chr> <dbl>
## 1 I've Seen Footage Death Grips 2012
## 2 Walk This Way Run DMC 1986
## 3 Juicy The Notorious B.I.G. 1994
## 4 Ready Or Not The Fugees 1996
## 5 N.Y. State Of Mind Nas 1994
## 6 93 ’Til Infinity Souls of Mischief 1993
## 7 Black Steel In The Hour Of Chaos Public Enemy 1988
## 8 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz 1998
## 9 Jonylah Forever Lupe Fiasco 2013
## 10 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Dogg 1992