Import two related datasets from TidyTuesday Project.
polls <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-04-14/polls.csv')
## Rows: 535 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): title, artist, gender, critic_name, critic_rols, critic_country, cr...
## dbl (2): rank, year
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
rankings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-04-14/rankings.csv')
## Rows: 311 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): title, artist, gender
## dbl (9): ID, year, points, n, n1, n2, n3, n4, n5
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Describe the two datasets: The best Hip-Hop songs of all time were put into a poll and ranked. Data1
Data 2
set.seed(1234)
Polls_Smalll <- polls %>% select(title, artist, year) %>% sample_n(10)
Rankings_Small <- rankings %>% select(title, artist, points) %>% sample_n(10)
Polls_Smalll
## # A tibble: 10 × 3
## title artist year
## <chr> <chr> <dbl>
## 1 I've Seen Footage Death Grips 2012
## 2 Walk This Way Run DMC 1986
## 3 Juicy The Notorious B.I.G. 1994
## 4 Ready Or Not The Fugees 1996
## 5 N.Y. State Of Mind Nas 1994
## 6 93 ’Til Infinity Souls of Mischief 1993
## 7 Black Steel In The Hour Of Chaos Public Enemy 1988
## 8 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz 1998
## 9 Jonylah Forever Lupe Fiasco 2013
## 10 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Dogg 1992
Rankings_Small
## # A tibble: 10 × 3
## title artist points
## <chr> <chr> <dbl>
## 1 Ms Jackson OutKast 10
## 2 The Message Grandmaster Flash & The Furious… 90
## 3 Ultralight Beam Kanye West 8
## 4 California Love 2Pac ft. Dr Dre 16
## 5 Boyz-n-the-Hood Eazy-E 4
## 6 Swimming Pools (Drank) Kendrick Lamar 6
## 7 Straight Outta Compton (Extended Mix) NWA 10
## 8 Freaky Tales Too $hort 8
## 9 I Got 5 On It Luniz 10
## 10 Learned from Texas BIG K.R.I.T 6
Describe the resulting data:
How is it different from the original two datasets?
Describe the resulting data:
How is it different from the original two datasets?
This now showing the most liked rap song songs within data set by year, and then by points. The common variables here are title, and artist, so you’ll see artists on both pieces of data.
left_join(Polls_Smalll, Rankings_Small)
## Joining with `by = join_by(title, artist)`
## # A tibble: 10 × 4
## title artist year points
## <chr> <chr> <dbl> <dbl>
## 1 I've Seen Footage Death Grips 2012 NA
## 2 Walk This Way Run DMC 1986 NA
## 3 Juicy The Notorious B.I.G. 1994 NA
## 4 Ready Or Not The Fugees 1996 NA
## 5 N.Y. State Of Mind Nas 1994 NA
## 6 93 ’Til Infinity Souls of Mischief 1993 NA
## 7 Black Steel In The Hour Of Chaos Public Enemy 1988 NA
## 8 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz 1998 NA
## 9 Jonylah Forever Lupe Fiasco 2013 NA
## 10 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Dogg 1992 NA
left_join(Rankings_Small, Polls_Smalll)
## Joining with `by = join_by(title, artist)`
## # A tibble: 10 × 4
## title artist points year
## <chr> <chr> <dbl> <dbl>
## 1 Ms Jackson OutKast 10 NA
## 2 The Message Grandmaster Flash & The F… 90 NA
## 3 Ultralight Beam Kanye West 8 NA
## 4 California Love 2Pac ft. Dr Dre 16 NA
## 5 Boyz-n-the-Hood Eazy-E 4 NA
## 6 Swimming Pools (Drank) Kendrick Lamar 6 NA
## 7 Straight Outta Compton (Extended Mix) NWA 10 NA
## 8 Freaky Tales Too $hort 8 NA
## 9 I Got 5 On It Luniz 10 NA
## 10 Learned from Texas BIG K.R.I.T 6 NA
Describe the resulting data:
How is it different from the original two datasets?
right_join(Polls_Smalll, Rankings_Small)
## Joining with `by = join_by(title, artist)`
## # A tibble: 10 × 4
## title artist year points
## <chr> <chr> <dbl> <dbl>
## 1 Ms Jackson OutKast NA 10
## 2 The Message Grandmaster Flash & The F… NA 90
## 3 Ultralight Beam Kanye West NA 8
## 4 California Love 2Pac ft. Dr Dre NA 16
## 5 Boyz-n-the-Hood Eazy-E NA 4
## 6 Swimming Pools (Drank) Kendrick Lamar NA 6
## 7 Straight Outta Compton (Extended Mix) NWA NA 10
## 8 Freaky Tales Too $hort NA 8
## 9 I Got 5 On It Luniz NA 10
## 10 Learned from Texas BIG K.R.I.T NA 6
right_join(Rankings_Small, Polls_Smalll)
## Joining with `by = join_by(title, artist)`
## # A tibble: 10 × 4
## title artist points year
## <chr> <chr> <dbl> <dbl>
## 1 I've Seen Footage Death Grips NA 2012
## 2 Walk This Way Run DMC NA 1986
## 3 Juicy The Notorious B.I.G. NA 1994
## 4 Ready Or Not The Fugees NA 1996
## 5 N.Y. State Of Mind Nas NA 1994
## 6 93 ’Til Infinity Souls of Mischief NA 1993
## 7 Black Steel In The Hour Of Chaos Public Enemy NA 1988
## 8 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz NA 1998
## 9 Jonylah Forever Lupe Fiasco NA 2013
## 10 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Dogg NA 1992
Describe the resulting data:
How is it different from the original two datasets?
full_join(Polls_Smalll, Rankings_Small)
## Joining with `by = join_by(title, artist)`
## # A tibble: 20 × 4
## title artist year points
## <chr> <chr> <dbl> <dbl>
## 1 I've Seen Footage Death Grips 2012 NA
## 2 Walk This Way Run DMC 1986 NA
## 3 Juicy The Notorious B.I.G. 1994 NA
## 4 Ready Or Not The Fugees 1996 NA
## 5 N.Y. State Of Mind Nas 1994 NA
## 6 93 ’Til Infinity Souls of Mischief 1993 NA
## 7 Black Steel In The Hour Of Chaos Public Enemy 1988 NA
## 8 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz 1998 NA
## 9 Jonylah Forever Lupe Fiasco 2013 NA
## 10 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Do… 1992 NA
## 11 Ms Jackson OutKast NA 10
## 12 The Message Grandmaster Flash & The F… NA 90
## 13 Ultralight Beam Kanye West NA 8
## 14 California Love 2Pac ft. Dr Dre NA 16
## 15 Boyz-n-the-Hood Eazy-E NA 4
## 16 Swimming Pools (Drank) Kendrick Lamar NA 6
## 17 Straight Outta Compton (Extended Mix) NWA NA 10
## 18 Freaky Tales Too $hort NA 8
## 19 I Got 5 On It Luniz NA 10
## 20 Learned from Texas BIG K.R.I.T NA 6
full_join(Rankings_Small, Polls_Smalll)
## Joining with `by = join_by(title, artist)`
## # A tibble: 20 × 4
## title artist points year
## <chr> <chr> <dbl> <dbl>
## 1 Ms Jackson OutKast 10 NA
## 2 The Message Grandmaster Flash & The F… 90 NA
## 3 Ultralight Beam Kanye West 8 NA
## 4 California Love 2Pac ft. Dr Dre 16 NA
## 5 Boyz-n-the-Hood Eazy-E 4 NA
## 6 Swimming Pools (Drank) Kendrick Lamar 6 NA
## 7 Straight Outta Compton (Extended Mix) NWA 10 NA
## 8 Freaky Tales Too $hort 8 NA
## 9 I Got 5 On It Luniz 10 NA
## 10 Learned from Texas BIG K.R.I.T 6 NA
## 11 I've Seen Footage Death Grips NA 2012
## 12 Walk This Way Run DMC NA 1986
## 13 Juicy The Notorious B.I.G. NA 1994
## 14 Ready Or Not The Fugees NA 1996
## 15 N.Y. State Of Mind Nas NA 1994
## 16 93 ’Til Infinity Souls of Mischief NA 1993
## 17 Black Steel In The Hour Of Chaos Public Enemy NA 1988
## 18 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz NA 1998
## 19 Jonylah Forever Lupe Fiasco NA 2013
## 20 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Do… NA 1992
Describe the resulting data:
How is it different from the original two datasets?
semi_join(polls, rankings)
## Joining with `by = join_by(title, artist, gender, year)`
## # A tibble: 535 × 9
## rank title artist gender year critic_name critic_rols critic_country
## <dbl> <chr> <chr> <chr> <dbl> <chr> <chr> <chr>
## 1 1 Terminator … Publi… male 1998 Joseph Aba… Fat Beats US
## 2 2 4th Chamber Gza f… male 1995 Joseph Aba… Fat Beats US
## 3 3 Peter Piper Run D… male 1986 Joseph Aba… Fat Beats US
## 4 4 Play That B… GLOBE… male 2001 Joseph Aba… Fat Beats US
## 5 5 Time’s Up O.C. male 1994 Joseph Aba… Fat Beats US
## 6 1 Players Slum … male 1997 Biba Adams Critic US
## 7 2 Self Destru… Stop … mixed 1989 Biba Adams Critic US
## 8 3 Push It Salt-… female 1986 Biba Adams Critic US
## 9 4 Ambitionz A… 2Pac male 1996 Biba Adams Critic US
## 10 5 Big Pimpin' JAY-Z… male 1999 Biba Adams Critic US
## # ℹ 525 more rows
## # ℹ 1 more variable: critic_country2 <chr>
semi_join(rankings, polls)
## Joining with `by = join_by(title, artist, year, gender)`
## # A tibble: 311 × 12
## ID title artist year gender points n n1 n2 n3 n4 n5
## <dbl> <chr> <chr> <dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 Juicy The N… 1994 male 140 18 9 3 3 1 2
## 2 2 Fight T… Publi… 1989 male 100 11 7 3 1 0 0
## 3 3 Shook O… Mobb … 1995 male 94 13 4 5 1 1 2
## 4 4 The Mes… Grand… 1982 male 90 14 5 3 1 0 5
## 5 5 Nuthin’… Dr Dr… 1992 male 84 14 2 4 2 4 2
## 6 6 C.R.E.A… Wu-Ta… 1993 male 62 10 3 1 1 4 1
## 7 7 93 ’Til… Souls… 1993 male 50 7 2 2 2 0 1
## 8 8 Passin’… The P… 1992 male 48 6 3 2 0 0 1
## 9 9 N.Y. St… Nas 1994 male 46 7 1 3 1 1 1
## 10 10 Dear Ma… 2Pac 1995 male 42 6 2 1 1 2 0
## # ℹ 301 more rows
Describe the resulting data:
How is it different from the original two datasets?
anti_join(Polls_Smalll, Rankings_Small)
## Joining with `by = join_by(title, artist)`
## # A tibble: 10 × 3
## title artist year
## <chr> <chr> <dbl>
## 1 I've Seen Footage Death Grips 2012
## 2 Walk This Way Run DMC 1986
## 3 Juicy The Notorious B.I.G. 1994
## 4 Ready Or Not The Fugees 1996
## 5 N.Y. State Of Mind Nas 1994
## 6 93 ’Til Infinity Souls of Mischief 1993
## 7 Black Steel In The Hour Of Chaos Public Enemy 1988
## 8 Déjà Vu (Uptown Baby) Lord Tariq & Peter Gunz 1998
## 9 Jonylah Forever Lupe Fiasco 2013
## 10 Nuthin’ But A ‘G’ Thang Dr Dre ft. Snoop Doggy Dogg 1992
anti_join(Rankings_Small, Polls_Smalll)
## Joining with `by = join_by(title, artist)`
## # A tibble: 10 × 3
## title artist points
## <chr> <chr> <dbl>
## 1 Ms Jackson OutKast 10
## 2 The Message Grandmaster Flash & The Furious… 90
## 3 Ultralight Beam Kanye West 8
## 4 California Love 2Pac ft. Dr Dre 16
## 5 Boyz-n-the-Hood Eazy-E 4
## 6 Swimming Pools (Drank) Kendrick Lamar 6
## 7 Straight Outta Compton (Extended Mix) NWA 10
## 8 Freaky Tales Too $hort 8
## 9 I Got 5 On It Luniz 10
## 10 Learned from Texas BIG K.R.I.T 6