Song shuffling
##EQ - How likely is each genre to play? ##how does adding more songs of each genre increase the likelihood of certain genres playing more often?
###As a class we created a playlist full of songs of several different genres and wanted to figure out how often each genre would play.
#tidyverse helps us organize the data
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.0 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
df <- read_csv("Playlist.csv")
## Rows: 144 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): genre, artist, title
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(df)
## # A tibble: 6 × 3
## genre artist title
## <chr> <chr> <chr>
## 1 Genre 1 - hiphop/rap Alesso When I'm Gone
## 2 Genre 1 - hiphop/rap Aliyah's Interlude It Girl
## 3 Genre 1 - hiphop/rap Big Sean Beware
## 4 Genre 1 - hiphop/rap Clipse Ma, I don't love her
## 5 Genre 1 - hiphop/rap Common Be
## 6 Genre 1 - hiphop/rap Drake Nice for what
#the list of music gets rearanged and listed accirding to genre
df %>% arrange(genre)
## # A tibble: 144 × 3
## genre artist title
## <chr> <chr> <chr>
## 1 Genre 1 - hiphop/rap Alesso When I'm Gone
## 2 Genre 1 - hiphop/rap Aliyah's Interlude It Girl
## 3 Genre 1 - hiphop/rap Big Sean Beware
## 4 Genre 1 - hiphop/rap Clipse Ma, I don't love her
## 5 Genre 1 - hiphop/rap Common Be
## 6 Genre 1 - hiphop/rap Drake Nice for what
## 7 Genre 1 - hiphop/rap Isaiah Rashad Headshots
## 8 Genre 1 - hiphop/rap j cole change
## 9 Genre 1 - hiphop/rap Jcole Crooked smile
## 10 Genre 1 - hiphop/rap Jcole Immortal
## # ℹ 134 more rows
#this table shows how many songs are in each genre
df %>% count(genre)
## # A tibble: 6 × 2
## genre n
## <chr> <int>
## 1 Genre 1 - hiphop/rap 26
## 2 Genre 2 - pop/kpop/Latin 22
## 3 Genre 3 - house/ska 5
## 4 Genre 4 - rnb/soul 43
## 5 Genre 5 - alt/indie/folk 34
## 6 Genre 6 - country/rock 14
genre_counts <- df %>% count(genre)
#this code gives us the theoretical probability of listening of each genre
genre_counts <-genre_counts %>% mutate(probability = n/144)
genre_counts
## # A tibble: 6 × 3
## genre n probability
## <chr> <int> <dbl>
## 1 Genre 1 - hiphop/rap 26 0.181
## 2 Genre 2 - pop/kpop/Latin 22 0.153
## 3 Genre 3 - house/ska 5 0.0347
## 4 Genre 4 - rnb/soul 43 0.299
## 5 Genre 5 - alt/indie/folk 34 0.236
## 6 Genre 6 - country/rock 14 0.0972
#list of 10 songs randomized from each genre
sample_count <- sample_n(df, 10) %>% count(genre)
sample_count<- sample_count %>% mutate (probability = n/10)
sample_count
## # A tibble: 4 × 3
## genre n probability
## <chr> <int> <dbl>
## 1 Genre 1 - hiphop/rap 2 0.2
## 2 Genre 2 - pop/kpop/Latin 1 0.1
## 3 Genre 4 - rnb/soul 2 0.2
## 4 Genre 5 - alt/indie/folk 5 0.5
#this will be our 2nd sample for in search of another experimental probability if we increased the number by 4 more songs
sample_count2 <-sample_n(df, 14) %>% count(genre)
sample_count2
## # A tibble: 5 × 2
## genre n
## <chr> <int>
## 1 Genre 1 - hiphop/rap 2
## 2 Genre 2 - pop/kpop/Latin 2
## 3 Genre 4 - rnb/soul 6
## 4 Genre 5 - alt/indie/folk 2
## 5 Genre 6 - country/rock 2
#This gives us the probability when we increase it by 4 more songs
sample_count2<-sample_count2 %>% mutate(probility = n/14)
sample_count2
## # A tibble: 5 × 3
## genre n probility
## <chr> <int> <dbl>
## 1 Genre 1 - hiphop/rap 2 0.143
## 2 Genre 2 - pop/kpop/Latin 2 0.143
## 3 Genre 4 - rnb/soul 6 0.429
## 4 Genre 5 - alt/indie/folk 2 0.143
## 5 Genre 6 - country/rock 2 0.143
#This is our 3rd sample with a much lower number than our last test run to see what how our probability will change and if it will increase the chances of having more or less of certain genres
sample_count3 <-sample_n(df, 5) %>% count(genre)
sample_count3
## # A tibble: 3 × 2
## genre n
## <chr> <int>
## 1 Genre 2 - pop/kpop/Latin 1
## 2 Genre 4 - rnb/soul 2
## 3 Genre 5 - alt/indie/folk 2
#Here we are given our probability for our 3rd sample
sample_count3<-sample_count3 %>% mutate(probility = n/5)
sample_count3
## # A tibble: 3 × 3
## genre n probility
## <chr> <int> <dbl>
## 1 Genre 2 - pop/kpop/Latin 1 0.2
## 2 Genre 4 - rnb/soul 2 0.4
## 3 Genre 5 - alt/indie/folk 2 0.4
#this is round 4 of our sample
sample_count4 <-sample_n(df, 24) %>% count(genre)
sample_count4
## # A tibble: 5 × 2
## genre n
## <chr> <int>
## 1 Genre 1 - hiphop/rap 5
## 2 Genre 2 - pop/kpop/Latin 3
## 3 Genre 4 - rnb/soul 6
## 4 Genre 5 - alt/indie/folk 7
## 5 Genre 6 - country/rock 3
#lastly this gives us the probabability of ecah genre in round 4
sample_count4<-sample_count4 %>% mutate(probility = n/24)
sample_count4
## # A tibble: 5 × 3
## genre n probility
## <chr> <int> <dbl>
## 1 Genre 1 - hiphop/rap 5 0.208
## 2 Genre 2 - pop/kpop/Latin 3 0.125
## 3 Genre 4 - rnb/soul 6 0.25
## 4 Genre 5 - alt/indie/folk 7 0.292
## 5 Genre 6 - country/rock 3 0.125
#The probability of listening to a song from each genre changes when more songs are added to listen to from different genres. The samples show how much the probabilities change when you increase or decrease songs making is more or less likely for different genres to play.