Song shuffling

##EQ - How likely is each genre to play? ##how does adding more songs of each genre increase the likelihood of certain genres playing more often?

###As a class we created a playlist full of songs of several different genres and wanted to figure out how often each genre would play.

#tidyverse helps us organize the data

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
df <- read_csv("Playlist.csv")
## Rows: 144 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): genre, artist, title
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(df)
## # A tibble: 6 × 3
##   genre                artist             title               
##   <chr>                <chr>              <chr>               
## 1 Genre 1 - hiphop/rap Alesso             When I'm Gone       
## 2 Genre 1 - hiphop/rap Aliyah's Interlude It Girl             
## 3 Genre 1 - hiphop/rap Big Sean           Beware              
## 4 Genre 1 - hiphop/rap Clipse             Ma, I don't love her
## 5 Genre 1 - hiphop/rap Common             Be                  
## 6 Genre 1 - hiphop/rap Drake              Nice for what

#the list of music gets rearanged and listed accirding to genre

df %>% arrange(genre)
## # A tibble: 144 × 3
##    genre                artist             title               
##    <chr>                <chr>              <chr>               
##  1 Genre 1 - hiphop/rap Alesso             When I'm Gone       
##  2 Genre 1 - hiphop/rap Aliyah's Interlude It Girl             
##  3 Genre 1 - hiphop/rap Big Sean           Beware              
##  4 Genre 1 - hiphop/rap Clipse             Ma, I don't love her
##  5 Genre 1 - hiphop/rap Common             Be                  
##  6 Genre 1 - hiphop/rap Drake              Nice for what       
##  7 Genre 1 - hiphop/rap Isaiah Rashad      Headshots           
##  8 Genre 1 - hiphop/rap j cole             change              
##  9 Genre 1 - hiphop/rap Jcole              Crooked smile       
## 10 Genre 1 - hiphop/rap Jcole              Immortal            
## # ℹ 134 more rows

#this table shows how many songs are in each genre

df %>% count(genre)
## # A tibble: 6 × 2
##   genre                        n
##   <chr>                    <int>
## 1 Genre 1 - hiphop/rap        26
## 2 Genre 2 - pop/kpop/Latin    22
## 3 Genre 3 -  house/ska         5
## 4 Genre 4 -  rnb/soul         43
## 5 Genre 5 - alt/indie/folk    34
## 6 Genre 6 - country/rock      14
genre_counts <- df %>% count(genre)

#this code gives us the theoretical probability of listening of each genre

genre_counts <-genre_counts %>% mutate(probability = n/144)
genre_counts
## # A tibble: 6 × 3
##   genre                        n probability
##   <chr>                    <int>       <dbl>
## 1 Genre 1 - hiphop/rap        26      0.181 
## 2 Genre 2 - pop/kpop/Latin    22      0.153 
## 3 Genre 3 -  house/ska         5      0.0347
## 4 Genre 4 -  rnb/soul         43      0.299 
## 5 Genre 5 - alt/indie/folk    34      0.236 
## 6 Genre 6 - country/rock      14      0.0972

#list of 10 songs randomized from each genre

sample_count <- sample_n(df, 10) %>% count(genre)
sample_count<- sample_count %>% mutate (probability = n/10)
sample_count
## # A tibble: 4 × 3
##   genre                        n probability
##   <chr>                    <int>       <dbl>
## 1 Genre 1 - hiphop/rap         2         0.2
## 2 Genre 2 - pop/kpop/Latin     1         0.1
## 3 Genre 4 -  rnb/soul          2         0.2
## 4 Genre 5 - alt/indie/folk     5         0.5

#this will be our 2nd sample for in search of another experimental probability if we increased the number by 4 more songs

sample_count2 <-sample_n(df, 14) %>% count(genre)
sample_count2
## # A tibble: 5 × 2
##   genre                        n
##   <chr>                    <int>
## 1 Genre 1 - hiphop/rap         2
## 2 Genre 2 - pop/kpop/Latin     2
## 3 Genre 4 -  rnb/soul          6
## 4 Genre 5 - alt/indie/folk     2
## 5 Genre 6 - country/rock       2

#This gives us the probability when we increase it by 4 more songs

sample_count2<-sample_count2 %>% mutate(probility = n/14)
sample_count2
## # A tibble: 5 × 3
##   genre                        n probility
##   <chr>                    <int>     <dbl>
## 1 Genre 1 - hiphop/rap         2     0.143
## 2 Genre 2 - pop/kpop/Latin     2     0.143
## 3 Genre 4 -  rnb/soul          6     0.429
## 4 Genre 5 - alt/indie/folk     2     0.143
## 5 Genre 6 - country/rock       2     0.143

#This is our 3rd sample with a much lower number than our last test run to see what how our probability will change and if it will increase the chances of having more or less of certain genres

sample_count3 <-sample_n(df, 5) %>% count(genre)
sample_count3
## # A tibble: 3 × 2
##   genre                        n
##   <chr>                    <int>
## 1 Genre 2 - pop/kpop/Latin     1
## 2 Genre 4 -  rnb/soul          2
## 3 Genre 5 - alt/indie/folk     2

#Here we are given our probability for our 3rd sample

sample_count3<-sample_count3 %>% mutate(probility = n/5)
sample_count3
## # A tibble: 3 × 3
##   genre                        n probility
##   <chr>                    <int>     <dbl>
## 1 Genre 2 - pop/kpop/Latin     1       0.2
## 2 Genre 4 -  rnb/soul          2       0.4
## 3 Genre 5 - alt/indie/folk     2       0.4

#this is round 4 of our sample

sample_count4 <-sample_n(df, 24) %>% count(genre)
sample_count4
## # A tibble: 5 × 2
##   genre                        n
##   <chr>                    <int>
## 1 Genre 1 - hiphop/rap         5
## 2 Genre 2 - pop/kpop/Latin     3
## 3 Genre 4 -  rnb/soul          6
## 4 Genre 5 - alt/indie/folk     7
## 5 Genre 6 - country/rock       3

#lastly this gives us the probabability of ecah genre in round 4

sample_count4<-sample_count4 %>% mutate(probility = n/24) 
sample_count4
## # A tibble: 5 × 3
##   genre                        n probility
##   <chr>                    <int>     <dbl>
## 1 Genre 1 - hiphop/rap         5     0.208
## 2 Genre 2 - pop/kpop/Latin     3     0.125
## 3 Genre 4 -  rnb/soul          6     0.25 
## 4 Genre 5 - alt/indie/folk     7     0.292
## 5 Genre 6 - country/rock       3     0.125

#The probability of listening to a song from each genre changes when more songs are added to listen to from different genres. The samples show how much the probabilities change when you increase or decrease songs making is more or less likely for different genres to play.