Main Variables and Ways to Define Groups A & B for my Null Hypotheses
Null Hypothesis 1: Neyman-Pearson Framework
Research Question: Do songs released after 2010 stay on the Billboard Hot 100 #1 charts longer than songs released prior to 2010?
Null and Alternative Hypotheses:
Null: Mean weeks on the chart is equal for pre-2010 and post-2010 songs
Alternative: Mean weeks on the chart differs (greater for post-2010)
Main Variable (continuous):
- weeks_at_number_one
Way to Define Groups A & B:
Group A: songs released before 2010
Group B: songs released in 2010 or later
Establishing the Eras (pre and post 2010)
billboard <- billboard |>
mutate(era = if_else(year < 2010, "pre_2010", "post_2010"))
Neyman-Pearson Design Choices and Explanation
alpha <- 0.05
power <- 0.80
effect_size <- 0.3
library(pwr)
pwr.t.test(d = effect_size, sig.level = alpha, power = power, type = "two.sample")
##
## Two-sample t test power calculation
##
## n = 175.3847
## d = 0.3
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
##
## NOTE: n is number in *each* group
billboard |> count(era)
## # A tibble: 2 × 2
## era n
## <chr> <int>
## 1 post_2010 198
## 2 pre_2010 979
Justification for Alpha, Power, Effect Size, and Why the Study is Sufficient to Continue the Hypothesis Test
The alpha of 0.05 was chosen here in order to balance the risk of false positives with the end goal being to identify meaningful differences in chart longevity. For this test, concluding that songs from different eras (pre and post 2010) differ in their time on the Billboard charts incorrectly would be undesirable but not have extreme consequences, showing that the alpha of 0.05 is appropriate here. The power was set to 0.80 to make sure there was a higher probability of determining a true difference if one exists in this context. Due to the cultural and industry relevance of chart performance, failing to detect a difference between eras (Type 2 Error) would limit how useful this analysis is, so a sufficient power was prioritized here. As for Cohen’s d or effect size, I utilized 0.3 to show that even small differences in weeks on the chart may be meaningful in the music industry, where smaller changes in exposure can have significant cumulative effects. Lastly, the power analysis above indicates that each group (pre and post 2010) needed to have at least 176 observations in order to achieve the desired power. Just after, we see that there are 198 post_2010 songs and 979 pre_2010 songs, meeting the requirement for both groups. So, this study is sufficient in order to conduct the originally planned hypothesis test.
T-test for ‘weeks_at_number_one’ and ‘era’ —> post_2010 & pre_2010
t.test(weeks_at_number_one ~ era, data = billboard)
##
## Welch Two Sample t-test
##
## data: weeks_at_number_one by era
## t = 4.2554, df = 227.29, p-value = 3.052e-05
## alternative hypothesis: true difference in means between group post_2010 and group pre_2010 is not equal to 0
## 95 percent confidence interval:
## 0.6366849 1.7347825
## sample estimates:
## mean in group post_2010 mean in group pre_2010
## 3.924242 2.738509
Conclusion for Neyman-Pearson Null Hypothesis
Using a two sample t-test with an alpha of 0.5, we reject the null hypothesis that songs released before 2010 and after 2010 spend the same average number of weeks at number one on the Billboard charts. The p-value of 3.052 x 10^-5 is far below the chosen significance level, which indicates strong evidence against the null hypothesis. Songs released after 2010 on average spent more weeks at number one, 3.92, than songs released before 2010, 2.74, and the estimated difference in means is about 1.19 weeks with a 95% confidence interval ranging from 0.64 to 1.73 weeks, so the true mean difference is roughly in that range. Due to the power of 0.80 to detect a smaller effect size of 0.3 and more than enough samples in each group, we are confident that this result reflects a meaningful difference in chart performance and longevity rather than sampling variability.
Null Hypothesis 2: Fisher’s Significance Testing Framework
Research Question: Are #1 songs released after 2010 more likely to involve featured artists (collaborations) than #1 songs released before 2010?
Null and Alternative Hypotheses:
Null: The presence of featured artists is independent of era (pre vs. post 2010)
Alternative: The presence of featured artists differs by era
Main Variable (binary):
- collab —> Yes or No (created from ‘featured_artists’ with NA or a featured artist
Way to Define Groups A & B:
Group A: songs released before 2010
Group B: songs released in or after 2010
billboard <- billboard |>
mutate(
collab = if_else(is.na(featured_artists), "No Feature", "Has Feature")
)
tab <- table(billboard$era, billboard$collab)
tab
##
## Has Feature No Feature
## post_2010 68 130
## pre_2010 121 858
fisher.test(tab)
##
## Fisher's Exact Test for Count Data
##
## data: tab
## p-value = 1.708e-12
## alternative hypothesis: true odds ratio is not equal to 1
## 95 percent confidence interval:
## 2.566209 5.327473
## sample estimates:
## odds ratio
## 3.703794
Conclusion for Fisher’s Significance Testing Hypothesis
In order to determine whether collaborations among Billboard Hot 100 #1 songs differ by era, I conducted a Fisher’s Exact Test comparing songs released before 2010 to those released after 2010. To reiterate, the null hypothesis stated that collaboration status, or the presence of a featured artist, is independent of era. The resulting p-value, 1.708 x 10^-12, provides rather strong evidence against the null hypothesis above, so we reject the null hypothesis and conclude that collaboration rates differ significantly between eras. The sample estimates odds ratio of roughly 3.7 indicates that #1 songs released after 2010 are about 3.7 times more likely to include a feature artist compared to those released before 2010. Then, the 95% confidence interval for the odds ratio, ~2.57 and ~5.33, does not include 1, providing further support for this conclusion. Due to this analysis including a large number of songs across multiple decades and using Fisher’s Exact Test, we can be confident that the resulting difference portrays a meaningful shift in collaboration patterns overall.