Introduction:
We will be examining if age is related to preferred streaming platforms (such as Netflix, Hulu, Disney and Amazon) using a sample of 300 people (N = 300). We ran a Chi-Square test of independence, used residual tables, and broke down cell contributions to the χ² statistic with a heatmap. Last we reported Cramer’s V as effect size. Our goal is to pinpoint which age groups favor which platforms and use that insight to inform marketing and content planning.
streaming_data<-read_xlsx("Streaming Services and Age.xlsx")
View(streaming_data)
str(streaming_data)
## tibble [300 × 2] (S3: tbl_df/tbl/data.frame)
## $ AgeCat : chr [1:300] "18–25" "18–25" "18–25" "18–25" ...
## $ Platform: chr [1:300] "Other" "Hulu" "Netflix" "Netflix" ...
skim(streaming_data)
| Name | streaming_data |
| Number of rows | 300 |
| Number of columns | 2 |
| _______________________ | |
| Column type frequency: | |
| character | 2 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| AgeCat | 0 | 1 | 3 | 5 | 0 | 3 | 0 |
| Platform | 0 | 1 | 4 | 7 | 0 | 5 | 0 |
count_age_cat<-table(streaming_data$AgeCat)
count_age_cat
##
## 18–25 26–40 41+
## 100 100 100
count_platform<-table(streaming_data$Platform)
count_platform
##
## Amazon Disney+ Hulu Netflix Other
## 54 61 46 111 28
contingency_table<-table(streaming_data$AgeCat,streaming_data$Platform)
contingency_table
##
## Amazon Disney+ Hulu Netflix Other
## 18–25 4 22 23 47 4
## 26–40 11 25 16 41 7
## 41+ 39 14 7 23 17
stacked_bar<- ggplot(streaming_data, aes(x = Platform, fill = AgeCat)) +
geom_bar(position = "fill") +
labs(
title = "Streaming Platform Preferance Proportion",
y = "Age",
x = "Streaming Platform"
) +
theme_solarized_2()
stacked_bar
clustered_bar<- ggplot(streaming_data, aes(Platform, fill = AgeCat)) +
geom_bar(position = "dodge") +
geom_text(
stat = "count",
aes(label=after_stat(count)),
position = position_dodge(width = 0.8),
vjust=-0.2,
size = 3) +
labs(
title = "Streaming Platform Preferance by Age",
x="Streaming Platform",
y="Number of People"
)+
theme_solarized_2()
clustered_bar
chi_square_test<-chisq.test(contingency_table)
chi_square_test
##
## Pearson's Chi-squared test
##
## data: contingency_table
## X-squared = 68.044, df = 8, p-value = 1.203e-11
This is Significant.
chi_square_test$observed
##
## Amazon Disney+ Hulu Netflix Other
## 18–25 4 22 23 47 4
## 26–40 11 25 16 41 7
## 41+ 39 14 7 23 17
chi_square_test$expected
##
## Amazon Disney+ Hulu Netflix Other
## 18–25 18 20.33333 15.33333 37 9.333333
## 26–40 18 20.33333 15.33333 37 9.333333
## 41+ 18 20.33333 15.33333 37 9.333333
chi_square_test$residuals
##
## Amazon Disney+ Hulu Netflix Other
## 18–25 -3.2998316 0.3696106 1.9578900 1.6439899 -1.7457431
## 26–40 -1.6499158 1.0349098 0.1702513 0.6575959 -0.7637626
## 41+ 4.9497475 -1.4045204 -2.1281413 -2.3015858 2.5095057
The 18–25 year old age bracket and 26–40 year old age bracket were much less likely to choose Amazon for a streaming service, than the age 41 plus group. Netflix was preferred more by younger (18–25) and middle-aged (26–40) adults, but less by those 41 plus. Hulu showed a similar pattern, with higher preference among 18–25-year-olds and lower preference among older adults. Disney was slightly more popular among younger and middle-aged adults, but less popular among the oldest group.
cell_contributions<-((chi_square_test$observed-chi_square_test$expected)^2)/chi_square_test$expected
cell_contributions
##
## Amazon Disney+ Hulu Netflix Other
## 18–25 10.88888889 0.13661202 3.83333333 2.70270270 3.04761905
## 26–40 2.72222222 1.07103825 0.02898551 0.43243243 0.58333333
## 41+ 24.50000000 1.97267760 4.52898551 5.29729730 6.29761905
percent_contributions<- cell_contributions / chi_square_test$statistic *100
percent_contributions
##
## Amazon Disney+ Hulu Netflix Other
## 18–25 16.00277665 0.20077087 5.63363056 3.97200744 4.47891125
## 26–40 4.00069416 1.57404361 0.04259834 0.63552119 0.85729161
## 41+ 36.00624747 2.89913133 6.65599073 7.78513459 9.25525020
pheatmap(percent_contributions,
display_numbers = TRUE,
cluster_rows = FALSE,
cluster_cols = FALSE,
main = "% Contribution to Chi-Square Statistic")
This heatmap shows that most people in the 41 plus age bracket watch
Amazon more than any other age bracket, and other age brackets (18-25
and 26-40) watch from a variety of different streaming platforms,
including Netflix, Disney, Hulu and other.
cramerV(contingency_table)
## Cramer V
## 0.3368
Interpretation The Chi-Square test gave us the results of χ²(8, N = 300) = 68.04, p = .12. This shows association was not statistically significant for age and streaming platform preference,. The 41 plus age bracket preference for Amazon, and Cramer’s V = 0.34 shows a moderate association. Overall, older adults tended to prefer Amazon, while younger adults (18–25 and 26–40) favored Netflix, and then other platforms including Disney and Hulu showing age-based patterns despite the non-significant test result.