Group A: Coffees roasted in Taiwan
Group B: Coffees roasted in all other countries
Main Variable: ‘rating’ (continuous)
Main Variable: ‘high_rating’ (binary)
Group A vs Group B: ‘high_price’ vs ‘low_price’ based on median split of ‘100g_USD’
coffee_clean <- coffee_clean %>%
mutate(
# Hypothesis 1 grouping: Taiwan vs Non-Taiwan
country_group = if_else(loc_country == "Taiwan", "Taiwan", "Non-Taiwan"),
# Hypothesis 2: define high/low price and high/low rating
high_price = if_else(`100g_USD` > median(`100g_USD`, na.rm = TRUE),
"High price", "Low price"),
high_rating = if_else(rating >= 93, 1, 0) # binary success
)
Median rating = 93; Found in last weeks data dive
Main variable: ‘rating’
Group A: Taiwan
Group B: Non-Taiwan
Null Hypothesis (H0): The mean rating of Taiwan roasted coffees equals the mean rating of non-Taiwan roasted coffees.
\[ ^μTaiwan = ^μNon-Taiwan \]
Alternative Hypothesis (H1): The mean rating of Taiwan roasted coffees differs from that of non-Taiwan roasted coffees.
\[ ^μTaiwan \neq ^μNon-Taiwan \]
Interpretation: Do coffees roasted in Taiwan differ in mean rating from non-Taiwan roasted coffees?
\(\alpha\) = 0.05 (5% false positive risk)
Power = 0.80 (20% false negative risk)
Minimum practicality meaningful difference = 1.00 rating point
power_calc <- power.t.test(
delta = 1, # minimum difference in means we care about
sd = 2, # assumed SD of rating
sig.level = 0.05, # alpha
power = 0.80,
type = "two.sample",
alternative = "two.sided"
)
power_calc
##
## Two-sample t test power calculation
##
## n = 63.76576
## delta = 1
## sd = 2
## sig.level = 0.05
## power = 0.8
## alternative = two.sided
##
## NOTE: n is number in *each* group
coffee_clean %>% count(country_group)
## # A tibble: 2 × 2
## country_group n
## <chr> <int>
## 1 Non-Taiwan 1531
## 2 Taiwan 549
Sample Size Conclusion: There are hundreds of observations for variable A and B, so I have enough data to test based on my design and effect size! This is far more than the required minimum as n =63.77, which means that I need about 64 Taiwan and 64 non-Taiwan roasted coffee observations to reliably detect a 1 rating point difference with 80% power, and my actual sample size is far larger.
t_test_res <- t.test(
rating ~ country_group,
data = coffee_clean,
var.equal = FALSE # Welch t-test
)
t_test_res
##
## Welch Two Sample t-test
##
## data: rating by country_group
## t = -6.7582, df = 1183.2, p-value = 2.189e-11
## alternative hypothesis: true difference in means between group Non-Taiwan and group Taiwan is not equal to 0
## 95 percent confidence interval:
## -0.6099728 -0.3354930
## sample estimates:
## mean in group Non-Taiwan mean in group Taiwan
## 92.98628 93.45902
Test Summary Insights: The mean rating of coffees roasted in Taiwan is +0.47 rating points higher than that of non-Taiwan roasted coffees. After flipping my confidence interval to interpret Taiwan to non-Taiwan, my 95% confidence interval is: [0.335, 0.610], which is quite narrow, meaning it has high precision.
ggplot(coffee_clean, aes(x = country_group, y = rating, fill = country_group)) +
geom_boxplot(alpha = 0.7) +
geom_jitter(width = 0.15, alpha = 0.4) +
labs(
title = "Coffee Ratings: Taiwan vs Non-Taiwan",
x = "Country group",
y = "Rating"
) +
theme_minimal() +
theme(legend.position = "none")
Reject H0: There is extremely strong evidence that Taiwan roasted coffees have higher mean ratings than non-Taiwan roasted coffees. Despite a somewhat small effect size at 0.47 rating points, it is nonetheless consistent, precise, and statistically convincing due to a large sample size. Overall, the effect is small but quite real and detectable due to the large dataset. The visualization strongly supports this evidence depicting a visibly higher mean rating for coffees roasted in Taiwan, with very few coffees receiving below a 90/100 rating, where as non-Taiwan roasted coffees have a much more significant variance with many sub 90/100 ratings!
Main Variable: ‘high_rating’ (binary: 1 if rating \(\geq\) 93; 0 if rating < 93)
Group A/B: ‘high_price’ & ‘low_price’ (“High price” vs “Low price”)
High price = above median price
Low price = below median price
Null Hypothesis (H0): Proportion of high ratings is the same in both price groups
\[ P_{High price} = P_{Low price} \]
Alternative Hypothesis (H1): The proportions differ
\[ P_{High price} \neq P_{Low price} \]
Interpretation: Are high price coffees more likely to receive high ratings?
tab_price_rating <- table(coffee_clean$high_price, coffee_clean$high_rating)
tab_price_rating
##
## 0 1
## High price 220 815
## Low price 393 652
High price group: \[ P_1 = \frac{815}{1035} = 0.787 \]
Low price group: \[ P_2 = \frac{652}{1045} = 0.624 \]
Difference in proportions:\[ P_1 - P_2 = 0.163 \]
There is a 16.3 percentage point difference.
prop_res <- prop.test(
x = c(
sum(coffee_clean$high_rating[coffee_clean$high_price == "High price"], na.rm = TRUE),
sum(coffee_clean$high_rating[coffee_clean$high_price == "Low price"], na.rm = TRUE)
),
n = c(
sum(coffee_clean$high_price == "High price", na.rm = TRUE),
sum(coffee_clean$high_price == "Low price", na.rm = TRUE)
),
alternative = "two.sided",
correct = TRUE
)
prop_res
##
## 2-sample test for equality of proportions with continuity correction
##
## data: c(sum(coffee_clean$high_rating[coffee_clean$high_price == "High price"], na.rm = TRUE), sum(coffee_clean$high_rating[coffee_clean$high_price == "Low price"], na.rm = TRUE)) out of c(sum(coffee_clean$high_price == "High price", na.rm = TRUE), sum(coffee_clean$high_price == "Low price", na.rm = TRUE))
## X-squared = 66.104, df = 1, p-value = 4.277e-16
## alternative hypothesis: two.sided
## 95 percent confidence interval:
## 0.1240346 0.2029977
## sample estimates:
## prop 1 prop 2
## 0.7874396 0.6239234
Fisher / Two-proportion Test Insights: The 95% confidence interval is [0.124, 0.203], which tells us that high price coffees are 12.4% to 20.3% more likely to receive a rating \[\geq\] 93 than low price coffees. Additionally, the p-value is far below 0.05, suggesting that we should reject \[H_0\] , since there is strong evidence that the probability of receiving high ratings differs significantly between coffees priced above the median and coffees priced below the median.
prop_df <- coffee_clean %>%
group_by(high_price) %>%
summarise(
prop_high = mean(high_rating, na.rm = TRUE),
n = n()
)
ggplot(prop_df, aes(x = high_price, y = prop_high, fill = high_price)) +
geom_col(alpha = 0.8) +
scale_fill_manual(values = c("High price" = "green",
"Low price" = "lightgreen")) +
geom_text(aes(label = round(prop_high, 2)), vjust = -0.5) +
ylim(0, 1) +
labs(
title = "Proportion of High Ratings (≥ 93) by Price Group",
x = "Price group",
y = "Proportion high rating"
) +
theme_minimal()
Reject H0: The Two-Proportion and Fisher Test strongly support that higher prices have a much higher chance of receiving a high rating compared to lower priced coffees. The visual supports and clearly depicts these statistically significant results that the correlation between low priced coffee ratings and high priced coffee ratings is certainly not proportional. The bigger picture insight we gain is the justification of higher priced coffees, as they consistently live up to their value, receiving much higher ratings than lower priced coffees. It is important for consumers to know that they are receiving premium coffee when they buy higher priced coffees. Ultimately, this supports the notion that premium pricing aligns with higher quality coffee far more often than not.