使用雙樣本 t 檢定, 比較不同性別的小費是否有差別. 資料來源: tips dataset in reshape2 package. 統計程序: 雙樣本 t 檢定
data(tips, package = "reshape2")
shapiro.test(tips$tip)
##
## Shapiro-Wilk normality test
##
## data: tips$tip
## W = 0.89781, p-value = 8.2e-12
shapiro.test(tips$tip[tips$sex=="Female"])
##
## Shapiro-Wilk normality test
##
## data: tips$tip[tips$sex == "Female"]
## W = 0.95678, p-value = 0.005448
shapiro.test(tips$tip[tips$sex=="Male"])
##
## Shapiro-Wilk normality test
##
## data: tips$tip[tips$sex == "Male"]
## W = 0.87587, p-value = 3.708e-10
所有 P 值皆小於 0.05, 所以 tip 資料的分配不是常態分配.
## Loading required package: ggplot2
ggplot(data = tips) + geom_density(aes(x=tips$tip)) + facet_grid(.~sex)
ansari.test(tip ~ sex, data = tips)
##
## Ansari-Bradley test
##
## data: tip by sex
## AB = 5582.5, p-value = 0.376
## alternative hypothesis: true ratio of scales is not equal to 1
p-value 為 0.376, 大於 0.05, 無法拒絕虛無假設(null hypothesis), 故男女服務生的小費變異數無差別.
使用雙樣本 t 檢定, 且變異數相等.
t.test(tip~sex, data = tips, var.equal=TRUE)
##
## Two Sample t-test
##
## data: tip by sex
## t = -1.3879, df = 242, p-value = 0.1665
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.6197558 0.1074167
## sample estimates:
## mean in group Female mean in group Male
## 2.833448 3.089618
若考慮樣本分配不為常態, 可使用 Mann-Whitney 檢定.
wilcox.test(tip ~sex, data = tips)
##
## Wilcoxon rank sum test with continuity correction
##
## data: tip by sex
## W = 6369.5, p-value = 0.3834
## alternative hypothesis: true location shift is not equal to 0
繪製男女服務生小費的平均數及 2 倍標準差範圍.
require(plyr)
## Loading required package: plyr
tip.summary <- with(tips, ddply(tips, "sex", summarise, tip.mean = mean(tip),
tip.sd = sd(tip),
lower = tip.mean - 2* tip.sd/sqrt(NROW(tip)),
upper = tip.mean + 2* tip.sd/sqrt(NROW(tip))
))
ggplot(tip.summary, aes(x = tip.mean, y=sex)) + geom_errorbarh(aes(xmin=lower, xmax=upper), height = .2, color="red") + geom_point() + geom_vline(xintercept = tip.summary$tip.mean, linetype='dashed')
兩者的信賴區間重疊, 所以平均數沒有差異.
[1] Jared P. Lander 原著, 鍾振蔚 翻譯, 15-3-2 雙樣本 t 檢定, R 軟體資料分析基礎與應用, 旗標出版社, 2015