Consider the following data:
group1 <- c(2.9736,0.9448,1.6394,0.0389,1.2958)
group2 <- c(0.7681,0.8027,0.2156,0.0740,1.5076)
group3 <- c(4.8249,2.2516,1.5609,2.0452,1.0959)
d=data.frame("data"=c(group1,group2, group3), "group"= c(rep("group 1",5),rep("group 2",5),rep("group 3",5)))
boxplot(d$data~d$group)
Perform the usual ANOVA F-test
aov= aov(data~group, data = d)
summary(aov)
## Df Sum Sq Mean Sq F value Pr(>F)
## group 2 7.135 3.568 2.991 0.0883 .
## Residuals 12 14.315 1.193
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#I couldn't extract the F stat or p-value with this method
Fun fact, if you do it this way, you can run the Tukey post hoc test
TukeyHSD(aov)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = data ~ group, data = d)
##
## $group
## diff lwr upr p adj
## group 2-group 1 -0.7049 -2.547807 1.138007 0.5786492
## group 3-group 1 0.9772 -0.865707 2.820107 0.3645720
## group 3-group 2 1.6821 -0.160807 3.525007 0.0749628
This method is what you did…
lm = lm(data~group, data = d)
anova = anova(lm)
Fobs = anova[1,4]
pval = anova[1,5];pval
## [1] 0.08833844
Perform the permutation F-test
set.seed(1)
n = length(d$data)
nsim=1000
F_stat_Vec_Null = rep(NA,nsim)
for (i in 1:nsim){
d_permute = d
d_permute$data = d$data[sample(1:n,n)]
lm_out = lm(data~group, data = d_permute)
anova_out=anova(lm_out)
F_stat_Vec_Null[i]=anova_out[1,4]
}
pval_Perm=sum(Fobs<F_stat_Vec_Null)/nsim
#Permutation p-value
pval_Perm
## [1] 0.073
#pvalue from anova F test
pval
## [1] 0.08833844
Perform the Kruskal-Wallis test
kruskal.test(list(group1,group2,group3))
##
## Kruskal-Wallis rank sum test
##
## data: list(group1, group2, group3)
## Kruskal-Wallis chi-squared = 5.78, df = 2, p-value = 0.05558
Discuss the results of these three tests. (i.e. What are the p-values? What is the conclusion? Are the conclusions different? How similar are the p-values? Are any assumptions violated? etc.)
The p-value in permutation test (0.073) is very close to the Kruskal-Wallis test (.056) and the p-from the anova F test is slightly larger (.088). At the \(\alpha = 0.05\) level the conclusions would be the same (fail-to-reject). The assumption of normality is violated in the anova F test and the sample size it small (5), however the outliers don’t appear to have that large of an effect on p-values.
Download this data set: https://bit.ly/2FKGu43
d2=read.csv("NP_HW2.csv")
It contains 50 observations divided equally between 5 groups. First perform the Kruskal-Wallis test and report your results. Then, if necessary, perform multiple comparisons using the Wilcoxon rank-sum test using the Bonferroni correction to control the family-wise (or experiment-wise) error rate at α=0.05. Report your results and any significant differences that you find.
kruskal.test(d2$score~d2$group)
##
## Kruskal-Wallis rank sum test
##
## data: d2$score by d2$group
## Kruskal-Wallis chi-squared = 26.033, df = 4, p-value = 3.116e-05
pairwise.wilcox.test(d2$score, d2$group, p.adj="bonf")
##
## Pairwise comparisons using Wilcoxon rank sum test
##
## data: d2$score and d2$group
##
## 1 2 3 4
## 2 1.0000 - - -
## 3 0.0021 0.0209 - -
## 4 0.0389 0.7526 0.0049 -
## 5 0.0209 0.1469 1.0000 0.0049
##
## P value adjustment method: bonferroni
Since the Kruskal-Wallis Rank Sum test was significant we know that at least one of the medians is different so we look at Bonferroni adjusted pair-wise wilcoxon Rank Sum Tests to find which pairs are significantly different and the following pairs are signficant at the \(\alpha = 0.05\) level: 1-3, 1-4, 1-5, 2-3, 3-4, and 4-5.
Perform the Kruskal-Wallis test on the data to determine if there is a difference between the median lengths for the YOY gizzard shad populations in the four Kokosing Lake sites. Make sure to stat the null and alternative hypotheses, the p-value, and your conclusion.
\[H_0:\theta_1=\theta_2=\theta_3=\theta_4\] \(H_a:\) at least one \(\theta\) is not the same.
Where \(\theta\) are the median lengths.
d3=data.frame("lengths"=c(46,28,46,37,32,41,42,45,38,44,42,60,32,42,45,58,27,51,42,52,38,33,26,25,28,28,26,27,27,27,31,30,27,29,30,25,25,24,27,30), "site"=c(rep("site1", 10),rep("site2", 10),rep("site3", 10),rep("site4", 10)))
k =kruskal.test(d3$lengths~d3$site)
pairwise.wilcox.test(d3$lengths, d3$site, p.adj="bonf")
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot
## compute exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot
## compute exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot
## compute exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot
## compute exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot
## compute exact p-value with ties
## Warning in wilcox.test.default(xi, xj, paired = paired, ...): cannot
## compute exact p-value with ties
##
## Pairwise comparisons using Wilcoxon rank sum test
##
## data: d3$lengths and d3$site
##
## site1 site2 site3
## site2 1.0000 - -
## site3 0.0066 0.0084 -
## site4 0.0045 0.0057 1.0000
##
## P value adjustment method: bonferroni
Since the p-value 4.334658810^{-5}< \(\alpha = 0.05\), we reject the null hypothesis and conclude at least one of the medians is different. Using the bonferroni adjusted p-values of the pair-wise Wilcoxon Rank Sum Test we see that site1-site3, site1-site4, site2-site3, and site2-site4 are signfificantly different.
(Grad students only) Suppose k=4, n1=n2=n3=1 and n4=2. Obtain the form of the exact null (H0) distribution of H for the case of no tied data.
combinationsn4=combn(5,2)
ranks = c(1,2,3,4,5)
H = rep(NA,nsim)
for (i in 1:ncol(combinationsn4)){
R=c(setdiff(ranks,combinationsn4[,i]),combinationsn4[,i])
H[i]=(12/30)*(1*(R[1]-(3))^2 +1*(R[2]-(3))^2 +1*(R[3]-(3))^2 +2*((R[4]+R[5])/2-(3))^2)
print(H[i])
}
## [1] 3.8
## [1] 3.2
## [1] 2.2
## [1] 0.8
## [1] 3.8
## [1] 3.2
## [1] 2.2
## [1] 3.8
## [1] 3.2
## [1] 3.8
hist(H)
(Grad students only) The Bonferroni correction is known to be conservative. This means that we are guaranteed to control family-wise error rate at less than or equal to α, but often the true family-wise error rate is much less than this. Perform a simulation study to compute the true family-wise error rate when performing multiple comparisons under the null hypothesis that the means of all groups are the same. Perform the simulation study using the following parameters: • α = 0.2 • k=6 • nj =20 for j =1,2,3,4,5,6, • Data in each group is generated from a normal distribution with variance 1 • Use the Wilcoxon rank-sum test for multiple comparisons.
d5=data.frame("samples" = c(rnorm(120)), "j"=c(rep(1,20),rep(2,20),rep(3,20),rep(4,20),rep(5,20),rep(6,20)))
set.seed(1)
nsim5=1000
alpha=.2
countvec=rep(0,nsim5)
for (k in 1:nsim5){
dp = d5
dp$samples = d5$samples[sample(1:120,120)]
wk = pairwise.wilcox.test(dp$samples,dp$j,p.adj="none")
for(i in 1:5){
for(j in 1:i){
if (wk$p.value[i,j]<alpha) {countvec[k]=countvec[k]+1}
}
}
}
error = (nsim5-sum(countvec==0))/nsim5; error
## [1] 0.791
I got a family error rate of 0.791 in my simulation.
I counted the number of times I got at least 1 significant p-value in my pair-wise Wilcoxon Rank-Sum tests (unadjusted).
d5=data.frame("samples" = c(rnorm(120)), "j"=c(rep(1,20),rep(2,20),rep(3,20),rep(4,20),rep(5,20),rep(6,20)))
set.seed(1)
nsim5=1000
alpha=.2
countvec=rep(0,nsim5)
for (k in 1:nsim5){
dp = d5
dp$samples = d5$samples[sample(1:120,120)]
wk = pairwise.wilcox.test(dp$samples,dp$j,p.adj="bonf")
for(i in 1:5){
for(j in 1:i){
if (wk$p.value[i,j]<alpha) {countvec[k]=countvec[k]+1}
}
}
}
errorb = (nsim5-sum(countvec==0))/nsim5; errorb
## [1] 0.124
The Bonferroni adjustment guarantees the family alpha level of .2 (I actually got an error rate of 0.124).