Joel Correa da Rosa
May 31st, 2017
The table below shows the equivalent nonparametric tests for the parametric alternatives.
Parametric | Nonparametric |
---|---|
two-sample t-test | Wilcoxon-Mann-Whitney |
paired t-test | Wilcoxon Signed Ranks |
one-way ANOVA | Kruskall-Wallis |
Whenever the two-sample t-test may be used, the Wilcoxon statistic may also be used. However, for variables that follow the normal distribution, the Wilcoxon test is less efficient than the t-test. The relative efficiency of the Wilcoxon-Mann-Whitney test is 95.5%.
The computation of the Wilcoxon-Mann-Whitney test statistic is performed in some steps.
Combine samples from the two populations.
Sort the values in ascending order.
Replace each value by its rank (e.g. assign 1 to the smallest value, n for the largest value).
Compute the sum of ranks within each sample.
The test statistic \( S \) is the sum of the ranks in the smaller group. If the groups are balanced, \( S \) can be arbitrarily chosen between the two groups.
Denote by \( S \) the statistic that was previously described. Under the null hypothesis \( H_0 \):
Expected value \( E(S) = \frac{m(m+n+1)}{2} \)
Variance \( Var(S) = \frac{mn(m+n+1)}{12} \)
\( m: \) smaller sample size
\( n: \) larger sample size
The asymptotic distribution of the \( S \) statistic is approximately normal with mean \( E(S) \) and variance \( Var(S) \).
The rationale for the Wilcoxon-Mann-Whitney test is to reject the null hypothesis if the observed \( S \) statistic deviate from its expected value \( E(S) \) under the null hypothesis \( H0 \).
We will apply to Wilcoxon-Mann-Whitney test to verify in the nutrition dataset the significance of the difference between the cholesterol levels in the Male and Female groups. This dataset is modified compared to the previous one by including two decimal places. The justification for increasing the precision of the measurements is to avoid the number of ties when assigning ranks to the cholesterol levels.
cholesterol | treatment | gender |
---|---|---|
310.29 | control | M |
320.78 | control | M |
370.41 | control | F |
240.88 | control | F |
280.93 | diet | M |
250.05 | diet | M |
270.53 | diet | F |
280.88 | diet | F |
240.55 | exercise | M |
230.46 | exercise | M |
180.95 | exercise | F |
270.45 | exercise | F |
d$rank.chol<-rank(d$cholesterol)
kable(d)
cholesterol | treatment | gender | rank.chol |
---|---|---|---|
310.29 | control | M | 10 |
320.78 | control | M | 11 |
370.41 | control | F | 12 |
240.88 | control | F | 4 |
280.93 | diet | M | 9 |
250.05 | diet | M | 5 |
270.53 | diet | F | 7 |
280.88 | diet | F | 8 |
240.55 | exercise | M | 3 |
230.46 | exercise | M | 2 |
180.95 | exercise | F | 1 |
270.45 | exercise | F | 6 |
We will use the rank transformation to test the hypothesis that cholesterol levels are associated with gender.
# using the function ddply to summarize by groups
SumOfRanks<-ddply(d,.(gender),summarise,sum.ranks=sum(rank.chol))
SumOfRanks
gender sum.ranks
1 F 38
2 M 40
# S
S = min(SumOfRanks$sum.ranks)[1]
S
[1] 38
The following sequence of functions in R will generate the distribution of the \( S \) statistic under the null hypothesis \( H_0 \).
s<-NULL
for (i in 1:50000){
# generate random permutations of the ranks
temp.d<-sample(d$rank.chol,12,replace=F)
# Sum the ranks for Females
S.sampled<-sum(temp.d[d$gender=='F'])
s<-c(s,S.sampled)
}
plot(density(s))
rug(s)
text(0,S,'Sobs',col='red')
To obtain the p-value from the generated distribution of \( S \), we build a table and then count the frequency of times that a random value exceeded the observed value.
p.value = 2*prop.table(table(s<SumOfRanks$sum.ranks[ which(SumOfRanks$gender=='F') ] ) )['TRUE']
p.value
TRUE
0.81748
The wilcox.test function in R uses the following statistic for verifying the significance:
\( W=\frac{m(m+2n+1)}{2}-S \)
wilcox.test(d$cholesterol~d$gender, paired=F , exact=FALSE , correct=FALSE, alternative='two.sided')
Wilcoxon rank sum test
data: d$cholesterol by d$gender
W = 17, p-value = 0.8728
alternative hypothesis: true location shift is not equal to 0
The parameters 'paired', 'exact', 'correct' and 'alternative' were adjusted to reproduce the assumptions for the generated S distribution in the previous slides.
Note that the p-values are equivalent.
The parametric paired t-test for comparing two paired population means relies on a set of assumptions and may be highly affected by extreme observations.
The Wilcoxon's signed ranked test is the counterpart of the paired t test. Instead of means, the hypotheses of this tests are relative to the medians, more specifically median differences.
1) Evaluate the differences in each pair.
2) Sort the absolute differences in ascending order.
3) Replace the sorted absolute differences by ranks (1-smallest value/ n- largest value).
4) Multiply each rank by the sign of the difference.
5) Perform two sums, sum the ranks with positive signs and negative signs.
\( V \) is the sum of ranks that were assigned the positive sign.
The test statistic \( V \) for the Wilcoxon's signed rank test is a standardized z-score-type, asymptotically normal distributed.
\( z = \frac{V-\mu_V}{\sigma_V} \)
\( \mu_V = \frac{n(n+1)}{4} \)
\( \sigma_V = \sqrt{\frac{n(n+1)(2n+1)}{24}} \)
A small group of 5 patients was given a very low calorie diet whose effectiveness was measured by the decrease in the Body Mass Index (BMI). The following data show the BMI measured in the same individual at two time points: baseline and 6 months after the beginning of the diet.
# Evaluate post-pre differences
bmi.diff<-bmi.post-bmi.pre
# rank and obtain signs
signed.ranks<-sign(bmi.diff)*rank(abs(bmi.diff))
signed.ranks
[1] -4 -5 -1 2 -3
V=sum(signed.ranks[signed.ranks>0])
V
[1] 2
wilcox.test(bmi.diff)
Wilcoxon signed rank test
data: bmi.diff
V = 2, p-value = 0.1875
alternative hypothesis: true location is not equal to 0
Alternatively
wilcox.test(bmi.post,bmi.pre,paired=TRUE)
Wilcoxon signed rank test
data: bmi.post and bmi.pre
V = 2, p-value = 0.1875
alternative hypothesis: true location shift is not equal to 0