Lecture 12: Nonparametric Tests

Joel Correa da Rosa
May 31st, 2017

Equivalence Parametric vs. Nonparametric tests

The table below shows the equivalent nonparametric tests for the parametric alternatives.

Parametric Nonparametric
two-sample t-test Wilcoxon-Mann-Whitney
paired t-test Wilcoxon Signed Ranks
one-way ANOVA Kruskall-Wallis

Two-sample (Independent Populations)

Wilcoxon Mann-Whitney

Whenever the two-sample t-test may be used, the Wilcoxon statistic may also be used. However, for variables that follow the normal distribution, the Wilcoxon test is less efficient than the t-test. The relative efficiency of the Wilcoxon-Mann-Whitney test is 95.5%.

Wilcoxon Mann-Whitney test statistic

The computation of the Wilcoxon-Mann-Whitney test statistic is performed in some steps.

  1. Combine samples from the two populations.

  2. Sort the values in ascending order.

  3. Replace each value by its rank (e.g. assign 1 to the smallest value, n for the largest value).

  4. Compute the sum of ranks within each sample.

  5. The test statistic \( S \) is the sum of the ranks in the smaller group. If the groups are balanced, \( S \) can be arbitrarily chosen between the two groups.

Distribution of Wilcoxon-Mann-Whitney test statistic

Denote by \( S \) the statistic that was previously described. Under the null hypothesis \( H_0 \):

Expected value \( E(S) = \frac{m(m+n+1)}{2} \)

Variance \( Var(S) = \frac{mn(m+n+1)}{12} \)

\( m: \) smaller sample size

\( n: \) larger sample size

Distribution of Wilcoxon-Mann-Whitney test statistic

The asymptotic distribution of the \( S \) statistic is approximately normal with mean \( E(S) \) and variance \( Var(S) \).

The rationale for the Wilcoxon-Mann-Whitney test is to reject the null hypothesis if the observed \( S \) statistic deviate from its expected value \( E(S) \) under the null hypothesis \( H0 \).

Application of the Wilcoxon-Mann-Whitney test

We will apply to Wilcoxon-Mann-Whitney test to verify in the nutrition dataset the significance of the difference between the cholesterol levels in the Male and Female groups. This dataset is modified compared to the previous one by including two decimal places. The justification for increasing the precision of the measurements is to avoid the number of ties when assigning ranks to the cholesterol levels.

Cholesterol Levels in Male vs. Female Groups

cholesterol treatment gender
310.29 control M
320.78 control M
370.41 control F
240.88 control F
280.93 diet M
250.05 diet M
270.53 diet F
280.88 diet F
240.55 exercise M
230.46 exercise M
180.95 exercise F
270.45 exercise F

Using R for Assigning Ranks

d$rank.chol<-rank(d$cholesterol)
kable(d)
cholesterol treatment gender rank.chol
310.29 control M 10
320.78 control M 11
370.41 control F 12
240.88 control F 4
280.93 diet M 9
250.05 diet M 5
270.53 diet F 7
280.88 diet F 8
240.55 exercise M 3
230.46 exercise M 2
180.95 exercise F 1
270.45 exercise F 6

Calculation of the S statistic for groups comparison

We will use the rank transformation to test the hypothesis that cholesterol levels are associated with gender.

# using the function ddply to summarize by groups
SumOfRanks<-ddply(d,.(gender),summarise,sum.ranks=sum(rank.chol))
SumOfRanks
  gender sum.ranks
1      F        38
2      M        40
# S 
S = min(SumOfRanks$sum.ranks)[1]
S
[1] 38

Generate the distribution of S under the null hypothesis

The following sequence of functions in R will generate the distribution of the \( S \) statistic under the null hypothesis \( H_0 \).

s<-NULL
for (i in 1:50000){

# generate random permutations of the ranks
temp.d<-sample(d$rank.chol,12,replace=F)

# Sum the ranks for Females
S.sampled<-sum(temp.d[d$gender=='F'])

s<-c(s,S.sampled)
}

Simulated Distribution of W

plot(density(s))
rug(s)
text(0,S,'Sobs',col='red')

plot of chunk unnamed-chunk-8

P-value

To obtain the p-value from the generated distribution of \( S \), we build a table and then count the frequency of times that a random value exceeded the observed value.

p.value = 2*prop.table(table(s<SumOfRanks$sum.ranks[ which(SumOfRanks$gender=='F') ] ) )['TRUE']
p.value
   TRUE 
0.81748 

Using built-in function in R for the Wilcoxon-Mann-Whitney Test

The wilcox.test function in R uses the following statistic for verifying the significance:

\( W=\frac{m(m+2n+1)}{2}-S \)

wilcox.test(d$cholesterol~d$gender, paired=F , exact=FALSE , correct=FALSE, alternative='two.sided')

    Wilcoxon rank sum test

data:  d$cholesterol by d$gender
W = 17, p-value = 0.8728
alternative hypothesis: true location shift is not equal to 0

The parameters 'paired', 'exact', 'correct' and 'alternative' were adjusted to reproduce the assumptions for the generated S distribution in the previous slides.

Note that the p-values are equivalent.

One-Sample (or Paired Samples)

Wilcoxon's Signed Rank

The parametric paired t-test for comparing two paired population means relies on a set of assumptions and may be highly affected by extreme observations.

The Wilcoxon's signed ranked test is the counterpart of the paired t test. Instead of means, the hypotheses of this tests are relative to the medians, more specifically median differences.

Rank-based Statistic

1) Evaluate the differences in each pair.

2) Sort the absolute differences in ascending order.

3) Replace the sorted absolute differences by ranks (1-smallest value/ n- largest value).

4) Multiply each rank by the sign of the difference.

5) Perform two sums, sum the ranks with positive signs and negative signs.

\( V \) is the sum of ranks that were assigned the positive sign.

Wilcoxon's Test Statistic

The test statistic \( V \) for the Wilcoxon's signed rank test is a standardized z-score-type, asymptotically normal distributed.

\( z = \frac{V-\mu_V}{\sigma_V} \)

\( \mu_V = \frac{n(n+1)}{4} \)

\( \sigma_V = \sqrt{\frac{n(n+1)(2n+1)}{24}} \)

Example Obese Weight Loss

A small group of 5 patients was given a very low calorie diet whose effectiveness was measured by the decrease in the Body Mass Index (BMI). The following data show the BMI measured in the same individual at two time points: baseline and 6 months after the beginning of the diet.

Signed Ranks in R

# Evaluate post-pre differences
bmi.diff<-bmi.post-bmi.pre

# rank and obtain signs
signed.ranks<-sign(bmi.diff)*rank(abs(bmi.diff))
signed.ranks
[1] -4 -5 -1  2 -3
V=sum(signed.ranks[signed.ranks>0])
V
[1] 2

Using R for calculation

wilcox.test(bmi.diff)

    Wilcoxon signed rank test

data:  bmi.diff
V = 2, p-value = 0.1875
alternative hypothesis: true location is not equal to 0

Using R for calculation

Alternatively

wilcox.test(bmi.post,bmi.pre,paired=TRUE)

    Wilcoxon signed rank test

data:  bmi.post and bmi.pre
V = 2, p-value = 0.1875
alternative hypothesis: true location shift is not equal to 0