Wilcoxon Signed Rank Test

Consider the situation where we tabulate the numbers of errors made by a group of 10 subjects in translating two passages of English, of equal length, into French. We wish to test (5% level) whether there is any significant difference between the two sets of scores. Since we are not predicting the direction of any such difference, a non-directional test will be appropriate.

a <- c(8,7,4,2,4,10,17,3,2,11)
b <- c(10,6,4,5,7,11,15,6,3,14)
data <- data.frame(a,b)
data <- tbl_df(data)
data
## Source: local data frame [10 x 2]
## 
##     a  b
## 1   8 10
## 2   7  6
## 3   4  4
## 4   2  5
## 5   4  7
## 6  10 11
## 7  17 15
## 8   3  6
## 9   2  3
## 10 11 14

Step 1:

Find the difference between each pair of scores.

data <- mutate(data, diffs = a-b) # Add a row of differences of 'a' - 'b'
data
## Source: local data frame [10 x 3]
## 
##     a  b diffs
## 1   8 10    -2
## 2   7  6     1
## 3   4  4     0
## 4   2  5    -3
## 5   4  7    -3
## 6  10 11    -1
## 7  17 15     2
## 8   3  6    -3
## 9   2  3    -1
## 10 11 14    -3

Step 2:

Remove all ‘0’ differences

data <- filter(data, a-b != 0)
data
## Source: local data frame [9 x 3]
## 
##    a  b diffs
## 1  8 10    -2
## 2  7  6     1
## 3  2  5    -3
## 4  4  7    -3
## 5 10 11    -1
## 6 17 15     2
## 7  3  6    -3
## 8  2  3    -1
## 9 11 14    -3

Step 3:

Rank the differences based on their absolute position.

data <- mutate(data, ranks = rank(abs(data$diffs))) 
data
## Source: local data frame [9 x 4]
## 
##    a  b diffs ranks
## 1  8 10    -2   4.5
## 2  7  6     1   2.0
## 3  2  5    -3   7.5
## 4  4  7    -3   7.5
## 5 10 11    -1   2.0
## 6 17 15     2   4.5
## 7  3  6    -3   7.5
## 8  2  3    -1   2.0
## 9 11 14    -3   7.5

Step 4:

Each rank is given the appropriate sign.

data$ranks <- ifelse(data$diffs < 0, data$ranks*-1, data$ranks)
data
## Source: local data frame [9 x 4]
## 
##    a  b diffs ranks
## 1  8 10    -2  -4.5
## 2  7  6     1   2.0
## 3  2  5    -3  -7.5
## 4  4  7    -3  -7.5
## 5 10 11    -1  -2.0
## 6 17 15     2   4.5
## 7  3  6    -3  -7.5
## 8  2  3    -1  -2.0
## 9 11 14    -3  -7.5

Step 5:

The sums of the positive ranks and negative ranks is calculated separately.

sumPos <- filter(data, ranks > 0) %>% summarize(Sum_of_Positives = sum(ranks))
sumNeg <- filter(data, ranks < 0) %>% summarize(Sum_of_Negatives = sum(ranks))
data.frame(sumPos, sumNeg)
##   Sum_of_Positives Sum_of_Negatives
## 1              6.5            -38.5

Step 6:

The smaller sum of the two, 6.5, is assigned as test statistic W.

Step 7:

Compare the previous value to the critical value. In this case, the critical value is 5. Since 6.5 > 5, do not reject the null hypothesis.


How to perform quickly in R

wilcox.test(a, b, paired = TRUE)
## Warning in wilcox.test.default(a, b, paired = TRUE): cannot compute exact
## p-value with ties
## Warning in wilcox.test.default(a, b, paired = TRUE): cannot compute exact
## p-value with zeroes
## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  a and b
## V = 6.5, p-value = 0.06275
## alternative hypothesis: true location shift is not equal to 0