The Wilcoxon Signed-Rank Test is a non-parametric alternative to the paired t-test. It is used when:
Unlike the Sign Test, which only considers signs (+/-), the Wilcoxon Signed-Rank Test also incorporates the magnitude of differences.
The One-Sample Wilcoxon Signed-Rank Test is a nonparametric alternative to the one-sample t-test. It is used when: - The data are not normally distributed. - The sample size is small. - The goal is to test whether the median of a single population is equal to a specified value.
The test was introduced by Frank Wilcoxon (1945) as an extension of the Sign Test, incorporating both signs and magnitudes of differences.
Motivation: The one-sample t-test assumes normality, but many real-world datasets do not satisfy this assumption. The Wilcoxon Signed-Rank Test provides a robust, distribution-free alternative.
Given a sample:
\[ \{X_1, X_2, \dots, X_n\} \]
we test whether the median (\(m\)) is equal to a hypothesized value \(m_0\).
For each observation \(X_i\), compute the difference:
\[ D_i = X_i - m_0 \]
Observations where \(D_i = 0\) are discarded.
Compute the absolute differences:
\[ |D_i| = |X_i - m_0| \]
Assign ranks \(R_i\) from smallest to largest. If there are ties, assign the average rank.
Each rank retains the sign of \(D_i\):
\[ R_i^+ = R_i \quad \text{if } D_i > 0, \quad R_i^- = R_i \quad \text{if } D_i < 0 \]
Define: - \(W^+ =\) Sum of positive signed ranks. - \(W^- =\) Sum of negative signed ranks. - The Wilcoxon Signed-Rank Test Statistic is:
\[ W = \min(W^+, W^-) \]
\[ Z = \frac{W - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n+1)(2n+1)}{24}}} \]
where \(Z\) follows the standard normal distribution \(N(0,1)\).
A nutritionist believes that the median daily caloric intake of a certain population is 2000 kcal. To test this claim, they collect a random sample of 10 individuals.
The observed caloric intake values (in kcal) are:
\[ \begin{array}{|c|c|} \hline \textbf{Participant} & \textbf{Caloric Intake} \\ \hline 1 & 1950 \\ 2 & 2020 \\ 3 & 2100 \\ 4 & 1980 \\ 5 & 2050 \\ 6 & 1990 \\ 7 & 2070 \\ 8 & 2000 \\ 9 & 1955 \\ 10 & 2080 \\ \hline \end{array} \]
# Sample Data: Daily Caloric Intake
caloric_intake <- c(1950, 2020, 2100, 1980, 2050, 1990, 2070, 2000, 1955, 2080)
# Perform Wilcoxon Signed-Rank Test in R
wilcox.test(caloric_intake, mu = 2000, alternative = "two.sided")
##
## Wilcoxon signed rank test with continuity correction
##
## data: caloric_intake
## V = 32, p-value = 0.2855
## alternative hypothesis: true location is not equal to 2000
The Wilcoxon Signed-Rank Test with continuity correction was conducted to assess whether the median caloric intake significantly differs from 2000 kcal.
The p-value (0.2855) is greater than 0.05, meaning we fail to reject the null hypothesis \(H_0\).
Since \(p > 0.05\), there is insufficient evidence to conclude that the median caloric intake significantly differs from 2000 kcal. In other words, the observed differences in caloric intake could have occurred due to random chance.
This analysis shows that the Wilcoxon Signed-Rank Test does not provide strong evidence against the assumption that the median caloric intake is 2000 kcal. 🚀
We have a sample {X1, X2, …, Xn} and we want to test if the true median m equals a hypothesized value m0:
The test statistic is:
W = min(W+, W-)
where W+ is the sum of ranks for positive differences and W- is the sum of ranks for negative differences.
Under H0, W follows a Wilcoxon distribution (approximately normal for large n).
A company wants to test whether the median employee satisfaction score is equal to 75.
# Simulated Employee Satisfaction Scores
set.seed(42)
satisfaction_scores <- c(78, 74, 80, 72, 76, 79, 75, 81, 77, 74)
# Hypothesized median satisfaction score
hypothesized_median <- 75
# Perform Wilcoxon Signed-Rank Test
wilcox_one_sample <- wilcox.test(satisfaction_scores, mu = hypothesized_median, paired = FALSE)
# Display test results
wilcox_one_sample
##
## Wilcoxon signed rank test with continuity correction
##
## data: satisfaction_scores
## V = 35.5, p-value = 0.1369
## alternative hypothesis: true location is not equal to 75
If the p-value is less than 0.05, we reject H0:
H0 → The median satisfaction score differs significantly from 75.
If the p-value is greater than or equal to 0.05, we fail to reject H0:
H0 → There is no statistically significant difference between the median satisfaction score and 75.
df_satisfaction <- data.frame(Scores = satisfaction_scores)
ggplot(df_satisfaction, aes(x = Scores)) +
geom_histogram(bins = 6, fill = "steelblue", alpha = 0.8, color = "black") +
geom_vline(xintercept = hypothesized_median, color = "red", size = 1.2) +
labs(title = "Histogram of Employee Satisfaction Scores", x = "Satisfaction Score", y = "Frequency") +
theme_minimal()
#A company wants to test whether the median employee satisfaction score is equal to 75.
set.seed(123) # For reproducibility
satisfaction_scores <- rnorm(20, 70, 10) # Simulate scores (not necessarily normal)
# Add some skewness (optional, to further deviate from normality)
satisfaction_scores <- satisfaction_scores + runif(20, -5, 5)
# Ensure no scores below zero (satisfaction can't be negative)
satisfaction_scores[satisfaction_scores < 0] <- 0
satisfaction_scores <- round(satisfaction_scores)
# Create a data frame (optional but good practice)
df <- data.frame(score = satisfaction_scores)
print(df)
## score
## 1 61
## 2 67
## 3 85
## 4 69
## 5 68
## 6 84
## 7 72
## 8 57
## 9 61
## 10 69
## 11 78
## 12 73
## 13 77
## 14 67
## 15 65
## 16 85
## 17 71
## 18 53
## 19 81
## 20 64
wilcox.test(df$score, mu = 75, alternative = "two.sided")
##
## Wilcoxon signed rank test with continuity correction
##
## data: df$score
## V = 52, p-value = 0.04976
## alternative hypothesis: true location is not equal to 75
If the p-value is less than 0.05, we reject H0:
H0 → The median satisfaction score differs significantly from 75.
If the p-value is greater than or equal to 0.05, we fail to reject H0:
H0 → There is no statistically significant difference between the median satisfaction score and 75.
The Wilcoxon Signed-Rank Test is a nonparametric statistical test that is used to compare the median of a single sample or the median difference between paired observations. It is particularly useful when: - The data are not normally distributed. - The sample size is small, making the t-test unreliable. - The data are paired (dependent observations).
This test was introduced by Frank Wilcoxon (1945) and is an extension of the Sign Test, which only considers the signs of the differences but not their magnitude.
Motivation: The parametric paired t-test assumes that the differences follow a normal distribution. However, real-world data often violate this assumption (e.g., skewed data or ordinal measurements). The Wilcoxon Signed-Rank Test provides a distribution-free alternative.
Given paired observations:
\[ \{(X_1, Y_1), (X_2, Y_2), \dots, (X_n, Y_n)\} \]
we define the differences:
\[ D_i = X_i - Y_i \]
If \(D_i = 0\), discard the observation as it does not contribute to ranking.
Compute the absolute differences:
\[ |D_i| = |X_i - Y_i| \]
Assign ranks \(R_i\) to the nonzero absolute differences, from smallest to largest. If there are ties, assign the average rank.
Each rank \(R_i\) retains the sign of \(D_i\):
\[ R_i^+ = R_i \quad \text{if } D_i > 0, \quad R_i^- = R_i \quad \text{if } D_i < 0 \]
Define: - \(W^+ =\) Sum of positive signed ranks. - \(W^- =\) Sum of negative signed ranks. - The Wilcoxon Signed-Rank Test Statistic is:
\[ W = \min(W^+, W^-) \]
\[ Z = \frac{W - \frac{n(n+1)}{4}}{\sqrt{\frac{n(n+1)(2n+1)}{24}}} \]
where \(Z\) follows the standard normal distribution \(N(0,1)\).
A researcher wants to determine if a new training program improves student test scores. The scores before and after the program are:
\[ \begin{array}{|c|c|c|} \hline \textbf{Student} & \textbf{Before} & \textbf{After} \\ \hline 1 & 65 & 70 \\ 2 & 78 & 80 \\ 3 & 75 & 78 \\ 4 & 60 & 65 \\ 5 & 80 & 82 \\ 6 & 70 & 74 \\ 7 & 72 & 76 \\ 8 & 68 & 72 \\ \hline \end{array} \]
# Sample Data: Before and After Scores
before <- c(65, 78, 75, 60, 80, 70, 72, 68)
after <- c(70, 80, 78, 65, 82, 74, 76, 72)
# Perform Wilcoxon Signed-Rank Test in R
wilcox.test(before, after, paired = TRUE, alternative = "two.sided")
##
## Wilcoxon signed rank test with continuity correction
##
## data: before and after
## V = 0, p-value = 0.01356
## alternative hypothesis: true location shift is not equal to 0
For paired data {(X1, Y1), (X2, Y2), …, (Xn, Yn)}, define Di = Xi - Yi. We test:
The test follows the same procedure as the single-sample case, but using the paired before-after differences (Di).
A fitness trainer wants to evaluate whether a new workout program significantly changes participants’ weight.
set.seed(2023)
# Simulate paired data (Before & After weights)
before_weight <- c(82, 78, 85, 79, 77, 80, 83, 75, 81, 78)
after_weight <- before_weight - rnorm(10, mean = 1.5, sd = 0.5) # Weight reduction by ~1.5 kg
# Display paired data
df_weights <- data.frame(Participant = 1:10, Before = before_weight, After = after_weight)
kable(df_weights, caption = "Before and After Weight (kg)")
Participant | Before | After |
---|---|---|
1 | 82 | 80.54189 |
2 | 78 | 76.99147 |
3 | 85 | 84.43753 |
4 | 79 | 77.59307 |
5 | 77 | 75.81674 |
6 | 80 | 77.95460 |
7 | 83 | 81.95686 |
8 | 75 | 72.99918 |
9 | 81 | 79.69963 |
10 | 78 | 76.73406 |
# Perform Wilcoxon Signed-Rank Test for Paired Data
wilcox_paired <- wilcox.test(before_weight, after_weight, paired = TRUE)
# Print test result
wilcox_paired
##
## Wilcoxon signed rank exact test
##
## data: before_weight and after_weight
## V = 55, p-value = 0.001953
## alternative hypothesis: true location shift is not equal to 0
df_weights$Difference <- df_weights$Before - df_weights$After
ggplot(df_weights, aes(x = Participant, y = Difference, fill = Difference > 0)) +
geom_bar(stat = "identity") +
geom_hline(yintercept = 0, color = "red", size = 1) +
labs(title = "Weight Differences: Before - After", y = "Weight Change (kg)") +
scale_fill_manual(values = c("TRUE" = "steelblue", "FALSE" = "tomato")) +
theme_minimal()
If the p-value is less than 0.05, we reject H0:
H0 → The new workout program significantly reduces weight.
If the p-value is greater than or equal to 0.05, we fail to reject H0:
H0 → There is no statistically significant change in weight due to the workout program.
df_weights$Difference <- df_weights$Before - df_weights$After
ggplot(df_weights, aes(x = Participant, y = Difference, fill = Difference > 0)) +
geom_bar(stat = "identity") +
geom_hline(yintercept = 0, color = "red", size = 1) +
labs(title = "Weight Differences: Before - After", y = "Weight Change (kg)") +
scale_fill_manual(values = c("TRUE" = "steelblue", "FALSE" = "tomato")) +
theme_minimal()
A professor records students’ exam scores before and after introducing a new teaching method.
Student | Before | After |
---|---|---|
A | 78 | 82 |
B | 75 | 77 |
C | 80 | 84 |
D | 72 | 74 |
E | 77 | 79 |
F | 83 | 85 |
G | 79 | 81 |
H | 76 | 80 |
before_scores <- c(78, 75, 80, 72, 77, 83, 79, 76)
after_scores <- c(82, 77, 84, 74, 79, 85, 81, 80)
wilcox.test(before_scores, after_scores, paired = TRUE, alternative = "less") # "less" because we are testing for an *increase*.
##
## Wilcoxon signed rank test with continuity correction
##
## data: before_scores and after_scores
## V = 0, p-value = 0.00577
## alternative hypothesis: true location shift is less than 0
#or
differences <- before_scores - after_scores
wilcox.test(differences, mu=0, alternative="less")
##
## Wilcoxon signed rank test with continuity correction
##
## data: differences
## V = 0, p-value = 0.00577
## alternative hypothesis: true location is less than 0
before_scores <- c(78, 75, 80, 72, 77, 83, 79, 76)
after_scores <- c(82, 77, 84, 74, 79, 85, 81, 80)
# Wilcoxon Test
wilcox_exercise <- wilcox.test(before_scores, after_scores, paired = TRUE)
# Print result
wilcox_exercise
##
## Wilcoxon signed rank test with continuity correction
##
## data: before_scores and after_scores
## V = 0, p-value = 0.01154
## alternative hypothesis: true location shift is not equal to 0
If the p-value from the wilcox.test output is less than 0.05, we reject H0:
H0 → The new teaching method has not increased the median exam score (or the median difference is zero or negative).
We conclude that the new teaching method has significantly increased the median exam score.
If the p-value is greater than or equal to 0.05, we fail to reject H0:
H0 → The new teaching method has not increased the median exam score.
We conclude that there is not enough evidence to say that the new teaching method has significantly increased the median exam score.