The Mann-Whitney U Test (also called the Wilcoxon Rank-Sum Test) is a nonparametric test that compares two independent samples to determine whether one tends to have larger values than the other. It is an alternative to the independent two-sample t-test when: - The data are not normally distributed. - The sample size is small. - The data are ordinal (ranked) rather than continuous.
The test was developed by Henry Mann and Donald Whitney (1947) as an enhancement of Wilcoxon’s rank-sum test (1945).
The independent two-sample t-test assumes: 1. Normality: The two groups come from normally distributed populations. 2. Equal Variances: The variance in both groups is the same. 3. Interval Data: The data are measured on a numerical scale.
However, in many real-world cases: - The data are skewed or non-normally distributed. - The sample size is too small to check normality. - The data are ordinal (e.g., Likert scale responses: 1-5).
The Mann-Whitney U Test is a distribution-free alternative that only assumes that the two distributions have a similar shape.
Suppose we have two independent groups:
\[ X_1, X_2, \dots, X_m \quad \text{(Sample 1, size \( m \))} \] \[ Y_1, Y_2, \dots, Y_n \quad \text{(Sample 2, size \( n \))} \]
The test examines whether one group tends to have higher values than the other.
This is a rank-based test that evaluates whether one group tends to have larger values than the other.
Let: - \(R_X\) = sum of ranks for Sample 1. - \(R_Y\) = sum of ranks for Sample 2.
The Mann-Whitney U statistic is calculated as:
\[ U_X = R_X - \frac{m(m+1)}{2} \]
\[ U_Y = R_Y - \frac{n(n+1)}{2} \]
where \(U_X\) and \(U_Y\) represent the number of times an observation in one group is greater than an observation in the other group.
The test statistic is:
\[ U = \min(U_X, U_Y) \]
\[ Z = \frac{U - \frac{mn}{2}}{\sqrt{\frac{mn(m+n+1)}{12}}} \]
where \(Z\) follows the standard normal distribution \(N(0,1)\).
A psychologist wants to compare the stress levels of two groups of students: 1. Group A: Students who meditate before exams. 2. Group B: Students who do not meditate.
The measured stress levels (on a scale of 1 to 100) are:
\[ \begin{array}{|c|c|} \hline \textbf{Meditation Group} & \textbf{No Meditation Group} \\ \hline 42 & 65 \\ 50 & 70 \\ 48 & 72 \\ 39 & 80 \\ 45 & 78 \\ \hline \end{array} \]
# Sample Data: Stress Levels
meditation <- c(42, 50, 48, 39, 45)
no_meditation <- c(65, 70, 72, 80, 78)
# Perform Mann-Whitney U Test in R
wilcox.test(meditation, no_meditation, alternative = "two.sided")
##
## Wilcoxon rank sum exact test
##
## data: meditation and no_meditation
## W = 0, p-value = 0.007937
## alternative hypothesis: true location shift is not equal to 0
A scientist wants to compare reaction times between two groups:
# Simulated reaction time data (in seconds)
set.seed(42)
groupA <- rnorm(15, mean = 5.2, sd = 0.8) # Caffeine group
groupB <- rnorm(15, mean = 6.0, sd = 0.9) # Non-Caffeine group
# Create a dataframe
df_mannwhitney <- data.frame(
Group = rep(c("Caffeine", "No Caffeine"), each = 15),
ReactionTime = c(groupA, groupB)
)
# Display first few rows
kable(head(df_mannwhitney), caption = "First Few Rows of Reaction Time Data")
Group | ReactionTime |
---|---|
Caffeine | 6.296767 |
Caffeine | 4.748241 |
Caffeine | 5.490503 |
Caffeine | 5.706290 |
Caffeine | 5.523415 |
Caffeine | 5.115100 |
###Performing the Mann-Whitney U Test###
# Perform Mann-Whitney U Test
mannwhitney_test <- wilcox.test(groupA, groupB, alternative = "two.sided")
# Print test results
mannwhitney_test
##
## Wilcoxon rank sum exact test
##
## data: groupA and groupB
## W = 96, p-value = 0.5125
## alternative hypothesis: true location shift is not equal to 0
If the p-value is less than 0.05, we reject H0:
H0 → The two groups have significantly different reaction times.
If the p-value is greater than or equal to 0.05, we fail to reject H0:
H0 → There is no statistically significant difference in reaction times between the two groups.
ggplot(df_mannwhitney, aes(x = Group, y = ReactionTime, fill = Group)) +
geom_boxplot(alpha = 0.7) +
labs(title = "Reaction Times: Caffeine vs. No Caffeine",
x = "Group",
y = "Reaction Time (seconds)") +
theme_minimal()
PlantGrowth
in RThe built-in PlantGrowth
dataset in R records the weight
of plants under three conditions:
ctrl
: Control grouptrt1
: Treatment 1trt2
: Treatment 2We will compare the control group (ctrl
) vs. Treatment 1
(trt1
) using the Mann-Whitney U test (also known as the
Wilcoxon rank-sum test).
# Load dataset
data("PlantGrowth")
# Filter for Control and Treatment 1
plant_data <- PlantGrowth %>% filter(group != "trt2")
# Perform Mann-Whitney U Test
mannwhitney_real <- wilcox.test(weight ~ group, data = plant_data)
# Display first few rows
kable(head(plant_data), caption = "First Few Rows of Plant Growth Data")
weight | group |
---|---|
4.17 | ctrl |
5.58 | ctrl |
5.18 | ctrl |
6.11 | ctrl |
4.50 | ctrl |
4.61 | ctrl |
# Print test result
mannwhitney_real
##
## Wilcoxon rank sum test with continuity correction
##
## data: weight by group
## W = 67.5, p-value = 0.1986
## alternative hypothesis: true location shift is not equal to 0
If the p-value is less than 0.05, we reject H0:
H0 → There is a statistically significant difference in plant weight between the Control group and Treatment 1.
If the p-value is greater than or equal to 0.05, we fail to reject H0:
H0 → There is no statistically significant difference in plant weight between the Control group and Treatment 1.
ggplot(plant_data, aes(x = group, y = weight, fill = group)) +
geom_boxplot(alpha = 0.7) +
labs(title = "Plant Growth: Control vs. Treatment 1",
x = "Group",
y = "Plant Weight") +
theme_minimal()
A teacher records exam scores for two different teaching methods.
Student | Traditional | Modern |
---|---|---|
A | 78 | 85 |
B | 75 | 80 |
C | 80 | 87 |
D | 72 | 78 |
E | 77 | 83 |
F | 83 | 89 |
G | 79 | 84 |
H | 76 | 81 |
traditional_scores <- c(78, 75, 80, 72, 77, 83, 79, 76)
modern_scores <- c(85, 80, 87, 78, 83, 89, 84, 81)
traditional <- c(78, 75, 80, 72, 77, 83, 79, 76)
modern <- c(85, 80, 87, 78, 83, 89, 84, 81)
# Perform Mann-Whitney U Test
mannwhitney_exercise <- wilcox.test(traditional, modern, alternative = "two.sided")
# Print test result
mannwhitney_exercise
##
## Wilcoxon rank sum test with continuity correction
##
## data: traditional and modern
## W = 6.5, p-value = 0.008505
## alternative hypothesis: true location shift is not equal to 0
df_exercise <- data.frame(
Method = rep(c("Traditional", "Modern"), each = 8),
Score = c(traditional, modern)
)
ggplot(df_exercise, aes(x = Method, y = Score, fill = Method)) +
geom_boxplot(alpha = 0.7) +
labs(title = "Exam Scores: Traditional vs. Modern Teaching",
x = "Teaching Method",
y = "Exam Score") +
theme_minimal()
## Conclusion
The Mann-Whitney U test (also known as the Wilcoxon rank-sum test) is a robust non-parametric alternative to the independent samples t-test, especially when:
It is widely used in fields like medicine, psychology, and education, where the assumption of normality is often not met.