Median Test

Author

Ruba Fikri Abdallatif 20224685

Published

August 27, 2024

Motivation and Background

The median test is a non-parametric test used to determine if there are significant differences in the median values between two or more independent samples. It is useful when the data does not meet the assumptions of normality required for parametric tests.

Hypotheses and Assumptions

Null Hypothesis (H0): The medians of all groups are equal.

Alternative Hypothesis (H1): At least one of the medians is different from the others.

Assumptions:

  • Independence of samples

  • Data is ordinal or continuous

  • Random sampling

Summary Statistics

We will use the coin package to perform the median test on the Iris dataset. The following R code applies the median test:

suppressPackageStartupMessages({
  library(ggplot2)
})
Warning: package 'ggplot2' was built under R version 4.5.0
ggplot(iris, aes(x = Species, y = Sepal.Length)) +
  geom_boxplot() +
  labs(title = "Boxplot of Sepal Length by Iris Species",
       x = "Species",
       y = "Sepal Length") +
  theme_minimal()

suppressPackageStartupMessages({
 library(coin)
  })

data(iris)


# Perform the Median Test using oneway_test
median_test_result <- median_test(Sepal.Length ~ Species, data = iris)

# Extract the p-value
p_value <- pvalue(median_test_result)

# Create a boxplot and annotate with the p-value
ggplot(iris, aes(x = Species, y = Sepal.Length, fill = Species)) +
  geom_boxplot() +
  labs(title = "Boxplot of Sepal Length by Species",
       x = "Species",
       y = "Sepal Length") +
  annotate("text", x = 1.5, y = max(iris$Sepal.Length), 
           label = paste("p-value =", format(p_value, digits = 3)), 
           size = 5, hjust = 0.5, vjust = 1.5) +
  theme_minimal()

# Print the result
print(median_test_result)

    Asymptotic K-Sample Brown-Mood Median Test

data:  Sepal.Length by Species (setosa, versicolor, virginica)
chi-squared = 78.119, df = 2, p-value < 2.2e-16

Results

The test results include a p-value indicating whether there are significant differences in the median sepal lengths among the Iris species.

The median test helps to determine if there are significant differences in the medians of different groups. In the context of the Iris dataset, this test provides insight into whether the sepal lengths vary significantly among different Iris species.