Parametric tests are are designed to estimate or test hypotheses about population parameters (e.g., the mean). For these tests to be valid, certain assumptions based on the population’s distribution, from which the samples were drawn, must be satisified.
When the assumptions of the parametric tests are violated, we can use alternative tests. These are either not concerned with population parameters (nonparametric) or do not make assumptions about the distribution of the population being sampled (distribution-free).
The following tests serve as nonparametric alternatives for one-sample or paired two-sample inference when the assumptions of the parametric t-test are violated (i.e., normality).
It is used for hypothesis testing:
About a population median using a sample drawn from that population.
To determine whether the median difference of two populations is zero or another prespecified constant, using two paired samples.
The population under investigation can be either discrete or continuous and its distribution can assume any form (not necessarily symmetric as assumed by Wilcoxon signed-rank test).
The test can be applied to ratio, interval, or ordinal data.
Consider a random variable \(X\) from which we have drawn a random sample \((x_1, x_2, ...,x_n)\). If we are interested in testing the hypothesis that the median \((M)\) of the population equals a prespecified value \((M_0)\), then the null and alternative hypotheses can be formulated as follows:
\(H_0: M=M_0\)
\(H_1: M \ne M_0\) (two-sided) or \(H_1: M \gt M_0\) or \(H_1: M \lt M_0\) (one-sided)
We begin by calculating \(X_i-M_0\) and marking the sign of the difference (\(+\) or \(-\)). If any difference is \(0\), it is discarded and the sample size is reduced by \(1\).
Assuming \(H_0\) is true, we anticipate that the number of positive differences \((N^+)\) to be equal to the negative differences \((N^–)\).
If \(M \gt M_0\), we anticipate that the number of positive differences to be greater than that of negative differences.
If \(M \lt M_0\), we anticipatethat the number of positive differences to be less than that of negative differences.
Therefore, under \(H_0\), the distribution of either \(N^+\) or \(N^–\) is binomial with parameters \(n\) (i.e., sample size) and \(p=0.5\) (i.e., the probability of success = the probability of failure).
Test statistic:
We use \(N^+\) as the test statistic (i.e., the number of successes). The \(p-\)value is then calculated using the binomial distribution:
If testing \(H_1:M \gt M_0\), \(p-\)value is the probability of observing a test statistic as extreme as or more extreme than the observed \(N^+\) if the \(H_0\) is true (i.e., right tail probability).
If testing \(H_1:M \lt M_0\), \(p-\)value is the probability of observing a test statistic as extreme as or less extreme than the observed \(N^+\) if the \(H_0\) is true (i.e., left tail probability).
If testing \(H_1:M \ne M_0\), \(p-\)value is twice the probability of observing a test statistic as extreme as or more extreme than the observed \(N^+\) if the \(H_0\) is true (i.e., two-sided probability).
The same rationale applies when testing the median difference of two-paired samples.
Given the following sample of scores \((3, 5, 9, 7, 7, 8, 4, 9, 10, 5)\), can you conclude that the median score of the population from which the sample was drawn is different from \(4\) (hypothesized median or \(M_0\)).
\(H_0: M=4\) and \(H_1 \ne 4\)
| \(X_i\) | \(X_i-4\) | Sign | \(X_i\) | \(X_i-4\) | Sign |
|---|---|---|---|---|---|
| \(8\) | \(+4\) | \(+\) | \(3\) | \(-1\) | \(-\) |
| \(5\) | \(+1\) | \(+\) | \(4\) | \(0\) | Excluded |
| \(9\) | \(+5\) | \(+\) | \(6\) | \(+2\) | \(+\) |
| \(7\) | \(+3\) | \(+\) | \(10\) | \(+6\) | \(+\) |
| \(7\) | \(+3\) | \(+\) | \(5\) | \(+1\) | \(+\) |
\(N^+ = 8,\ N^-=1,\ N^0=1,\ N^{Total}=9\)
A. The test can be done using \(binom.test()\) function as follows:
scores <- c(3, 5, 9, 7, 7, 8, 4, 9, 10, 5)
# Calculate number of successes (scores greater than 4)
sucesses <- sum(scores > 4)
# Calculate total number of trials (excluding scores equal to 4)
n_total <- sum(scores !=4)
# Perform the binomial test
binom.test(x = sucesses, n = n_total, p = 0.5, alternative = "two.sided", conf.level = 0.95)
##
## Exact binomial test
##
## data: sucesses and n_total
## number of successes = 8, number of trials = 9, p-value = 0.03906
## alternative hypothesis: true probability of success is not equal to 0.5
## 95 percent confidence interval:
## 0.5175035 0.9971909
## sample estimates:
## probability of success
## 0.8888889
The \(p-\)value < 0.05, so there is evidence to reject \(H_0\) and conclude that the population median is different from \(4\).
B. The sign test can be performed directly using \(SIGN.test()\) function from \(BSDA\) package.
SIGN.test(x = scores, md = 4, alternative = "two.sided", conf.level = 0.95)
##
## One-sample Sign-Test
##
## data: scores
## s = 8, p-value = 0.03906
## alternative hypothesis: true median is not equal to 4
## 95 percent confidence interval:
## 4.324444 9.000000
## sample estimates:
## median of x
## 7
##
## Achieved and Interpolated Confidence Intervals:
##
## Conf.Level L.E.pt U.E.pt
## Lower Achieved CI 0.8906 5.0000 9
## Interpolated CI 0.9500 4.3244 9
## Upper Achieved CI 0.9785 4.0000 9
The output reveals the same \(p-\)value but also shows the sample estimate of the median \((7)\) and the interpolated \(95\%\ CI\) of the median \((4.32-9)\). Fig. 1 shows the two-sided \(p-\)value of observing successes \(\ge 8\) or \(\le1\), given the null hypothesis is true.
Fig. 1 Two-sided p-value of Sign Test
In this example, \(Oxytocin\) dataset from \(BSDA\) has been used to assess if the median of differences of arterial blood pressure (BP) of \(11\) subjects before and after receiving oxytocin is zero.
\(H_0: \text {the median of differences} =0\), \(H_1: \text {the median of differences} \ne0\)
The median BP is 97 mm Hg before oxytocin administration, while it is 49 mm Hg following oxytocin.
| Variable | N | Mean | Std. Dev. | Min | Pctl. 25 | Pctl. 50 | Pctl. 75 | Max |
|---|---|---|---|---|---|---|---|---|
| before | 11 | 100 | 26 | 72 | 88 | 97 | 100 | 173 |
| after | 11 | 55 | 21 | 23 | 46 | 49 | 57 | 92 |
SIGN.test(x = Oxytocin$before, y = Oxytocin$after)
##
## Dependent-samples Sign-Test
##
## data: Oxytocin$before and Oxytocin$after
## S = 11, p-value = 0.0009766
## alternative hypothesis: true median difference is not equal to 0
## 95 percent confidence interval:
## 34.71273 55.14909
## sample estimates:
## median of x-y
## 46
##
## Achieved and Interpolated Confidence Intervals:
##
## Conf.Level L.E.pt U.E.pt
## Lower Achieved CI 0.9346 35.0000 54.0000
## Interpolated CI 0.9500 34.7127 55.1491
## Upper Achieved CI 0.9883 34.0000 58.0000
Interpretation:
The data provides evidence that the median arterial BP is significantly reduced by 46 mm Hg after the administration of oxytocin \((S=11,\ p\lt0.001)\) as depicted in Fig. 2. The interpolated \(95\%\ CI\) for the median difference in arterial BP before and after the administration of oxytocin is between \(34.7\) and \(55.1\) mm Hg.
Fig. 2 Effect of oxytocin administration on blood pressure
Binomial distribution may be approximated by the normal distribution.
This can be applied when \(np\) and \(n(1-p)\) are both \(\gt 5\), where \(n\) is the sample size and \(p\) is the probability of success and is equal to \(0.5\) under \(H_0\).
The test statatistic \((z)\) is then calculated as follows:
\[z=\frac{(N^+\pm0.5)-0.5n}{0.5 \sqrt{n}}\] where:
\(N^+\) is the number of positive signs.
\(n=N^++N^–\)
In the quantity \((N^+ \pm0.5)\), the continuity correction of \(0.5\) has been added (because the discrete binomial distribution is approximated by the continuous normal distribution).
Use \((N^+ -0.5)\) when \(N^+ \ge n/2\)
Use \((N^+ +0.5)\) when \(N^+ \lt n/2\)
This approach is implemented in \(SPSS\) when \(n \gt25\).
The effect size is \(Cohen's\ g=P(N^+) -0.5\), where \(P(N^+)\) is the probability of positive differences.
Effect size conventions:
small \(g = 0.05\)
medium \(g = 0.15\)
large \(g = 0.25\)
The steps for sample size calculation is depicted in Fig. 4.
Cohen, J. (2013). Statistical Power Analysis for the Behavioral Sciences. United States: Taylor & Francis.
Corder, G. W., Foreman, D. I. (2014). Nonparametric Statistics: A Step-by-Step Approach. Germany: Wiley.
Daniel, W. W., Cross, C. L. (2013). Biostatistics: A Foundation for Analysis in the Health Sciences. Singapore: Wiley.
Sprent, P. (2011). Sign Test. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_515.