Agresti, A. (1990). Categorical data analysis. Wiley. pp 350–354
Siegel, S. and Castellan Jr., N. J. (1988) Nonparametric statistics for the behavioral sciences, 2nd ed., McGraw-Hill. Ch4 pp 38-44 for the Binomial Exact test and Ch5 pp 75-80 for the McNemr Change Test.
The McNemar test for the significance of changes is applicable to ‘before and after’ designs in which each subject is used as its own control and on which the measurements are made on a binary nominal or ordinal scale. Hence, for each subject, the outcome of each measurement is recorded as ‘1’ or ‘0’. Subjects which gave altered results would be those for which the first measurement result was 1 and the second 0, or vice-versa.
The null hypothesis \(H_0\) is that the subjects are as likely to change from 1 to 0 as from 0 to 1.
The alternate hypothesis \(H_1\) is that there is a preferred direction of change, from 1 to 0 or from 0 to 1
In this test, all we need are the numbers of changes, from 0 to 1 - let’s call this \(n_{01}\) and from 1 to 0, let’s call this \(n_{10}\).
The total number of changes is thus \(n_{01}+n_{10}\) and since under the null hypothesis changes in each direction are equally likely, we can say that the expected number of changes in each direction is \((n_{01}+n_{10})/2\).
Recall that for count data such as this, we can calculate the test statistic \(X^2\) as follows
\[X^2=\sum^k_{i=1}\frac{(O_i-E_i)^2}{E_i}\] where
\(O_i\) = the observed number of cases in the \(i\)th category.
\(E_i\) = the expected number of cases in the \(i\)th category when \(H_0\) is true.
\(k\) = the number of categories, which is two, in the case we are considering.
If the number of data in each category is large enough (more than 5, as a rule of thumb) then to a good approximation \(X^2\) is distributed as a \(\chi^2\) with one degree of freedom.
We have \(n_{01}\) is the observed number of changes from 0 to 1, \(n_{10}\) is the observed number of changes from 1 to 0 and \((n_{01}+n_{10})/2\) is the expected number of changes in each direction.
Hence,
\[ \begin{align*} X^2&=\sum^k_{i=1}\frac{(O_i-E_i)^2}{E_i}\\ &=\frac{\left[n_{01}-(n_{01}+n_{10})/2\right]^2}{(n_{01}+n_{10})}+\frac{\left[n_{10}-(n_{01}+n_{10})/2\right]^2}{(n_{01}+n_{10})}\\ &=\frac{\left(n_{01}-n_{10}\right)^2}{n_{01}+n_{10}}\\ &\approx \chi^2_1\qquad \text{(a chi-square with 1 degree of freedom, if there are enough data)} \end{align*} \] Hence, if this applies, all we need to know are the numbers of changes in each direction and we can calculate a chi-squared statistic with one degree of freedom, and from that a \(p\)-value, which will tell us the likelihood of getting a difference this big or bigger in the preferred direction of change. We can then compare that \(p\)-value to our significance level and either reject or fail to reject the null hypothesis.
When does this apply? This is only a good approximation to the truth if the number of changes is large enough. A rule of thumb is that they should be 10 or more altogether. If there are fewer than 10 we need to use the exact Binomial test.
A detail, but the chi-square distribution strictly is a continuous distribution and our test statistic is a discrete variable. The use of the chi-squre distribution becomes more exact if a small correction, the so-called continuity correction, is made to the formula above. We just subtraxct one from the numerator before squaring it, so that, finally, our test statistic is:
\[ \begin{align*} X^2&=\frac{\left(|n_{01}-n_{10}|-1\right)^2}{n_{01}+n_{10}}\\ &\approx \chi^2_1\qquad \text{(a chi-square with 1 degree of freedom, if there are enough data)} \end{align*} \] ## Practical
You need a directory (folder) called Rstuff (or similar). Make that your working directory. In that folder you should have two other folders:
‘scripts’ - save the script that you will write in there.
‘data’ - save the data in there.
If saving your own data, it needs to be in two columns of paired data written as 1s and 0s.
rm(list=ls())
library(tidyverse)
If you have your own data, you need to change the filename here
filename<- '../data/mcnemar_change_data.csv' # change this to your filename if you want
data<-read.csv(filename)
glimpse(data)
## Rows: 100
## Columns: 2
## $ before <int> 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0,…
## $ after <int> 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0,…
contingency_table<-table(data)
contingency_table
## after
## before 0 1
## 0 53 2
## 1 13 32
n00<-contingency_table[1,1]
n10<-contingency_table[2,1]
n01<-contingency_table[1,2]
n11<-contingency_table[0,0]
If the number of changes 0 to 1 or 1 to 0 is greater than 10, it is OK to use the McNemar Test. If not, we need to use the exact binomial test.
print(paste('n01 + n10 = ',n01+n10))
## [1] "n01 + n10 = 15"
Here we explicitly calculate the McNemar change test statistic. Providing the number of data are large enough this roughly follows a Chi-square distribution.
test_statistic<-(abs(n10-n01)-1)^2/(n10+n01)
p_value<-pchisq(test_statistic,df=1,lower.tail=FALSE)
## [1] "McNemar's Chi-squared test with continuity correction"
## [1] "McNemar's chi-squared = 6.67 , df = 1, p-value = 0.00982327"
Now you see what the McNemar test is actually doing, we can just use the built in function from R to carry out the test. As its argument it requires the data to be in a contingency table, which we easily form from our data using the table() command
contingency_table<-table(data)
mcnemar.test(contingency_table)
##
## McNemar's Chi-squared test with continuity correction
##
## data: contingency_table
## McNemar's chi-squared = 6.6667, df = 1, p-value = 0.009823
Given a significance level of, say 0.05, would you reject or fail to reject the null hypothesis for this data?
In this test, we consider all the tests that resulted in a change of result. Call this number \(N\). We imagine throwing a dice, and with probability \(p\) each test would be one that flipped from 1 to 0, while with probability \(1-p\) each test would be one that flipped from 0 to 1. Under the null hypothesis a flip in either direction is equally likely, so \(p=0.5\).
Say there were \(x\) tests that flipped from 0 to 1. The binomial test gives us the probability that we would observe this many or fewer such tests among \(N\) trials that flipped one way or another. Double that and we get the probability that we would get results as far or further from the null prediction as we actually got, in either direction.
Mathematically, the probability of seeing \(x\) 0 to 1 flips among N flips is given by:
\[ {N\choose x} p^x(1-p)^{N-x}\] but since, under the null, p=0.5, this is equal to \[ {N\choose x} 0.5^N\] where \({N\choose x}\) is the binomial factor \(\frac{N!}{x!(N-x)!}\).
Thus, under the null hypothesis the chance of seeing 2 or fewer flips, or 13 or more among 15 flips is
\[2\times\left[{15\choose 0}+{15\choose 1}+{15\choose 2}\right]\times 0.5^{15}\] This is just a \(p\)-value for a two-sided hypothesis test!
Luckily, there is a function that calculates this for us in R:
n10<-contingency_table[2,1]
n01<-contingency_table[1,2]
N<-n01+n10
x<-n01
p_binom<-2*pbinom(x,N,0.5)
print(paste(N,' total changes, p-value under the null hypothesis = ',round(p_binom,8)))
## [1] "15 total changes, p-value under the null hypothesis = 0.00738525"
For large \(N\), the \(p\) value using the binomial test should agree with that found using the McNemar test. For small (\(N<10\)) values, the binomial test is the more reliable.
Write a script to run change tests on three data files you have been given. You need to decide in each case whether to use the McNemar test or the binomial exact test.
In each case you will need to create a contingency table, and to calculate n01 and n10.
Use the code chunks above as examples that you can adapt.
In each case, decide whether to reject or fail to reject the null hypothesis that there is no preferred direction of change, at the 5% significance level.