Objective:

To compare the conclusions that from three types of 2x2 table analysis.

  1. A traditional randomization model for independence on a two-way table using one of the cell counts as the test statistic.
  2. An alternative model for independence where you don’t condition on the table margin.
  3. A third model that neither conditions on the table margin nor assumes perfect knowledge of the parameter.

Model 1: Traditional Randomization Test

This example is based on the MythBusters segment which investigates whether yawning is contagious. A sample of 50 people is randomly divided into two groups. In both groups the subjects are shown into a room to wait and are secretly videoed. In treatment group, the person that shows them to the room seeds them with a yawn. Subjects in the control group are exposed to no such yawn. Researchers then note which of the subjects yawn.

We generate an “observed” data set using an assumed probability vector that \(P(yawn | unseeded) = 0.32\) and \(P(yawn | seeded) = 0.30\).

set.seed(145)
n <- 50
treatment <- c(rep(c("seeded", "unseeded"), c(n/2, n/2)))
p <- c(0.32, 0.30)

# generate single data set
outcomeS <- sample(c("yawn", "no_yawn"), size = n/2, replace = TRUE,
                   prob = c(p[1], 1 - p[1])) # generate yawners in seeded group
outcomeU <- sample(c("yawn", "no_yawn"), size = n/2, replace = TRUE,
                   prob = c(p[2], 1 - p[2])) # generate yawners in unseeded group
outcome <- c(outcomeS, outcomeU)
obs_stat <- table(treatment, outcome)[1, 2]
nYawn <- sum(table(treatment, outcome)[, 2])
tab <- table(treatment, outcome)
tab
##           outcome
## treatment  no_yawn yawn
##   seeded        12   13
##   unseeded      18    7

We perform a traditional randomization test for indepedence (an approximation to Fisher’s Exact Test) by conditioning on the table margins and simply shuffling the outcome vector. The test statistic being used is the number of subjects in the seeded group that yawn.

it <- 50000
stat1 <- rep(NA, it)
for(i in 1:it) {
  outcome_shuffled <- sample(outcome)
  stat1[i] <- table(treatment, outcome_shuffled)[1, 2]
}
barplot(table(stat1))

plot of chunk unnamed-chunk-2

The result is a sample distribution of counts under this model, which we can then use to put the observed count of 13 in context. If we are interested in computing a two-tailed p-value here, we get 0.147.

This model is fully specified by the following four characteristics:

  1. \(P(yawn | unseeded) = P(yawn | seeded) = P(yawn)\)
  2. \(P(yawn) = \hat{P}(yawn) = 20/50\)
  3. Generates samples of size \(n\).
  4. Total number of yawners is 20.
  5. The total number of subjects in each treatment group is 25.

Model 2: Less Constrained Randomization

Now we consider a slightly different model based on a reconsideration of the four characteristics above. Characteristic 3 and 5 are important to retain as that is part of the experimental design. Characteristic 1 is important as it articulates a theory of interest. Characteristic 2 is a bit of a leap. Why would we think this particular \(\hat{P}\) is the right value for \(P\)? This could be addressed by considering a distribution on \(P\), but we’ll leave that as is for now.

Let’s look instead at the fourth characteristic, which seems like a vestige of computational convenience and without any good justification. The following routine eliminates that characteristic and generates the resulting sampling distribution of the same cell count.

stat2 <- rep(NA, it)
pYawn <- nYawn/n # characteristic 2
for(i in 1:it) {
  outcome <- sample(c("yawn", "no_yawn"), size = n, replace = TRUE,
                   prob = c(pYawn, 1 - pYawn)) # characteristics 1 and 3
  stat2[i] <- sum(outcome[1:(n/2)] == "yawn") # test statistic
}
barplot(table(stat2))

plot of chunk unnamed-chunk-3

For the ease of comparison, let’s looks at those two sampling distributions on top of one another as smooth functions.

plot of chunk unnamed-chunk-4

It’s clear that eliminating characteristic 4 of model 1 results in more variation working its way into the sampling distribution. For comparison’s sake, the two-tailed p-value under this less-constrained model (in gold) is 0.3087, which is roughly twice that of model 1. One assumes that if charteristic 2 were changed to reflect the uncertainty in the parameter \(P(yawn)\), yet more variation would find its way into the sampling distribution.

This leaves me with the question: why have I been teaching model 1 when model 2 seems like a more reasonable approach?


Model 3: Allow \(P(yawn)\) to vary

We consider a third model in which \(P(yawn)\) is drawn from a probability distribution that relects our uncertainty in the true parameter. The conditions of the model can be written as:

  1. \(P(yawn | unseeded) = P(yawn | seeded) = P(yawn)\)
  2. \(P(yawn) \sim f\)
  3. Generates samples of size \(n\).
  4. The total number of subjects in each treatment group is 25.

We’ll look at three distributions for \(f\), each one a beta distribution with the mean set to \(\hat{P}(yawn) = 0.4\) but with different variances.

plot of chunk unnamed-chunk-5

stat3 <- matrix(rep(NA, it * 3), ncol = 3)
for(j in 1:3) {
  for(i in 1:it) {
    pYawn <- rbeta(1, alphas[j], betas[j]) # characteristic 2
    outcome <- sample(c("yawn", "no_yawn"), size = n, replace = TRUE,
                   prob = c(pYawn, 1 - pYawn)) # characteristics 1 and 3
    stat3[i, j] <- sum(outcome[1:(n/2)] == "yawn") # test statistic
  }
}

Below we add the distributions of number of yawners in the seeded group resulting from these three choices (in greens) to the previous two sampling distributions.

plot of chunk unnamed-chunk-7

It’s evident that if we allow our uncertainty in \(P\) to be reflected in the model, the model can reasonably explain most results that could possibly emerge from this experiment.


4. The Likelihood Approach (Under Construction)

plot of chunk unnamed-chunk-8

plot of chunk unnamed-chunk-9