Types of Resampling:

Monte Carlo Simulation:

Permutation Test:

Bootstrapping:

Jackknife :

Permutation Test:

Permutation: given a set of objects, how many different combinations of those objects can you create?

Test a Null Hypothesis:

Permutation;

  • permutation test is a simple way to compute the sampling distribution for any test-statisti under the strong null hypothesis that a set of variants has ABSOLUTELY NO EFFECT on the outcome

  • Permutation is only valid when the null hypothesis has NO ASSOCIATION

  • if the null is true, changing the exposure would have no effect on the outcome

  • the shuffeled data sets should look like real data, otherwise they should look different from the real data

  • Permutation tests are viable when we assume there is no difference between the treated and the untreated. In other words, the null = 0 in a permutation test because the mean should not be statistically significant from zero if the treatment has no effect

  • Permutations are just simulated data, and since we assume that the treatment has no effect, it doesn’t matter if we assign different results to different people

  • the ranking of the real test statistic among the shuffeled test statistics gives a p-value

Procedures for Permutation Tests:

  1. Analyze the Problem :
    • What is the hypothesis and the alternative?
    • What distribution is the data drawn from?
    • What losses are associated with bad decisions?
  2. Choose a Test Statistic: one that will distinguish the hypothesis from the alternative

  3. Rearrange the Observations (i.e. Permutations):

    • Compute the test staistic for all possible permutations of the data of the observations and generte a distribution of observed values of the statistic of interest under the null hypothesis of no difference between the two populations
  4. Make a Decision:

    • compare observed statistc to this empirical sampling distribution to see how unlikely our observed statistic is if the two distributions are the same(t-test)

    • If the value’s of the test statistic for the original data is an extreme value in the permutation distribution of the statsitic

    -if NOT an extreme value, fail to reject the null and rejectthe alternative

Permutation:

  1. collect data fram control and treatment

  2. merge samples to form a psuedo permutation

  3. sample w/o replacement from psuedo population to simluate control and treatment groups

  4. compute target statistic for each resample

  • where s is the standard deviation
  • n is the degrees of freedom used

\[T =\frac{\bar{X} - \bar{Y}}{s /\sqrt{n}} \] When using a two sample t test: where S^2 is the variance

\[T =\frac{\bar{X} - \bar{Y}}{\sqrt{\frac{(n_x - 1)S^2_x + (n_y - 1)S^2_y}{(n_x - 1) + (n_y - 1)}}}\]

t.test(permutations$treatment_mean, permutations$control_mean, var.equal = T, paired = F)

    Two Sample t-test

data:  permutations$treatment_mean and permutations$control_mean
t = 1.0824, df = 38, p-value = 0.2859
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.208601 20.475267
sample estimates:
mean of x mean of y 
 77.55000  70.41667 

Computing Sample Size:

\[n = (\frac{Z_\sigma}{E})^2\]

Symbol Definition Example
Z the value from te standard normal distribution reflectiing the confiedence interval that will be used Z = 1.96 for 95% (get value from Z table)
sigma standard deviation of the outcome variable example
E desired margin of error example
