My assumptions are violated!!

M. Drew LaMar
October 19, 2020

Class Announcements

  • Reading Assignment for Wednesday (NO QUIZ)
    • Whitlock & Schluter, Chapter 13: Handling violations of assumptions

Before we go there: Comparing variances

Question: Do populations differ in the variability of measurements?

Remember, it isn't always about inferring central tendency!

There are two main tests:

  • \( F \)-test (Warning: Highly sensitive to departures from normality assumption)
  • Levene's test (More robust to departures from normality, but at a cost - loss of power!)

Comparing variances

Example 12.4

The brook trout is a species native to eastern North America that has been introduced into streams in the West for sport fishing. Biologists followed the survivorship of a native species, chinook salmon, in a series of 12 streams that either had brook trout introduced or did not (Levin et al. 2002). Their goal was to determine whether the presence of brook trout effected the survivorship of the salmon. In each stream, they released a number of tagged juvenile chinook and then recorded whether or not each chinook survived over one year.

Comparing variances

Load data and sneak-a-peek:

'data.frame':   12 obs. of  4 variables:
 $ troutTreatment    : chr  "present" "absent" "present" "present" ...
 $ nReleased         : int  820 467 960 700 959 545 1029 769 27 998 ...
 $ nSurvivors        : int  166 180 136 153 178 103 326 173 7 120 ...
 $ proportionSurvived: num  0.202 0.385 0.142 0.219 0.186 0.189 0.317 0.225 0.259 0.12 ...

Comparing variances

Compute variances in both groups:

chinook %>%
  group_by(troutTreatment) %>%
  summarize(variance = var(proportionSurvived))
# A tibble: 2 x 2
  troutTreatment variance
  <chr>             <dbl>
1 absent         0.0107  
2 present        0.000883

Comparing variances - F-test

var.test(proportionSurvived ~ troutTreatment, 
         data = chinook)

    F test to compare two variances

data:  proportionSurvived by troutTreatment
F = 12.165, num df = 5, denom df = 5, p-value = 0.01589
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
  1.702272 86.936360
sample estimates:
ratio of variances 
          12.16509 

Comparing variances - Levene's test

library(car)
leveneTest(chinook$proportionSurvived, 
           group = chinook$troutTreatment, 
           center = mean)
Levene's Test for Homogeneity of Variance (center = mean)
      Df F value   Pr(>F)   
group  1  10.315 0.009306 **
      10                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Visualizing significance between groups

How to compare between two groups with only confidence intervals?

The fallacy of indirect comparison

Example 12.5: Mommy's baby, Daddy's maybe

Question: Do babies look more like their fathers or their mothers?

The fallacy of indirect comparison

Example 12.5: Mommy's baby, Daddy's maybe

Question: Do babies look more like their fathers or their mothers?

Christenfeld and Hill (1995) predicted that babies more resemble their fathers, due to the hypothesis that this resemblance affords an evolutionary advantage of increased paternal care. They tested this by obtaining pictures of a series of babies and their mothers and fathers. Particpants shown picture of child, and either three possible mothers or three possible fathers (one is correct).

The fallacy of indirect comparison

Conclusion: Authors concluded that since fathers turned up statistically significant and mothers did not, that babies more resembled their fathers than their mothers.

Discuss: What’s the mistake here?

Mistake: Misinterpretation of statistical significance

The fallacy of indirect comparison

Fallacy: If one test in Group 1 shows with statistical significance that \( \mu_{1} > \mu_{0} \), and the same test in Group 2 does not show \( \mu_{2} > \mu_{0} \), then this shows with statistical significance that \( \mu_{1} > \mu_{2} \).

The fallacy of indirect comparison

Fallacy: If one test in Group 1 shows with statistical significance that \( \mu_{1} > \mu_{0} \), and the same test in Group 2 does not show \( \mu_{2} > \mu_{0} \), then this shows with statistical significance that \( \mu_{1} > \mu_{2} \).

The fallacy of direct comparison

Fallacy: If \( \bar{Y}_{1} > \bar{Y}_{2} \), then \( \mu_{1} > \mu_{2} \).

Mistake: Relying on point estimates rather than interval estimates

The fallacy of indirect comparison

Conclusion: Comparisons between two groups should always be made directly using the appropriate statistical test, not indirectly by comparing both to the same null hypothesized value.

Handling violations of assumptions

Handling violations of assumptions

Four options for handling violations of assumptions:

  • Ignore the violations of assumptions
  • Transform the data
  • Use a nonparametric method
  • Use a permutation test (computer-intensive methods)

Need to detect deviations first

To check for normality, first (as always) look at your data. Histograms work best here.

Detecting deviations from normality

The following data come from a normal distribution:

They don't look normal, but they:

  • …don't have outliers
  • …aren't skewed

Detecting deviations from normality

Examples of data from non-normal distributions:

Normal quantile plot

Definition: The normal quantile plot compares each observation in the sample with its quantile expected from the standard normal distribution. Points should fall roughly along a straight line if the data come from a normal distribution.

Normal quantile plot - R Example

  1. Sort measurements (\( x \))
  2. Compute percentiles of \( x \) (cumulative probabilities, \( p \))
  3. Compute standard normal quantiles from percentiles (\( q \))
  4. Plot measurements against computed quantiles (\( q \) vs \( x \))
x <- sort(rnorm(20))  # (1)
p <- (1:20)/21  # (2)
q <- qnorm(p, lower.tail = TRUE)  # (3)
plot(q ~ x, xlab="Measurements", ylab="Normal quantiles")  # (4)

Normal quantile plot - R Example

x <- sort(rnorm(20))  # (1)
p <- (1:20)/21  # (2)
q <- qnorm(p, lower.tail = TRUE)  # (3)
plot(q ~ x, xlab="Measurements", ylab="Normal quantiles")  # (4)

plot of chunk unnamed-chunk-7

Normal quantile plot - R Example

Fast way (note: axes are flipped by default!)

qqnorm(x, datax = TRUE)

plot of chunk unnamed-chunk-8

Marine reserve example

Question: Are marine reserves effective in preserving marine wildlife?

Experimental design

Halpern (2003) matched 32 marine reserves to a control location, which was either the site of the reserve before it became protected or a similar unprotected site nearby. They then evaluated the “biomass ratio,” which is the ratio of total masses of all marine plants and animals per unit area of reserve in the protected and matched unprotected areas.

Marine reserve example

Experimental design

Halpern (2003) matched 32 marine reserves to a control location, which was either the site of the reserve before it became protected or a similar unprotected site nearby. They then evaluated the “biomass ratio,” which is the ratio of total masses of all marine plants and animals per unit area of reserve in the protected and matched unprotected areas.

Discuss: Observational or experimental? Paired or unpaired? Interpret response measure in terms of effect of protection.

Answer: Observational. Paired (matching). Biomass ratio = 1 (no effect); > 1 (beneficial effect); < 1 (detrimental effect).

How to interpret normal quantile plots

How to interpret normal quantile plots

How to interpret normal quantile plots

Practice Problem #4: Interpret the following normal quantile plots.

Statistical test for normality??

Definition: A Shapiro-Wilk test evaluates the goodness of fit of a normal distribution to a set of data randomly sampled from a population.

\( H_{0} \): The data are sampled from a population having a normal distribution.
\( H_{A} \): The data are sampled from a population not having a normal distribution.

Cautions:

  • Small sample size might not have enough power.
  • Large sample size can have too much power (reject even when deviation from normality is very slight)

Shapiro-Wilk Test - R Example

marine <- read.csv("/Users/mdlama/Dropbox/Work/Teaching/College of William and Mary/Fall 2018/Datasets/chapter13/chap13e1MarineReserve.csv")
hist(marine$biomassRatio)

plot of chunk unnamed-chunk-9

Shapiro-Wilk Test - R Example

marine <- read.csv("/Users/mdlama/Dropbox/Work/Teaching/College of William and Mary/Fall 2018/Datasets/chapter13/chap13e1MarineReserve.csv")
shapiro.test(marine$biomassRatio)

    Shapiro-Wilk normality test

data:  marine$biomassRatio
W = 0.81751, p-value = 8.851e-05

Conclusion: Combination of graphical, testing, and common sense.

When to ignore violation of assumptions

  • Ignore the violations of assumptions
  • Transform the data
  • Use a nonparametric method
  • Use a permutation test (computer-intensive methods)

Definition: A statistical procedure is robust if the answer it gives is not sensitive to violations of assumptions of the method.

Main takeaway point: This is a case-by-case basis that depends on the statistical test and data (see book for discussion).

Data transformations

  • Ignore the violations of assumptions
  • Transform the data
  • Use a nonparametric method
  • Use a permutation test (computer-intensive methods)

Definition: A data transformation changes each measurement by the same mathematical formula.

Data transformations

Common transformations:

  • Log transformation (data skewed right) \[ Y^{\prime} = \ln[Y] \]
  • Arcsine transformation (data are proportions) \[ p^{\prime} = \arcsin[\sqrt{p}] \]
  • Square-root transformation (data are counts) \[ Y^{\prime} = \sqrt{Y + 1/2} \]

Data transformations

Other transformations:

  • Square transformation (data skewed left) \[ Y^{\prime} = Y^2 \]
  • Antilog transformation (data skewed left) \[ Y^{\prime} = e^{Y} \]
  • Reciprocal transformation (data skewed right) \[ Y^{\prime} = \frac{1}{Y} \]
  • Box-Cox transformation (skew) \[ Y^{\prime}_{\lambda} = \frac{Y^{\lambda} - 1}{\lambda} \]

Log transformations - When to use

  • Measurements are ratios or products
  • Frequency distribution is skewed right
  • Group having larger mean also has larger standard deviation
  • Data span several orders of magnitude

Log transformations - When to use

  • Measurements are ratios or products
  • Frequency distribution is skewed right
  • Group having larger mean also has larger standard deviation
  • Data span several orders of magnitude

Log transformations - When to use

  • Measurements are ratios or products
  • Frequency distribution is skewed right
  • Group having larger mean also has larger standard deviation
  • Data span several orders of magnitude

Log transformations - How to use

Hypothesis testing

marine <- read.csv("/Users/mdlama/Dropbox/Work/Teaching/College of William and Mary/Fall 2018/Datasets/chapter13/chap13e1MarineReserve.csv")
shapiro.test(log(marine$biomassRatio))

    Shapiro-Wilk normality test

data:  log(marine$biomassRatio)
W = 0.93795, p-value = 0.06551

Log transformations - How to use

Hypothesis testing

hist(log(marine$biomassRatio))

plot of chunk unnamed-chunk-12

Log transformations - How to use

Original statistical hypotheses:

\( H_{0} \): The mean of the biomass ratio of marine reserves is one (\( \mu = 1 \))
\( H_{A} \): The mean of the biomass ratio of marine reserves is not one (\( \mu \neq 1 \))

Transformed statistical hypotheses:

\( H_{0} \): The mean of the log biomass ratio of marine reserves is zero (\( \mu^{\prime} = 0 \))
\( H_{A} \): The mean of the log biomass ratio of marine reserves is not zero (\( \mu^{\prime} \neq 0 \))

Log transformations - How to use

t.test(log(marine$biomassRatio), mu=0)

    One Sample t-test

data:  log(marine$biomassRatio)
t = 7.3968, df = 31, p-value = 2.494e-08
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
 0.3470180 0.6112365
sample estimates:
mean of x 
0.4791272 

Log transformations - How to use

Estimation

The 95% confidence interval for the log transformed data is

\[ 0.347 < \mu^{\prime} < 0.611. \]

For a 95% confidence interval of the untransformed data, we have

\[ e^{0.347} < \mathrm{geometric \ mean} < e^{0.611}, \]

or

\[ 1.41 < \mathrm{geometric \ mean} < 1.84. \]

Discuss: Conclusion?

Data transformations - Caveats

  • Be careful of sign of your data!!! (i.e. positives and negatives).
  • Avoid multiple testing with transformations (i.e. use all transformations and choose one that gives significant result)

Use a nonparametric method

  • Ignore the violations of assumptions
  • Transform the data
  • Use a nonparametric method
  • Use a permutation test (computer-intensive methods)

Definition: A nonparametric method makes fewer assumptions than standard parametric methods do about the distribution of the variables.

Property: Nonparametric methods are usually based on the ranks of the data points (medians, quartiles, etc.)

Property: Nonparametric tests are typically less powerful than parametric tests.

Use a nonparametric method

  • A nonparametric alternative to the one-sample \( t \)-test is the sign test.

Definition: The sign test compares the median of a sample to a constant specified in the null hypothesis. It makes no assumptions about the distribution of the measurements in the population.

  • A nonparametric alternative to the two-sample \( t \)-test is the Mann-Whitney \( U \)-test.

Definition: The Mann-Whitney \( U \)-test compares the distributions of two groups. It does not require as many assumptions as the two-sample \( t \)-test.

Sign test: Binomial test in disguise

Algorithm:

  • First, state a null hypothesized median.
  • Label all measurements larger than this median with a “\( + \)”, and all measurements smaller than this median with a “\( - \)”.
  • Throw out any measurements exactly equal to the median (sample size is reduced by this amount)
  • Use binomial test with the test statistic the number of “\( + \)” values (or \( - \) values), comparing the result to the null proportion \( p_{0}=0.5 \).

Sign test has very little power. If \( n \leq 5 \), then can't use sign test.