Comparing two means

Alban Guillaumet, Troy University

Objectives

  • Comparing the means of a numerical variable between two treatments or groups:

    • Paired design & \( t \)-test
    • Two-sample design & \( t \)-test
    • Two-sample design with unequal variance
  • Comparing two variances

Comparing two means

  • Frequent question in Biology. Examples?

  • Do patients treated with drug A live longer than those treated with drug B?

  • Are wolves larger than coyotes?

It all starts with experimental design

Definition: In the paired design, both treatments are applied to every sampled unit. In the two-sample design, each treatment group is composed of an independent, random sample of units.

Experimental design

Salamander abundance vs. clear-cutting/no clear-cutting of forest

  • Two-sample design:

    • random sample of forest plots
    • randomly assign one treatment to each plot
  • Mean comparison using a two-sample t-test

Experimental design

Salamander abundance vs. clear-cutting/no clear-cutting of forest

  • Paired design:

    • random sample of forest plots
    • randomly assign one of the treatments to each half of the plot
  • Mean comparison using a paired t-test

Paired design

Remember standard error: \[ SE_{\bar{Y}} = \frac{s}{\sqrt{n}} \]

We can increase the precision of our estimates and statistical power of our analyses by decreasing SE:

  • Increasing sample size
  • Decreasing variability (paired design)

Experimental Design

Unpaired Design

Paired Design

Paired design examples

  • Testing effects of sunscreen applied to one arm of each subject compared with a placebo applied to the other arm

  • Comparing fish species diversity in lakes before and after eutrophication

Paired design: What is our resulting variable?

Definition: Paired measurements are converted to a single measurement by taking the difference between them.


\[ d = Y_{T}-Y_{C}, \]

where \( Y_{T} \) and \( Y_{C} \) denote the variable in the treatment and control groups, respectively.

Paired design: Estimation

If pairs are a random sample and the difference between paired measurements (\( d = Y_{T}-Y_{C} \)) is normally distributed, then

Confidence interval for the mean difference:

\[ \bar{d} - t_{\alpha(2),df}\mathrm{SE}_{\bar{d}} < \mu_{d} < \bar{d} + t_{\alpha(2),df}\mathrm{SE}_{\bar{d}} \]

\[ \mathrm{SE}_{\bar{d}} = \frac{s_{d}}{\sqrt{n}} \]

\[ df=n-1 \]

Paired design: Hypothesis testing

Paired \( t \)-test: One-sample \( t \)-test on the difference d, testing whether the mean difference in a population equals a null hypothesized value

\[ H_{0}: \mu_{d} = \mu_{d0} \] \[ H_{A}: \mu_{d} \neq \mu_{d0} \]

Test statistic:

\[ t = \frac{\bar{d} - \mu_{d0}}{SE_{\bar{d}}} \]

Distribution under\( H_{0} \): the \( t \)-distribution with \( n-1 \) degrees of freedom, where \( n \) is the number of pairs.

Paired design: Hypothesis testing

Assumptions: Same as one-sample t-test and confidence interval for the mean difference of paired data

  • Pairs are a random sample
  • The difference between paired measurements (\( d = Y_{T}-Y_{C} \)) is normally distributed

Paired design: Practice Problem #1

Question: Can the death rate be influenced by tax incentives?

Kopczuk and Slemrod (2003) investigated this possibility using data on deaths in the United States in years in which the government announced it was changing (usually raising) the tax rate on inheritance (the estate tax). The authors calculated the death rate during the 14 days before, and the 14 days after, the changes in the estate tax rates took effect.

Paired design: Practice Problem #1

   yearOfChange HigherTaxDeaths lowerTaxDeaths
1          1917           22.21          24.93
2          1917           18.86          20.00
3          1919           28.21          29.93
4          1924           31.64          30.64
5          1926           18.43          20.86
6          1932            9.50          10.14
7          1934           24.29          28.00
8          1935           26.64          25.29
9          1940           35.07          35.00
10         1941           38.86          37.57
11         1942           28.50          34.79

Paired design: Practice Problem #1

plot of chunk stripchart

Paired design: Practice Problem #1

Question: What are the null and alternative hypotheses?

Answer:
\[ \begin{align} H_{0}: & \mathrm{Mean \ change \ in \ death \ rate \ is \ zero}\\ H_{A}: & \mathrm{Mean \ change \ in \ death \ rate \ is \ not \ zero} \end{align} \]

Answer:
\[ H_{0}: \mu_{d} = 0 \] \[ H_{A}: \mu_{d} \neq 0 \]

Paired design: Practice Problem #1

Let's do a paired \( t \)-test

t.test(x = deathRate$HigherTaxDeaths, 
       y = deathRate$lowerTaxDeaths, 
       mu = 0, 
       paired = TRUE)

Paired design: Practice Problem #1

Let's do a paired \( t \)-test


    Paired t-test

data:  deathRate$HigherTaxDeaths and deathRate$lowerTaxDeaths
t = -1.9121, df = 10, p-value = 0.08491
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -2.9408501  0.2244865
sample estimates:
mean of the differences 
              -1.358182 

Accessing attributes of a t.test

t.t <- with(deathRate, t.test(HigherTaxDeaths, lowerTaxDeaths, mu = 0, paired = TRUE))
attributes(t.t)
$names
 [1] "statistic"   "parameter"   "p.value"     "conf.int"    "estimate"   
 [6] "null.value"  "stderr"      "alternative" "method"      "data.name"  

$class
[1] "htest"

Accessing attributes of a t.test

data.frame(statistic = t.t$statistic, df = t.t$parameter, P.value = t.t$p.value)
  statistic df    P.value
t -1.912098 10 0.08491016

Two-sample design

If both samples are random samples and the numerical variable is normally distributed within both populations, then

The sampling distribution of the statistic

\[ t = \frac{\left(\bar{Y}_{1} - \bar{Y}_{2}\right) - \left(\mu_{1}-\mu_{2}\right)}{\mathrm{SE}_{\bar{Y}_{1} - \bar{Y}_{2}}} \]

has a Student's \( t \)-distribution with total degrees of freedom given by

\[ df = df_{1} + df_{2} = n_{1} + n_{2} - 2. \]

Two-sample design

Definition: The standard error of the difference between the two sample means is given by \[ \mathrm{SE}_{\bar{Y_{1}}-\bar{Y_{2}}} = \sqrt{s_{p}^2\left(\frac{1}{n_{1}} + \frac{1}{n_{2}}\right)} \] where the pooled sample variance \( s_{p}^{2} \) is given by

\[ s_{p}^2 = \frac{df_{1}s_{1}^2 + df_{2}s_{2}^2}{df_{1}+df_{2}}. \]

Paired vs. unpaired

Paired designs

  • The sample size in each group is the same.
  • We estimate the mean of the differences.

Unpaired designs

  • The sample size in each group may not be the same.
  • We estimate the difference of the means.

Two-sample design: Example

Practice Problem #16

A study in West Africa (Lefèvre et al. 2010), working with the mosquito species that carry malaria, wondered whether drinking the local beer influenced attractiveness to mosquitoes. They opened a container holding 50 mosquitoes next to each of 25 alcohol-free participants and measured the proportion of mosquitoes that left the container and flew toward the participants. They repeated this procedure 15 minutes after each of the same participants had consumed a liter of beer, measuring the change in proportion (treatment group).

Two-sample design: Example

Practice Problem #16

(cont'd) This procedure was also carried out on another 18 human participants who were given water instead of beer (control group).

Two-sample design: Example

mydata <- read.csv("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter12/chap12q16BeerAndMosquitoes.csv")
str(mydata)
'data.frame':   43 obs. of  4 variables:
 $ drink      : Factor w/ 2 levels "beer","water": 1 1 1 1 1 1 1 1 1 1 ...
 $ beforeDrink: num  0.13 0.13 0.21 0.25 0.25 0.32 0.43 0.44 0.46 0.5 ...
 $ afterDrink : num  0.49 0.59 0.27 0.43 0.5 0.5 0.37 0.3 0.58 0.89 ...
 $ change     : num  0.36 0.46 0.06 0.18 0.25 0.18 -0.06 -0.14 0.12 0.39 ...

Two-sample design: Example

plot of chunk unnamed-chunk-7

Two-sample design: Example

Short way, again using t.test (note the var.equal=TRUE):

t.test(change ~ drink, 
       mu=0, 
       var.equal=TRUE,
       data=mydata)

Two-sample design: Example

Short way, again using t.test (note the var.equal=TRUE):


    Two Sample t-test

data:  change by drink
t = 3.1913, df = 41, p-value = 0.002717
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.05383517 0.23940928
sample estimates:
 mean in group beer mean in group water 
        0.154400000         0.007777778 

Two-sample design: Assumptions

Two-sample assumptions:

  • Each of the two samples is a random sample from its pop.
  • Numerical variable normally distributed in each pop.
  • Variances are equal in both populations

Two-sample test is robust to minor deviations from normality. Test also works well if variances are unequal, as long as we have:

  • Moderate sample sizes (\( n_{1}, n_{2} > 30 \))
  • Balance: \( n_{1} \approx n_{2} \)
  • No more than a 3 \( \times \) difference in standard deviation

Two-sample design: Testing Example

(s <- tapply(mydata$change, mydata$drink, sd))
     beer     water 
0.1622519 0.1269347 

Two-sample design: Welch's t-test

Definition: Welch’s t-test compares the mean of two groups and can be used even when the variances of the two groups are not equal.

Standard error and degrees of freedom are calculated differently than in the two-sample \( t \)-test, but the rest is similar (i.e. a \( t \)-distribution is used).

Two-sample design: Welch's t-test

Same as two-sample \( t \)-test in R, except var.equal=FALSE (default).

t.test(change ~ drink, 
       mu=0, 
       var.equal=FALSE, 
       data=mydata)

Two-sample design: Welch's t-test

Same as two-sample in R, except var.equal=FALSE (default).


    Welch Two Sample t-test

data:  change by drink
t = 3.3219, df = 40.663, p-value = 0.001897
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 0.05746134 0.23578311
sample estimates:
 mean in group beer mean in group water 
        0.154400000         0.007777778 

Speaking of variance: Comparing variances

Question: Do populations differ in the variability of measurements?

Remember, it isn't always about inferring central tendency!

Two example of tests:

  • \( F \)-test (Warning: Highly sensitive to departures from normality assumption)
  • Levene's test (More robust to departures from normality, but at a cost - loss of power)

Comparing variances

The brook trout is a species native to eastern North America that has been introduced into streams in the West for sport fishing. Biologists followed the survivorship of a native species, chinook salmon, in a series of 12 streams that either had brook trout introduced or did not (Levin et al. 2002). Their goal was to determine whether the presence of brook trout affected the survivorship of the salmon. In each stream, they released a number of tagged juvenile chinook and then recorded whether or not each chinook survived over one year.

Comparing variances

Load data and have a peek:

'data.frame':   12 obs. of  4 variables:
 $ troutTreatment    : Factor w/ 2 levels "absent","present": 2 1 2 2 1 2 1 2 1 1 ...
 $ nReleased         : int  820 467 960 700 959 545 1029 769 27 998 ...
 $ nSurvivors        : int  166 180 136 153 178 103 326 173 7 120 ...
 $ proportionSurvived: num  0.202 0.385 0.142 0.219 0.186 0.189 0.317 0.225 0.259 0.12 ...

Comparing variances

Compute variances in both groups:

(variances <- tapply(chinook$proportionSurvived, 
                chinook$troutTreatment, 
                var))
      absent      present 
0.0107413667 0.0008829667 

Comparing variances - F-test

var.test(proportionSurvived ~ troutTreatment, 
         data = chinook)

    F test to compare two variances

data:  proportionSurvived by troutTreatment
F = 12.165, num df = 5, denom df = 5, p-value = 0.01589
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
  1.702272 86.936360
sample estimates:
ratio of variances 
          12.16509 

Comparing variances - Levene's test

library(car)
leveneTest(proportionSurvived ~ troutTreatment, 
         data = chinook)
Levene's Test for Homogeneity of Variance (center = median)
      Df F value  Pr(>F)  
group  1  9.5354 0.01148 *
      10                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1