Comparing two means: R code for Chapter 12 examples

Note: This document was converted to R-Markdown from this page by M. Drew LaMar. You can download the R-Markdown here.

Download the R code on this page as a single file here

New methods

Hover over a function argument for a short description of its meaning. The variable names are plucked from the examples further below.

Paired $t$-test and 95% confidence interval of a mean difference:

t.test(blackbird$logAfterImplant, blackbird$logBeforeImplant, paired = TRUE)

Two-sample $t$-test and 95% confidence interval for a difference between two means:

t.test(squamosalHornLength ~ Survival, data = lizard, var.equal = TRUE)

Other new methods:

The $F$ test to compare two variances.
The 95% confidence interval for the variance ratio.
Levene’s test to compare variances between two (or more) groups.

Example 12.2. Red-winged blackbirds

Confidence interval for mean difference and the paired t-test, comparing immunocompetence of red-winged blackbirds before and after testosterone implants.

Read the data into a data frame. The data are in “wide” format (one row per individual).

blackbird <- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter12/chap12e2BlackbirdTestosterone.csv"))
blackbird

##    blackbird beforeImplant afterImplant logBeforeImplant logAfterImplant
## 1          1           105           85             4.65            4.44
## 2          2            50           74             3.91            4.30
## 3          3           136          145             4.91            4.98
## 4          4            90           86             4.50            4.45
## 5          5           122          148             4.80            5.00
## 6          6           132          148             4.88            5.00
## 7          7           131          150             4.88            5.01
## 8          8           119          142             4.78            4.96
## 9          9           145          151             4.98            5.02
## 10        10           130          113             4.87            4.73
## 11        11           116          118             4.75            4.77
## 12        12           110           99             4.70            4.60
## 13        13           138          150             4.93            5.01

Calculate and plot differences (Figure 12.2-2). We add a variable called d to the data frame with the After minus Before difference.

blackbird$d <- blackbird$logAfterImplant - blackbird$logBeforeImplant
head(blackbird)

##   blackbird beforeImplant afterImplant logBeforeImplant logAfterImplant
## 1         1           105           85             4.65            4.44
## 2         2            50           74             3.91            4.30
## 3         3           136          145             4.91            4.98
## 4         4            90           86             4.50            4.45
## 5         5           122          148             4.80            5.00
## 6         6           132          148             4.88            5.00
##       d
## 1 -0.21
## 2  0.39
## 3  0.07
## 4 -0.05
## 5  0.20
## 6  0.12

hist(blackbird$d, right = FALSE, col = "firebrick")

To see how to produce the strip chart with lines (Figure 12.2-1):

# It helps to obtain a version of the data in "long" format.
blackbird2 = reshape(blackbird, varying = 4:5, direction = "long", idvar = "blackbird", v.names = "logAntibody", times = factor(c("before","after"), levels = c("before","after")))
head(blackbird2)

##          blackbird beforeImplant afterImplant     d   time logAntibody
## 1.before         1           105           85 -0.21 before        4.65
## 2.before         2            50           74  0.39 before        3.91
## 3.before         3           136          145  0.07 before        4.91
## 4.before         4            90           86 -0.05 before        4.50
## 5.before         5           122          148  0.20 before        4.80
## 6.before         6           132          148  0.12 before        4.88

par(bty="l")
stripchart(logAntibody ~ time, data = blackbird2, vertical = TRUE, xlab = "Implant treatment", ylab="Antibody production rate (ln[mOD/min])", xlim = c(0.6,2.4), las = 1, pch = 16, col = "firebrick")
segments(1, blackbird$logBeforeImplant, 2, blackbird$logAfterImplant)

95% confidence interval for the mean difference. 95% confidence intervals are part of the output of the t.test function, viewed in isolation by adding $conf.int to the command.

t.test(blackbird$d)$conf.int

## [1] -0.04007695  0.15238464
## attr(,"conf.level")
## [1] 0.95

or using the paired = TRUE argument of t.test and specifying the paired variables.

t.test(blackbird$logAfterImplant, blackbird$logBeforeImplant, paired = TRUE)$conf.int

## [1] -0.04007695  0.15238464
## attr(,"conf.level")
## [1] 0.95

Paired t-test. A paired $t$-test can be done either on differences you have already calculated (d here) or by using the paired=TRUE argument with the measurements from the two groups.

t.test(blackbird$d)

## 
##  One Sample t-test
## 
## data:  blackbird$d
## t = 1.2714, df = 12, p-value = 0.2277
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -0.04007695  0.15238464
## sample estimates:
##  mean of x 
## 0.05615385

t.test(blackbird$logAfterImplant, blackbird$logBeforeImplant, paired = TRUE)

## 
##  Paired t-test
## 
## data:  blackbird$logAfterImplant and blackbird$logBeforeImplant
## t = 1.2714, df = 12, p-value = 0.2277
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.04007695  0.15238464
## sample estimates:
## mean of the differences 
##              0.05615385

Example 12.3. Horned lizards

Confidence interval for the difference between two means, and the two-sample $t$-test, comparing horn length of live and dead (spiked) horned lizards.

lizard <- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter12/chap12e3HornedLizards.csv"))
lizard

##     squamosalHornLength Survival
## 1                  25.2   living
## 2                  26.9   living
## 3                  26.6   living
## 4                  25.6   living
## 5                  25.7   living
## 6                  25.9   living
## 7                  27.3   living
## 8                  25.1   living
## 9                  30.3   living
## 10                 25.6   living
## 11                 26.0   living
## 12                 24.6   living
## 13                 25.6   living
## 14                 25.3   living
## 15                 23.5   living
## 16                 24.5   living
## 17                 23.3   living
## 18                 26.0   living
## 19                 23.9   living
## 20                 27.3   living
## 21                 25.4   living
## 22                 25.5   living
## 23                 21.4   living
## 24                 23.8   living
## 25                 25.5   living
## 26                 19.2   living
## 27                 20.7   living
## 28                 19.2   living
## 29                 25.5   living
## 30                 20.5   living
## 31                 20.6   living
## 32                 24.9   living
## 33                 23.7   living
## 34                 23.4   living
## 35                 25.6   living
## 36                 26.6   living
## 37                 27.7   living
## 38                 25.7   living
## 39                 27.0   living
## 40                 26.5   living
## 41                 25.0   living
## 42                 19.7   living
## 43                 27.1   living
## 44                 23.0   living
## 45                 22.7   living
## 46                 25.8   living
## 47                 28.8   living
## 48                 23.5   living
## 49                 23.2   living
## 50                 25.3   living
## 51                 23.8   living
## 52                 25.1   living
## 53                 26.7   living
## 54                 27.1   living
## 55                 22.5   living
## 56                 25.5   living
## 57                 24.4   living
## 58                 25.6   living
## 59                 20.6   living
## 60                 25.5   living
## 61                 24.5   living
## 62                 23.0   living
## 63                 27.6   living
## 64                 27.3   living
## 65                 20.7   living
## 66                 26.0   living
## 67                 25.1   living
## 68                 19.9   living
## 69                 23.5   living
## 70                 24.8   living
## 71                 22.4   living
## 72                 24.7   living
## 73                 27.5   living
## 74                 21.5   living
## 75                 24.0   living
## 76                 22.4   living
## 77                 21.3   living
## 78                 25.0   living
## 79                 24.3   living
## 80                 24.5   living
## 81                 23.6   living
## 82                 21.1   living
## 83                 25.6   living
## 84                 26.1   living
## 85                 23.7   living
## 86                 24.2   living
## 87                 23.2   living
## 88                 26.5   living
## 89                 28.1   living
## 90                 24.1   living
## 91                 29.5   living
## 92                 24.0   living
## 93                 26.8   living
## 94                 25.8   living
## 95                 23.3   living
## 96                 22.4   living
## 97                 24.2   living
## 98                 26.1   living
## 99                 23.2   living
## 100                21.7   living
## 101                24.7   living
## 102                21.4   living
## 103                18.5   living
## 104                23.2   living
## 105                  NA   living
## 106                21.5   living
## 107                29.1   living
## 108                21.7   living
## 109                23.5   living
## 110                23.0   living
## 111                20.0   living
## 112                21.9   living
## 113                27.4   living
## 114                27.1   living
## 115                23.1   living
## 116                23.8   living
## 117                26.5   living
## 118                27.0   living
## 119                26.4   living
## 120                26.3   living
## 121                20.6   living
## 122                24.5   living
## 123                22.8   living
## 124                25.5   living
## 125                25.3   living
## 126                28.0   living
## 127                20.8   living
## 128                25.9   living
## 129                24.0   living
## 130                22.5   living
## 131                26.3   living
## 132                23.3   living
## 133                24.5   living
## 134                21.6   living
## 135                23.6   living
## 136                23.0   living
## 137                22.4   living
## 138                27.4   living
## 139                25.0   living
## 140                23.9   living
## 141                28.2   living
## 142                27.2   living
## 143                13.1   living
## 144                22.9   living
## 145                26.6   living
## 146                25.8   living
## 147                26.3   living
## 148                20.9   living
## 149                25.6   living
## 150                24.8   living
## 151                19.2   living
## 152                27.4   living
## 153                24.2   living
## 154                15.7   living
## 155                17.7   living
## 156                21.4   killed
## 157                23.9   killed
## 158                23.2   killed
## 159                22.6   killed
## 160                22.5   killed
## 161                19.3   killed
## 162                23.5   killed
## 163                23.4   killed
## 164                19.0   killed
## 165                21.7   killed
## 166                20.2   killed
## 167                26.7   killed
## 168                21.7   killed
## 169                21.0   killed
## 170                23.9   killed
## 171                24.6   killed
## 172                21.6   killed
## 173                25.3   killed
## 174                25.0   killed
## 175                25.2   killed
## 176                15.2   killed
## 177                22.9   killed
## 178                21.4   killed
## 179                23.9   killed
## 180                17.2   killed
## 181                15.5   killed
## 182                22.0   killed
## 183                22.0   killed
## 184                23.1   killed
## 185                20.7   killed

Note that there is one missing value for the variable squamosalHornLength. Everything is easier if we eliminate the row with missing data.

lizard2 <- na.omit(lizard)
lizard2

##     squamosalHornLength Survival
## 1                  25.2   living
## 2                  26.9   living
## 3                  26.6   living
## 4                  25.6   living
## 5                  25.7   living
## 6                  25.9   living
## 7                  27.3   living
## 8                  25.1   living
## 9                  30.3   living
## 10                 25.6   living
## 11                 26.0   living
## 12                 24.6   living
## 13                 25.6   living
## 14                 25.3   living
## 15                 23.5   living
## 16                 24.5   living
## 17                 23.3   living
## 18                 26.0   living
## 19                 23.9   living
## 20                 27.3   living
## 21                 25.4   living
## 22                 25.5   living
## 23                 21.4   living
## 24                 23.8   living
## 25                 25.5   living
## 26                 19.2   living
## 27                 20.7   living
## 28                 19.2   living
## 29                 25.5   living
## 30                 20.5   living
## 31                 20.6   living
## 32                 24.9   living
## 33                 23.7   living
## 34                 23.4   living
## 35                 25.6   living
## 36                 26.6   living
## 37                 27.7   living
## 38                 25.7   living
## 39                 27.0   living
## 40                 26.5   living
## 41                 25.0   living
## 42                 19.7   living
## 43                 27.1   living
## 44                 23.0   living
## 45                 22.7   living
## 46                 25.8   living
## 47                 28.8   living
## 48                 23.5   living
## 49                 23.2   living
## 50                 25.3   living
## 51                 23.8   living
## 52                 25.1   living
## 53                 26.7   living
## 54                 27.1   living
## 55                 22.5   living
## 56                 25.5   living
## 57                 24.4   living
## 58                 25.6   living
## 59                 20.6   living
## 60                 25.5   living
## 61                 24.5   living
## 62                 23.0   living
## 63                 27.6   living
## 64                 27.3   living
## 65                 20.7   living
## 66                 26.0   living
## 67                 25.1   living
## 68                 19.9   living
## 69                 23.5   living
## 70                 24.8   living
## 71                 22.4   living
## 72                 24.7   living
## 73                 27.5   living
## 74                 21.5   living
## 75                 24.0   living
## 76                 22.4   living
## 77                 21.3   living
## 78                 25.0   living
## 79                 24.3   living
## 80                 24.5   living
## 81                 23.6   living
## 82                 21.1   living
## 83                 25.6   living
## 84                 26.1   living
## 85                 23.7   living
## 86                 24.2   living
## 87                 23.2   living
## 88                 26.5   living
## 89                 28.1   living
## 90                 24.1   living
## 91                 29.5   living
## 92                 24.0   living
## 93                 26.8   living
## 94                 25.8   living
## 95                 23.3   living
## 96                 22.4   living
## 97                 24.2   living
## 98                 26.1   living
## 99                 23.2   living
## 100                21.7   living
## 101                24.7   living
## 102                21.4   living
## 103                18.5   living
## 104                23.2   living
## 106                21.5   living
## 107                29.1   living
## 108                21.7   living
## 109                23.5   living
## 110                23.0   living
## 111                20.0   living
## 112                21.9   living
## 113                27.4   living
## 114                27.1   living
## 115                23.1   living
## 116                23.8   living
## 117                26.5   living
## 118                27.0   living
## 119                26.4   living
## 120                26.3   living
## 121                20.6   living
## 122                24.5   living
## 123                22.8   living
## 124                25.5   living
## 125                25.3   living
## 126                28.0   living
## 127                20.8   living
## 128                25.9   living
## 129                24.0   living
## 130                22.5   living
## 131                26.3   living
## 132                23.3   living
## 133                24.5   living
## 134                21.6   living
## 135                23.6   living
## 136                23.0   living
## 137                22.4   living
## 138                27.4   living
## 139                25.0   living
## 140                23.9   living
## 141                28.2   living
## 142                27.2   living
## 143                13.1   living
## 144                22.9   living
## 145                26.6   living
## 146                25.8   living
## 147                26.3   living
## 148                20.9   living
## 149                25.6   living
## 150                24.8   living
## 151                19.2   living
## 152                27.4   living
## 153                24.2   living
## 154                15.7   living
## 155                17.7   living
## 156                21.4   killed
## 157                23.9   killed
## 158                23.2   killed
## 159                22.6   killed
## 160                22.5   killed
## 161                19.3   killed
## 162                23.5   killed
## 163                23.4   killed
## 164                19.0   killed
## 165                21.7   killed
## 166                20.2   killed
## 167                26.7   killed
## 168                21.7   killed
## 169                21.0   killed
## 170                23.9   killed
## 171                24.6   killed
## 172                21.6   killed
## 173                25.3   killed
## 174                25.0   killed
## 175                25.2   killed
## 176                15.2   killed
## 177                22.9   killed
## 178                21.4   killed
## 179                23.9   killed
## 180                17.2   killed
## 181                15.5   killed
## 182                22.0   killed
## 183                22.0   killed
## 184                23.1   killed
## 185                20.7   killed

Multiple histograms using the lattice package.

library(lattice)
histogram( ~ squamosalHornLength | Survival, data = lizard2, layout = c(1,2), col = "firebrick", breaks = seq(12, 32, by = 2), xlab = "Horn length (mm)")

Here is code to make multiple histograms using hist in base R instead.

oldpar = par(no.readonly = TRUE) # make backup of default graph settings
par(mfrow = c(2,1), oma = c(4, 6, 2, 6), mar = c(3, 4, 3, 2))
hist(lizard2$squamosalHornLength[lizard2$Survival == "living"], breaks = seq(12,32,by=2), col = "firebrick", las = 1, main = "living", ylab = "Frequency")
hist(lizard2$squamosalHornLength[lizard2$Survival == "killed"], breaks = seq(12,32,by=2), col = "firebrick", las = 1, main = "killed", ylab = "Frequency")
mtext("Horn length (mm)", side = 1, outer = TRUE, padj = 0)

par(oldpar) # revert to default graph settings

95% confidence interval for the difference between means

The output of t.test includes the 95% confidence interval for the difference between means. Add $confint after calling the function to get R to report only the confidence interval. The formula in the following command tells R to compare squamosalHornLength between the two groups indicated by Survival.

t.test(squamosalHornLength ~ Survival, data = lizard, var.equal = TRUE)$conf.int

## [1] -3.335402 -1.253602
## attr(,"conf.level")
## [1] 0.95

A two-sample $t$-test of the difference between two means can be carried out with t.test by using a formula, asking if squamosalHornLength is predicted by Survival, and specifying that the variables are in the data frame lizard.

t.test(squamosalHornLength ~ Survival, data = lizard, var.equal = TRUE)

## 
##  Two Sample t-test
## 
## data:  squamosalHornLength by Survival
## t = -4.3494, df = 182, p-value = 2.27e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.335402 -1.253602
## sample estimates:
## mean in group killed mean in group living 
##             21.98667             24.28117

Example 12.4. Salmon survival with brook trout

Welch’s two-sample t-test for unequal variances, comparing chinook salmon survival in the presents and absence of brook trout. Below, we use this same example to demonstrate the 95% confidence interval for the ratio of two variances, and the F-test of equal variances.

Read the data.

chinook <- read.csv(url("http://whitlockschluter.zoology.ubc.ca/wp-content/data/chapter12/chap12e4ChinookWithBrookTrout.csv"))

Set the preferred order of categories in tables and graphs

chinook$troutTreatment <- factor(chinook$troutTreatment, levels = c("present", "absent"))

Strip chart of the data (bare bones version of Figure 12.4-1)

stripchart(proportionSurvived ~ troutTreatment, data = chinook, method = "jitter", vertical = TRUE)

Adding the means and confidence intervals to the plot is trickier. Code for a fancier plot is as follows.

# Calculate means and confidence intervals for the means.
meanProportion = tapply(chinook$proportionSurvived, chinook$troutTreatment, mean)
ciPresence = t.test(chinook$proportionSurvived[chinook$troutTreatment == "present"])$conf.int
ciAbsence = t.test(chinook$proportionSurvived[chinook$troutTreatment == "absent"])$conf.int
lower = c(ciPresence[1], ciAbsence[1])
upper = c(ciPresence[2], ciAbsence[2])

# Stripchart with options
adjustAmount = 0.2
par(bty = "l") 
stripchart(proportionSurvived ~ troutTreatment, data = chinook, vertical = TRUE,method = "jitter", jitter = 0.1, pch = 1, col = "firebrick", cex = 1.5, las = 1, ylim = c(0, max(chinook$proportionSurvived)), lwd = 1.5, xlab = "Trout treatment", ylab = "Proportion surviving")
segments( c(1,2) + adjustAmount, lower, c(1,2) + adjustAmount, upper)
points(meanProportion ~ c( c(1,2) + adjustAmount ), pch = 16, cex = 1.2)

Calculating summary statistics by group (as found in Table 12.4-3)

Use tapply to calculate statistics by group. It has three required arguments. The first is the numeric variable of interest (a vector). The second argument is a categorical variable (a vector of the same length) identifying the groups that individuals belong to. The third argument is the name of the R function that you want to apply to the variable by group.

meanProportion <- tapply(chinook$proportionSurvived, chinook$troutTreatment, mean)
sdProportion <-   tapply(chinook$proportionSurvived, chinook$troutTreatment, sd)
nProportion <-    tapply(chinook$proportionSurvived, chinook$troutTreatment, length)
data.frame(mean = meanProportion, std.dev = sdProportion, n = nProportion)

##              mean    std.dev n
## present 0.1941667 0.02971476 6
## absent  0.2351667 0.10364056 6

Welch’s two-sample $t$-test of means for unequal variances can also be done with t.test, when the var.equal argument is set to FALSE (as it is by default):

t.test(proportionSurvived ~ troutTreatment, data = chinook, var.equal = FALSE)

## 
##  Welch Two Sample t-test
## 
## data:  proportionSurvived by troutTreatment
## t = -0.93148, df = 5.8165, p-value = 0.3886
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.14953173  0.06753173
## sample estimates:
## mean in group present  mean in group absent 
##             0.1941667             0.2351667

Here, we demonstrate the 95% confidence interval for the ratio of two variances, and F-test of equal variances, using the chinook salmon experiment.

95% confidence interval for variance ratio. (Warning: remember that the method is not robust to departures from assumption of normality.)

var.test(proportionSurvived ~ troutTreatment, data = chinook)$conf.int

## [1] 0.01150267 0.58745010
## attr(,"conf.level")
## [1] 0.95

$F$-test of equal variances (Warning: Remember that the $F$-test is not robust to departures from assumption of normality.)

var.test(proportionSurvived ~ troutTreatment, data = chinook)

## 
##  F test to compare two variances
## 
## data:  proportionSurvived by troutTreatment
## F = 0.082202, num df = 5, denom df = 5, p-value = 0.01589
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.01150267 0.58745010
## sample estimates:
## ratio of variances 
##         0.08220245

Levene’s test of equal variances. This function is in the car package, which must first be installed and loaded with the library function before use.

if (!require("car")) {install.packages("car", dependencies = TRUE, repos="http://cran.rstudio.com/")}
library(car)
leveneTest(chinook$proportionSurvived, group = chinook$troutTreatment, center = mean)

## Levene's Test for Homogeneity of Variance (center = mean)
##       Df F value   Pr(>F)   
## group  1  10.315 0.009306 **
##       10                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Comparing two means: R code for Chapter 12 examples

Michael Whitlock and Dolph Schluter

New methods

Example 12.2. Red-winged blackbirds

Example 12.3. Horned lizards

Example 12.4. Salmon survival with brook trout