Smelly stoats, cautious voles and careless mice

1. Trap sampling & responses to scent marking

The first hurdle we need to get over is to pull the data into R, using an appropriate format. To make best use of the examples in the course, we should make sure the data are in ‘long’ format—we have always started each analysis using data in this format. This means that rather than organising the data into a table (as presented in the work sheet), we should construct a data set with three columns: 1) a column of species names, 2) a column of scent treatments, and 3) a column of counts. If we call these Species, TrapScent and Count, after reading the data into R the resulting data frame should have the following form:

trap.counts
##       Species  TrapScent Count
## 1    Apodemus      Clean    49
## 2    Apodemus      Stoat    61
## 3    Apodemus Guinea-pig    50
## 4 M. agrestis      Clean    55
## 5 M. agrestis      Stoat    25
## 6 M. agrestis Guinea-pig    65

Both Species and TrapScent are categorical variables, and Count is obviously a numeric (ratio scale) variable.

We could make a bar plot to visualise these data, but they are sufficiently simple that there probably isn’t much to be gained by doing this—inspecting the raw numbers is good enough.

(1a) Is there any difference in the way that the two species respond to the differently marked traps?

The data are counts/frequencies in different categories. This question is asking whether there is a difference in the way the species respond to the traps. Another way of stating this question is to ask whether the frequencies of each species differ among traps. These observations should have lead you to a \(\chi^{2}\) contingency table test.

We can’t combine the data from the two experiments into a single analysis—they involve different treatments at different sites—so the data from the two sites have to be analysed separately.

We’ll start with the data from site 1, following the strategy given in the analysis of frequency data chapter. In order to carry out a \(\chi^{2}\) contingency table test in R we need to first convert the data into a table using the xtabs function:

trap.counts.tab <- xtabs(Count ~ Species + TrapScent, data = trap.counts)
trap.counts.tab
##              TrapScent
## Species       Clean Guinea-pig Stoat
##   Apodemus       49         50    61
##   M. agrestis    55         65    25

Since we already had the total counts in each combination of categories, this doesn’t change the actual data in any way—it just converts it from a data frame to a table for us. The next step uses the chisq.test function to carry out the \(\chi^{2}\) contingency table test using the resulting table:

chisq.test(trap.counts.tab)
## 
##  Pearson's Chi-squared test
## 
## data:  trap.counts.tab
## X-squared = 16.675, df = 2, p-value = 0.0002394

This indicates that there is a significant difference in the way in which wood mice and common voles respond to scent-marked traps.

This answers the main question at site 1. What else might we check using the site 1 data? We could evaluate whether there is evidence of avoidance of guinea-pig scent, compared to the other control. Since the \(\chi^{2}\) contingency table test was significant, we should do this for each species separately—i.e. they respond to traps differently so we can’t aggregate the data across the two species.

We need to assess whether, for each species, there is a difference in the frequency with which they are caught in the two control traps. If there is no difference, we should expect the numbers in the Clean and Guinea-pig categories to be the same. Therefore, since we have a prior expectation of the counts, we need to carry out \(\chi^{2}\) goodness of fit test on pairs of Clean and Guinea-pig counts for each species. We can do this as follows…

# Apodemus
chisq.test(c(49, 50))
## 
##  Chi-squared test for given probabilities
## 
## data:  c(49, 50)
## X-squared = 0.010101, df = 1, p-value = 0.9199
# M. agrestis
chisq.test(c(55, 65))
## 
##  Chi-squared test for given probabilities
## 
## data:  c(55, 65)
## X-squared = 0.83333, df = 1, p-value = 0.3613

Remember, if we don’t supply the expected proportions in each category the chisq.test function assumes equal numbers are expected. These tests indicates that the data are consistent with the prediction of equal counts (neither p-value was significant).

(1b) What is the purpose of having the guinea-pig marked treatment?

The guinea pig scent is a control for the effect of any novel scent-–-to see whether the response of the voles is just one of avoiding any unknown scent, rather than stoat scent in particular.

(1c) Does M. arvalis respond to Stoat scent?

What about the Site 2 experiment? The best we can do here is to carry out another \(\chi^{2}\) goodness of fit test on the data, using an expectation of equal counts in each trap:

chisq.test(c(53, 28))
## 
##  Chi-squared test for given probabilities
## 
## data:  c(53, 28)
## X-squared = 7.716, df = 1, p-value = 0.005473

This indicates that the counts are significantly different from the 50:50 expectation, with M. arvalis preferring clean traps over stoat marked traps. This is despite being found on an island where stoats are absent.

Example conclusions…

Overall, the results of these two experiments suggest that M. arvalis and M. agrestis respond to stoat scent in a similar way.

There is a significant difference in the way in which wood mice and common voles respond to scent-marked traps at site 1 (contingency table: \(\chi^{2}\) = 16.7, d.f.= 2, p<0.001), with voles showing a tendency to avoid of stoat-scented traps, and mice showing no such pattern, in fact being caught rather more often in stoat-scented traps. There was no evidence of avoidance of guinea-pig scent. There is a significant difference in the proportion of voles captured in stoat-scented traps and control traps at site 2 (\(\chi^{2}\) = 7.7, d.f.= 1, p<0.01), with voles showing a tendency to avoid of stoat-scented traps.

2. Scent avoidance

Once again, we needed to get the data into R, using an appropriate format. The data should be in ‘long’ format and need to be stored in a data frame. Since there is only one variable to worry about—which we will call Time—the resulting data frame is very simple:

arena
##    Time
## 1   4.1
## 2   0.1
## 3   0.3
## 4   5.7
## 5   1.3
## 6   0.6
## 7   2.1
## 8   0.4
## 9   1.4
## 10  0.8

(We called the data frame arena). It may seem like overkill to make a data frame to store just one variable, but remember, dplyr and ggplot2 are designed to work with data frames. We need to use both of these packages in this example.

The Time variable is a numeric, ratio scale variable. It is best to visualise the distribution of these kinds of data before picking an analysis. We’ll use a dot plot:

ggplot(arena, aes(x = Time)) + geom_dotplot(binwidth = 1)

It seems like the voles prefer not to spend much time on the stoat-marked side of the arena. Although there isn’t a lot of data here, it also looks like the data look like they may not be normally distributed—many of the observations are clustered between 0 and 2, with a few ‘extreme’ values (the data are right-skew).

We are not in a position to think about how to analyse the data….

(2a) Do voles avoid the stoat marked areas?

In order to answer this question we need to think about what would have happened if there were no effect of stoat scent (i.e. what is the appropriate null hypothesis?). In the absence of an effect the two sides of the arena are identical, and so we expect that the voles will spend 5 minutes in each half. This suggest that we should compare the mean or median of the data to an expectation of 5 minutes. We therefore are two options, depending on the distribution of the data…

Option 1: one sample t-test

If the data are normally distributed we can use a one-sample t-test to compare the mean of the sample to the prediction of 5 minutes. Since we suspect that the are not normal, we have to transform them first before using this approach. The log transformation can sometimes be used when data exhibit right-skew and there are no zeros or negative values. Let’s first look at the distribution after log-transformation…

ggplot(arena, aes(x = log10(Time))) + geom_dotplot(binwidth = .4)

This looks much better than the raw data, in the sense that the right-skew is no longer apparent. Notice that we just did the transformation inside ggplot. If we want to actually use the transformation we need to create a new variable with the mutate function…

arena <- mutate(arena, LogTime = log10(Time))
arena
##    Time     LogTime
## 1   4.1  0.61278386
## 2   0.1 -1.00000000
## 3   0.3 -0.52287875
## 4   5.7  0.75587486
## 5   1.3  0.11394335
## 6   0.6 -0.22184875
## 7   2.1  0.32221929
## 8   0.4 -0.39794001
## 9   1.4  0.14612804
## 10  0.8 -0.09691001

…remembering to give the new variable a sensible name and to overwrite the original data frame.

How do we compare the mean of the transformed sample to a value of 5? There are two things to keep in mind. First, since we transformed the data we have to compare the mean to the log of 5. It does not make sense to compare things on different scales. Second, by default, the t.test function (which we use to carry out t-tests) compares means to an expected value of 0. This means that we should subtract a value of log10(5) from the transformed data if we plan to use this default. Again, we can use mutate to do this for us…

arena <- mutate(arena, LogTimeDiff = LogTime - log10(5))
arena
##    Time     LogTime LogTimeDiff
## 1   4.1  0.61278386 -0.08618615
## 2   0.1 -1.00000000 -1.69897000
## 3   0.3 -0.52287875 -1.22184875
## 4   5.7  0.75587486  0.05690485
## 5   1.3  0.11394335 -0.58502665
## 6   0.6 -0.22184875 -0.92081875
## 7   2.1  0.32221929 -0.37675071
## 8   0.4 -0.39794001 -1.09691001
## 9   1.4  0.14612804 -0.55284197
## 10  0.8 -0.09691001 -0.79588002

…remembering to create a new variable.

Carrying out the one-sample t-test is then just a matter of pulling out the variable from the data frame (using $) and passing this to the t.test function:

t.test(arena$LogTimeDiff)
## 
##  One Sample t-test
## 
## data:  arena$LogTimeDiff
## t = -4.3096, df = 9, p-value = 0.001963
## alternative hypothesis: true mean is not equal to 0
## 95 percent confidence interval:
##  -1.1098814 -0.3457842
## sample estimates:
##  mean of x 
## -0.7278328

This indicates that common voles spent significantly less time in the stoat scent marked side of the arena than in the unmarked side.

There is another way to carry out the t-test without first subtracting the value of log10(5). If you had looked at the help for t.test, you would have seen an argument called mu. We can just set this to the required value…

t.test(arena$LogTime, mu = log10(5))
## 
##  One Sample t-test
## 
## data:  arena$LogTime
## t = -4.3096, df = 9, p-value = 0.001963
## alternative hypothesis: true mean is not equal to 0.69897
## 95 percent confidence interval:
##  -0.4109114  0.3531858
## sample estimates:
##   mean of x 
## -0.02886281

…remembering to use the correct variable (log transformed, but without the subtraction of log10(5)). The result of this is the same as above.

Example conclusions…

Common voles spent significantly less time in the stoat scent marked side of the arena than in the unmarked side (one sample t-test (log10-transformed data): t=4.31, n=10, p=0.002). [you could have d.f. instead of n]

Solution 2

Since the data are not normally distributed, we could have just skipped all the transformations and used a non-parametric test: the Wilcoxon one-sample test. We introduced this in the course in the context of a paired sample test (remember, a paired-sample test is really just a one-sample test). Here is how to do this in R:

wilcox.test(arena$Time, mu = 5)
## 
##  Wilcoxon signed rank test
## 
## data:  arena$Time
## V = 1, p-value = 0.003906
## alternative hypothesis: true location is not equal to 5

Again, this indicates that common voles spent significantly less time in the stoat scent marked side of the arena than in the unmarked side.

Example conclusions…

Common voles spent significantly less time in the stoat scent marked side of the arena than in the unmarked side (Wilcoxon test: W=1, n=10, p=0.004). [Depending which way round we coded the differences we may also get W=54, but everything else is the same.]

Final note…

We used two-tailed tests in the above analyses. However, we could actually make a justifiable a priori prediction about the direction of the effect—voles spend less time in the stoat-scented side—which means that it would have been valid to use a one-tailed test. You would not lose marks for doing this.

3. Behavioural responses to scent marking

To begin, we needed to read the data into an R data frame in ‘long’ format:

behaviour
##       Species      Scent Activity
## 1  M.agrestis      Clean       84
## 2  M.agrestis      Clean      100
## 3  M.agrestis      Clean       91
## 4  M.agrestis      Clean      112
## 5  M.agrestis      Clean       96
## 6  M.agrestis      Stoat       61
## 7  M.agrestis      Stoat       30
## 8  M.agrestis      Stoat       43
## 9  M.agrestis      Stoat       24
## 10 M.agrestis      Stoat       66
## 11 M.agrestis Guinea-pig       82
## 12 M.agrestis Guinea-pig       71
## 13 M.agrestis Guinea-pig       87
## 14 M.agrestis Guinea-pig       55
## 15 M.agrestis Guinea-pig       94
## 16 M. arvalis      Clean       62
## 17 M. arvalis      Clean       77
## 18 M. arvalis      Clean      100
## 19 M. arvalis      Clean       90
## 20 M. arvalis      Clean       59
## 21 M. arvalis      Stoat       54
## 22 M. arvalis      Stoat       60
## 23 M. arvalis      Stoat       36
## 24 M. arvalis      Stoat       46
## 25 M. arvalis      Stoat       32
## 26 M. arvalis Guinea-pig       78
## 27 M. arvalis Guinea-pig       62
## 28 M. arvalis Guinea-pig       66
## 29 M. arvalis Guinea-pig       72
## 30 M. arvalis Guinea-pig       48

We called the data frame behaviour. This has three columns: 1) a column of species names, 2) a column of scent treatments, and 3) a column of activity counts, called Species, Scent and Activity, respectively.

The Activity variable is the dependent variable here. This is a ratio scale numeric variable. The two other variables—Species and Scent—are the independent variables (treatments). These are categorical. We should make a quick plot to visualise the distribution of Activity in each treatment combination:

ggplot(behaviour, aes(x = Activity)) + 
  geom_dotplot(binwidth = 10) + facet_grid(Species ~ Scent)

This plot suggests that the vole activity is somewhat lower in the presence of stoat scent, and M agrestis is perhaps a little more active, though there is considerable overlap among the distributions. There is nothing to indicate that the data are non-normal, though it is difficult to assess this with so few observations in each treatment combination.

(3a) Is there any evidence that the amount of activity of the voles is influenced by the scent treatment in the enclosure, do the species respond differently, if so?

In order to answer this question we need to understand whether the mean activity is different among the three treatments, and whether this difference depends on the species identity. Since we want to compare means among more than two samples, classified by two factors, we need to work with a two-way ANOVA. The question about whether the amount of activity of the voles is influenced by the scent treatment relates to the main effect of Scent. The question about whether difference depends on the species identity relates the the interaction between Scent and Species.

The analysis is a two step process. In the first step we use the lm function to fit the two-way ANOVA…

activity.model <- lm(Activity ~ Scent + Species + Scent:Species, data = behaviour)

…remembering to include the interaction. Next, we use the anova function to print out the ANOVA table containing the statistical tests of main effects and interactions:

anova(activity.model)
## Analysis of Variance Table
## 
## Response: Activity
##               Df Sum Sq Mean Sq F value    Pr(>F)    
## Scent          2 8968.9  4484.4  21.295 4.804e-06 ***
## Species        1  790.5   790.5   3.754   0.06453 .  
## Scent:Species  2  510.5   255.2   1.212   0.31517    
## Residuals     24 5054.0   210.6                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This indicates that there was no significant difference in overall activity levels between the two vole species, but activity levels for both species differed significantly between treatments. The interaction between species and scent treatment is not significant.

Since the effect of Scent is significant, but the interaction between Scent and Species is not, we can use a multiple comparisons test of just differences between just the Scent means. First we need to load and attach the agricolae package…

library(agricolae)

…then we use the HSD.test function to carry out the required multiple comparisons test:

HSD.test(activity.model, "Scent", console=TRUE)
## 
## Study: activity.model ~ "Scent"
## 
## HSD Test for Activity 
## 
## Mean Square Error:  210.5833 
## 
## Scent,  means
## 
##            Activity      std  r Min Max
## Clean          87.1 16.95386 10  59 112
## Guinea-pig     71.5 14.31588 10  48  94
## Stoat          45.2 14.61962 10  24  66
## 
## Alpha: 0.05 ; DF Error: 24 
## Critical Value of Studentized Range: 3.531697 
## 
## Minimun Significant Difference: 16.20673 
## 
## Treatments with the same letter are not significantly different.
## 
##            Activity groups
## Clean          87.1      a
## Guinea-pig     71.5      a
## Stoat          45.2      b

This indicates that the two control means (‘Clean’ and ‘Guinea-pig’) are not significantly different from one another, but that the stoat scented treatment is significantly lower than the two controls, as we would expect.

Finally, it is a good idea to use regression diagnostics to check the model. The only one of these that might cause concern is the normal probability plot:

plot(activity.model, which = 2)

This evaluates the normality assumption. The larger positive values tend to be less positive than they should be, and the largest (in magnitude) negative values tend to be less negative than they should be. This indicates that there is some departure form normality, such that the tails of the distribution are ‘squashed’ toward the middle. This non-normality isn’t too bad, but it isn’t ideal either, and in a real analysis you might want to comment on it.

Example conclusions…

There was no significant difference in overall activity levels between the two vole species (ANOVA F=3.75, d.f.=1,24, p=0.065), or interaction between species and scent treatment (F=1.21, d.f.=2,24, p=0.31), but activity levels for both species differed significantly between treatments (F=21.3, d.f.=2,24, p<0.001), being significantly lower in stoat scented treatments than either clean, or guinea-pig scented treatments, which did not differ from each other (Tukey multiple comparison test, p<0.05).

(3b) What do these results, along with those of section 1, suggest about the biological basis of the two vole species’ responses to stoat scent?

This result indicates that although there are no stoats on Orkney, the Orkney voles still show avoidance of predator (stoat) scent, suggesting that the stoat scent avoidance is an evolved rather than a learned response.

4. Time-dependent scent effects

We began by reading the data into an R data frame in ‘long’ format:

time.effects
##    ScentAge Activity
## 1         0       17
## 2         0       45
## 3         1       41
## 4         1       70
## 5         3       22
## 6         3       60
## 7         7       89
## 8         7       60
## 9        10       62
## 10       10       56
## 11       14      114
## 12       14       78

We called the data frame time.effects. This has two columns: 1) a column of containing the ‘age’ of the scent marked enclosures and 2) a column of activity counts, called ScentAge and Activity, respectively.

The Activity variable is the dependent variable, and the ScentAge variable is the independent variable (i.e. we can reasonably consider Activity to depended upon ScentAge). Both are a ratio scale numeric variables. We should make a quick scatter plot to visualise the relationship between the two variables, remembering to place Activity on the y axis:

ggplot(time.effects, aes(x = ScentAge, y = Activity)) + 
  geom_point() 

This plot clearly suggests that the vole activity is higher lower when the stoat scent is older. The relationship looks to be approximately linear.

(4a) Is there any effect of scent age on the responses of the voles?

Since we are interested in whether or not there is a relationship between the two variables, and we also need to predict the activity (next question), we require a linear regression analysis. This is a two step process. In the first step we use the lm function to fit the regression model…

activity.reg <- lm(Activity ~ ScentAge, data = time.effects)

…remembering to include the interaction. Next, we use the anova function to print out the ANOVA table containing the statistical test of the effect of ScentAge on Activity:

anova(activity.reg)
## Analysis of Variance Table
## 
## Response: Activity
##           Df Sum Sq Mean Sq F value  Pr(>F)  
## ScentAge   1   4033  4033.0  9.9726 0.01019 *
## Residuals 10   4044   404.4                  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This indicates that there is significant effect of ScentAge on Activity, i.e., the slope of the relationship is significantly different from 0.

The regression diagnostics are good. There doesn’t seem to be any non-linearity on the relationship:

plot(activity.reg, which = 1)

…the normality assumption seems to be met:

plot(activity.reg, which = 2)

…and the scale-location plot indicates that the the variance is constant:

plot(activity.reg, which = 3)

The last thing we might want to do is extract the coefficients of the model using the summary function:

summary(activity.reg)
## 
## Call:
## lm(formula = Activity ~ ScentAge, data = time.effects)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -27.140 -14.235  -2.297  14.305  28.172 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept)   38.171      8.906   4.286   0.0016 **
## ScentAge       3.656      1.158   3.158   0.0102 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20.11 on 10 degrees of freedom
## Multiple R-squared:  0.4993, Adjusted R-squared:  0.4492 
## F-statistic: 9.973 on 1 and 10 DF,  p-value: 0.01019

This tells us that the intercept of the best fit line is 38.2, and the slope is 3.66. These can be used to make predictions.

Example conclusions…

There was a significant positive relationship between the age of the scent mark and the activity level of voles (activity = 38.2 + 3.66 age, F=10.0, d.f.=1,10, p=0.01).

(4b) What activity (grid lines crossed) would you predict to occur in an arena with scent marks 3 weeks old? Is this reasonable?

After three weeks (21 days) the relationship predicts that vole activity will be about 115 gridlines crossed in 20 minutes. This is a little larger than the mean activity level shown by voles (M. agrestis) in the clean traps in the previous experiment (96 lines in 20 min), so it seems likely that the effect of scent has more or less worn off by this point and this extrapolation beyond the data is probably rather questionable.