LME code

Version: 2012-12-17 09:01:30

This is a little bit on the stream-of-consciousness side; it's an attempt to figure out why some of the values in an ANOVA table are exactly 0/1. I'm leaving it a bit messy so it's easier to see the thought process of the diagnosis. The bottom line: it turns out that there was a glitch in the averaging process somewhere, so that the average standard length for males and females is identical within each site-year combination. Thus the sum of squares, \( F \) statistic, etc. are exactly zero for any term involving sex (obvious in hindsight!) I have some versions of the full (disaggregated) data, but I'm not sure I have the latest version, so I'll leave it to Rachel to re-aggregate the data.

Read and process data: remove NA values, convert year to cyear (i.e. starting at zero: see Schielzeth 2010 Methods in Ecology and Evolution for context).

SLtabLMS <- read.csv("SLtabLMS.csv")
SLtabLMS2 <- na.omit(SLtabLMS)
SLtabLMS2$cyear <- SLtabLMS2$year - min(SLtabLMS2$year)
library(nlme)
library(car)  ## for contr.Sum() -- same as contr.sum() but with better names

Specify sum-to-zero contrasts below for both sex and sitetype although we don't really need them since much we will be 'stripping down' the model and not looking much at main effect terms in the presence of interactions …

m1 <- lme(SL.mean ~ sex * cyear * sitetype, random = ~year | site, data = SLtabLMS2, 
    contrasts = list(sex = contr.Sum, sitetype = contr.Sum))

Default ANOVA (anova.lme): why do we get \( F \) values of exactly zero ???

anova(m1, test = "F")
##                    numDF denDF F-value p-value
## (Intercept)            1    98  2542.1  <.0001
## sex                    1    98     0.0  1.0000
## cyear                  1    98    48.8  <.0001
## sitetype               1     4     8.6  0.0424
## sex:cyear              1    98     0.0  1.0000
## sex:sitetype           1    98     0.0  1.0000
## cyear:sitetype         1    98     2.2  0.1395
## sex:cyear:sitetype     1    98     0.0  1.0000

Using drop1? (Same weirdness.)

m1B <- update(m1, method = "ML")
drop1(m1B, . ~ ., test = "Chisq")
## Single term deletions
## 
## Model:
## SL.mean ~ sex * cyear * sitetype
##                    Df AIC       LRT Pr(>Chi)
## <none>                252                   
## sex                 1 250 -1.53e-10        1
## cyear               0 252  1.26e-09         
## sitetype            0 252 -1.70e-10         
## sex:cyear           0 252  7.21e-10         
## sex:sitetype        1 250 -1.69e-10        1
## cyear:sitetype      0 252 -1.61e-10         
## sex:cyear:sitetype  1 250 -1.11e-10        1

Is it the random effects? Try a regular linear model.

m1C <- lm(SL.mean ~ sex * cyear * sitetype, data = SLtabLMS2, contrasts = list(sex = contr.Sum, 
    sitetype = contr.Sum))
anova(m1C)
## Analysis of Variance Table
## 
## Response: SL.mean
##                     Df Sum Sq Mean Sq F value  Pr(>F)    
## sex                  1    0.0     0.0     0.0    1.00    
## cyear                1   31.8    31.8    60.1 7.1e-12 ***
## sitetype             1   14.1    14.1    26.7 1.2e-06 ***
## sex:cyear            1    0.0     0.0     0.0    1.00    
## sex:sitetype         1    0.0     0.0     0.0    1.00    
## cyear:sitetype       1    1.1     1.1     2.0    0.16    
## sex:cyear:sitetype   1    0.0     0.0     0.0    1.00    
## Residuals          102   53.9     0.5                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Looking at the data – nothing seems to be too weird. The data are unbalanced (2 contaminated vs 4 clean sites, contaminated sites only sampled in 6/11 years), but they aren't completely unbalanced …

(ctab1 <- with(SLtabLMS2, table(cyear, sitetype, sex)))
## , , sex = Female
## 
##      sitetype
## cyear clean contam
##    0      4      0
##    1      3      0
##    2      4      0
##    3      4      0
##    4      4      2
##    5      4      2
##    6      4      2
##    7      4      0
##    8      4      2
##    9      4      2
##    10     4      2
## 
## , , sex = Male
## 
##      sitetype
## cyear clean contam
##    0      4      0
##    1      3      0
##    2      4      0
##    3      4      0
##    4      4      2
##    5      4      2
##    6      4      2
##    7      4      0
##    8      4      2
##    9      4      2
##    10     4      2
## plot(ctab1) ## ugly

Draw a picture: use position_dodge to separate the male and female points horizontally (it was hard to see when they were overplotted on the same x-y location).

library(ggplot2)
library(grid)
zmargin <- theme(panel.margin = unit(0, "lines"))
theme_set(theme_bw())
ggplot(SLtabLMS2, aes(x = cyear, y = SL.mean, colour = sex)) + geom_point(aes(shape = sex), 
    position = position_dodge(width = 0.5)) + facet_grid(. ~ sitetype) + zmargin

plot of chunk plot3

Further stuff

Not worth proceeding much further at this point.

m2 <- update(m1, . ~ . - cyear:sex:sitetype)
anova(m2)
##                numDF denDF F-value p-value
## (Intercept)        1    99  2541.5  <.0001
## sex                1    99     0.0  1.0000
## cyear              1    99    49.2  <.0001
## sitetype           1     4     8.7  0.0423
## sex:cyear          1    99     0.0  1.0000
## sex:sitetype       1    99     0.0  1.0000
## cyear:sitetype     1    99     2.2  0.1375
## no evidence of interactions ... (although weak year:sitetype)
m3 <- update(m1, . ~ sex + cyear + sitetype)
anova(m3, type = "marginal")
##             numDF denDF F-value p-value
## (Intercept)     1   102  1445.4  <.0001
## sex             1   102     0.0  1.0000
## cyear           1   102    43.1  <.0001
## sitetype        1     4     8.6  0.0425
plot(m3)

plot of chunk unnamed-chunk-3

## some tendency of increased variance
plot(m3, form = sqrt(abs(resid(.))) ~ fitted(.), type = c("p", "smooth"))

plot of chunk unnamed-chunk-3


## consider log transform, although that might be too strong (box-cox
## transformation?)


m4 <- lme(log10(SL.mean) ~ sex * cyear * sitetype, random = ~year | site, data = SLtabLMS2, 
    contrasts = list(sex = contr.Sum, sitetype = contr.Sum))
anova(m4, type = "marginal")
##                    numDF denDF F-value p-value
## (Intercept)            1    98  2581.5  <.0001
## sex                    1    98     0.0  1.0000
## cyear                  1    98     9.8  0.0023
## sitetype               1     4     7.1  0.0557
## sex:cyear              1    98     0.0  1.0000
## sex:sitetype           1    98     0.0  1.0000
## cyear:sitetype         1    98     1.4  0.2394
## sex:cyear:sitetype     1    98     0.0  1.0000
## hmm. main-effect p-value is non-sig here (just >0.05), so we are
## snooping a *little* bit if we decide to strip it down a bit as follows
## ...

m5 <- lme(log(SL.mean) ~ sex + cyear + sitetype, random = ~year | site, data = SLtabLMS2)
plot(m5, form = sqrt(abs(resid(.))) ~ fitted(.), type = c("p", "smooth"))

plot of chunk unnamed-chunk-3

anova(m5, type = "marginal")
##             numDF denDF F-value p-value
## (Intercept)     1   102    4747  <.0001
## sex             1   102       0  1.0000
## cyear           1   102      40  <.0001
## sitetype        1     4       9  0.0412