Version: 2012-12-17 09:01:30
This is a little bit on the stream-of-consciousness side; it's an attempt to figure out why some of the values in an ANOVA table are exactly 0/1. I'm leaving it a bit messy so it's easier to see the thought process of the diagnosis. The bottom line: it turns out that there was a glitch in the averaging process somewhere, so that the average standard length for males and females is identical within each site-year combination. Thus the sum of squares, \( F \) statistic, etc. are exactly zero for any term involving sex (obvious in hindsight!) I have some versions of the full (disaggregated) data, but I'm not sure I have the latest version, so I'll leave it to Rachel to re-aggregate the data.
Read and process data: remove NA values, convert year to
cyear (i.e. starting at zero: see Schielzeth 2010 Methods in Ecology and Evolution for context).
SLtabLMS <- read.csv("SLtabLMS.csv")
SLtabLMS2 <- na.omit(SLtabLMS)
SLtabLMS2$cyear <- SLtabLMS2$year - min(SLtabLMS2$year)
library(nlme)
library(car) ## for contr.Sum() -- same as contr.sum() but with better names
Specify sum-to-zero contrasts below for both sex and
sitetype although we don't really
need them since much we will be 'stripping down' the model and
not looking much at main effect terms in the presence
of interactions …
m1 <- lme(SL.mean ~ sex * cyear * sitetype, random = ~year | site, data = SLtabLMS2,
contrasts = list(sex = contr.Sum, sitetype = contr.Sum))
Default ANOVA (anova.lme): why do we get \( F \) values of
exactly zero ???
anova(m1, test = "F")
## numDF denDF F-value p-value
## (Intercept) 1 98 2542.1 <.0001
## sex 1 98 0.0 1.0000
## cyear 1 98 48.8 <.0001
## sitetype 1 4 8.6 0.0424
## sex:cyear 1 98 0.0 1.0000
## sex:sitetype 1 98 0.0 1.0000
## cyear:sitetype 1 98 2.2 0.1395
## sex:cyear:sitetype 1 98 0.0 1.0000
Using drop1? (Same weirdness.)
m1B <- update(m1, method = "ML")
drop1(m1B, . ~ ., test = "Chisq")
## Single term deletions
##
## Model:
## SL.mean ~ sex * cyear * sitetype
## Df AIC LRT Pr(>Chi)
## <none> 252
## sex 1 250 -1.53e-10 1
## cyear 0 252 1.26e-09
## sitetype 0 252 -1.70e-10
## sex:cyear 0 252 7.21e-10
## sex:sitetype 1 250 -1.69e-10 1
## cyear:sitetype 0 252 -1.61e-10
## sex:cyear:sitetype 1 250 -1.11e-10 1
Is it the random effects? Try a regular linear model.
m1C <- lm(SL.mean ~ sex * cyear * sitetype, data = SLtabLMS2, contrasts = list(sex = contr.Sum,
sitetype = contr.Sum))
anova(m1C)
## Analysis of Variance Table
##
## Response: SL.mean
## Df Sum Sq Mean Sq F value Pr(>F)
## sex 1 0.0 0.0 0.0 1.00
## cyear 1 31.8 31.8 60.1 7.1e-12 ***
## sitetype 1 14.1 14.1 26.7 1.2e-06 ***
## sex:cyear 1 0.0 0.0 0.0 1.00
## sex:sitetype 1 0.0 0.0 0.0 1.00
## cyear:sitetype 1 1.1 1.1 2.0 0.16
## sex:cyear:sitetype 1 0.0 0.0 0.0 1.00
## Residuals 102 53.9 0.5
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Looking at the data – nothing seems to be too weird. The data are unbalanced (2 contaminated vs 4 clean sites, contaminated sites only sampled in 6/11 years), but they aren't completely unbalanced …
(ctab1 <- with(SLtabLMS2, table(cyear, sitetype, sex)))
## , , sex = Female
##
## sitetype
## cyear clean contam
## 0 4 0
## 1 3 0
## 2 4 0
## 3 4 0
## 4 4 2
## 5 4 2
## 6 4 2
## 7 4 0
## 8 4 2
## 9 4 2
## 10 4 2
##
## , , sex = Male
##
## sitetype
## cyear clean contam
## 0 4 0
## 1 3 0
## 2 4 0
## 3 4 0
## 4 4 2
## 5 4 2
## 6 4 2
## 7 4 0
## 8 4 2
## 9 4 2
## 10 4 2
## plot(ctab1) ## ugly
Draw a picture: use position_dodge to separate the male and female points horizontally (it was hard to see when they were overplotted on the same x-y location).
library(ggplot2)
library(grid)
zmargin <- theme(panel.margin = unit(0, "lines"))
theme_set(theme_bw())
ggplot(SLtabLMS2, aes(x = cyear, y = SL.mean, colour = sex)) + geom_point(aes(shape = sex),
position = position_dodge(width = 0.5)) + facet_grid(. ~ sitetype) + zmargin
Not worth proceeding much further at this point.
m2 <- update(m1, . ~ . - cyear:sex:sitetype)
anova(m2)
## numDF denDF F-value p-value
## (Intercept) 1 99 2541.5 <.0001
## sex 1 99 0.0 1.0000
## cyear 1 99 49.2 <.0001
## sitetype 1 4 8.7 0.0423
## sex:cyear 1 99 0.0 1.0000
## sex:sitetype 1 99 0.0 1.0000
## cyear:sitetype 1 99 2.2 0.1375
## no evidence of interactions ... (although weak year:sitetype)
m3 <- update(m1, . ~ sex + cyear + sitetype)
anova(m3, type = "marginal")
## numDF denDF F-value p-value
## (Intercept) 1 102 1445.4 <.0001
## sex 1 102 0.0 1.0000
## cyear 1 102 43.1 <.0001
## sitetype 1 4 8.6 0.0425
plot(m3)
## some tendency of increased variance
plot(m3, form = sqrt(abs(resid(.))) ~ fitted(.), type = c("p", "smooth"))
## consider log transform, although that might be too strong (box-cox
## transformation?)
m4 <- lme(log10(SL.mean) ~ sex * cyear * sitetype, random = ~year | site, data = SLtabLMS2,
contrasts = list(sex = contr.Sum, sitetype = contr.Sum))
anova(m4, type = "marginal")
## numDF denDF F-value p-value
## (Intercept) 1 98 2581.5 <.0001
## sex 1 98 0.0 1.0000
## cyear 1 98 9.8 0.0023
## sitetype 1 4 7.1 0.0557
## sex:cyear 1 98 0.0 1.0000
## sex:sitetype 1 98 0.0 1.0000
## cyear:sitetype 1 98 1.4 0.2394
## sex:cyear:sitetype 1 98 0.0 1.0000
## hmm. main-effect p-value is non-sig here (just >0.05), so we are
## snooping a *little* bit if we decide to strip it down a bit as follows
## ...
m5 <- lme(log(SL.mean) ~ sex + cyear + sitetype, random = ~year | site, data = SLtabLMS2)
plot(m5, form = sqrt(abs(resid(.))) ~ fitted(.), type = c("p", "smooth"))
anova(m5, type = "marginal")
## numDF denDF F-value p-value
## (Intercept) 1 102 4747 <.0001
## sex 1 102 0 1.0000
## cyear 1 102 40 <.0001
## sitetype 1 4 9 0.0412