Goals

The ideal model would allow for interactions of dose and sex with the growth curve (= smooth effect of week), and for random variation among pups (and mothers) in the growth curve. Presumably-interesting biological questions are:

Preliminaries

Load packages:

## built-in packages (no need to install)
library(mgcv)
## use install.packages() (or RStudio menu) for these
##  fitting
## data manipulation
library(plyr)
library(dplyr)
## graphics, diagnostics, etc.
library(ggplot2); theme_set(theme_bw())
library(cowplot) ## plot_grid
library(bbmle) ## for AICtab
library(broom)
library(dotwhisker)
library(emmeans)
library(pander)

Read in data & make preliminary plot:

dd <- read.csv("DEPRE.weights.vertical.csv")
## re-order dose properly
dd <- transform(dd,
                dose=factor(dose,
                     levels=c("Sham",paste0(c(50,300,1000),"mGy"))),
                ## factors and interactions
                mother=factor(mother),
                dose_sex=interaction(dose,sex),
                mother_id=interaction(mother,id,drop=TRUE))

dd <- subset(dd,select=c(mother,id,mother_id,week,dose,sex,dose_sex,weight))
pups_per_mum <- with(unique(subset(dd,select=c(mother,id))),rowSums(table(mother,id)))
summary(subset(dd,select=-c(mother_id,dose_sex)))
##      mother          id           week         dose         sex     
##  108    : 52   A184-1 : 13   Min.   : 4   Sham   :182   female:416  
##  124    : 52   A198-1 : 13   1st Qu.: 7   50mGy  :221   male  :403  
##  140    : 52   A199-1 : 13   Median :10   300mGy :221               
##  145    : 52   A199-2 : 13   Mean   :10   1000mGy:195               
##  178    : 52   A222-1 : 13   3rd Qu.:13                             
##  188    : 52   A222-2 : 13   Max.   :16                             
##  (Other):507   (Other):741                                          
##      weight     
##  Min.   :12.00  
##  1st Qu.:18.10  
##  Median :20.20  
##  Mean   :21.09  
##  3rd Qu.:24.50  
##  Max.   :31.50  
## 

There are 23 mothers and 63 pups (from 1 to 4 pups per mother). Mothers are “nested” within dose (each mother, and all of her pups, have a single dose treatment). mother_id and dose_sex (not shown) are auxiliary variables that encode interactions. Doses are measured in milliGrays (except for “Sham”, i.e. sham treatment with no radiation).

Each line represents one pup:

gg0 <- ggplot(dd,aes(week,weight,colour=mother))+
    geom_point()+
    facet_wrap(~dose,labeller=label_both)
print(gg0+geom_line(aes(linetype=sex,group=id)))

Or (dropping information about mothers from the plot, plotting mean values for sex/dose/week combinations):

gg1 <- ggplot(dd,aes(week,weight,colour=dose,shape=dose))+
    geom_point()+
    facet_wrap(~sex,scale="free")+
    scale_colour_brewer(palette="Dark2")
print(gg1+
      stat_summary(fun.y=mean,geom="line",lwd=2,alpha=0.8)+
      geom_line(aes(group=id),alpha=0.2))

Since it seems hard/finicky to come up with and fit a nonlinear mixed-effects model (e.g. coming up with a simple, sensible nonlinear function that starts out decelerating but transitions to linear growth rather than saturating to a constant value), we’ll proceed with generalized additive mixed models.

model possibilities

Many growth curve fits: data file with fits available here.

L <- load("growthcurves_batch.RData")

All models are fitted with mgcv::gam() using method="REML", with sex using sum-to-zero contrasts, e.g.

gam(weight~sex*dose+s(week,by=dose_sex)+
      s(week,mother_id,bs='fs'),
    data=dd,
    method="REML")

(this is the most complex model fitted).

Using gamm4 would have been faster, but it didn’t seem worth the additional trouble for these purposes (our slowest model took 4.1 minutes on an 8-year-old desktop machine).

We include sex as a predictor in all models; this pattern (males are heavier than females and have a different growth curve) is biologically obvious/not interesting to us. In other words, “growth curves differ by sex but dose doesn’t have any clear effects” is our null model/hypothesis.

We have three options for the vertical shift of the curves (aka changes in intercept):

Three options for changes in the growth curve by group:

And three options for random effects:

For simplicity, I only fitted pup-within-mother random effects; I didn’t try to decompose the random effects into (mother + pup within mother). Preliminary analyses suggested that this wouldn’t make much difference/ we wouldn’t be able to detect a non-trivial level of variance among mothers with these data. Furthermore, when I tried to include both effects in the same model (sex+s(week,by=sex)+s(week,mother,bs='fs')+s(week,mother_id,bs='fs')) I got:

Error in gam(forms[[i]], data = dd, method = "REML") :
Model has more coefficients than data<br
Calls: system.time -> gam
In addition: Warning message:
In gam.side(sm, X, tol = .Machine$double.eps^0.5) :
  model has repeated 1-d smooths of same variable.

model examples

For illustration, let’s look at the predictions of some of these models.

plotfun <- function(model,exclude=NULL) {
    ## generate data for prediction
    pframe <- dd
    ## exclusion: see https://stats.stackexchange.com/questions/131106/predicting-with-random-effects-in-mgcv-gam
    pframe$weight <- predict(model,newdata=pframe,exclude=exclude)
    gg2 <- (ggplot(pframe,aes(week,weight,colour=dose,shape=dose))
        +         geom_point()
        + scale_colour_brewer(palette="Dark2")
        + geom_line(aes(group=id,linetype=sex))
    )
    return(gg2)
}

Simplest model: different curves by sex, all doses identical, no random effects.

plotfun(models[["sex+s(week,by=sex)+1"]])

Interactive effects of dose and sex on shift; curve shape varies only by sex: no random effects.

plotfun(models[["dose*sex+s(week,by=sex)+1"]])

Interactive effects of dose and sex on shift and curve shape: no random effects.

plotfun(models[["dose*sex+s(week,by=dose_sex)+1"]])

No effects of dose: random effects (shifts) by pup

plotfun(models[["sex+s(week,by=sex)+s(mother_id,bs='re')"]])

No effects of dose: random effects (curves) by pup

plotfun(models[["sex+s(week,by=sex)+s(week,mother_id,bs='fs')"]])

Most complex model (dose*sex + sex(week,by=dose_sex)+s(week,mother_id,bs='fs')):

plotfun(models[["dose*sex+s(week,by=dose_sex)+s(week,mother_id,bs='fs')"]])

Not sure what this means:

no_legend <- theme(legend.position="none")
ysc <- scale_y_continuous(limits=c(12,30))
small_sub <-
  theme(plot.subtitle=element_text(size=10))
dwid <- c(40,50) ## deparse width
noREmodel_ind <- which(names(models)=="dose*sex+s(week,by=dose_sex)+1")
fullmodel_ind <- which(names(models)=="dose*sex+s(week,by=dose_sex)+s(week,mother_id,bs='fs')")
gg3 <- plot_grid(
    plotfun(models[[fullmodel_ind]],
            exclude="s(week,mother_id)")
    +no_legend+ ysc + small_sub
    + ggtitle("full model, RE zeroed out",
         subtitle=paste(c(deparse(forms[[fullmodel_ind]],width=dwid[1]),
                          "predict(.,exclude='s(week,mother_id)')"),
                          collapse="\n")),
    plotfun(models[[noREmodel_ind]])
    + no_legend+ysc + small_sub
    + ggtitle("model with no RE",
            subtitle=paste(c(deparse(forms[[noREmodel_ind]],width=dwid[2]),
                "predict(.)"),
                           collapse="\n")),
    nrow=1)
print(gg3)

ggsave(gg3,file="modelcomp.png",width=8)

inference (1):

AICtab <- (ldply(models,glance,.id="model")
    %>% mutate(dAIC=AIC-min(AIC))  ## compute delta-AIC
    %>% select(model,dAIC,df)
    ## compute AIC 'weights'
    %>% mutate(weight=exp(-dAIC/2),
           weight=weight/sum(weight))
    %>% arrange(dAIC)
)

AIC results:

model dAIC df weight
sex+s(week,by=dose_sex)+s(week,mother_id,bs=‘fs’) 0 210.7 0.611
dose*sex+s(week,by=dose_sex)+s(week,mother_id,bs=‘fs’) 2 209.4 0.223
dose+sex+s(week,by=dose_sex)+s(week,mother_id,bs=‘fs’) 2.6 208.9 0.166
sex+s(week,by=dose)+s(week,by=sex)+s(week,mother_id,bs=‘fs’) 34.3 183.2 0
dose+sex+s(week,by=dose)+s(week,by=sex)+s(week,mother_id,bs=‘fs’) 37.1 181.5 0
dose*sex+s(week,by=dose)+s(week,by=sex)+s(week,mother_id,bs=‘fs’) 39.9 182.3 0

This suggests that the only models we really need to think about (ΔAIC<10) are those with the full set of random effects (curves varying by pup) and with curve shapes varying at the population level by dose and sex.

Classical null-hypothesis testing for the full model:

full_model <-  models[["dose*sex+s(week,by=dose_sex)+s(week,mother_id,bs='fs')"]]
anova(full_model)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## weight ~ dose * sex + s(week, by = dose_sex) + s(week, mother_id, 
##     bs = "fs")
## 
## Parametric Terms:
##          df      F  p-value
## dose      3  5.947 0.000532
## sex       1 75.632  < 2e-16
## dose:sex  3  0.120 0.948038
## 
## Approximate significance of smooth terms:
##                                    edf  Ref.df     F p-value
## s(week):dose_sexSham.female      6.790   7.893 20.16  <2e-16
## s(week):dose_sex50mGy.female     7.239   8.255 25.38  <2e-16
## s(week):dose_sex300mGy.female    7.907   8.685 18.07  <2e-16
## s(week):dose_sex1000mGy.female   6.716   7.829 23.75  <2e-16
## s(week):dose_sexSham.male        7.688   8.561 86.44  <2e-16
## s(week):dose_sex50mGy.male       7.562   8.481 94.83  <2e-16
## s(week):dose_sex300mGy.male      6.048   7.201 68.03  <2e-16
## s(week):dose_sex1000mGy.male     7.981   8.724 86.17  <2e-16
## s(week,mother_id)              143.436 614.000  8.46  <2e-16

This shows that the “shift” effect in the dose is clear (very clear effect of sex, effect of interaction is unclear). Not a useful comparison of curve effects (all we get is a list saying that all of the curves are clearly different from a flat line).

Model comparison: full model (interaction between dose and sex in curve shape) vs. model with additive effects of dose and sex on the shape of the growth curve

anova(full_model,
      models[["dose*sex+s(week,by=dose)+s(week,by=sex)+s(week,mother_id,bs='fs')"]],test="F")
## Analysis of Deviance Table
## 
## Model 1: weight ~ dose * sex + s(week, by = dose_sex) + s(week, mother_id, 
##     bs = "fs")
## Model 2: weight ~ dose * sex + s(week, by = dose) + s(week, by = sex) + 
##     s(week, mother_id, bs = "fs")
##   Resid. Df Resid. Dev      Df Deviance      F   Pr(>F)    
## 1    569.05     148.23                                     
## 2    600.52     166.24 -31.472  -18.011 2.3537 6.16e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Dot-whisker plot of shift-effects only: since sex is coded with sum-to-zero contrasts, the dose parameters represent the average across males and females of the difference between Sham and (50 mGy, 300 mGy, 1000 mGy). sex1 means the difference between females (level 1) and the average of males and females.

dwplot(tidy(full_model,parametric=TRUE))+
    geom_vline(linetype=2,xintercept=0)

Differences in curves are hard to summarize in terms of parameters.

To do/questions for experts

References