Modified from hybridInflo Biomass.Rmd (April 2017)

Guides

Bolker et al. 2009. Generalized linear mixed models: a practical guide for ecology and evolution. Trends in Ecology & Evolution.
Bolker. GLMM FAQ
Bolker et al. 2011. GLMMs in action: gene-by-environment interaction in total fruit production of wild populations of Arabidopsis thaliana. Part 1, Part 2
Bolker. In: Ecological Statistics: Contemporary Theory and Application
- GLMM:worked examples
- Reporting GLMM results
Bolker et al. 2013. Strategies for fitting nonlinear ecological models in R, AD Model Builder, and BUGS
- Bolker et al. 2012. Owls example: a zero-inflated, generalized linear mixed model for count data
- other examples from Non-Linear Model Working Group
A Practical Guide to Mixed Models in R
Schielzeth and Nakagawa 2012. Nested by design: model fitting and interpretation in a mixed model era. Methods in Ecology and Evolution.
Stroup. Rethinking the Analysis of Non-Normal Data in Plant and Soil Science
Elston et al. 2001. Analysis of aggregation, a worked example: Numbers of ticks on red grouse chicks
Annika Nelson. Ant GLMM Example
R package pscl vignette: Regression Models for Count Data in R

Purpose

Identify reproductive barriers between two sympatric moth-pollinated plant species, Schiedea kaalae and S. hookeri by fitting a generalized linear mixed model (GLMM).

In the experimental design, the following crosstypes were made:

within species, between population (may show outbreeding depression or heterosis)
within species, within populations (may show inbreeding depression)
hybrids between species (indicates species barrier from pollination to seed production)

In this analysis the response variable is the biomass of the offspring produced by each cross. Other barriers (hybrid survival, flowering) could be analyzed in a similar framework, with appropriate changes to the underlying distribution.

Fixed effects:

crosstype - hybrids, within population, between populations
species - species of the maternal plant that produced the Inflo Biomass

Potential random effects:

mompop - maternal plant population
mompid - maternal plant, specified by its population and ID
dadpop - paternal plant population

Data Import

ib <- read.table("inflobiomass.csv", header=T, sep="\t", 
                 colClasses=c(collect.date="Date", weigh.date="Date"))
crosses <- read.table("hybrids.csv", header=T, sep="\t", colClasses=c(mompop="factor", dadpop="factor"))

crosscol <- c("green","blue","orange","red")

#treat populations as factors
ib$mompop <- crosses$mompop[match(ib$crossid, crosses$crossid)]
ib$momid <- crosses$momid[match(ib$crossid, crosses$crossid)]
ib$species <- crosses$momsp[match(ib$crossid, crosses$crossid)]
ib$dadpop <- crosses$dadpop[match(ib$crossid, crosses$crossid)]
ib$dadid <- crosses$dadid[match(ib$crossid, crosses$crossid)]
ib$dadsp <- crosses$dadsp[match(ib$crossid, crosses$crossid)]
ib$crosstype <- crosses$crosstype[match(ib$crossid, crosses$crossid)]
ib$cross <- crosses$cross[match(ib$crossid, crosses$crossid)]

#rename crosstype codes
ib$crosstype <- factor(ib$crosstype, levels=c("between", "within", "hybrid"))
#made "between" the first reference level to facilitate comparison between outcrossing populations and hybridizing species 

ib$mompop <- sapply(ib$mompop, mapvalues, from = c("794","866","899","879","892","904","3587"), to = c("WK","WK","WK","879WKG","892WKG","904WPG","3587WP"))
ib$dadpop <- sapply(ib$dadpop, mapvalues, from = c("794","866","899","879","892","904","3587"), to = c("WK","WK","WK","879WKG","892WKG","904WPG","3587WP"))

#define interactions
ib <- within(ib, sxc <- interaction(species,crosstype))
ib <- within(ib, sxcxm <- interaction(species,crosstype,mompop,momid))
ib <- within(ib, mompid <- as.factor(paste(mompop,momid,sep=".")))
ib <- within(ib, dadpid <- as.factor(paste(dadpop,dadid,sep=".")))
ib <- within(ib, smompop <- as.factor(paste(species,mompop,sep="")))

ib$collect.date <- round(difftime(ib$collect.date, "2016-03-10"))
#check final structure
str(ib)

   'data.frame':    1528 obs. of  28 variables:
    $ crossid     : int  1 1 1 1 1 1 1 1 1 1 ...
    $ plantid     : Factor w/ 25 levels "1","10","11",..: 1 15 20 23 24 24 2 2 3 4 ...
    $ cross       : Factor w/ 4 levels "HH","HK","KH",..: 1 1 1 1 1 1 1 1 1 1 ...
    $ block1      : int  12 12 11 12 12 12 12 12 12 12 ...
    $ blockAB     : Factor w/ 3 levels "","A","B": 1 1 1 1 1 1 1 1 1 1 ...
    $ collect.date:Class 'difftime'  atomic [1:1528] 341 344 338 345 316 316 335 335 346 323 ...
     .. ..- attr(*, "units")= chr "days"
    $ regression  : Factor w/ 5 levels "","backup R",..: 4 4 4 4 3 5 3 5 4 3 ...
    $ flrs        : int  NA NA NA NA NA 41 NA 34 NA NA ...
    $ inflo.e     : int  1 4 10 7 31 1 3 1 6 10 ...
    $ inflo       : int  1 4 10 7 32 NA 4 NA 6 11 ...
    $ weigh.date  : Date, format: "17-07-14" "17-05-18" ...
    $ mass        : num  0.053 0.075 1.113 0.44 2.14 ...
    $ initials    : Factor w/ 4 levels "AK","AK ","SS",..: 3 3 3 3 3 3 3 3 3 3 ...
    $ check       : Factor w/ 22 levels "","AK 8/15/17",..: 2 2 2 2 2 2 2 2 2 2 ...
    $ comments    : Factor w/ 66 levels "","11 open fls",..: 1 1 1 1 1 1 21 1 1 1 ...
    $ linenum     : num  1 2 3 4 5 6 7 8 9 10 ...
    $ mompop      : Factor w/ 5 levels "3587WP","WK",..: 3 3 3 3 3 3 3 3 3 3 ...
    $ momid       : Factor w/ 17 levels "1","10","10-1",..: 7 7 7 7 7 7 7 7 7 7 ...
    $ species     : Factor w/ 2 levels "hook","kaal": 1 1 1 1 1 1 1 1 1 1 ...
    $ dadpop      : Factor w/ 5 levels "3587WP","WK",..: 3 3 3 3 3 3 3 3 3 3 ...
    $ dadid       : Factor w/ 23 levels "1","10","10-1",..: 19 19 19 19 19 19 19 19 19 19 ...
    $ dadsp       : Factor w/ 2 levels "hook","kaal": 1 1 1 1 1 1 1 1 1 1 ...
    $ crosstype   : Factor w/ 3 levels "between","within",..: 2 2 2 2 2 2 2 2 2 2 ...
    $ sxc         : Factor w/ 6 levels "hook.between",..: 3 3 3 3 3 3 3 3 3 3 ...
    $ sxcxm       : Factor w/ 510 levels "hook.between.3587WP.1",..: 195 195 195 195 195 195 195 195 195 195 ...
    $ mompid      : Factor w/ 21 levels "3587WP.10","3587WP.14",..: 8 8 8 8 8 8 8 8 8 8 ...
    $ dadpid      : Factor w/ 23 levels "3587WP.10","3587WP.14",..: 9 9 9 9 9 9 9 9 9 9 ...
    $ smompop     : Factor w/ 5 levels "hook879WKG","hookWK",..: 1 1 1 1 1 1 1 1 1 1 ...

Data Inspection

Regression

library(ggplot2)
qplot(log10(mass), log10(flrs), col=cross, data=ib, weight=mass) + geom_smooth(method="lm", se=T) + scale_color_manual(values=crosscol) + ggtitle("cross")

qplot(log10(mass), log10(flrs), col=cross, group=paste(mompop,dadpop), data=ib) +  geom_smooth(method="lm", se=F) + scale_color_manual(values=crosscol) + ggtitle("mompop*dadpop")

#qplot(log10(mass), log10(flrs), col=cross, data=ib, weight=mass) + geom_quantile(quantiles = 0.9) + scale_color_manual(values=crosscol)

#plot(log10(flrs)~log10(mass), data=ib, type="n")
#text(log10(ib$mass), log10(ib$flrs), 1:length(ib$plantid), col=crosscol[ib$cross])

#library(quantreg,log10(flrs)~log10(mass)*cross, data=ib[-exclude,], weights=mass)
#qr <- rq(log10(flrs)~log10(mass)*cross, data=ib[-exclude,], tau = 0.9)
#summary(qr)

mf <- lm(log10(flrs)~log10(mass)*cross, data=ib, weights=mass)
plot(mf)

mass.flrs <- summary(mf)
mass.flrs

   
   Call:
   lm(formula = log10(flrs) ~ log10(mass) * cross, data = ib, weights = mass)
   
   Weighted Residuals:
        Min       1Q   Median       3Q      Max 
   -0.32549 -0.04318  0.00825  0.05360  0.32532 
   
   Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
   (Intercept)          2.54429    0.07685  33.108  < 2e-16 ***
   log10(mass)          0.91156    0.08689  10.491  < 2e-16 ***
   crossHK              0.22495    0.07812   2.879 0.004158 ** 
   crossKH              0.25375    0.08232   3.082 0.002169 ** 
   crossKK             -0.05870    0.07723  -0.760 0.447550    
   log10(mass):crossHK  0.24313    0.09610   2.530 0.011715 *  
   log10(mass):crossKH  0.22703    0.11299   2.009 0.045047 *  
   log10(mass):crossKK  0.30545    0.09000   3.394 0.000744 ***
   ---
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
   
   Residual standard error: 0.08916 on 495 degrees of freedom
     (1025 observations deleted due to missingness)
   Multiple R-squared:  0.9156, Adjusted R-squared:  0.9144 
   F-statistic: 766.8 on 7 and 495 DF,  p-value: < 2.2e-16

slopes <- mass.flrs$coefficients[c(2,6,7,8),"Estimate"]
slopes[c(2,3,4)] <- slopes[c(2,3,4)]+slopes[1]
names(slopes) <- levels(ib$cross)
slopes

         HH       HK       KH       KK 
   0.911560 1.154691 1.138592 1.217012

intercepts <- mass.flrs$coefficients[c(1,3,4,5),"Estimate"]
intercepts[c(2,3,4)] <- intercepts[c(2,3,4)]+intercepts[1]
names(intercepts) <- levels(ib$cross)
intercepts

         HH       HK       KH       KK 
   2.544293 2.769240 2.798040 2.485588

#with(ib, plot(mass,flrs, col=crosscol[cross], xlim=c(0,0.5), ylim=c(0,240)))
#for(i in 1:4) { curve(10^intercepts[i]*x^slopes[i], from=0, to=6, col=crosscol[i], add=T) }
#with(ib, plot(log10(mass),log10(flrs), col=crosscol[cross]))
#for(i in 1:4) { curve(intercepts[i]+x*slopes[i], from=-2, to=1, col=crosscol[i], add=T) }

#Add together envelopes for regression and the rest
ibold <- ib
ib <- aggregate(mass ~ crossid+plantid+cross+mompop+momid+species+dadpop+dadid+dadsp+crosstype+sxc+sxcxm+mompid+dadpid+smompop, ib, sum, na.action=na.pass)
ib$collect.date <- aggregate(collect.date ~ crossid+plantid+cross+mompop+momid+species+dadpop+dadid+dadsp+crosstype+sxc+sxcxm+mompid+dadpid+smompop, ibold, max, na.action=na.pass)$collect.date

library(ggplot2)
ggplot(ib, aes(x=log10(mass), fill=cross, color=cross)) + geom_density(alpha=0.1) + scale_fill_manual(values=crosscol) +scale_color_manual(values=crosscol)

Replication

The sample sizes are unbalanced at all levels, including maternal population:

reptab <- with(ib, table(smompop,crosstype))
mosaic(reptab, pop=F)
labeling_cells(text = reptab, margin = 0)(reptab)

Replication is low for some within-population crosses. The replication is even lower for each maternal plant, so we need to be wary of estimates when subsetting at this level:

with(ib, kable(table(mompid,crosstype)))

	between	within	hybrid
3587WP.10	0	1	0
3587WP.14	15	0	6
3587WP.15	4	0	0
3587WP.7	19	14	8
3587WP.A	5	0	0
3587WP.C	12	1	0
879WKG.10-1	48	1	35
879WKG.2-2	31	35	89
879WKG.G-2	15	5	26
879WKG.H-2	0	0	7
879WKG.N-5	8	6	22
892WKG.1	23	0	6
892WKG.3	1	0	3
892WKG.4	1	0	1
892WKG.5	8	5	6
904WPG.2	13	9	7
904WPG.3	16	27	3
904WPG.5	77	32	17
WK.2	137	18	91
WK.2E- 1	0	0	19
WK.4	55	6	25

Overall data distribution

To identify the best-fitting distribution, we make quantile-quantile plots of the raw data against various distributions. The more points within the confidence interval envelopes, the better the fit. Later, we present quantile-quantile plots of the model residuals to assess model fit.

#QQ plots against various distributions
set.seed(1)
par(mfrow=c(1,3))
normal <- fitdistr(log10(ib$mass+1), "normal")
qqp(log10(ib$mass+1), "norm", main="Normal")
lognormal <- fitdistr(ib$mass+1, "lognormal")
qqp(ib$mass+1, "lnorm", main="Log Normal")
#pois <- fitdistr(ib$mass+1, "Poisson")
#qqp(ib$mass, "pois", pois$estimate, main="Poisson")
#nbinom <- fitdistr(ib$mass+1, "Negative Binomial") 
#qqp(ib$mass+1, "nbinom", size = nbinom$estimate[[1]], mu=nbinom$estimate[[2]], main="Negative Binomial")
gamma <- fitdistr(ib$mass+1, "gamma")
qqp(ib$mass+1, "gamma", shape = gamma$estimate[[1]], rate = gamma$estimate[[2]], main="Gamma")

Distributions by fixed factors

ggplot(ib, aes(x = mass, fill=species)) +
  geom_histogram(data=subset(ib,species == "hook"), aes(y=-..density..),binwidth=1)+
  geom_histogram(data=subset(ib,species == "kaal"), aes(y= ..density..),binwidth=1)+
  coord_flip() + facet_grid(~crosstype) + labs(y="Histogram", x="Inflo Biomass")

Distributions by random factors

ggplot(aes(y=mass, x=mompid, color=crosstype), data=ib) + geom_count(alpha=0.8) + coord_flip() + labs(x="Maternal plant", y="Mass")

Homogeneity of variances across subsets

Our mixed model uses one parameter to capture random effect variance, which is assumed to be homogeneous. Plotting on a log scale should uncouple variances from means to assess this visually. Subsets are species * crosstype * maternal plant.

Subset variances are not homogeneous:

ggplot(aes(y=log10(mass+1), x=sxcxm, color=crosstype), data=ib) + geom_boxplot() + coord_flip() + labs(y="ln(Inflo Biomass + 1)",x="Subsets")

Subset mean-variance relationship

Various distributions make different assumptions about the mean-variance (µ-Var) ratio.

grpVars <- with(ib, tapply(mass, list(sxcxm), var))
grpMeans <- with(ib, tapply(mass, list(sxcxm), mean))
grpCounts <- with(ib, tapply(mass, list(sxcxm), length))
#set weight=grpCounts to weight loess by sample sizes
ggplot(na.omit(data.frame(grpMeans,grpVars,grpCounts)),
       aes(x=grpMeans,y=grpVars, weight=1))+geom_point(aes(size=grpCounts))+
  guides(colour=guide_legend(title="Fit"),size=guide_legend(title="Sample size")) + labs(x="Subset Mean", y="Subset Variance") + labs(subtitle="Subset: species*crosstype*mompid")

Fixed effects

Effects and interactions in these plots are simply given by the mean, which may be unduly influenced by high values.

intplot <- ggplot(ib,aes(x=crosstype,y=mass))+
  geom_count(aes(size = ..prop.., group=sxc),alpha=0.5)+
  stat_summary(aes(x=as.numeric(crosstype)),fun.y=mean,geom="line")+ facet_grid(~species)
intplot + aes(group=species, color=species)

Random effects

Maternal population

intplot + aes(group=mompop, color=mompop)

Maternal plant

intplot + aes(group=mompid, color=mompop)

Paternal population

intplot + aes(group=dadpop, color=dadpop)

Run models on subsets

Run many generalized linear models on subsets of the data defined by crosstype | mompid to see if effects estimates are consistent within maternal plants.

Most maternal plant subsets agree, but some are problematic outliers. These plants can be picked out visually from the random effects interaction plot above, the estimated parameters of each subset model, and the QQ plot of the estimated parameters:

 #had to get rid of species or mompid since mompid is nested inside species. dadpop also works
glm.lis <- lmList(log10(mass)~crosstype|mompid,data=ib, family="gaussian")
plot.lmList(glm.lis,scale=list(x=list(relation="free")))

   Loading required package: reshape

   
   Attaching package: 'reshape'

   The following objects are masked from 'package:plyr':
   
       rename, round_any

   The following object is masked from 'package:Matrix':
   
       expand

   Using grp as id variables

qqmath.lmList(glm.lis)#

   Using  as id variables

Models

We constructed the following models with the package glmmADMB. They all have the same fixed effects, species x crosstype, and response variable, log10(mass)

X = standard GL(M)M

Distribution, Random Effects:	None	Maternal plant	Maternal population
normal (norm)	X	X	X

#Normal (Gaussian) distribution, identity link
ib <- ib[!is.na(ib$collect.date),]
sc.norm.l          <- lm(log10(mass)~species*crosstype+collect.date, data=ib)
sc.mix.mompid.l           <- lmer(log10(mass)~species*crosstype+collect.date + (1|mompid), data=ib)
sc.nd.mix.momdadpid.l           <- lmer(log10(mass)~species*crosstype + (1|mompid)+ (1|dadpid), data=ib)
sc.mix.mompop.l    <- lmer(log10(mass)~species*crosstype+collect.date + (1|mompop), data=ib)
sc.mix.momdadpid.l           <- lmer(log10(mass)~species*crosstype+collect.date + (1|mompid) + (1|dadpid), data=ib)
sc.mix.momdadpop.l    <- lmer(log10(mass)~species*crosstype+collect.date + (1|mompop) + (1|dadpop), data=ib)

Model comparison

AIC

We will use the Aikake Information Criterion to pick the model the best fits the data, penalized by the number of parameters. Differences of 2 units are significant.

sc.names <- c("sc.norm.l","sc.mix.mompid.l","sc.mix.mompop.l","sc.mix.momdadpid.l","sc.mix.momdadpop.l")
sc.list <- sapply(sc.names, get, USE.NAMES=T)
sc.AIC <- ICtab(sc.list,mnames=sc.names,type="AIC", base=T, delta=F) # for AICc, nobs=nobs(sc.list[[1]])
class(sc.AIC)<-"data.frame"
all.names <- c(sc.names)
all.list <- sapply(all.names, get, USE.NAMES=T)
all.AIC <- dfun(rbind(sc.AIC))
all.AIC <- all.AIC[order(all.AIC$dAIC),]
kable(all.AIC, format.arg=list(digits=3))

	dAIC	df
sc.norm.l	0.0	8
sc.mix.mompid.l	36.0	9
sc.mix.momdadpid.l	36.8	10
sc.mix.momdadpop.l	39.8	10
sc.mix.mompop.l	41.0	9

The best-fiting model is a mixed model with the following components:

- response: log10(mass)
- fixed effects: species, crosstype, species x crosstype
- random effect: mompid

Overdispersion

Looking at the normal, fixed effects model, we see that the residuals are not normal:

shapiro.test(sc.norm.l$residuals)#raw residuals!

   
    Shapiro-Wilk normality test
   
   data:  sc.norm.l$residuals
   W = 0.93245, p-value < 2.2e-16

Coefficients

The coefficients estimated for each model agree qualitatively.

sc.log.names <- sc.names
sc.log <- sapply(sc.log.names, get, USE.NAMES=T)

coefplot2(sc.log, legend.x="topright",legend=T,legend.args=list(cex=0.8, xpd=T, inset=c(-0.1,0)), col.pts=sample(gg_color_hue(length(sc.log.names))), spacing=0.05, lwd.2=2, lwd.1=4, intercept=F)

Inference

We chose the model with nearly the best (lowest) AIC, to carry out inference tests and parameter estimation.

Description

mod <- sc.mix.momdadpid.l
print(mod)

   Linear mixed model fit by REML ['lmerMod']
   Formula: 
   log10(mass) ~ species * crosstype + collect.date + (1 | mompid) +  
       (1 | dadpid)
      Data: ib
   REML criterion at convergence: 1400.953
   Random effects:
    Groups   Name        Std.Dev.
    dadpid   (Intercept) 0.04366 
    mompid   (Intercept) 0.07815 
    Residual             0.47644 
   Number of obs: 997, groups:  dadpid, 23; mompid, 21
   Fixed Effects:
                   (Intercept)                  specieskaal  
                      0.646671                     0.562378  
               crosstypewithin              crosstypehybrid  
                     -0.053021                     0.410157  
                  collect.date  specieskaal:crosstypewithin  
                     -0.002629                     0.001903  
   specieskaal:crosstypehybrid  
                     -0.711009

Test significance of random effects

Using a likelihood ratio test, with a null hypothesis of zero variance, the random effect (maternal plant) is significant for both model parts:

anova(sc.norm.l, sc.mix.mompid.l) #double this p-value. or simulate null by permuting data.

Test significance of interaction

By dropping it from the model and performing a likelihood-ratio test, we see that the species x crosstype interaction is significant for the count model but not the binary model:

sxc.chisq <- drop1(mod, test="Chisq") #load from file
dfun(sxc.chisq)

   Single term deletions
   
   Model:
   log10(mass) ~ species * crosstype + collect.date + (1 | mompid) + 
       (1 | dadpid)
                     Df   dAIC    LRT   Pr(Chi)    
   <none>                0.000                     
   collect.date       1 15.930 17.930 2.291e-05 ***
   species:crosstype  2 24.795 28.795 5.588e-07 ***
   ---
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Model summary

The model estimated the following parameters, with individual parameter significance determined by the Wald z-test, and fixed effect significance determined by analysis of deviance Wald test.

summary(mod)

   Linear mixed model fit by REML ['lmerMod']
   Formula: 
   log10(mass) ~ species * crosstype + collect.date + (1 | mompid) +  
       (1 | dadpid)
      Data: ib
   
   REML criterion at convergence: 1401
   
   Scaled residuals: 
       Min      1Q  Median      3Q     Max 
   -4.4254 -0.5161  0.2024  0.7027  1.7542 
   
   Random effects:
    Groups   Name        Variance Std.Dev.
    dadpid   (Intercept) 0.001906 0.04366 
    mompid   (Intercept) 0.006107 0.07815 
    Residual             0.226996 0.47644 
   Number of obs: 997, groups:  dadpid, 23; mompid, 21
   
   Fixed effects:
                                 Estimate Std. Error t value
   (Intercept)                  0.6466713  0.2014437   3.210
   specieskaal                  0.5623783  0.0673054   8.356
   crosstypewithin             -0.0530208  0.0661296  -0.802
   crosstypehybrid              0.4101567  0.0473467   8.663
   collect.date                -0.0026286  0.0006194  -4.244
   specieskaal:crosstypewithin  0.0019032  0.0938162   0.020
   specieskaal:crosstypehybrid -0.7110086  0.0956978  -7.430
   
   Correlation of Fixed Effects:
                   (Intr) spcskl crsstypw crsstyph cllct. spcskl:crsstypw
   specieskaal     -0.075                                                
   crsstypwthn     -0.072  0.219                                         
   crsstyphybr     -0.014  0.475  0.323                                  
   collect.dat     -0.972 -0.094 -0.002   -0.124                         
   spcskl:crsstypw  0.037 -0.351 -0.708   -0.240    0.016                
   spcskl:crsstyph  0.018 -0.502 -0.156   -0.591    0.062  0.254

Anova(mod, type=3)

   Analysis of Deviance Table (Type III Wald chisquare tests)
   
   Response: log10(mass)
                      Chisq Df Pr(>Chisq)    
   (Intercept)       10.305  1   0.001327 ** 
   species           69.816  1  < 2.2e-16 ***
   crosstype         89.502  2  < 2.2e-16 ***
   collect.date      18.011  1  2.196e-05 ***
   species:crosstype 59.096  2  1.470e-13 ***
   ---
   Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Predicted random effects

These are box and QQ (to check normality) plots of the estimated random effect of each maternal plant.

predre <- setNames(data.frame(RE=ranef(mod)$mompid,SD=ranef(mod, sd=T)$`1`),c("RE","SD"))
ggplot(predre, aes(x = rownames(predre),y=RE)) +
  geom_point(size = 2) + coord_flip()+
  geom_errorbar(aes(ymin = RE-SD, ymax = RE+SD)) + labs(x="Maternal plants", y="Predicted random effects")

#Count
reStack <- ldply(ranef(mod))
print( qqmath( ~`(Intercept)`|.id, data=reStack, scales=list(relation="free"),
                 prepanel = prepanel.qqmathline,
                 panel = function(x, ...) {
                   panel.qqmathline(x, ...)
                   panel.qqmath(x, ...)
                 },
                 layout=c(1,1)))

Least square means

The least square means procedure can generate predictor estimates of each type, and give their significance groupings with a post-hoc Tukey test. S. hookeri-produced hybrids produce less Inflo Biomass than either crosses between or within S. hookeri populations. The other differences are not significant, but remember that the fixed effect of hybrid (vs. between) was significant (model summary).

#Count
rg <- ref.grid(sc.mix.momdadpid.l)

   Loading required namespace: lmerTest

#summary(rg)
sxc.lsm <- lsmeans(rg, ~ crosstype*species)
plot(sxc.lsm)

options(digits=4)
cld.mod <- cld(sxc.lsm, Letters=letters) #tukey letterings
cld.mod$response <- 10 ^ cld.mod$lsmean
cld.mod$uSE <- 10 ^ (cld.mod$lsmean+cld.mod$SE)
cld.mod$lSE <- 10 ^ (cld.mod$lsmean-cld.mod$SE)
cld.mod[rev(order(cld.mod$species, cld.mod$crosstype)),]

    crosstype species   lsmean      SE    df lower.CL upper.CL .group
    hybrid    kaal     0.06001 0.07310 99.85 -0.08501  0.20504   b   
    within    kaal     0.30975 0.06488 46.92  0.17922  0.44028   bc  
    between   kaal     0.36086 0.04734 25.59  0.26347  0.45826    c  
    hybrid    hook     0.20864 0.04362 12.64  0.11413  0.30315   bc  
    within    hook    -0.25453 0.06847 60.03 -0.39149 -0.11758  a    
    between   hook    -0.20151 0.04763 13.75 -0.30385 -0.09918  a    
    response    uSE    lSE
      1.1482 1.3587 0.9703
      2.0405 2.3693 1.7574
      2.2954 2.5598 2.0584
      1.6167 1.7876 1.4623
      0.5565 0.6515 0.4753
      0.6288 0.7016 0.5634
   
   Degrees-of-freedom method: satterthwaite 
   Results are given on the log10 (not the response) scale. 
   Confidence level used: 0.95 
   P value adjustment: tukey method for comparing a family of 6 estimates 
   significance level used: alpha = 0.05

H.wb <-  with(cld.mod[cld.mod$species=="hook",], response[crosstype=="within"]/response[crosstype=="between"] - 1)
K.wb <- with(cld.mod[cld.mod$species=="kaal",], response[crosstype=="within"]/response[crosstype=="between"] - 1)
K.H <- with(cld.mod[cld.mod$crosstype=="between",], response[species=="kaal"]/response[species=="hook"] - 1)
maxsp <- ifelse(K.H>0, "kaal","hook")
minsp <- ifelse(K.H<0, "kaal","hook")
maxresp <- with(cld.mod, response[species==maxsp & crosstype=="between"])
minresp <- with(cld.mod, response[species==minsp & crosstype=="between"])
HK.resp <-  with(cld.mod, response[species=="hook" & crosstype=="hybrid"])
KH.resp <-  with(cld.mod, response[species=="kaal" & crosstype=="hybrid"])
HK.int <-   with(cld.mod, ifelse(HK.resp > minresp & HK.resp < maxresp, (HK.resp-minresp)/(maxresp-minresp), 
                                 ifelse(HK.resp < minresp, HK.resp/minresp-1, HK.resp/maxresp-1)))
KH.int <-   with(cld.mod, ifelse(KH.resp > minresp & KH.resp < maxresp, (KH.resp-minresp)/(maxresp-minresp), 
                                 ifelse(KH.resp < minresp, KH.resp/minresp-1, KH.resp/maxresp-1)))

intermed <- (minresp + maxresp) / 2
with(ib, wilcox.test(ib[species=="kaal" & crosstype=="hybrid","mass"], mu=intermed))

   
    Wilcoxon signed rank test with continuity correction
   
   data:  ib[species == "kaal" & crosstype == "hybrid", "mass"]
   V = 1000, p-value = 0.09
   alternative hypothesis: true location is not equal to 1.462

with(ib, wilcox.test(ib[species=="hook" & crosstype=="hybrid","mass"], mu=intermed))

   
    Wilcoxon signed rank test with continuity correction
   
   data:  ib[species == "hook" & crosstype == "hybrid", "mass"]
   V = 36000, p-value = 6e-16
   alternative hypothesis: true location is not equal to 1.462

round(c(H.wb,K.wb,K.H,HK.int,KH.int),2)

   [1] -0.11 -0.11  2.65  0.59  0.31

ggplot(as.data.frame(cld.mod), aes(y=response, x=relevel(crosstype, "within"), fill=species)) +
  geom_col(position=position_dodge2()) +
  geom_linerange(aes(ymin=lSE, ymax=uSE), position=position_dodge(0.9)) +
  labs(x="", y="Inflorescence biomass (g)",fill="Maternal species") +
  scale_fill_manual(labels = c("S. hookeri  ", "S. kaalae  "), values=brewer.pal(name="Set1", n=3)[c(3,2)]) +
  scale_x_discrete(labels = c("Intrapopulation", "Interpopulation", "Hybrid")) +
  geom_text(aes(label=.group), position=position_dodge(0.9), hjust=0, vjust=-1) +
  scale_y_continuous(expand = expand_scale(add=c(0,0)), breaks = scales::pretty_breaks(n = 5)) +
  theme_classic() + theme(legend.text=element_text(face="italic", size=rel(1)), legend.position="bottom", axis.text = element_text(colour="black", size=rel(1)), text=element_text(size=14), axis.ticks.x = element_blank()) + geom_segment(aes(x=2.5, y=intermed, xend=3.5, yend=intermed))

Hybrids Inflo Biomass GLMM

John Powers

October 15, 2017