1 Are growth curve analyses very anti-conservative?

This is based on work by Yujing Jiang and Jesse Snedeker, the follow-up analyses are done by Hugh Rabgaliati

These analyses are a follow-up to Jesse Snedeker’s recent comment on social media, concerning the false positive rate in growth curve analyses. Jesse described simulations carried out by Yujing Huang, assessing the probability of generating false positive results from mixed effects growth curve analyses. To assess this, Yujing had taken a pre-existing dataset that she had analyzed with a growth curve model, and then she had shuffled the condition labels in that dataset, and reanalyzed with a growth curve model. She did this about 1000 times. The key finding was that on about 50% of her simulations, she obtained significant p values for time terms, or for terms that interacted with time. This is surprising, because significant results should really be about the 5% level when condition labels are randomly shuffled. In this document, I will confirm Yujing’s analysis and assess why these false positives occurred.

1.1 The original dataset.

Yujing provided me with her original dataset and the R scripts that she had used to produce her simulations. In this experiment, participants listen to sentences and look at pictures, and we measure gaze to a Target picture during two ROIs in the sentence. We’ll only analyze the first ROI (the “verb” ROI). The two graphs below show:

  1. Gaze to the Target across the four conditions of the experiment.
  2. The actual analyzed data. To process the data for analysis, we average over trials within subjects and conditions, and then create the Dependent Variable: gaze to the Target during condition Uamis is subtracted from gaze to the Target during condition Uam (that’s condition 0), and is compared to the subtraction of Uemis from Uem (that’s condition 1).
load("ua_for_growth_curve.Rda")
ua <- rename(ua ,c('ItemID'='ItemId')) # rename the colnames for function "shuffle". This is specific to my data
ua$Condition2 <- ua$Condition
ua_prepd_data <- plot_plot_return_prepd_data(ua)

## Warning: Removed 48 rows containing non-finite values (stat_smooth).
## Warning: Removed 48 rows containing missing values (geom_point).

1.2 Analyzing that dataset

Looking at the graphs, it seems very clear that there is only minimal difference between the conditions. Despite this, when we run the planned growth curve analysis, we see some quite striking results.

We fit a growth curve with the following structure. lmer(TargetLooks ~ (ot1+ot2)*Condition2 + (ot1+ot2| Subject), data= ua_prepd_data, REML=F)

In this model, we predict the subtraction of looks to the Target between Conditions 0 and 1 (that’s Uam - Uamis and Uem - Uemis) according to:

  1. A fixed effect intercept
  2. A fixed fixed linear time term
  3. A fixex effect quadratic time term (time terms are orthogonal polynomials)
  4. A fixed effect of Condition
  5. A fixed effect interaction of linear time and Condition
  6. A fixed effect interaction of quadratic time and Condition
  7. A random effect intercept for subjects
  8. A random effect linear time term for subjects
  9. A random effect quadratic time term for subjects
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
##   approximations to degrees of freedom [lmerMod]
## Formula: TargetLooks ~ (ot1 + ot2) * Condition2 + (ot1 + ot2 | Subject)
##    Data: ua_prepd_data
## 
##      AIC      BIC   logLik deviance df.resid 
##  -3209.6  -3116.4   1617.8  -3235.6     9539 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8051 -0.6656 -0.0024  0.6461  4.0327 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  Subject  (Intercept) 0.02517  0.1586              
##           ot1         0.35874  0.5990   -0.07      
##           ot2         0.19412  0.4406   -0.06  0.11
##  Residual             0.03920  0.1980              
## Number of obs: 9552, groups:  Subject, 60
## 
## Fixed effects:
##                  Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)     1.086e-01  2.068e-02  6.100e+01   5.250 2.02e-06 ***
## ot1             3.263e-03  8.150e-02  6.700e+01   0.040    0.968    
## ot2             1.668e-02  6.244e-02  7.200e+01   0.267    0.790    
## Condition2     -5.355e-03  4.056e-03  9.373e+03  -1.320    0.187    
## ot1:Condition2  2.702e-02  3.628e-02  9.375e+03   0.745    0.456    
## ot2:Condition2 -1.593e-01  3.629e-02  9.378e+03  -4.391 1.14e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) ot1    ot2    Cndtn2 ot1:C2
## ot1         -0.068                            
## ot2         -0.050  0.094                     
## Condition2  -0.099 -0.001  0.000              
## ot1:Condtn2  0.000 -0.224 -0.001  0.002       
## ot2:Condtn2  0.000 -0.001 -0.293  0.000  0.002

As you can see, despite the fact that there is minimal difference between the curves in the second figure, there is also a (highly) significant interaction between the quadratic time term and condition.

1.3  Permuting the dataset.

Yujing shuffled this dataset thousands of times, and ran lmer analyses on each of those datasets. We’re not going to repeat that here, because it would take far too long. But her results can be easily demonstrated. Below, we shuffle the data, plot the results, and print the GCA output, then repeat.

1.3.0.1  Shuffle the data once and run a GCA analysis

## Warning: Removed 35 rows containing non-finite values (stat_smooth).
## Warning: Removed 32 rows containing missing values (geom_point).

## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
##   approximations to degrees of freedom [lmerMod]
## Formula: TargetLooks ~ (ot1 + ot2) * Condition2 + (ot1 + ot2 | Subject)
##    Data: shuffle_ua_prepd_data
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2760.9  -2667.7   1393.4  -2786.9     9552 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.0959 -0.6474  0.0266  0.6554  4.6238 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  Subject  (Intercept) 0.02830  0.1682              
##           ot1         0.43524  0.6597   -0.10      
##           ot2         0.22535  0.4747   -0.06  0.33
##  Residual             0.04105  0.2026              
## Number of obs: 9565, groups:  Subject, 60
## 
## Fixed effects:
##                  Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)     8.797e-02  2.192e-02  6.100e+01   4.014 0.000166 ***
## ot1             3.325e-02  8.915e-02  6.600e+01   0.373 0.710389    
## ot2            -6.110e-02  6.670e-02  7.000e+01  -0.916 0.362719    
## Condition2      4.192e-02  4.146e-03  9.386e+03  10.110  < 2e-16 ***
## ot1:Condition2 -8.891e-02  3.712e-02  9.389e+03  -2.395 0.016648 *  
## ot2:Condition2 -2.660e-02  3.711e-02  9.388e+03  -0.717 0.473413    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) ot1    ot2    Cndtn2 ot1:C2
## ot1         -0.098                            
## ot2         -0.050  0.288                     
## Condition2  -0.095 -0.002 -0.001              
## ot1:Condtn2 -0.001 -0.210 -0.002  0.005       
## ot2:Condtn2  0.000 -0.002 -0.280  0.002  0.004

1.3.0.2  Shuffle the data a second time and run a GCA analysis

## Warning: Removed 18 rows containing non-finite values (stat_smooth).
## Warning: Removed 18 rows containing missing values (geom_point).

## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
##   approximations to degrees of freedom [lmerMod]
## Formula: TargetLooks ~ (ot1 + ot2) * Condition2 + (ot1 + ot2 | Subject)
##    Data: shuffle_ua_prepd_data
## 
##      AIC      BIC   logLik deviance df.resid 
##  -2968.0  -2874.8   1497.0  -2994.0     9569 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8199 -0.6429  0.0032  0.6533  3.9955 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  Subject  (Intercept) 0.02638  0.1624              
##           ot1         0.37130  0.6093   -0.05      
##           ot2         0.17014  0.4125   -0.09  0.09
##  Residual             0.04028  0.2007              
## Number of obs: 9582, groups:  Subject, 60
## 
## Fixed effects:
##                  Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)     1.142e-01  2.117e-02  6.100e+01   5.395 1.17e-06 ***
## ot1             3.819e-02  8.282e-02  6.600e+01   0.461 0.646201    
## ot2            -1.512e-01  5.922e-02  7.300e+01  -2.554 0.012732 *  
## Condition2     -1.396e-02  4.103e-03  9.404e+03  -3.402 0.000673 ***
## ot1:Condition2 -9.227e-02  3.675e-02  9.407e+03  -2.511 0.012065 *  
## ot2:Condition2  2.111e-01  3.672e-02  9.406e+03   5.750 9.18e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) ot1    ot2    Cndtn2 ot1:C2
## ot1         -0.046                            
## ot2         -0.081  0.074                     
## Condition2  -0.097  0.000  0.000              
## ot1:Condtn2  0.000 -0.221  0.000  0.004       
## ot2:Condtn2  0.000  0.000 -0.309  0.003  0.005

You can see that there are multiple significant results, particular in the Condition and interaction terms! That is not what we would expect given that the shuffled data should show no effects in these terms.

2 Why do these GCA analyses show such a larger number of false positives?

In her original message about this, Jesse argued that GCA analyses show a high false positive rate because they do not respect the form of the dependent variable: We are conducting linear models on proportions, which are necessarily bounded between 0 and 1. When the data from our experiments are close to 0 or 1, this can result in distortions. The results also ignore item effects.

2.1 Might a binomial growth curve work better?

It is hard to test Jesse’s claim on Yujing’s original dataset, because that data can’t really be analyzed using a binomial model: the dependent variable is a subtraction of proportions. To test this idea, I turned to one of my own visual world datasets. In this data, participants from two groups (SC and AC) listened to sentences whose prosody either did, or did not, encourage them to look towards a Target Instrument. We’ll analyze the averaged data using a linear model first, then move on to a binomial model (fit using a Bayesian approach).

2.1.1 A linear model

We will analyze this data similarly to Yujing’s analysis (we do it differently in our actual paper).

We fit a growth curve with the following structure. lmer(Inst ~1+(t1+t2)*Cond*Pop+(1+t1+t2|Name), data=Pros.GCurve)

In this model, we predict the proportion of looks to the Target over time according to:

  1. A fixed effect intercept
  2. A fixed fixed linear time term
  3. A fixex effect quadratic time term (time terms are orthogonal polynomials)
  4. A fixed effect of Condition
  5. A fixed effect of Population
  6. A fixed effect interaction of linear time and Condition
  7. A fixed effect interaction of quadratic time and Condition
  8. A fixed effect interaction of linear time and Population
  9. A fixed effect interaction of quadratic time and Population
  10. A fixed effect interaction of linear time, Condition and Population
  11. A fixed effect interaction of quadratic time, Condition and Population
  12. A random effect intercept for subjects We had to remove this term to aide convergence
  13. A random effect linear time term for subjects
  14. A random effect quadratic time term for subjects

This is not a large dataset, and so we should not have sufficient data such that we can very precise estimates for higher order terms.

## Linear mixed model fit by REML t-tests use Satterthwaite approximations
##   to degrees of freedom [lmerMod]
## Formula: Inst ~ 1 + (t1 + t2) * Pop * Cond + ((0 + t1 | Name) + (0 + t2 |  
##     Name) + (0 + Cond | Name))
##    Data: Pros.GCurve
## 
## REML criterion at convergence: -1265
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.6179 -0.5252 -0.0401  0.5158  3.3111 
## 
## Random effects:
##  Groups   Name     Variance Std.Dev. Corr
##  Name     t1       0.057658 0.24012      
##  Name.1   t2       0.021426 0.14637      
##  Name.2   CondInst 0.023518 0.15336      
##           CondMod  0.006824 0.08261  0.14
##  Residual          0.014391 0.11996      
## Number of obs: 1242, groups:  Name, 48
## 
## Fixed effects:
##                    Estimate Std. Error         df t value Pr(>|t|)    
## (Intercept)       1.649e-01  1.371e-02  4.590e+01  12.031 8.88e-16 ***
## t1                3.629e-01  3.678e-02  4.480e+01   9.866 8.39e-13 ***
## t2               -2.960e-02  2.444e-02  4.580e+01  -1.211 0.232032    
## PopSC            -4.849e-02  1.371e-02  4.590e+01  -3.538 0.000937 ***
## CondMod          -8.165e-02  1.231e-02  4.610e+01  -6.633 3.22e-08 ***
## t1:PopSC         -8.558e-02  3.678e-02  4.480e+01  -2.327 0.024571 *  
## t2:PopSC          3.639e-02  2.444e-02  4.580e+01   1.489 0.143329    
## t1:CondMod       -1.214e-01  1.232e-02  1.048e+03  -9.849  < 2e-16 ***
## t2:CondMod        8.277e-02  1.228e-02  1.044e+03   6.738 2.65e-11 ***
## PopSC:CondMod     2.681e-02  1.231e-02  4.610e+01   2.178 0.034576 *  
## t1:PopSC:CondMod -9.744e-03  1.232e-02  1.048e+03  -0.791 0.429404    
## t2:PopSC:CondMod -4.331e-02  1.228e-02  1.044e+03  -3.525 0.000441 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) t1     t2     PopSC  CondMd t1:PpSC t2:PpSC t1:CnM
## t1           0.001                                                   
## t2           0.000  0.001                                            
## PopSC       -0.001 -0.001  0.000                                     
## CondMod     -0.514  0.001  0.000 -0.001                              
## t1:PopSC    -0.001 -0.001 -0.001  0.001 -0.001                       
## t2:PopSC     0.000 -0.001 -0.002  0.000  0.000  0.001                
## t1:CondMod   0.003  0.004  0.003 -0.003  0.003 -0.004  -0.003        
## t2:CondMod   0.001  0.002  0.003 -0.001  0.001 -0.002  -0.003   0.005
## PopSC:CndMd -0.001 -0.001  0.000 -0.514 -0.001  0.001   0.000  -0.003
## t1:PpSC:CnM -0.003 -0.004 -0.003  0.003 -0.003  0.004   0.003  -0.013
## t2:PpSC:CnM -0.001 -0.002 -0.003  0.001 -0.001  0.002   0.003  -0.005
##             t2:CnM PSC:CM t1:PSC:
## t1                               
## t2                               
## PopSC                            
## CondMod                          
## t1:PopSC                         
## t2:PopSC                         
## t1:CondMod                       
## t2:CondMod                       
## PopSC:CndMd -0.001               
## t1:PpSC:CnM -0.005  0.003        
## t2:PpSC:CnM -0.007  0.001  0.005

As you can see, there are many significant terms in this regression, with very small standard errors. In addition, the graph below shows that the model makes some horrible predictions about how proportions could lie below 0, which justifies Jesse’s concerns about the linear assumption.

Pros.GCurve$fitted <-  fitted(pros.gc)
ggplot(Pros.GCurve,aes(TimeFrame,Inst,linetype = Cond)) + facet_wrap(~Pop) + stat_summary(aes(y=fitted), fun.y=mean, geom="line")+coord_cartesian(ylim=c(0,0.6))+ theme_minimal()+ theme(legend.title=element_blank(),legend.position="bottom", legend.background = element_rect(colour = "white"), strip.background = element_rect(colour = "white"),  text = element_text(size = 16, colour = "black", angle = 0), strip.text.y = element_text(size = 12, colour = "black", angle = 0))+labs(x = "Time (ms)",y = "Proportion of Trials with Looks to Target Instrument")+geom_point(data = Pros.GCurve.Points, cex = 1, colour = "black", aes(shape = Cond))

2.1.2 A (Bayesian) logistic model

The linear model thus provides a bad fit, and seems to over-estimate the significance of the parameters. We’ll try a binomial model instead. Binomial growth curves don’t tend to work well when estimated with lme4, but I’ve been able to get them to work using Bayesian modeling (via the BRMS link to STAN) and by setting reasonable priors on the coefficients (student_t). We use 3 chains, 1000 iterations each, with 600 for warmup.

brm(Inst ~1+(t1+t2)*Pop*Cond+(0+t1+t2||Name) + (1+t1+t2||ItemNo), data=Pros.GCurve, family = "bernoulli", chains = 3, iter = 1000, warmup = 600,prior=c(set_prior ("student_t (4,0, 1)")))

The estimated coefficients look pretty different, which is unsurprising because we are modeling something very different. If you aren’t familier with BRMS output, then the key fixed effect terms are down in the final table. The first column labels the coefficient, the second gives the Beta estimate, the third the estimated standard error, then lower and upper bound on the credible interval for the estimate, followed by effective sample size and RHat stats, which are essentially measures of whether sampling was done efficiently and whether the chains converged.

##  Family: bernoulli(logit) 
## Formula: Inst ~ 1 + (t1 + t2) * Pop * Cond + (0 + t1 + t2 || Name) + (1 + t1 + t2 || ItemNo) 
##    Data: Pros.GCurve (Number of observations: 4779) 
## Samples: 3 chains, each with iter = 1000; warmup = 600; thin = 1; 
##          total post-warmup samples = 1200
##    WAIC: Not computed
##  
## Group-Level Effects: 
## ~ItemNo (Number of levels: 8) 
##               Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)     0.82      0.29     0.44     1.60        362 1.00
## sd(t1)            2.03      1.03     0.87     4.83        179 1.03
## sd(t2)            0.44      0.36     0.02     1.28        427 1.00
## 
## ~Name (Number of levels: 48) 
##        Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(t1)     2.65      0.37     2.03     3.48        352    1
## sd(t2)     1.64      0.30     1.11     2.27        445    1
## 
## Population-Level Effects: 
##                  Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept           -2.46      0.32    -3.13    -1.85        322 1.01
## t1                   3.87      1.09     1.13     5.60        237 1.04
## t2                  -1.63      0.41    -2.36    -0.72        455 1.00
## PopSC               -0.32      0.07    -0.47    -0.17        757 1.00
## CondMod             -0.81      0.07    -0.95    -0.68        697 1.00
## t1:PopSC            -0.50      0.43    -1.35     0.32        457 1.00
## t2:PopSC             0.29      0.30    -0.32     0.87        546 1.00
## t1:CondMod           0.23      0.27    -0.29     0.78        743 1.00
## t2:CondMod           0.32      0.22    -0.09     0.72        816 1.00
## PopSC:CondMod        0.29      0.07     0.15     0.43        506 1.00
## t1:PopSC:CondMod    -1.05      0.29    -1.61    -0.51        530 1.00
## t2:PopSC:CondMod     0.43      0.22     0.03     0.85        648 1.01
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).

In addition, the model clearly provides a better fit to the data, without any worries about impossible values:

Pros.GCurve$fitted <-  fitted(pros.gc.brm)
ggplot(Pros.GCurve,aes(TimeFrame,Inst,linetype = Cond)) + facet_wrap(~Pop) + stat_summary(aes(y=fitted[,1]), fun.y=mean, geom="line")+coord_cartesian(ylim=c(0,0.6))+ theme_minimal()+ theme(legend.title=element_blank(),legend.position="bottom", legend.background = element_rect(colour = "white"), strip.background = element_rect(colour = "white"),  text = element_text(size = 16, colour = "black", angle = 0), strip.text.y = element_text(size = 12, colour = "black", angle = 0))+labs(x = "Time (ms)",y = "Proportion of Trials with Looks to Target Instrument")+geom_point(data = Pros.GCurve.Points, cex = 1, colour = "black", aes(shape = Cond))

This evidence so far is consistent with Jesse being right about the importance of the form of the model. But something else may be going on, as many of the credible intervals around the coefficients are quite tight, and the results of this regression don’t match up with prior time window analyses that we did.

2.2 Autocorrelations

A second worry about growth curve analyses of eye movements lies in the high autocorrelation in the dependent variable. If you are gazing towards the target at x milliseconds, then you will probably also gaze at the target at x+20 milliseconds. Unless this correlation is accounted for, growth curve regression estimates can have two problems. First (and less worrying) coefficients may be inefficiently estimated. Second (and much, much more worrying), the standard errors for each coeffficient can be greatly underestimated, and so the resulting t statistics will be greatly overestimated. See here for more details.

Standard OLS regressions assume that there is no correlation in a regression’s residuals. However, if we look back at the model that started all of this, and assess the autocorrelation of the residuals from Yujing’s original growth curve analysis, we can see that there is significant correlation in the residuals. In the plot below, correlations above the blue line are significant.

acf(resid(model_ua))

The lme4 package doesn’t have an easy way of modeling these correlations, but they can be somewhat reduced by using a fuller random effects structure.

lmer(TargetLooks ~ (ot1+ot2)*Condition2 + ((ot1+ot2)*Condition2 | Subject), data= ua_prepd_data, REML=F)

Not only is the autocorrelation (somewhat) reduced, but the standard errors and t value are now far more sensible.

full_model_ua <- lmer(TargetLooks ~ (ot1+ot2)*Condition2 + ((ot1+ot2)*Condition2 | Subject), data= ua_prepd_data, REML=F)
acf(resid(full_model_ua))

print(summary(full_model_ua))
## Linear mixed model fit by maximum likelihood t-tests use Satterthwaite
##   approximations to degrees of freedom [lmerMod]
## Formula: 
## TargetLooks ~ (ot1 + ot2) * Condition2 + ((ot1 + ot2) * Condition2 |  
##     Subject)
##    Data: ua_prepd_data
## 
##      AIC      BIC   logLik deviance df.resid 
##  -8004.3  -7803.7   4030.2  -8060.3     9524 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -6.1219 -0.6120 -0.0046  0.6073  5.8488 
## 
## Random effects:
##  Groups   Name           Variance Std.Dev. Corr                         
##  Subject  (Intercept)    0.04135  0.2033                                
##           ot1            0.78911  0.8883    0.05                        
##           ot2            0.36746  0.6062    0.17 -0.09                  
##           Condition2     0.04493  0.2120   -0.63 -0.16 -0.21            
##           ot1:Condition2 1.14640  1.0707   -0.07 -0.75  0.21  0.12      
##           ot2:Condition2 0.74974  0.8659   -0.19  0.18 -0.67  0.13 -0.24
##  Residual                0.02192  0.1481                                
## Number of obs: 9552, groups:  Subject, 60
## 
## Fixed effects:
##                 Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)     0.111338   0.026340 59.990000   4.227  8.2e-05 ***
## ot1             0.016229   0.116308 60.120000   0.140    0.889    
## ot2             0.020806   0.080624 59.840000   0.258    0.797    
## Condition2     -0.008102   0.027535 59.960000  -0.294    0.770    
## ot1:Condition2  0.014054   0.140882 60.070000   0.100    0.921    
## ot2:Condition2 -0.163448   0.115052 60.070000  -1.421    0.161    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) ot1    ot2    Cndtn2 ot1:C2
## ot1          0.048                            
## ot2          0.167 -0.084                     
## Condition2  -0.634 -0.157 -0.200              
## ot1:Condtn2 -0.068 -0.746  0.196  0.114       
## ot2:Condtn2 -0.187  0.170 -0.669  0.127 -0.230

2.3 Combining a full random effects structure and a binomial model

These considerations suggest that the best approach would be to use a binomial model that has a maximal random effects structure.

brm(Inst ~1+(t1+t2)*Pop*Cond+(0+(t1+t2)*Cond||Name) + (1+(t1+t2)*Pop*Cond||ItemNo), data=Pros.GCurve, family = "bernoulli", chains = 3, iter = 1000, warmup = 600,prior=c(set_prior ("student_t (4,0, 1)")))

Note that this model takes quite a long time to fit. It also lack correlations amongst the random effects components, to ease convergence.

summary(pros.gc.brm.full)
##  Family: bernoulli(logit) 
## Formula: Inst ~ 1 + (t1 + t2) * Pop * Cond + (0 + (t1 + t2) * Cond || Name) + (1 + (t1 + t2) * Pop * Cond || ItemNo) 
##    Data: Pros.GCurve (Number of observations: 4779) 
## Samples: 3 chains, each with iter = 1000; warmup = 600; thin = 1; 
##          total post-warmup samples = 1200
##    WAIC: Not computed
##  
## Group-Level Effects: 
## ~ItemNo (Number of levels: 8) 
##                      Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(Intercept)            1.16      0.42     0.62     2.25        464 1.00
## sd(t1)                   3.22      1.77     1.23     8.16        241 1.00
## sd(t2)                   0.84      0.60     0.04     2.24        327 1.01
## sd(PopSC)                0.15      0.12     0.01     0.47        527 1.00
## sd(CondMod)              0.88      0.35     0.43     1.71        541 1.00
## sd(t1:PopSC)             1.40      0.68     0.37     3.10        557 1.00
## sd(t2:PopSC)             0.35      0.29     0.02     1.09        888 1.00
## sd(t1:CondMod)           1.77      0.74     0.67     3.41        576 1.02
## sd(t2:CondMod)           1.37      0.65     0.47     3.06        427 1.01
## sd(PopSC:CondMod)        0.65      0.27     0.31     1.26        621 1.00
## sd(t1:PopSC:CondMod)     1.51      0.74     0.51     3.35        816 1.00
## sd(t2:PopSC:CondMod)     0.39      0.32     0.01     1.15        588 1.01
## 
## ~Name (Number of levels: 48) 
##                Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## sd(t1)             2.97      0.56     1.93     4.20        330 1.00
## sd(t2)             1.66      0.39     0.94     2.48        479 1.00
## sd(CondInst)       1.69      0.24     1.26     2.21        293 1.01
## sd(CondMod)        2.16      0.39     1.51     3.05        387 1.01
## sd(t1:CondMod)     2.49      0.58     1.50     3.78        434 1.01
## sd(t2:CondMod)     0.61      0.38     0.03     1.36        270 1.01
## 
## Population-Level Effects: 
##                  Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat
## Intercept           -3.65      0.50    -4.63    -2.66        324 1.00
## t1                   4.55      1.91     0.14     7.48        189 1.01
## t2                  -2.26      0.59    -3.42    -1.11        709 1.00
## PopSC               -0.58      0.24    -1.06    -0.12        324 1.00
## CondMod             -1.32      0.40    -2.10    -0.55        445 1.00
## t1:PopSC            -0.95      0.69    -2.33     0.35        669 1.00
## t2:PopSC             0.43      0.39    -0.33     1.26       1001 1.00
## t1:CondMod           0.27      0.66    -0.91     1.70        838 1.00
## t2:CondMod           0.05      0.54    -1.03     1.10        566 1.00
## PopSC:CondMod        0.27      0.32    -0.37     0.92        424 1.00
## t1:PopSC:CondMod    -0.84      0.67    -2.19     0.44        810 1.00
## t2:PopSC:CondMod     0.60      0.35    -0.08     1.28       1200 1.00
## 
## Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample 
## is a crude measure of effective sample size, and Rhat is the potential 
## scale reduction factor on split chains (at convergence, Rhat = 1).
Pros.GCurve$fitted <-  fitted(pros.gc.brm.full)
ggplot(Pros.GCurve,aes(TimeFrame,Inst,linetype = Cond)) + facet_wrap(~Pop) + stat_summary(aes(y=fitted[,1]), fun.y=mean, geom="line")+coord_cartesian(ylim=c(0,0.6))+ theme_minimal()+ theme(legend.title=element_blank(),legend.position="bottom", legend.background = element_rect(colour = "white"), strip.background = element_rect(colour = "white"),  text = element_text(size = 16, colour = "black", angle = 0), strip.text.y = element_text(size = 12, colour = "black", angle = 0))+labs(x = "Time (ms)",y = "Proportion of Trials with Looks to Target Instrument")+geom_point(data = Pros.GCurve.Points, cex = 1, colour = "black", aes(shape = Cond))

This model fits well, and appears to be very sensible! Many fewer parameters are now credibly different from 0, particularly parameters that interact with time. In fact, this analysis is now much more consistent with a prior logistic regression analysis that we had done on this data, using early and late time windows rather than a full growth curve model. While it may seem odd at first blush that time and condition do not interact, given that the curves drift further apart over time, this actually does make sense in log odds world, where differences should be magnified when close to 0 or 1.