This is problem set #3, in which we want you to integrate your knowledge of data wrangling with some basic simulation skills and some linear modeling.
For ease of reading, please separate your answers from our text by marking our text with the > character (indicating quotes).
library(dplyr)
library(ggplot2)
library(lme4)
Let’s start by convincing ourselves that t-tests have the appropriate false positive rate. Run 10,000 t-tests with standard, normally-distributed data from a made up 30-person, single-measurement experiment (the command for sampling from a normal distribution is rnorm). What’s the mean number of “significant” results?
First do this using a for loop.
significant <- 0
for (i in 1:10000) {
d <- rnorm(30)
t <- t.test(d)
if (t$p.value < 0.05) {
significant <- significant + 1
}
}
significant / 10000
## [1] 0.0513
Next, do this using the replicate function:
mean(
replicate(10000,
if (t.test(rnorm(30))$p.value < 0.05) { 1 } else { 0 }))
## [1] 0.0496
Ok, that was a bit boring. Let’s try something moderately more interesting - let’s implement a p-value sniffing simulation, in the style of Simons, Nelson, & Simonsohn (2011).
Consider this scenario: you have done an experiment, again with 30 participants (one observation each, just for simplicity). The question is whether their performance is above chance. You aren’t going to check the p-value every trial, but let’s say you run 30 - then if the p-value is within the range p < .25 and p > .05, you optionally run 30 more and add those data, then test again. But if the original p value is < .05, you call it a day, and if the original is > .25, you also stop.
First, write a function that implements this sampling regime.
double.sample <- function () {
d <- rnorm(30)
p <- t.test(d)$p.value
if (p < 0.25 && p > 0.05) {
d <- c(d, rnorm(30))
p <- t.test(d)$p.value
return(p)
}
return(p)
}
Now call this function 10k times and find out what happens.
mean(
replicate(10000,
if (double.sample() < 0.05) { 1 } else { 0 }))
## [1] 0.0739
Is there an inflation of false positives? How bad is it?
Yes, there is an inflation of false positives by ~ 0.02.
Now modify this code so that you can investigate this “double the sample” rule in a bit more depth. Let’s see what happens when you double the sample ANY time p > .05 (not just when p < .25), or when you do it only if p < .5 or < .75. How do these choices affect the false positive rate?
HINT: Try to do this by making the function double.sample take the upper p value as an argument, so that you can pass this through dplyr.
HINT 2: You may need more samples. Find out by looking at how the results change from run to run.
double.sample <- function(n, p_max) {
d <- rnorm(n)
if (t.test(d)$p.value < p_max) {
return(1)
} else {
d <- c(d, rnorm(n))
if (t.test(d)$p.value < p_max) {
return(1)
}
}
return (0)
}
# mean.sample <- function(x) {
# return(mean(replicate(10000, double.sample(x$n, x$p))))
# }
# expand.grid(n=c(30, 100), p=c(0.05, 0.5, 0.75)) %>%
# apply(1, sample=mean.sample)
mean(replicate(10000, double.sample(30, 0.05)))
## [1] 0.0852
mean(replicate(10000, double.sample(30, 0.5)))
## [1] 0.6889
mean(replicate(10000, double.sample(30, 0.75)))
## [1] 0.9114
mean(replicate(10000, double.sample(100, 0.05)))
## [1] 0.0818
mean(replicate(10000, double.sample(100, 0.5)))
## [1] 0.687
mean(replicate(10000, double.sample(100, 0.75)))
## [1] 0.9136
What do you conclude on the basis of this simulation? How bad is this kind of data-dependent policy?
There are even more false positives with this scheme, though it gets masked as the size of the sample increases.
Let’s use the ToothGrowth dataset, on guineapig teeth based on orange juice and vitamin C. This is super simple. (Try ?ToothGrowth).
First plot the data, we’ll use qplot to understand how len (tooth length) depends on dose (amount of Vitamin C) and supp (delivery method).
ggplot(ToothGrowth, aes(y=len, x=dose, color=supp)) +
geom_point(stat = "identity")
So now you see what’s going on.
Next, always make a histogram of the DV before making a linear model! This reveals the distribution and can be helpful in choosing your model type.
ggplot(ToothGrowth, aes(x=len)) +
geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Now make a linear model of tooth lengths using lm. Try making one with main effects and interactions and another with just main effects. Make sure to assign them to variables so that you can get them later.
with_interactions <- lm(len ~ dose * supp, ToothGrowth)
main_effects <- lm(len ~ dose + supp, ToothGrowth)
summary(with_interactions)
##
## Call:
## lm(formula = len ~ dose * supp, data = ToothGrowth)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.2264 -2.8462 0.0504 2.2893 7.9386
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 11.550 1.581 7.304 1.09e-09 ***
## dose 7.811 1.195 6.534 2.03e-08 ***
## suppVC -8.255 2.236 -3.691 0.000507 ***
## dose:suppVC 3.904 1.691 2.309 0.024631 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.083 on 56 degrees of freedom
## Multiple R-squared: 0.7296, Adjusted R-squared: 0.7151
## F-statistic: 50.36 on 3 and 56 DF, p-value: 6.521e-16
summary(main_effects)
##
## Call:
## lm(formula = len ~ dose + supp, data = ToothGrowth)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.600 -3.700 0.373 2.116 8.800
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.2725 1.2824 7.231 1.31e-09 ***
## dose 9.7636 0.8768 11.135 6.31e-16 ***
## suppVC -3.7000 1.0936 -3.383 0.0013 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.236 on 57 degrees of freedom
## Multiple R-squared: 0.7038, Adjusted R-squared: 0.6934
## F-statistic: 67.72 on 2 and 57 DF, p-value: 8.716e-16
Now try taking out the intercept, using a -1 term in the formula. what does this do?
summary(lm(len ~ dose + supp - 1, ToothGrowth))
##
## Call:
## lm(formula = len ~ dose + supp - 1, data = ToothGrowth)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.600 -3.700 0.373 2.116 8.800
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## dose 9.7636 0.8768 11.135 6.31e-16 ***
## suppOJ 9.2725 1.2824 7.231 1.31e-09 ***
## suppVC 5.5725 1.2824 4.345 5.79e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.236 on 57 degrees of freedom
## Multiple R-squared: 0.9586, Adjusted R-squared: 0.9564
## F-statistic: 439.7 on 3 and 57 DF, p-value: < 2.2e-16
Thought question: Take a moment to interpret the coefficients of the model. Q1 - What are the units?
Length might be measured in millimeters since they’re guinea pigs. Dose is measured in mg/day.
Q2 - How does the interaction relate to the plot?
The model coded the Vit C condition as 1, and the interaction term means that the effect of dosage is different depending on the treatment.
Q3 - Should there be an interaction in the model? What does it mean? How important is it?
Yes it should be included. It would make sense for there to be varying dose levels required for different supplements to achieve the same effect.
Now make predictions from the model you like the best. What should happen with doses of 0, 1.5, 2.5, and 10 under both supplements?
HINT: use the predict function …
HINT 2: you will have to make a dataframe to do the prediction with, so use something like data.frame(dose=...).
predict(with_interactions, data.frame(dose=c(0, 1.5, 2.5, 10, 0, 1.5, 2.5, 10), supp=c(rep('VC', 4), rep('OJ', 4))))
## 1 2 3 4 5 6 7
## 3.29500 20.86857 32.58429 120.45214 11.55000 23.26714 31.07857
## 8
## 89.66429
Now plot the residuals from the original model. How do they look? HINT: ?resid
ggplot(with_interactions, aes(.fitted, .resid)) +
geom_point()
The residual plot looks pretty-random, suggesting a good model.
BONUS: test them for normality of distribution using a quantile-quantile plot.
HINT: ?qqplot and ?qqnorm
What the heck is going on? Load data from Frank, Vul, Saxe (2011, Infancy), a study in which we measured infants’ looking to hands in moving scenes. There were infants from 3 months all the way to about two years, and there were two movie conditions (Faces_Medium, in which kids played on a white background, and Faces_Plus, in which the backgrounds were more complex and the people in the videos were both kids and adults). Forgive our bad naming conventions.
Try to figure out what the most reasonable linear model of the data is.
d <- read.csv("http://langcog.stanford.edu/FVS2011-hands.csv")
ggplot(d, aes(y=hand.look, x=age, color=condition)) +
geom_point(stat = "identity")
ggplot(d, aes(x=hand.look)) +
geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
model <- lm(hand.look ~ age * condition -1, d)
summary(model)
##
## Call:
## lm(formula = hand.look ~ age * condition - 1, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.190121 -0.036508 -0.001239 0.032091 0.239474
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## age 0.003143 0.001176 2.673 0.00807 **
## conditionFaces_Medium 0.022394 0.015417 1.453 0.14772
## conditionFaces_Plus -0.006046 0.015242 -0.397 0.69198
## age:conditionFaces_Plus 0.003950 0.001653 2.389 0.01770 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.05456 on 228 degrees of freedom
## Multiple R-squared: 0.6641, Adjusted R-squared: 0.6583
## F-statistic: 112.7 on 4 and 228 DF, p-value: < 2.2e-16
ggplot(model, aes(.fitted, .resid)) +
geom_point()
Plot that model on the same plot as the data.
HINT: you can do this either using predict or (if you are feeling confident of your understanding of the models) using the built-in linear models in ggplot’s geom_smooth.
d.pred = data.frame(age=c(0, 15, 30, 0, 15, 30), condition=c(rep('Faces_Medium', 3), rep('Faces_Plus', 3)))
d.pred$hand.look <- predict(model, d.pred)
ggplot(d, aes(y=hand.look, x=age, color=condition)) +
geom_point(stat = "identity") +
geom_line(data=d.pred)
What do you conclude from this pattern of data?
There is a steeper slope on the amount of time children in the Faces_Plus condition followed hands compared to the children in Faces_Medium condition. We could summise that the more complex scenes grabbed the attention of children over 10 months more than the simple scene with the white background.
The goal here is to learn to use LMEMs using lme4 and to compare them to standard by subject, by item LMs, as well as the standard (no repeated measures) fixed effects GLM.
The dataset here is from Stiller, Goodman, & Frank (2014), a paper on children’s pragmatic inferences. We saw the paradigm in the counterbalancing lecture: it’s three faces: a smiley, a smiley with glasses, and a smiley with a hat and glasses. When told “my friend has glasses” do kids pick the one with the glasses and no hat? age.group is the kids’ age group, condition is either “label,” described above, or “no label,” which was a control condition in which kids picked without hearing the term “glasses” at all.
d <- read.csv("http://langcog.stanford.edu/scales_data.csv")
d$age.group <- factor(d$age.group)
Always begin with a histogram!
ggplot(d, aes(x=age, fill=condition)) +
geom_histogram() +
facet_wrap(~correct)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Start out by setting up a function for a 95% CI using the normal approximation.
sem <- function(x) {sd(x, na.rm=TRUE) / sqrt(length(x))}
ci95.norm <- function(x) {
c(mean(x) - (sem(x) * 1.96),
mean(x) + (sem(x) * 1.96))
#sem(x) * 1.96
}
But the number of participants in a group is likely to be < 30, so let’s also compute this with a t distribution.
ci95.t <- function(x) {
t.test(x)$conf.int[1:2]
}
On the other hand, maybe we should use bootstrap CIs because these are actually proportions, and the normal/t approximations don’t know that they are 0/1 bounded.
library(boot)
library(bootstrap)
Take a look at ?boot and ?bootci. Note that the syntax for the boot library is terrible, so we’re going to use it to check some code that I use:
theta <- function(x,xdata,na.rm=T) {mean(xdata[x],na.rm=na.rm)}
ci.low <- function(x,na.rm=T) {
mean(x,na.rm=na.rm) -
quantile(bootstrap(1:length(x),
10000,theta,x,na.rm=na.rm)$thetastar,.025,na.rm=na.rm)}
ci.high <- function(x,na.rm=T) {
quantile(bootstrap(1:length(x),
10000,theta,x,na.rm=na.rm)$thetastar,.975,na.rm=na.rm) -
mean(x,na.rm=na.rm)}
Now make 95% confidence intervals across participants using all the methods above:
boot.cicis <- data.frame()
norm <- ci95.norm(d$correct)
cis <- bind_rows(cis, data.frame(
type = "ci95.norm",
low = norm[1],
high = norm[2]))
t <- ci95.t(d$correct)
cis <- bind_rows(cis, data.frame(
type = "ci95.t",
low = t[1],
high = t[2]))
## Warning in rbind_all(x, .id): Unequal factor levels: coercing to character
boot.mean <- function(x, i) { mean(x[i]) }
b <- boot.ci(
boot(d$correct, boot.mean, R=10000))
## Warning in boot.ci(boot(d$correct, boot.mean, R = 10000)): bootstrap
## variances needed for studentized intervals
cis <- bind_rows(cis, data.frame(
type = "boot.ci",
low = b$normal[2],
high = b$normal[3]))
cis <- bind_rows(cis, data.frame(
type = "ci.mfc",
low = ci.low(d$correct),
high = ci.high(d$correct)))
Now plot the data by age and condition using dplyr and ggplot2. Plot some CIs on here - extra credit if you plot all of them and compare visually (you’ll need position = position_dodge() or some other way to offset them).
ggplot(cis) +
geom_hline(yintercept=mean(d$correct)) +
geom_errorbar(aes(ymax=high, ymin=low, x=type))
What do you conclude about confidence interval computation?
There are multiple ways to compute confidence intervals (or approximate it). Bootstraping syntax is confusing and the confidence interval of boostraped ci’s gives a difference from the mean instead.
OK, now do a basic GLM over the entire data frame, including age.group and condition and their interaction. (If we were focusing on developmental issues, I would ask you to think about how to model age here, but let’s treat it as three discrete groups for now).
NOTE: this model is not appropriate, because it assumes that each subject’s observations are independent from one another. It’s still fine to do the analysis, though: it can tell you a lot about the data and is easy and fast to fit, as long as you know that you can’t trust the p-values!
model <- glm(correct ~ age.group * condition * trial, data=d)
summary(model)
##
## Call:
## glm(formula = correct ~ age.group * condition * trial, data = d)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.92308 -0.25000 -0.08333 0.33333 0.91667
##
## Coefficients:
## Estimate Std. Error t value
## (Intercept) 0.600000 0.085413 7.025
## age.group3 0.323077 0.119625 2.701
## age.group4 0.275000 0.122044 2.253
## conditionNo Label -0.308333 0.122044 -2.526
## trialfaces 0.040000 0.120792 0.331
## trialhouses -0.120000 0.120792 -0.993
## trialpasta -0.040000 0.120792 -0.331
## age.group3:conditionNo Label -0.489744 0.171782 -2.851
## age.group4:conditionNo Label -0.358333 0.173475 -2.066
## age.group3:trialfaces -0.347692 0.169176 -2.055
## age.group4:trialfaces -0.206667 0.172596 -1.197
## age.group3:trialhouses -0.303077 0.169176 -1.791
## age.group4:trialhouses -0.088333 0.172596 -0.512
## age.group3:trialpasta -0.075385 0.169176 -0.446
## age.group4:trialpasta -0.001667 0.172596 -0.010
## conditionNo Label:trialfaces -0.081667 0.172596 -0.473
## conditionNo Label:trialhouses -0.005000 0.172596 -0.029
## conditionNo Label:trialpasta -0.001667 0.172596 -0.010
## age.group3:conditionNo Label:trialfaces 0.472692 0.242936 1.946
## age.group4:conditionNo Label:trialfaces 0.123333 0.245330 0.503
## age.group3:conditionNo Label:trialhouses 0.636410 0.242936 2.620
## age.group4:conditionNo Label:trialhouses 0.005000 0.245330 0.020
## age.group3:conditionNo Label:trialpasta 0.283718 0.242936 1.168
## age.group4:conditionNo Label:trialpasta 0.043333 0.245330 0.177
## Pr(>|t|)
## (Intercept) 6.2e-12 ***
## age.group3 0.00713 **
## age.group4 0.02462 *
## conditionNo Label 0.01180 *
## trialfaces 0.74066
## trialhouses 0.32092
## trialpasta 0.74066
## age.group3:conditionNo Label 0.00452 **
## age.group4:conditionNo Label 0.03932 *
## age.group3:trialfaces 0.04032 *
## age.group4:trialfaces 0.23165
## age.group3:trialhouses 0.07375 .
## age.group4:trialhouses 0.60900
## age.group3:trialpasta 0.65606
## age.group4:trialpasta 0.99230
## conditionNo Label:trialfaces 0.63628
## conditionNo Label:trialhouses 0.97690
## conditionNo Label:trialpasta 0.99230
## age.group3:conditionNo Label:trialfaces 0.05218 .
## age.group4:conditionNo Label:trialfaces 0.61536
## age.group3:conditionNo Label:trialhouses 0.00904 **
## age.group4:conditionNo Label:trialhouses 0.98375
## age.group3:conditionNo Label:trialpasta 0.24335
## age.group4:conditionNo Label:trialpasta 0.85986
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 0.182385)
##
## Null deviance: 145.37 on 587 degrees of freedom
## Residual deviance: 102.87 on 564 degrees of freedom
## AIC: 693.61
##
## Number of Fisher Scoring iterations: 2
Let’s now use dplyr to get data frames for by-items and by-subjects analyses. These should contain means for age and condition across items.
d.items <- d %>%
group_by(age.group, trial, condition) %>%
summarise(correct = mean(correct, na.rm=TRUE))
d.subjects <- d %>%
group_by(subid, age.group, condition) %>%
summarise(correct = mean(correct, na.rm=TRUE))
Now do standard linear models on each of these.
NOTE: These are not strictly correct either because of the normal approximation on percent correct (model doesn’t know it’s 0 - 1 bounded and could give you standard error that goes above 1). Again, useful to do and see what happens.
# by-item
ggplot(d.items, aes(x=age.group, y=correct, color=condition)) +
geom_point() +
facet_wrap(~trial)
model.items <- lm(correct ~ age.group * condition + trial,
d.items)
summary(model.items)
##
## Call:
## lm(formula = correct ~ age.group * condition + trial, data = d.items)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.17567 -0.03765 -0.00554 0.03459 0.17873
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.63109 0.05758 10.960 1.48e-08 ***
## age.group3 0.14154 0.06649 2.129 0.050256 .
## age.group4 0.20083 0.06649 3.021 0.008605 **
## conditionNo Label -0.33042 0.06649 -4.969 0.000168 ***
## trialfaces -0.08628 0.05429 -1.589 0.132837
## trialhouses -0.14607 0.05429 -2.691 0.016772 *
## trialpasta -0.01201 0.05429 -0.221 0.827920
## age.group3:conditionNo Label -0.14154 0.09403 -1.505 0.153025
## age.group4:conditionNo Label -0.31542 0.09403 -3.354 0.004346 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.09403 on 15 degrees of freedom
## Multiple R-squared: 0.9236, Adjusted R-squared: 0.8828
## F-statistic: 22.65 on 8 and 15 DF, p-value: 4.833e-07
# by-subject
ggplot(d.subjects, aes(x=age.group, y=correct, color=condition)) +
geom_jitter()
model.subjects <- lm(correct ~ age.group * condition,
d.subjects)
summary(model.subjects)
##
## Call:
## lm(formula = correct ~ age.group * condition, data = d.subjects)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.57000 -0.21154 0.01042 0.18000 0.51042
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.57000 0.04420 12.897 < 2e-16 ***
## age.group3 0.14154 0.06190 2.287 0.023714 *
## age.group4 0.20083 0.06315 3.180 0.001810 **
## conditionNo Label -0.33042 0.06315 -5.232 5.95e-07 ***
## age.group3:conditionNo Label -0.14154 0.08889 -1.592 0.113560
## age.group4:conditionNo Label -0.31542 0.08977 -3.514 0.000594 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.221 on 141 degrees of freedom
## Multiple R-squared: 0.5737, Adjusted R-squared: 0.5586
## F-statistic: 37.96 on 5 and 141 DF, p-value: < 2.2e-16
Do ANOVA on these. Note that ANOVA doesn’t let you figure out what is going on with respect to age.
anova(model.items)
## Analysis of Variance Table
##
## Response: correct
## Df Sum Sq Mean Sq F value Pr(>F)
## age.group 2 0.02035 0.01018 1.1510 0.34275
## condition 1 1.39820 1.39820 158.1380 2.273e-09 ***
## trial 3 0.08398 0.02799 3.1661 0.05536 .
## age.group:condition 2 0.09984 0.04992 5.6458 0.01486 *
## Residuals 15 0.13262 0.00884
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(model.subjects)
## Analysis of Variance Table
##
## Response: correct
## Df Sum Sq Mean Sq F value Pr(>F)
## age.group 2 0.1461 0.0731 1.4962 0.227519
## condition 1 8.5172 8.5172 174.4023 < 2.2e-16 ***
## age.group:condition 2 0.6047 0.3024 6.1912 0.002647 **
## Residuals 141 6.8859 0.0488
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
On to linear mixed effect models. Create the maximal random effects model a la Barr et al. (2013). Does it converge? If not, what will you do to make it converge? (The internet can be your friend here).
HINT: make sure that you consider which random effects are appropriate. Consider which observations are within/between subjects. E.g. having a random coefficient for age by subject doesn’t make sense, because each subject has only one age.
# This model doesn't converge:
model.max <- lmer(correct ~ age.group * condition + (1 | subid) + (1 + age.group * condition | trial), d)
summary(model.max)
## Linear mixed model fit by REML ['lmerMod']
## Formula: correct ~ age.group * condition + (1 | subid) + (1 + age.group *
## condition | trial)
## Data: d
##
## REML criterion at convergence: 683.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.0974 -0.6082 -0.1870 0.7663 2.1468
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## subid (Intercept) 0.0054717 0.07397
## trial (Intercept) 0.0012682 0.03561
## age.group3 0.0207459 0.14403 0.95
## age.group4 0.0033931 0.05825 0.91 0.99
## conditionNo Label 0.0001531 0.01237 0.78 0.93
## age.group3:conditionNo Label 0.0641521 0.25328 -0.97 -1.00
## age.group4:conditionNo Label 0.0001336 0.01156 -0.72 -0.90
## Residual 0.1734581 0.41648
##
##
##
##
##
## 0.97
## -0.98 -0.90
## -0.95 -1.00 0.86
##
## Number of obs: 588, groups: subid, 147; trial, 4
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.57000 0.04765 11.962
## age.group3 0.14154 0.09496 1.490
## age.group4 0.20083 0.06955 2.888
## conditionNo Label -0.33042 0.06346 -5.207
## age.group3:conditionNo Label -0.14154 0.15472 -0.915
## age.group4:conditionNo Label -0.31542 0.08995 -3.506
##
## Correlation of Fixed Effects:
## (Intr) ag.gr3 ag.gr4 cndtNL a.3:NL
## age.group3 -0.162
## age.group4 -0.447 0.611
## conditnNLbl -0.618 0.393 0.482
## ag.grp3:cNL -0.033 -0.879 -0.517 -0.478
## ag.grp4:cNL 0.438 -0.273 -0.663 -0.705 0.332
# This does and removes subject as a random effect, but keeps age * condition per trial.
model.max <- lmer(correct ~ age.group * condition + (1 + age.group * condition | trial), d)
## Warning in optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp), :
## convergence code 1 from bobyqa: bobyqa -- maximum number of function
## evaluations exceeded
summary(model.max)
## Linear mixed model fit by REML ['lmerMod']
## Formula: correct ~ age.group * condition + (1 + age.group * condition |
## trial)
## Data: d
##
## REML criterion at convergence: 683.8
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.1204 -0.6150 -0.2189 0.7858 2.1460
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## trial (Intercept) 1.195e-03 0.034568
## age.group3 2.055e-02 0.143350 1.00
## age.group4 3.250e-03 0.057009 1.00 1.00
## conditionNo Label 1.245e-04 0.011158 1.00 1.00
## age.group3:conditionNo Label 6.391e-02 0.252800 -1.00 -1.00
## age.group4:conditionNo Label 9.943e-05 0.009971 -1.00 -1.00
## Residual 1.788e-01 0.422862
##
##
##
##
## 1.00
## -1.00 -1.00
## -1.00 -1.00 1.00
##
## Number of obs: 588, groups: trial, 4
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 0.57000 0.04568 12.478
## age.group3 0.14154 0.09298 1.522
## age.group4 0.20083 0.06681 3.006
## conditionNo Label -0.33042 0.06068 -5.445
## age.group3:conditionNo Label -0.14154 0.15235 -0.929
## age.group4:conditionNo Label -0.31542 0.08603 -3.666
##
## Correlation of Fixed Effects:
## (Intr) ag.gr3 ag.gr4 cndtNL a.3:NL
## age.group3 -0.129
## age.group4 -0.424 0.617
## conditnNLbl -0.610 0.388 0.480
## ag.grp3:cNL -0.057 -0.887 -0.530 -0.471
## ag.grp4:cNL 0.433 -0.268 -0.660 -0.705 0.327
## convergence code: 1
How do these coefficients compare with the independent coefficients linear model? What do you conclude?
The coefficient estimates are completely identical. Aside from the random effects, the fixed effects might be computed using the same linear model.
Which random effects make the most difference? Find out using ranef. Plot the random effects for subject and item.
r <- ranef(model.max, condVar=T)
lattice::dotplot(r)
## $trial
Make the minimal random effects model with just a subject intecept. How does this compare?
model.min <- lmer(correct ~ (1 | subid) + 0, d)
summary(model.min)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: correct ~ (1 | subid) + 0
## Data: d
##
## AIC BIC logLik deviance df.resid
## 955.6 964.3 -475.8 951.6 586
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.4938 -0.4980 0.0000 0.8431 1.8390
##
## Random effects:
## Groups Name Variance Std.Dev.
## subid (Intercept) 0.2642 0.5140
## Residual 0.1831 0.4279
## Number of obs: 588, groups: subid, 147
lattice::dotplot(ranef(model.min, condVar=T))
## $subid
There is much more variability when there are no modelled effects. The subid has to account for all the random effects in the data.
Get significance values for the coefficient on the age*condition interaction using the anova, likelihood-ratio test approach, comparing between your semi-maximal model and one without an intercept.
anova(model.max, model.min)
## refitting model(s) with ML (instead of REML)
## Data: d
## Models:
## model.min: correct ~ (1 | subid) + 0
## model.max: correct ~ age.group * condition + (1 + age.group * condition |
## model.max: trial)
## Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## model.min 2 955.59 964.34 -475.79 951.59
## model.max 28 715.04 837.59 -329.52 659.04 292.55 26 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1