hist(freeResponseOnly$mean_StudyTime)
hist(freeResponseOnly$mean_MU)
hist(freeVtotal$mean_MU)
If kids have to check a shopping list that tells them what items to select from a virtual store, and there is an annoying 4-second delay every time they want to check their list, children will be incentivized to make fewer trips. This can be accomplished by selecting more correct items per trip. To accomplish this, children will have to attempt to encode more items in memory, by spending more time encoding list items. We predict, then, that the time children spend studying the list will be longer in the Long-delay conditions (when there is a 4-second delay to access the list) and shorter in the No-delay conditions (when there is no delay to access the list).
Relatedly, we predict that children will succeed at encoding more items when they attempt to do so, leading to more correct responses as costs increase. We predict, then, that Memory Usage (MU, calculated as a difference score, with corrections for guessing) will be larger in the Long-delay conditions and smaller in the No-delay conditions.
When allowed to choose how many attempts to make, we expect that children will tend to be conservative and therefore underperform. Given this, if children are ‘forced’ to respond despite their risk aversion, memory usage will increase. We predict then that MU in total-response conditions will be higher than MU in free-response conditions.
The Shopping Game is a tablet-based, naturalistic, child-friendly memory paradigm. In this game, 74 5.5-7.0 year-old children were asked to pick items from a store based on a shopping list of 10 items. The store contained 20 items (10 of them from the list), arranged in two-alternative forced choice (2AFC) pairs on shelves. Importantly, the store and the list were not visible on the screen simultaneously; instead, participants could toggle between them by tapping an icon. The key manipulation was the accessibility of the list.
The experiment had four conditions: Two Delay Cost conditions (Long-delay (4 s) versus No-delay (0 s)) and two Response Type conditions (Free-response versus Total-response). No-delay and Long-delay condition trials are identical except for the imposition of a 4 s waiting period between a child touching the icon to make a trip to their shopping list and the list actually appearing. In the Free-response condition trials, children are free to attempt as many of the 2AFC shelves as they like on each visit to the store and may make as many trips back to the list as they like. Total-response condition trials are always the last (4th) trial in their respective block and will come as a surprise to the children. When a Total-response trial begins, it doesn’t appear any different from the previous three Free-response trials: after the child makes their first trip to the list and returns to the store, they will make their selections as usual. However, when the child taps the icon to revisit the list, a “closing” sign appears on the left of the screen, and the experimenter will explain to the child that the store is closing now, and they must continue to choose items from each of the 2AFC pairs (based on whatever memory they have from their first (and only) trip). The Total-response conditions will therefore allow us to assess whether children are using the entire contents of their memory when they make a trip between the list and the shelf. This is a way to estimate how children respond based on memories that they may be less confident in.
There will be two blocks of trials, blocked by condition. A session will begin with either a No-delay block or a Long-delay block (counterbalanced between children. Each block will consist of one practice trial, three Free-response trials, and a fourth, final Total-response trial where children are given only one opportunity to sample the list and, critically, must make a selection for all of the 10 2AFC shelves. Since there are only two Total-response trials and they occur after three Free-response trials in both No-delay and Long-delay conditions, we do not expect that children will anticipate their appearance.
All children will complete both the blocked No-delay and Long-delay conditions; order will be counterbalanced between children. The Total-response conditions will always be the last trial in each of the two blocks.
We will recruit typically developing children between the ages of 5.5 and 7.0 years. We will recruit and test children from local children’s museums, daycares, and libraries. Children will receive a small toy and some stickers as a gift for participating.
Based on power analyses calculated using G*Power 3.1 for multiple linear regression with two predictors, we determined the minimum sample size to be 68 children based on a medium effect size of f2 = 0.15, alpha = 0.05, and 80% power.
We will stop data collection when we have collected data from 78 children, to allow for unexpected exclusions after data collection has been completed. We will include all children who met the inclusion criteria. The final sample size after exclusions was 74 children.
Outliers and Exclusions:
We will conduct an outlier analysis on the log of Study Time (the amount of time the child chooses to spend on the shopping list before making a visit to the store). Log-transformed Study times more than 3.0 MAD from the median (Leys et al., 2013) will be excluded from the analysis. If such an exclusion results in no trips for a given trial, the trial will be excluded.
Leys, C., Ley, C., Klein, O., Bernard, P., & Licata, L. (2013). Detecting outliers: Do not use standard deviation around the mean, use absolute deviation around the median.
Hypotheses will be tested using Generalized Linear Mixed-Effect Models (Lo & Andrews, 2015) with Condition as a fixed effect and child (participant) as a random intercept, and with appropriate linking functions. The model for Study Time will have an inverse Gaussian linking function because of its similar distribution to reaction times (Lo & Andrews, 2015). The model for Memory Usage will have an identity linking function because based on the results of our previous studies, we expect it to follow a normal distribution. If there are substantive violations of the assumptions of these analyses, we will corroborate results with analogous non-parametric tests.
We will be using an alpha of 0.05 to determine the significance of factors in our generalized linear mixed-effects models.
Lo, S., & Andrews, S. (2015). To transform or not to transform: using generalized linear mixed models to analyze reaction time data. Frontiers in Psychology, 6, 1171.
Technically there is a preprocessing step that comes before this, but its kind of long and I don’t think its necessary to include in the project (its just about condensing individual participant csv files into one master file, and in the meantime it also calculates correct responses + means across trials for the variables of interest).
Here we read in the data after its initial preprocessing step, but we have to combine it with the csv that contains children’s ages and properly name one of the conditions, and then we get an average study time for each condition for each child, since we preregistered modeling it that way. Although a glm could also model multiple rows per condition per trial, I am not aware that doing it that way would be substantially “better”, so I will stick to the plan.
We should check the histograms at least, to make sure its distribution does indeed look like an inverse-gaussian (response time distribution) for study time, and normal distributions for Memory usage.
hist(freeResponseOnly$mean_StudyTime)
hist(freeResponseOnly$mean_MU)
hist(freeVtotal$mean_MU)
The memory usage histograms are not as normal as expected. We may have some issues in our linear models.
Hypotheses will be tested using Generalized Linear Mixed-Effect Models (Lo & Andrews, 2015) with Condition as a fixed effect and child (participant) as a random intercept, and with appropriate linking functions. The model for Study Time will have an inverse Gaussian linking function because of its similar distribution to reaction times (Lo & Andrews, 2015).
Lo, S., & Andrews, S. (2015). To transform or not to transform: using generalized linear mixed models to analyze reaction time data. Frontiers in Psychology, 6, 1171.
<- glmer(mean_StudyTime ~ condsFile + (1|participant),
studyTimeGlmm data = freeResponseOnly,
family = inverse.gaussian(link = "identity"))
Before we look at the summary, lets do some model checks to make sure this worked OK.
I like using this plot! It shows the actual values as a smoothed histogram and then the model predicted values in red. We want the lines to somewhat match up, but we won’t get a perfect match - as long as it’s in the same ballpark.
plot(density(freeResponseOnly$mean_StudyTime),
xlim=c(0, 160), ylim=c(0, 1), main='M_1 y_hat')
lines(density(predict(studyTimeGlmm, type='response')), col='red')
That looks great!
We are looking for random, even scatter here.
plot(studyTimeGlmm)
I wouldn’t say this scatter is super random. The errors are getting larger as the fitted values increase.
In the QQ plot, we want the dots to generally follow the diagonal line.
<- residuals(studyTimeGlmm)
residuals qqnorm(residuals)
qqline(residuals, col = "red")
That looks great.
Now let’s run check_model from the Performance package.
library(performance)
check_model(studyTimeGlmm)
Loading required namespace: statmod
Simulations from your fitted model produce infinite values. Consider if this is sensible
Cannot simulate residuals for models of class `glmerMod`. Please try
`check_model(..., residual_type = "normal")` instead.
There is a bit of wiggliness here, but overall things look OK.
Now we will compare our model with one that has no fixed effects. We want our full model to fit better than the null.
<- glmer(mean_StudyTime ~ 1 + (1|participant),
mixConTime_GLMM_null data = freeResponseOnly,
family = inverse.gaussian(link = "identity"))
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
Model failed to converge with max|grad| = 0.0721721 (tol = 0.002, component 1)
Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, : Model is nearly unidentifiable: very large eigenvalue
- Rescale variables?
anova(mixConTime_GLMM_null,studyTimeGlmm)
Data: freeResponseOnly
Models:
mixConTime_GLMM_null: mean_StudyTime ~ 1 + (1 | participant)
studyTimeGlmm: mean_StudyTime ~ condsFile + (1 | participant)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
mixConTime_GLMM_null 3 625.04 634.05 -309.52 619.04
studyTimeGlmm 4 618.61 630.62 -305.30 610.61 8.4319 1 0.003687
mixConTime_GLMM_null
studyTimeGlmm **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The full model does fit better! The AIC in the full model is lower, and its significantly a better fit according to anova().
There was a little bit of patterning in the residuals in the residuals vs. fitted plot, but the qqplot, KDE plot, and check_model results looks overall AOK. We also know that our fixed effect of condition is doing a good job as removing it significantly reduces the fit of the model. I don’t think doing a transformation is warranted.
summary(studyTimeGlmm)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: inverse.gaussian ( identity )
Formula: mean_StudyTime ~ condsFile + (1 | participant)
Data: freeResponseOnly
AIC BIC logLik deviance df.resid
618.6 630.6 -305.3 610.6 145
Scaled residuals:
Min 1Q Median 3Q Max
-1.74842 -0.54638 -0.08409 0.47357 2.28652
Random effects:
Groups Name Variance Std.Dev.
participant (Intercept) 3.08496 1.7564
Residual 0.02398 0.1548
Number of obs: 149, groups: participant, 75
Fixed effects:
Estimate Std. Error t value Pr(>|z|)
(Intercept) 6.5354 0.3921 16.667 < 0.0000000000000002 ***
condsFilenoDelay -0.6350 0.2174 -2.921 0.00348 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
condsFlnDly -0.276
# get overall p for condition
<- cbind(car::Anova(studyTimeGlmm),
mixTimeConAov r.squaredGLMM(studyTimeGlmm,mixConTime_GLMM_null))
Warning: 'r.squaredGLMM' now calculates a revised statistic. See the help page.
mixTimeConAov
Chisq Df Pr(>Chisq) R2m R2c
condsFile 8.535085 1 0.00348365 0.03037668 0.9536573
From Ben Bolker’s glm FAQ (quoting Henrik Singmann):
“The fact that calculating a global measure of model fit (such as R2) is already riddled with complications and that no simple single number can be found, should be a hint that doing so for a subset of the model parameters (i.e., main-effects or interactions) is even more difficult. Given this, I would not recommend to try finding a measure of standardized effect sizes for mixed models.”
[1] "Standard deviation for study time in No-Delay conditon:"
[1] 2.481185
[1] "Standard deviation for study time in Long-Delay conditon:"
[1] 3.627854
Study time showed a significant effect of condition (X2 = 8.54, p = 0.003), with children studying the list longer as a function of increasing access costs across conditions. Study time was longest in the Long-delay condition (M = 6.43, SD = 3.63), and it was significantly longer than study time in the No-delay condition (M = 5.31, SD = 2.48).
<- lmer(mean_MU ~ condsFile + (1|participant),
MU_GLMM data = freeResponseOnly)
plot(density(freeResponseOnly$mean_MU), xlim=c(-50, 50), ylim=c(0, 0.6),
main='M_1 y_hat')
lines(density(predict(MU_GLMM, type='response') ), col='red')
This isn’t so bad - the predicted line is in the ballpark, but it does shoot up higher than the actual data.
plot(MU_GLMM)
This is pretty clear heteroskedasticity.
<- residuals(MU_GLMM)
residuals qqnorm(residuals)
qqline(residuals, col = "red")
The tail at the end wiggles up.
library(performance)
check_model(MU_GLMM)
Our model is not fitting perfectly. The data has negative values though, so we can’t just use a gamma or negative binom distribution in a glm without doing some transformation to deal with the negatives. Let’s try adding a constant to MU and then using a gamma distribution.
$MU_add_1 <- freeResponseOnly$mean_MU + 1
freeResponseOnlyhist(freeResponseOnly$MU_add_1)
<- glmer(MU_add_1 ~ condsFile + (1|participant),
MU_add1_GLM data = freeResponseOnly, family = Gamma(link = "log"))
# KDE
plot(density(freeResponseOnly$MU_add_1), xlim=c(-50, 50), ylim=c(0, 0.6),
main='M_1 y_hat')
lines(density(predict(MU_add1_GLM, type='response') ), col='red')
<- residuals(MU_add1_GLM)
residuals qqnorm(residuals)
qqline(residuals, col = "red")
plot(MU_add1_GLM)
# Eh- not much better
check_model(MU_add1_GLM)
I feel like that is not much better.
<- lmer(mean_MU ~ 1 + (1|participant),
MUfree_null data = freeResponseOnly)
<- glmer(MU_add_1 ~ 1 + (1|participant),
MU_add1_null data = freeResponseOnly, family = Gamma(link = "log"))
print("Below is for non-transformed data, Gaussian")
[1] "Below is for non-transformed data, Gaussian"
anova(MUfree_null,MU_GLMM)
refitting model(s) with ML (instead of REML)
Data: freeResponseOnly
Models:
MUfree_null: mean_MU ~ 1 + (1 | participant)
MU_GLMM: mean_MU ~ condsFile + (1 | participant)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
MUfree_null 3 441.31 450.32 -217.65 435.31
MU_GLMM 4 440.48 452.50 -216.24 432.48 2.8228 1 0.09293 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
print("Below is for transformed, Gamma")
[1] "Below is for transformed, Gamma"
anova(MU_add1_null,MU_add1_GLM)
Data: freeResponseOnly
Models:
MU_add1_null: MU_add_1 ~ 1 + (1 | participant)
MU_add1_GLM: MU_add_1 ~ condsFile + (1 | participant)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
MU_add1_null 3 366.65 375.66 -180.32 360.65
MU_add1_GLM 4 364.78 376.80 -178.39 356.78 3.8656 1 0.04928 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Even though the transformed data did not seem much better, it fit better than the null model while for the non-transformed data, the null model fit better than the full. I wonder if this means that the transformed model indeed is providing a better fit to the data.
summary(MU_add1_GLM)
Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) [glmerMod]
Family: Gamma ( log )
Formula: MU_add_1 ~ condsFile + (1 | participant)
Data: freeResponseOnly
AIC BIC logLik deviance df.resid
364.8 376.8 -178.4 356.8 145
Scaled residuals:
Min 1Q Median 3Q Max
-2.4888 -0.5178 -0.0931 0.4913 3.3170
Random effects:
Groups Name Variance Std.Dev.
participant (Intercept) 0.05277 0.2297
Residual 0.08150 0.2855
Number of obs: 149, groups: participant, 75
Fixed effects:
Estimate Std. Error t value Pr(>|z|)
(Intercept) 0.98810 0.04967 19.892 <0.0000000000000002 ***
condsFileNo-Delay -0.08231 0.04156 -1.981 0.0476 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
cndsFlN-Dly -0.424
<- cbind(car::Anova(MU_add1_GLM),
MUfreeAov r.squaredGLMM(MU_add1_GLM,MU_add1_null))
Warning in data.frame(..., check.names = FALSE): row names were found from a
short variable and have been discarded
MUfreeAov
Chisq Df Pr(>Chisq) R2m R2c
1 3.922957 1 0.04763103 0.01254072 0.4006480
2 3.922957 1 0.04763103 0.01283819 0.4101518
3 3.922957 1 0.04763103 0.01223383 0.3908438
# Calculate means and standard errors
<- tapply(freeResponseOnly$mean_MU, freeResponseOnly$condsFile, mean)
means
# Calculate standard deviations and sample sizes for each condition
<- tapply(freeResponseOnly$mean_MU, freeResponseOnly$condsFile, sd)
sds <- tapply(freeResponseOnly$mean_MU, freeResponseOnly$condsFile, length)
n_per_condition
# Calculate standard errors
<- sds / sqrt(n_per_condition)
se
# Define the x-axis levels (to match the ggplot2 ordering)
<- c("No-Delay", "Long-Delay")
conds_levels <- 1:length(conds_levels)
x_positions
# Create an empty plot
plot(
x_positions, means[conds_levels], type = "n", xaxt = "n", ylim = c(-1, 7.0), xlim = c(0.5, 2.5),
xlab = "", ylab = "Mean Memory Usage (items)",
main = "",
cex.lab = 1.5,
cex.axis = 1.5,
cex.main = 1.5
)
<- unique(freeResponseOnly$participant)
participants
for (p in participants) {
<- freeResponseOnly[freeResponseOnly$participant == p, ]
this_participant if (nrow(this_participant) == 2) {
# Get the x positions for this participant’s conditions
<- match(this_participant$condsFile, conds_levels)
x_vals <- this_participant$mean_MU
y_vals lines(x_vals, y_vals, col = adjustcolor("#009E73", alpha.f = 0.3), lwd = 1)
}
}
# Add x-axis labels normally
axis(1, at = x_positions, labels = conds_levels, cex.axis = 1.5)
# Add points for each participant
for (i in 1:length(conds_levels)) {
<- freeResponseOnly[freeResponseOnly$condsFile == conds_levels[i], ]
condition_data points(rep(x_positions[i], nrow(condition_data)), condition_data$mean_MU,
pch = 1, col = "#0072B2")
}
# Add points for the means
points(x_positions, means[conds_levels], pch = 10, col = "black")
# Add a solid line connecting the means
lines(x_positions, means[conds_levels], col = "black", lwd = 2)
# Add horizontal segments at the means (optional, similar to ggplot2's stat_summary)
segments(
- 0.1, means[conds_levels],
x_positions + 0.1, means[conds_levels],
x_positions col = "black", lwd = 3.0
)
# Add mean labels for each condition
text(x_positions, means[conds_levels] + 2,
labels = round(means[conds_levels], 2), cex = 2.5, col = "black")
# Add error bars using arrows() with specified lower and upper bounds
# Function to compute standard error
<- function(x) sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))
se
# Get standard errors for each condition
<- tapply(freeResponseOnly$mean_MU, freeResponseOnly$condsFile, se)
ses
# Get means again, just to be sure
<- tapply(freeResponseOnly$mean_MU, freeResponseOnly$condsFile, mean)
means
# Define bounds for error bars
<- means[conds_levels] - 2*ses[conds_levels]
lower_bounds <- means[conds_levels] + 2*ses[conds_levels]
upper_bounds
# Add vertical error bars using arrows()
for (i in 1:length(x_positions)) {
arrows(
x0 = x_positions[i], y0 = lower_bounds[i],
x1 = x_positions[i], y1 = upper_bounds[i],
length = 0.1, angle = 90, code = 3, col = "#D55E00", lwd = 3.5
) }
print("Standard deviation for MU in No-Delay conditon:")
[1] "Standard deviation for MU in No-Delay conditon:"
sd(freeResponseOnly$mean_MU[freeResponseOnly$condsFile == "No-Delay"], na.rm = T)
[1] 0.9352832
print("Standard deviation for MU in Long-Delay conditon:")
[1] "Standard deviation for MU in Long-Delay conditon:"
sd(freeResponseOnly$mean_MU[freeResponseOnly$condsFile == "Long-Delay"], na.rm = T)
[1] 1.174552
Memory Usage showed a marginally significant effect of condition (X2 = 3.92, p = 0.048), with children using more memory as a function of increasing access costs across conditions. Memory Usage was larger in the Long-delay condition (M = 1.85, SD = 1.74) than Memory Usage in the No-delay condition (M = 1.60, SD = 0.94).
<- lmer(mean_MU ~ condType + (1|participant),
freeVtotal_glmm data = freeVtotal)
plot(density(freeVtotal$mean_MU), xlim=c(-50, 50), ylim=c(0, 0.6),
main='M_1 y_hat')
lines(density(predict(freeVtotal_glmm, type='response') ), col='red')
We might have the same problem here as with the other model for MU in the other conditions.
plot(freeVtotal_glmm)
<- residuals(freeVtotal_glmm)
residuals qqnorm(residuals)
qqline(residuals, col = "red")
check_model(freeVtotal_glmm)
The problem here is that yes the residuals have an insane pattern, but the data doesn’t obviously look like a gamma. It DOES look normal, but it just has a tall peak in the middle and tiny tails. Also there is negative values.
library(glmmTMB)
<- glmmTMB(mean_MU ~ condType + (1|participant),
MU_TMB data = freeVtotal, ziformula=~1,
family = gaussian())
plot(density(freeVtotal$mean_MU), xlim=c(-50, 50), ylim=c(0, 0.6),
main='M_1 y_hat')
lines(density(predict(MU_TMB, type='response') ), col='red')
<- residuals(MU_TMB)
residuals qqnorm(residuals)
qqline(residuals, col = "red")
# Have to do this by scratch...
<- residuals(MU_TMB, type = "pearson")
resid_vals <- fitted(MU_TMB)
fitted_vals
plot(fitted_vals, resid_vals,
xlab = "Fitted values", ylab = "Pearson residuals",
main = "Residuals vs Fitted (glmmTMB)")
abline(h = 0, lty = 2, col = "red")
I do not think that worked better.
<- lmer(mean_MU ~ 1 + (1|participant),
MUfree_null data = freeVtotal)
<- glmmTMB(mean_MU ~ 1 + (1|participant),
MUTMB_null data = freeVtotal, ziformula=~1,
family = gaussian())
anova(MUfree_null,freeVtotal_glmm)
refitting model(s) with ML (instead of REML)
Data: freeVtotal
Models:
MUfree_null: mean_MU ~ 1 + (1 | participant)
freeVtotal_glmm: mean_MU ~ condType + (1 | participant)
npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
MUfree_null 3 638.84 647.88 -316.42 632.84
freeVtotal_glmm 4 639.97 652.01 -315.98 631.97 0.8786 1 0.3486
anova(MUTMB_null,MU_TMB)
Data: freeVtotal
Models:
MUTMB_null: mean_MU ~ 1 + (1 | participant), zi=~1, disp=~1
MU_TMB: mean_MU ~ condType + (1 | participant), zi=~1, disp=~1
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
MUTMB_null 4 640.84 652.89 -316.42 632.84
MU_TMB 5 641.97 657.02 -315.98 631.97 0.8786 1 0.3486
Using a regular lmer and using glmTMB with the zero-inflation add-on did not make vastly different results, so I will go with the regular lmer because its simpler, but we should keep in mind when interpreting the results that the model is not an ideal fit.
summary(freeVtotal_glmm)
Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: mean_MU ~ condType + (1 | participant)
Data: freeVtotal
REML criterion at convergence: 634.2
Scaled residuals:
Min 1Q Median 3Q Max
-3.12767 -0.40792 0.03309 0.43621 2.86641
Random effects:
Groups Name Variance Std.Dev.
participant (Intercept) 0.9624 0.981
Residual 3.1610 1.778
Number of obs: 150, groups: participant, 75
Fixed effects:
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 1.7156 0.2345 140.3543 7.317 0.0000000000179 ***
condTypeTotal 0.2711 0.2903 74.0000 0.934 0.353
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Correlation of Fixed Effects:
(Intr)
condTypeTtl -0.619
<- cbind(car::Anova(freeVtotal_glmm),
MUfreeAov r.squaredGLMM(freeVtotal_glmm,MUfree_null))
MUfreeAov
Chisq Df Pr(>Chisq) R2m R2c
condType 0.8719833 1 0.3504068 0.004466303 0.2368211
# Calculate means and standard errors
<- tapply(freeVtotal$mean_MU, freeVtotal$condType, mean)
means
# Calculate standard deviations and sample sizes for each condition
<- tapply(freeVtotal$mean_MU, freeVtotal$condType, sd)
sds <- tapply(freeVtotal$mean_MU, freeVtotal$condType, length)
n_per_condition
# Calculate standard errors
<- sds / sqrt(n_per_condition)
se
# Define the x-axis levels (to match the ggplot2 ordering)
<- c("Free", "Total")
conds_levels <- 1:length(conds_levels)
x_positions
# Create an empty plot
plot(
x_positions, means[conds_levels], type = "n", xaxt = "n", ylim = c(-5, 10.0), xlim = c(0.5, 2.5),
xlab = "", ylab = "Mean Memory Usage (items)",
main = "",
cex.lab = 1.5,
cex.axis = 1.5,
cex.main = 1.5
)
<- unique(freeVtotal$participant)
participants
for (p in participants) {
<- freeVtotal[freeVtotal$participant == p, ]
this_participant if (nrow(this_participant) == 2) {
# Get the x positions for this participant’s conditions
<- match(this_participant$condType, conds_levels)
x_vals <- this_participant$mean_MU
y_vals lines(x_vals, y_vals, col = adjustcolor("#009E73", alpha.f = 0.3), lwd = 1)
}
}
# Add x-axis labels normally
axis(1, at = x_positions, labels = conds_levels, cex.axis = 1.6)
# Add points for each participant
for (i in 1:length(conds_levels)) {
<- freeVtotal[freeVtotal$condType == conds_levels[i], ]
condition_data points(rep(x_positions[i], nrow(condition_data)), condition_data$mean_MU,
pch = 1, col = "#0072B2")
}
# Add points for the means
points(x_positions, means[conds_levels], pch = 10, col = "black")
# Add a solid line connecting the means
lines(x_positions, means[conds_levels], col = "black", lwd = 2)
# Add horizontal segments at the means (optional, similar to ggplot2's stat_summary)
segments(
- 0.1, means[conds_levels],
x_positions + 0.1, means[conds_levels],
x_positions col = "black", lwd = 3.0
)
# Add mean labels for each condition
text(x_positions, means[conds_levels] + 2,
labels = round(means[conds_levels], 2), cex = 2.5, col = "black")
# Add error bars using arrows() with specified lower and upper bounds
# Function to compute standard error
<- function(x) sd(x, na.rm = TRUE) / sqrt(sum(!is.na(x)))
se
# Get standard errors for each condition
<- tapply(freeVtotal$mean_MU, freeVtotal$condType, se)
ses
# Get means again, just to be sure
<- tapply(freeVtotal$mean_MU, freeVtotal$condType, mean)
means
# Define bounds for error bars
<- means[conds_levels] - 2*ses[conds_levels]
lower_bounds <- means[conds_levels] + 2*ses[conds_levels]
upper_bounds
# Add vertical error bars using arrows()
for (i in 1:length(x_positions)) {
arrows(
x0 = x_positions[i], y0 = lower_bounds[i],
x1 = x_positions[i], y1 = upper_bounds[i],
length = 0.1, angle = 90, code = 3, col = "#D55E00", lwd = 3.5
) }
print("Standard deviation for MU in Free-Response conditon:")
[1] "Standard deviation for MU in Free-Response conditon:"
sd(freeVtotal$mean_MU[freeVtotal$condType == "Free"], na.rm = T)
[1] 0.8484102
print("Standard deviation for MU in Total-Response conditon:")
[1] "Standard deviation for MU in Total-Response conditon:"
sd(freeVtotal$mean_MU[freeVtotal$condType == "Total"], na.rm = T)
[1] 2.74351
Children used the entire contents of their memory when they made a trip between the list and the shelf, as Memory Usage in the Total-response condition (M = 1.99, SD = 2.74) was larger than Memory Usage in the Free-response condition (M = 1.72, SD = 0.85), but this was non-significant (X2 = 0.87, p = 0.35).