This document reports code for my Technical Comment on Berkowitz et al. (2015). For a fuller analysis see this repository.
d <- read.csv("aac7427-Accessory-Data-File-S1.csv") %>%
mutate(condition = ifelse(cond.dum == 1, "Math App","Reading App"))
We need to exclude:
twin
)year1dropout
)We also see that we need to include 34 children due to experimenter errors (24 fall, 16 math and 8 reading; 10 in spring, 9 math and 1 reading). This is in the filter
variables (e.g., wjappliedfa13_filter
). These are binary variables that presumably could be conveying whether the children’s scores should be included because they get basal and ceiling items. We do this after excluding twins and dropouts.
sum(d$wjappliedfa13_filter[d$twin == 0 & d$year1dropout == 0] == 0)
## [1] 24
sum(d$WJ.applied.Sp14.filter[d$twin == 0 & d$year1dropout == 0] == 0)
## [1] 11
These numbers are very close to the paper’s numbers, though we do get 11 in the spring (not 10). Let’s do the same for reading, where there were 25 exclusions (7 in fall, 4 math and 3 reading; 18 in spring, 12 math and 6 reading).
sum(d$WJletterwordFa13_filter[d$twin == 0 & d$year1dropout == 0] == 0)
## [1] 7
sum(d$WJ.letterword.Sp14.filter[d$twin == 0 & d$year1dropout == 0] == 0)
## [1] 19
Again we’re one off (18 vs. 19 in spring), but quite close. Note there is a separate issue in exclusions for analyses that use math anxiety: where math anxiety questionnaire data is missing (~20% of cases), you have to exclude data. I’m assuming the authors just dropped non-responding families for these analyses.
dt.excl <- d %>%
mutate(applied.w.fa13 = ifelse(wjappliedfa13_filter,
WJ.Applied.Problems.W.Score.Fa13,
NA),
applied.w.sp14 = ifelse(WJ.applied.Sp14.filter,
WJ.Applied.Problems_W.Score_Sp14,
NA),
reading.w.fa13 = ifelse(WJletterwordFa13_filter,
WJ.Letter.Word.ID.W.Score.Fa13,
NA),
reading.w.sp14 = ifelse(WJ.letterword.Sp14.filter,
WJ.LetterWord.Wscore.Sp14,
NA)) %>%
filter(!twin, !year1dropout) %>%
gather(measure, score,
applied.w.fa13, applied.w.sp14,
reading.w.fa13, reading.w.sp14) %>%
separate(measure, c("measure","time"), sep="w\\.") %>%
mutate(year = as.factor(str_sub(time, 4, 5)),
measure = ifelse(grepl( "applied", measure), "Math", "Reading"))
Let’s detour briefly to check the means now that we are doing exclusions. Table S1 of the original paper gives means, albeit using a median split on math anxiety (e.g. not grand means).
Low anxious:
High anxious: 
dt.excl %>%
filter(measure == "Math") %>%
filter(!is.na(parentMA_mediansplit)) %>%
group_by(parentMA_mediansplit, condition, year) %>%
summarise(mean = mean(score, na.rm=TRUE),
sd = sd(score, na.rm=TRUE))
## Source: local data frame [8 x 5]
## Groups: parentMA_mediansplit, condition [?]
##
## parentMA_mediansplit condition year mean sd
## (dbl) (chr) (fctr) (dbl) (dbl)
## 1 1 Math App 3 462.7468 18.71507
## 2 1 Math App 4 478.4348 21.11036
## 3 1 Reading App 3 458.8571 18.08212
## 4 1 Reading App 4 475.3134 19.25602
## 5 2 Math App 3 459.0562 15.98633
## 6 2 Math App 4 474.3836 18.67978
## 7 2 Reading App 3 457.4800 13.56352
## 8 2 Reading App 4 469.2157 18.68402
We get only minor numerical differences between the data I’m using and the data reported by the authors (e.g., <= 1 W score unit).
Difference scores.
diffs.excl <- dt.excl %>%
select(ChildID, condition, year, measure, score) %>%
mutate(measure_year = interaction(measure, year)) %>%
select(-year, -measure) %>%
spread(measure_year, score) %>%
mutate(math_gain = Math.4 - Math.3,
reading_gain = Reading.4 - Reading.3) %>%
select(-Math.3, -Math.4, -Reading.3, -Reading.4) %>%
gather(measure, score, math_gain, reading_gain)
Do the whole thing with grade equivalents.
dt.excl.ge <- d %>%
mutate(applied.w.fa13 = ifelse(wjappliedfa13_filter,
WJ.Applied.Problems.GE.Fa13,
NA),
applied.w.sp14 = ifelse(WJ.applied.Sp14.filter,
WJ.Applied.Problem_GE_Sp14,
NA),
reading.w.fa13 = ifelse(WJletterwordFa13_filter,
WJ.Letter.Word.ID.GE.Fa13,
NA),
reading.w.sp14 = ifelse(WJ.letterword.Sp14.filter,
WJ.LetterWord_GE_Sp14,
NA)) %>%
filter(!twin, !year1dropout) %>%
gather(measure, score,
applied.w.fa13, applied.w.sp14,
reading.w.fa13, reading.w.sp14) %>%
separate(measure, c("measure","time"), sep="w\\.") %>%
mutate(year = as.factor(str_sub(time, 4, 5)),
measure = ifelse(grepl( "applied", measure), "Math", "Reading"))
diffs.excl.ge <- dt.excl.ge %>%
select(ChildID, condition, year, measure, score) %>%
mutate(measure_year = interaction(measure, year)) %>%
select(-year, -measure) %>%
spread(measure_year, score) %>%
mutate(math_gain = Math.4 - Math.3,
reading_gain = Reading.4 - Reading.3) %>%
select(-Math.3, -Math.4, -Reading.3, -Reading.4) %>%
gather(measure, score, math_gain, reading_gain)
And join back in other measures.
diffs.full <- left_join(diffs.excl, d %>%
select(ChildID, ParentMAaverage.Fa13,
parentMA_mediansplit,
avg.use, use.012.groups)) %>%
left_join(diffs.excl.ge %>% rename(score.ge = score))
And compute means for plotting.
ms.diffs.excl <- diffs.excl %>%
group_by(condition, measure) %>%
multi_boot_standard(column = "score", na.rm=TRUE)
ms.diffs.excl.ge <- diffs.excl.ge %>%
group_by(condition, measure) %>%
multi_boot_standard(column = "score", na.rm=TRUE) %>%
ungroup %>%
mutate(measure = factor(measure,
levels = c("math_gain","reading_gain"),
labels = c("WJ Applied Problems\n(Math)",
"WJ Letter-Word\n(Reading)")),
Condition = condition)
This section will now reproduce the analyses in the Technical Comment.
Figure 1.
ggplot(ms.diffs.excl.ge, aes(x = measure, y = mean,
ymin = ci_lower, ymax = ci_upper,
fill = Condition)) +
geom_bar(stat="identity", position = "dodge") +
geom_linerange(position = position_dodge(width = .9)) +
scale_fill_solarized() +
ylab("Improvement (Grade Equivalents)") +
xlab("Measure") + ylim(c(0,1))
\(t\)-tests for Figure 1.
kable(tidy(with(diffs.excl.ge,
t.test(score[condition == "Math App" &
measure == "math_gain"],
score[condition == "Reading App" &
measure == "math_gain"], var.equal = TRUE))))
estimate1 | estimate2 | statistic | p.value | parameter | conf.low | conf.high |
---|---|---|---|---|---|---|
0.852973 | 0.77 | 1.061383 | 0.2890304 | 498 | -0.0706193 | 0.2365653 |
kable(tidy(with(diffs.excl.ge,
t.test(score[condition == "Math App" &
measure == "reading_gain"],
score[condition == "Reading App" &
measure == "reading_gain"], var.equal = TRUE))))
estimate1 | estimate2 | statistic | p.value | parameter | conf.low | conf.high |
---|---|---|---|---|---|---|
0.8337731 | 0.8192308 | 0.2728889 | 0.7850497 | 507 | -0.0901546 | 0.1192392 |
Main longitudinal models (for paragraph beginning “first”). GE scores and then W scores.
kable(summary(lmer(score ~ condition * time
+ (1 | ChildID)
+ (time | TeachID.Year1),
data = filter(dt.excl.ge,
measure == "Math")))$coefficients, digits = 3)
Estimate | Std. Error | t value | |
---|---|---|---|
(Intercept) | 1.985 | 0.088 | 22.656 |
conditionReading App | -0.075 | 0.169 | -0.444 |
timesp14 | 0.842 | 0.051 | 16.650 |
conditionReading App:timesp14 | -0.072 | 0.098 | -0.733 |
kable(summary(lmer(score ~ condition * time
+ (1 | ChildID)
+ (time | TeachID.Year1),
data = filter(dt.excl,
measure == "Math")))$coefficients, digits = 3)
Estimate | Std. Error | t value | |
---|---|---|---|
(Intercept) | 458.914 | 1.735 | 264.453 |
conditionReading App | -1.518 | 3.356 | -0.452 |
timesp14 | 15.368 | 0.897 | 17.141 |
conditionReading App:timesp14 | -1.003 | 1.742 | -0.576 |
Three-way interaction models (for paragraph beginning “second”). GE and then W scores, as above.
kable(summary(lmer(score ~ condition * time * avg.use
+ (1 | ChildID)
+ (time | TeachID.Year1),
data = filter(dt.excl.ge,
measure == "Math")))$coefficients, digits = 3)
Estimate | Std. Error | t value | |
---|---|---|---|
(Intercept) | 1.770 | 0.103 | 17.184 |
conditionReading App | 0.068 | 0.199 | 0.340 |
timesp14 | 0.650 | 0.070 | 9.324 |
avg.use | 0.174 | 0.046 | 3.770 |
conditionReading App:timesp14 | 0.088 | 0.134 | 0.658 |
conditionReading App:avg.use | -0.128 | 0.073 | -1.756 |
timesp14:avg.use | 0.157 | 0.040 | 3.968 |
conditionReading App:timesp14:avg.use | -0.137 | 0.063 | -2.190 |
kable(summary(lmer(score ~ condition * time * avg.use
+ (1 | ChildID)
+ (time | TeachID.Year1),
data = filter(dt.excl,
measure == "Math")))$coefficients, digits = 3)
Estimate | Std. Error | t value | |
---|---|---|---|
(Intercept) | 454.559 | 2.002 | 227.015 |
conditionReading App | 0.964 | 3.875 | 0.249 |
timesp14 | 12.897 | 1.237 | 10.429 |
avg.use | 3.546 | 0.858 | 4.132 |
conditionReading App:timesp14 | 1.226 | 2.379 | 0.515 |
conditionReading App:avg.use | -2.381 | 1.361 | -1.749 |
timesp14:avg.use | 2.021 | 0.703 | 2.876 |
conditionReading App:timesp14:avg.use | -1.867 | 1.109 | -1.684 |
Figure 2.
ggplot(filter(diffs.full,
!is.na(parentMA_mediansplit),
condition == "Math App"),
aes(x = avg.use, y = score.ge,
col = factor(parentMA_mediansplit))) +
geom_point() +
geom_smooth(se=TRUE, method = "lm") +
scale_colour_solarized() +
xlab("Average Weekly App Usage") +
ylab("Improvement (Grade Equivalents)")
Math anxiety three- and four-way interactions (for paragraph beginning “third”).
kable(summary(lmer(score ~ time * condition * ParentMAaverage.Fa13
+ (1 | ChildID)
+ (time | TeachID.Year1),
data = filter(dt.excl.ge,
measure == "Math")))$coefficients, digits = 3)
Estimate | Std. Error | t value | |
---|---|---|---|
(Intercept) | 2.281 | 0.163 | 14.010 |
timesp14 | 1.036 | 0.126 | 8.201 |
conditionReading App | -0.038 | 0.324 | -0.117 |
ParentMAaverage.Fa13 | -0.112 | 0.062 | -1.802 |
timesp14:conditionReading App | 0.167 | 0.249 | 0.670 |
timesp14:ParentMAaverage.Fa13 | -0.085 | 0.053 | -1.611 |
conditionReading App:ParentMAaverage.Fa13 | -0.020 | 0.127 | -0.157 |
timesp14:conditionReading App:ParentMAaverage.Fa13 | -0.119 | 0.107 | -1.117 |
kable(summary(lmer(score ~ time * condition * ParentMAaverage.Fa13
+ (1 | ChildID)
+ (time | TeachID.Year1),
data = filter(dt.excl,
measure == "Math")))$coefficients, digits = 3)
Estimate | Std. Error | t value | |
---|---|---|---|
(Intercept) | 464.190 | 3.073 | 151.051 |
timesp14 | 17.718 | 2.211 | 8.014 |
conditionReading App | -0.948 | 6.120 | -0.155 |
ParentMAaverage.Fa13 | -1.937 | 1.152 | -1.682 |
timesp14:conditionReading App | 3.740 | 4.374 | 0.855 |
timesp14:ParentMAaverage.Fa13 | -1.082 | 0.915 | -1.183 |
conditionReading App:ParentMAaverage.Fa13 | -0.280 | 2.355 | -0.119 |
timesp14:conditionReading App:ParentMAaverage.Fa13 | -2.370 | 1.860 | -1.275 |
kable(summary(lmer(score ~ time * condition * ParentMAaverage.Fa13 * avg.use +
+ (1 | ChildID)
+ (time | TeachID.Year1),
data = filter(dt.excl,
measure == "Math")))$coefficients, digits = 3)
Estimate | Std. Error | t value | |
---|---|---|---|
(Intercept) | 459.443 | 4.621 | 99.432 |
timesp14 | 12.277 | 3.598 | 3.412 |
conditionReading App | -4.837 | 8.722 | -0.555 |
ParentMAaverage.Fa13 | -1.589 | 1.832 | -0.867 |
avg.use | 3.049 | 2.876 | 1.061 |
timesp14:conditionReading App | 10.657 | 6.722 | 1.585 |
timesp14:ParentMAaverage.Fa13 | 0.000 | 1.488 | 0.000 |
conditionReading App:ParentMAaverage.Fa13 | 2.647 | 3.472 | 0.762 |
timesp14:avg.use | 3.899 | 2.339 | 1.667 |
conditionReading App:avg.use | 3.203 | 4.750 | 0.674 |
ParentMAaverage.Fa13:avg.use | -0.071 | 1.252 | -0.056 |
timesp14:conditionReading App:ParentMAaverage.Fa13 | -3.990 | 2.816 | -1.417 |
timesp14:conditionReading App:avg.use | -4.899 | 3.857 | -1.270 |
timesp14:ParentMAaverage.Fa13:avg.use | -0.759 | 1.018 | -0.746 |
conditionReading App:ParentMAaverage.Fa13:avg.use | -2.413 | 2.037 | -1.184 |
timesp14:conditionReading App:ParentMAaverage.Fa13:avg.use | 1.137 | 1.655 | 0.687 |
kable(summary(lmer(score ~ time * condition * ParentMAaverage.Fa13 * avg.use +
+ (1 | ChildID)
+ (time | TeachID.Year1),
data = filter(dt.excl.ge,
measure == "Math")))$coefficients, digits = 3)
Estimate | Std. Error | t value | |
---|---|---|---|
(Intercept) | 2.038 | 0.247 | 8.249 |
timesp14 | 0.596 | 0.205 | 2.910 |
conditionReading App | -0.142 | 0.466 | -0.305 |
ParentMAaverage.Fa13 | -0.089 | 0.099 | -0.904 |
avg.use | 0.156 | 0.155 | 1.004 |
timesp14:conditionReading App | 0.629 | 0.382 | 1.645 |
timesp14:ParentMAaverage.Fa13 | 0.013 | 0.085 | 0.150 |
conditionReading App:ParentMAaverage.Fa13 | 0.091 | 0.187 | 0.488 |
timesp14:avg.use | 0.321 | 0.134 | 2.405 |
conditionReading App:avg.use | 0.096 | 0.256 | 0.374 |
ParentMAaverage.Fa13:avg.use | -0.008 | 0.068 | -0.117 |
timesp14:conditionReading App:ParentMAaverage.Fa13 | -0.226 | 0.161 | -1.410 |
timesp14:conditionReading App:avg.use | -0.333 | 0.220 | -1.513 |
timesp14:ParentMAaverage.Fa13:avg.use | -0.072 | 0.058 | -1.233 |
conditionReading App:ParentMAaverage.Fa13:avg.use | -0.093 | 0.110 | -0.848 |
timesp14:conditionReading App:ParentMAaverage.Fa13:avg.use | 0.077 | 0.095 | 0.818 |
Thanks very much to the Berkowitz et al. team for posting their raw data and providing feedback on a draft of this reanalysis. Thanks also to Johannes Haushofer as well as the members of the Language and Cognition lab at Stanford for valuable feedback.