1 Introduction

This document reports code for my Technical Comment on Berkowitz et al. (2015). For a fuller analysis see this repository.

d <- read.csv("aac7427-Accessory-Data-File-S1.csv") %>%
  mutate(condition = ifelse(cond.dum == 1, "Math App","Reading App"))

2 Exclusions and Data Preparation

We need to exclude:

twins (twin)
dropouts during the first year (year1dropout)

We also see that we need to include 34 children due to experimenter errors (24 fall, 16 math and 8 reading; 10 in spring, 9 math and 1 reading). This is in the filter variables (e.g., wjappliedfa13_filter). These are binary variables that presumably could be conveying whether the children’s scores should be included because they get basal and ceiling items. We do this after excluding twins and dropouts.

sum(d$wjappliedfa13_filter[d$twin == 0 & d$year1dropout == 0] == 0)

## [1] 24

sum(d$WJ.applied.Sp14.filter[d$twin == 0 & d$year1dropout == 0] == 0)

## [1] 11

These numbers are very close to the paper’s numbers, though we do get 11 in the spring (not 10). Let’s do the same for reading, where there were 25 exclusions (7 in fall, 4 math and 3 reading; 18 in spring, 12 math and 6 reading).

sum(d$WJletterwordFa13_filter[d$twin == 0 & d$year1dropout == 0] == 0)

## [1] 7

sum(d$WJ.letterword.Sp14.filter[d$twin == 0 & d$year1dropout == 0] == 0)

## [1] 19

Again we’re one off (18 vs. 19 in spring), but quite close. Note there is a separate issue in exclusions for analyses that use math anxiety: where math anxiety questionnaire data is missing (~20% of cases), you have to exclude data. I’m assuming the authors just dropped non-responding families for these analyses.

dt.excl <- d %>% 
  mutate(applied.w.fa13 = ifelse(wjappliedfa13_filter,
                                          WJ.Applied.Problems.W.Score.Fa13, 
                                          NA),
         applied.w.sp14 = ifelse(WJ.applied.Sp14.filter,
                                          WJ.Applied.Problems_W.Score_Sp14, 
                                          NA),
         reading.w.fa13 = ifelse(WJletterwordFa13_filter,
                                          WJ.Letter.Word.ID.W.Score.Fa13, 
                                          NA),
         reading.w.sp14 = ifelse(WJ.letterword.Sp14.filter,
                                          WJ.LetterWord.Wscore.Sp14, 
                                          NA)) %>%
  filter(!twin, !year1dropout) %>%
  gather(measure, score, 
         applied.w.fa13, applied.w.sp14, 
         reading.w.fa13, reading.w.sp14) %>%
  separate(measure, c("measure","time"), sep="w\\.") %>%
  mutate(year = as.factor(str_sub(time, 4, 5)),
         measure = ifelse(grepl( "applied", measure), "Math", "Reading"))

Let’s detour briefly to check the means now that we are doing exclusions. Table S1 of the original paper gives means, albeit using a median split on math anxiety (e.g. not grand means).

Low anxious:

math grp: 463 (19) in fall and 479 (21) in spring
reading grp: 459 (18) in fall and 475 (19) in spring

High anxious:

math grp: 459 (16) in fall and 474 (19) in spring
reading grp: 458 (14) in fall and 469 (19) in spring

dt.excl %>%
  filter(measure == "Math") %>%
  filter(!is.na(parentMA_mediansplit)) %>%
  group_by(parentMA_mediansplit, condition, year) %>%
  summarise(mean = mean(score, na.rm=TRUE), 
            sd = sd(score, na.rm=TRUE))

## Source: local data frame [8 x 5]
## Groups: parentMA_mediansplit, condition [?]
## 
##   parentMA_mediansplit   condition   year     mean       sd
##                  (dbl)       (chr) (fctr)    (dbl)    (dbl)
## 1                    1    Math App      3 462.7468 18.71507
## 2                    1    Math App      4 478.4348 21.11036
## 3                    1 Reading App      3 458.8571 18.08212
## 4                    1 Reading App      4 475.3134 19.25602
## 5                    2    Math App      3 459.0562 15.98633
## 6                    2    Math App      4 474.3836 18.67978
## 7                    2 Reading App      3 457.4800 13.56352
## 8                    2 Reading App      4 469.2157 18.68402

We get only minor numerical differences between the data I’m using and the data reported by the authors (e.g., <= 1 W score unit).

Difference scores.

diffs.excl <- dt.excl %>%
  select(ChildID, condition, year, measure, score) %>%
  mutate(measure_year = interaction(measure, year)) %>%
  select(-year, -measure) %>%
  spread(measure_year, score) %>%
  mutate(math_gain = Math.4 - Math.3,
         reading_gain = Reading.4 - Reading.3) %>%
  select(-Math.3, -Math.4, -Reading.3, -Reading.4) %>%
  gather(measure, score, math_gain, reading_gain)

Do the whole thing with grade equivalents.

dt.excl.ge <- d %>% 
  mutate(applied.w.fa13 = ifelse(wjappliedfa13_filter,
                                          WJ.Applied.Problems.GE.Fa13, 
                                          NA),
         applied.w.sp14 = ifelse(WJ.applied.Sp14.filter,
                                          WJ.Applied.Problem_GE_Sp14, 
                                          NA),
         reading.w.fa13 = ifelse(WJletterwordFa13_filter,
                                          WJ.Letter.Word.ID.GE.Fa13, 
                                          NA),
         reading.w.sp14 = ifelse(WJ.letterword.Sp14.filter,
                                          WJ.LetterWord_GE_Sp14, 
                                          NA)) %>%
  filter(!twin, !year1dropout) %>%
  gather(measure, score, 
         applied.w.fa13, applied.w.sp14, 
         reading.w.fa13, reading.w.sp14) %>%
  separate(measure, c("measure","time"), sep="w\\.") %>%
  mutate(year = as.factor(str_sub(time, 4, 5)),
         measure = ifelse(grepl( "applied", measure), "Math", "Reading")) 

diffs.excl.ge <- dt.excl.ge %>%
  select(ChildID, condition, year, measure, score) %>%
  mutate(measure_year = interaction(measure, year)) %>%
  select(-year, -measure) %>%
  spread(measure_year, score) %>%
  mutate(math_gain = Math.4 - Math.3,
         reading_gain = Reading.4 - Reading.3) %>%
  select(-Math.3, -Math.4, -Reading.3, -Reading.4) %>%
  gather(measure, score, math_gain, reading_gain)

And join back in other measures.

diffs.full <- left_join(diffs.excl, d %>% 
                          select(ChildID, ParentMAaverage.Fa13,
                                 parentMA_mediansplit, 
                                 avg.use, use.012.groups)) %>%
  left_join(diffs.excl.ge %>% rename(score.ge = score))

And compute means for plotting.

ms.diffs.excl <- diffs.excl %>%
  group_by(condition, measure) %>%
  multi_boot_standard(column = "score", na.rm=TRUE)

ms.diffs.excl.ge <- diffs.excl.ge %>%
  group_by(condition, measure) %>%
  multi_boot_standard(column = "score", na.rm=TRUE) %>%
  ungroup %>%
  mutate(measure = factor(measure, 
                            levels = c("math_gain","reading_gain"), 
                            labels = c("WJ Applied Problems\n(Math)", 
                                       "WJ Letter-Word\n(Reading)")), 
         Condition = condition)

3 Technical Comment Analyses

This section will now reproduce the analyses in the Technical Comment.

Figure 1.

ggplot(ms.diffs.excl.ge, aes(x = measure, y = mean, 
               ymin = ci_lower, ymax = ci_upper, 
               fill = Condition)) +
  geom_bar(stat="identity", position = "dodge") +
  geom_linerange(position = position_dodge(width = .9)) + 
  scale_fill_solarized() + 
  ylab("Improvement (Grade Equivalents)") + 
  xlab("Measure") + ylim(c(0,1))

\(t\)-tests for Figure 1.

kable(tidy(with(diffs.excl.ge, 
                t.test(score[condition == "Math App" & 
                               measure == "math_gain"],
                       score[condition == "Reading App" & 
                               measure == "math_gain"], var.equal = TRUE))))

estimate1	estimate2	statistic	p.value	parameter	conf.low	conf.high
0.852973	0.77	1.061383	0.2890304	498	-0.0706193	0.2365653

kable(tidy(with(diffs.excl.ge, 
                t.test(score[condition == "Math App" & 
                               measure == "reading_gain"],
                       score[condition == "Reading App" & 
                               measure == "reading_gain"], var.equal = TRUE))))

estimate1	estimate2	statistic	p.value	parameter	conf.low	conf.high
0.8337731	0.8192308	0.2728889	0.7850497	507	-0.0901546	0.1192392

Main longitudinal models (for paragraph beginning “first”). GE scores and then W scores.

kable(summary(lmer(score ~ condition * time 
                   + (1 | ChildID)
                   + (time | TeachID.Year1),
     data = filter(dt.excl.ge, 
                   measure == "Math")))$coefficients, digits = 3)

	Estimate	Std. Error	t value
(Intercept)	1.985	0.088	22.656
conditionReading App	-0.075	0.169	-0.444
timesp14	0.842	0.051	16.650
conditionReading App:timesp14	-0.072	0.098	-0.733

kable(summary(lmer(score ~ condition * time 
                   + (1 | ChildID)
                   + (time | TeachID.Year1), 
     data = filter(dt.excl, 
                   measure == "Math")))$coefficients, digits = 3)

	Estimate	Std. Error	t value
(Intercept)	458.914	1.735	264.453
conditionReading App	-1.518	3.356	-0.452
timesp14	15.368	0.897	17.141
conditionReading App:timesp14	-1.003	1.742	-0.576

Three-way interaction models (for paragraph beginning “second”). GE and then W scores, as above.

kable(summary(lmer(score ~ condition * time * avg.use 
                   + (1 | ChildID)
                   + (time | TeachID.Year1), 
     data = filter(dt.excl.ge, 
                   measure == "Math")))$coefficients, digits = 3)

	Estimate	Std. Error	t value
(Intercept)	1.770	0.103	17.184
conditionReading App	0.068	0.199	0.340
timesp14	0.650	0.070	9.324
avg.use	0.174	0.046	3.770
conditionReading App:timesp14	0.088	0.134	0.658
conditionReading App:avg.use	-0.128	0.073	-1.756
timesp14:avg.use	0.157	0.040	3.968
conditionReading App:timesp14:avg.use	-0.137	0.063	-2.190

kable(summary(lmer(score ~ condition * time * avg.use 
                   + (1 | ChildID)
                   + (time | TeachID.Year1), 
     data = filter(dt.excl, 
                   measure == "Math")))$coefficients, digits = 3)

	Estimate	Std. Error	t value
(Intercept)	454.559	2.002	227.015
conditionReading App	0.964	3.875	0.249
timesp14	12.897	1.237	10.429
avg.use	3.546	0.858	4.132
conditionReading App:timesp14	1.226	2.379	0.515
conditionReading App:avg.use	-2.381	1.361	-1.749
timesp14:avg.use	2.021	0.703	2.876
conditionReading App:timesp14:avg.use	-1.867	1.109	-1.684

Figure 2.

ggplot(filter(diffs.full, 
              !is.na(parentMA_mediansplit), 
              condition == "Math App"), 
              aes(x = avg.use, y = score.ge, 
                  col = factor(parentMA_mediansplit))) + 
  geom_point() + 
  geom_smooth(se=TRUE, method = "lm") + 
  scale_colour_solarized() + 
  xlab("Average Weekly App Usage") + 
  ylab("Improvement (Grade Equivalents)")

Math anxiety three- and four-way interactions (for paragraph beginning “third”).

kable(summary(lmer(score ~ time * condition * ParentMAaverage.Fa13 
                   + (1 | ChildID)
                   + (time | TeachID.Year1),
     data = filter(dt.excl.ge, 
                   measure == "Math")))$coefficients, digits = 3)

	Estimate	Std. Error	t value
(Intercept)	2.281	0.163	14.010
timesp14	1.036	0.126	8.201
conditionReading App	-0.038	0.324	-0.117
ParentMAaverage.Fa13	-0.112	0.062	-1.802
timesp14:conditionReading App	0.167	0.249	0.670
timesp14:ParentMAaverage.Fa13	-0.085	0.053	-1.611
conditionReading App:ParentMAaverage.Fa13	-0.020	0.127	-0.157
timesp14:conditionReading App:ParentMAaverage.Fa13	-0.119	0.107	-1.117

kable(summary(lmer(score ~ time * condition * ParentMAaverage.Fa13 
                   + (1 | ChildID)
                   + (time | TeachID.Year1),
     data = filter(dt.excl, 
                   measure == "Math")))$coefficients, digits = 3)

	Estimate	Std. Error	t value
(Intercept)	464.190	3.073	151.051
timesp14	17.718	2.211	8.014
conditionReading App	-0.948	6.120	-0.155
ParentMAaverage.Fa13	-1.937	1.152	-1.682
timesp14:conditionReading App	3.740	4.374	0.855
timesp14:ParentMAaverage.Fa13	-1.082	0.915	-1.183
conditionReading App:ParentMAaverage.Fa13	-0.280	2.355	-0.119
timesp14:conditionReading App:ParentMAaverage.Fa13	-2.370	1.860	-1.275

kable(summary(lmer(score ~ time * condition * ParentMAaverage.Fa13 * avg.use + 
               + (1 | ChildID)
               + (time | TeachID.Year1),
     data = filter(dt.excl, 
                   measure == "Math")))$coefficients, digits = 3)

	Estimate	Std. Error	t value
(Intercept)	459.443	4.621	99.432
timesp14	12.277	3.598	3.412
conditionReading App	-4.837	8.722	-0.555
ParentMAaverage.Fa13	-1.589	1.832	-0.867
avg.use	3.049	2.876	1.061
timesp14:conditionReading App	10.657	6.722	1.585
timesp14:ParentMAaverage.Fa13	0.000	1.488	0.000
conditionReading App:ParentMAaverage.Fa13	2.647	3.472	0.762
timesp14:avg.use	3.899	2.339	1.667
conditionReading App:avg.use	3.203	4.750	0.674
ParentMAaverage.Fa13:avg.use	-0.071	1.252	-0.056
timesp14:conditionReading App:ParentMAaverage.Fa13	-3.990	2.816	-1.417
timesp14:conditionReading App:avg.use	-4.899	3.857	-1.270
timesp14:ParentMAaverage.Fa13:avg.use	-0.759	1.018	-0.746
conditionReading App:ParentMAaverage.Fa13:avg.use	-2.413	2.037	-1.184
timesp14:conditionReading App:ParentMAaverage.Fa13:avg.use	1.137	1.655	0.687

kable(summary(lmer(score ~ time * condition * ParentMAaverage.Fa13 * avg.use + 
               + (1 | ChildID)
               + (time | TeachID.Year1),
     data = filter(dt.excl.ge, 
                   measure == "Math")))$coefficients, digits = 3)

	Estimate	Std. Error	t value
(Intercept)	2.038	0.247	8.249
timesp14	0.596	0.205	2.910
conditionReading App	-0.142	0.466	-0.305
ParentMAaverage.Fa13	-0.089	0.099	-0.904
avg.use	0.156	0.155	1.004
timesp14:conditionReading App	0.629	0.382	1.645
timesp14:ParentMAaverage.Fa13	0.013	0.085	0.150
conditionReading App:ParentMAaverage.Fa13	0.091	0.187	0.488
timesp14:avg.use	0.321	0.134	2.405
conditionReading App:avg.use	0.096	0.256	0.374
ParentMAaverage.Fa13:avg.use	-0.008	0.068	-0.117
timesp14:conditionReading App:ParentMAaverage.Fa13	-0.226	0.161	-1.410
timesp14:conditionReading App:avg.use	-0.333	0.220	-1.513
timesp14:ParentMAaverage.Fa13:avg.use	-0.072	0.058	-1.233
conditionReading App:ParentMAaverage.Fa13:avg.use	-0.093	0.110	-0.848
timesp14:conditionReading App:ParentMAaverage.Fa13:avg.use	0.077	0.095	0.818

4 Acknowledgements

Thanks very much to the Berkowitz et al. team for posting their raw data and providing feedback on a draft of this reanalysis. Thanks also to Johannes Haushofer as well as the members of the Language and Cognition lab at Stanford for valuable feedback.

Berkowitz et al. (2015) reanalysis

Michael C. Frank

2016-01-04

1 Introduction

2 Exclusions and Data Preparation

3 Technical Comment Analyses

4 Acknowledgements