This script analyzes data from two active/passive learning experiments.

Descriptives

condition order category_type mean_exp_length sd_exp_length count participants_needed
AA order1 information-integration 9.955996 3.9981471 23 -3
AA order1 rule-based 9.145264 3.7683355 71 -51
AA order2 information-integration 10.223666 5.5412207 17 3
AA order2 rule-based 9.956931 2.7548557 15 5
RR order1 information-integration 3.416021 1.2014588 22 -2
RR order1 rule-based 6.104508 10.4616051 33 -13
RR order2 information-integration 3.196833 0.6546468 16 4
RR order2 rule-based 4.669485 2.5250647 22 -2
RA order1 information-integration 7.557564 3.5104577 26 -6
RA order1 rule-based 6.060428 2.0234695 18 2
RA order2 information-integration 6.041291 1.8691983 25 -5
RA order2 rule-based 5.684708 2.0012646 20 0
AR order1 information-integration 6.327930 2.4083405 19 1
AR order1 rule-based 7.680843 5.1584244 14 6
AR order2 information-integration 6.617099 1.8137440 22 -2
AR order2 rule-based 7.238033 4.2210413 32 -12

Histogram of length of experiment split by condition

Overall accuracy analysis

Get mean accuracy for each condition and category type

Plot.

We see the overall advantage for active learning over passive learning across both category types.

Accuracy by block analysis

Next, we analyze accuracy across the two blocks.

The block analysis shows an effect of order on active learning.

Receptive-Active learners are more accurate after their block of active learning (block 2) compared to Active-Receptive learners (block 1).

Accuracy by block and order

Order here refers to whether size or angle was the category dimension.

Rename order labels, so they make sense

Plot accuracy over blocks

Rule-Based category structure

For the category that depends on size, AA and RA end up on top of each other, whereas AR and RR do not. I’m not sure what’s going on with the “angle” category – Perhaps this is just easier to learn overall and so we are not seeing any condtion differences?

Also, there seems to be some between subjects variation here – could this explain why the RR learners are the best in the angle category? Should we try to replicate this order difference?

Information Integration category structure

Evidence selection analysis (active learning)

Analyze the average distance of participants’ samples from the optimal decision boundary.

Rotate, so orientation and radius are on the same dimension.

Plot group level sampling behavior.

Plot individual participant sampling behavior

Get distance from optimal decision boundary for each sample.

Now get the average distance across subjects

Plot.

Active learning is better after getting a block of receptive learning trials. But not better than getting two blocks of Active learning trials.

Relationship between sampling and test

Get the mean sample distance and accuracy for each participant.

Plot

Individual accuracy across blocks: consistency analysis

Plot.

There is a different overall pattern of accuracy performance across blocks by condition. Receptive-first learners show larger growth compared to Active-first learners.

Models

Accuracy on the trial-level based on condition and block

Does condition and block predict accuracy on test trials?

Reliable interaction between condition and block. Receptive-first learners perform better on the second block of test trials than Active-first learners.

But overall, the two groups are not different from one another. How to interpret?

Accuracy based on sampling behavior and condition

Does mean accuracy depend on sampling behavior and condition?

Reliable interaction between mean sample distance and condition. If you get Receptive-first, then better sampling predicts better test, but not if you get Active-first.

Sampling behavior based on condition

Which condition is “better” at sampling?

Receptive-first participants are better at sampling than active first participants.

Effect coding (condition vs. category type) to test main effects.

effect code (choose contrasts based on how you want to interpret model output)

df %<>% mutate(category_type = factor(category_type),
               condition = factor(condition),
               block_factor = factor(block))

contrasts(df$category_type) <- cbind("base=rb" = c(1, -1))

contrasts(df$condition) <- cbind("active_vs_passive" = c(1, -3, 1, 1), "active2_vs_active1" = c(2, 0, -1, -1),
                                        "ra_vs_ar" = c(0, 0, 1, -1))

contrasts(df$block_factor) <- cbind("base=block1" = c(-1, 1))

Model with effect coding.

m3 <- glmer(correct ~ condition * category_type * block_factor + (1|subids), 
            data=filter(df, trial_type=="test"), 
            nAGQ = 0,
            control = glmerControl(optimizer = "bobyqa"),
            family=binomial)

knitr::kable(summary(m3)$coefficients)
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.1064081 0.0471463 23.4675510 0.0000000
conditionactive_vs_passive 0.0694780 0.0249274 2.7872088 0.0053164
conditionactive2_vs_active1 0.0629751 0.0356798 1.7650084 0.0775624
conditionra_vs_ar 0.0880576 0.0670867 1.3125953 0.1893194
category_typebase=rb -0.4575866 0.0461559 -9.9139457 0.0000000
block_factorbase=block1 0.1781442 0.0154227 11.5507540 0.0000000
conditionactive_vs_passive:category_typebase=rb -0.0100738 0.0255378 -0.3944644 0.6932382
conditionactive2_vs_active1:category_typebase=rb 0.0111979 0.0356798 0.3138442 0.7536394
conditionra_vs_ar:category_typebase=rb -0.0386074 0.0670867 -0.5754852 0.5649632
conditionactive_vs_passive:block_factorbase=block1 -0.0212999 0.0086836 -2.4528826 0.0141717
conditionactive2_vs_active1:block_factorbase=block1 0.0091157 0.0123090 0.7405669 0.4589561
conditionra_vs_ar:block_factorbase=block1 0.0562423 0.0228069 2.4660235 0.0136622
category_typebase=rb:block_factorbase=block1 -0.0755553 0.0154227 -4.8989654 0.0000010
conditionactive_vs_passive:category_typebase=rb:block_factorbase=block1 -0.0020441 0.0086836 -0.2353939 0.8139030
conditionactive2_vs_active1:category_typebase=rb:block_factorbase=block1 -0.0094441 0.0123090 -0.7672484 0.4429338
conditionra_vs_ar:category_typebase=rb:block_factorbase=block1 -0.0392145 0.0228069 -1.7194165 0.0855386

Intercept is the mean of the means (or the grand mean) of all the groups. These data are unbalanced. Active better than passive. Information integration worse than rule-based.