Prediction Updating and previous PE

Previous grade prediction error (PE) predicts expectation updating. This is the basic updating finding from the last markdown. In the sections below, I fit a few additional models to see whether updating differs following positive vs. negative PEs.

## delta_pred_2 ~ pe_2_lag1 + (1 | cohort/id)
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.6301874 1.2654662 2.063108 0.4979883 0.666525
pe_2_lag1 0.3014926 0.0316644 1523.995268 9.5214976 0.000000



Does expectation updating differ following positive vs. negative PEs?

Nonlinear fits seem to show that updating is more strongly predicted by positive PEs.

## delta_pred_2 ~ s(pe_2_lag1) + (1 | id)



Below, the data are split by PE direction and two separate updating models are fit.




Positive PEs predict linear increases in grade expectation

The following model is fit only to data with positive PEs on exam t-1.

Estimate Std. Error df t value Pr(>|t|)
(Intercept) -2.360726 0.8979133 886 -2.629125 0.0087088
pe_2_lag1 0.510907 0.0713421 886 7.161372 0.0000000



Negative PEs do not predict expectation updating

Here, the data are subset to observations with negative PEs on exam t-1 and an identical model is fit.

Estimate Std. Error df t value Pr(>|t|)
(Intercept) -1.5053055 1.7641859 3.209626 -0.8532579 0.4524736
pe_2_lag1 0.0558684 0.0798899 611.189935 0.6993179 0.4846194
## boundary (singular) fit: see ?isSingular



Modeling PE as an interaction (Unsigned PE * PE direction)

BICs indicate that this interactive model fits the data best:

BIC(updating.PEint.lmer) - BIC(updating.lmer)
## [1] -176.5658

Again, updating is occuring primarily after positive PEs.

## delta_pred_2 ~ unsigned_lag_pe_2 * pe_2_lag_sign_f + (1 | id)
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -2.1267669 1.0018378 1498 -2.1228655 0.0339287
unsigned_lag_pe_2 -0.0378307 0.0823194 1498 -0.4595595 0.6458991
pe_2_lag_sign_f1 -0.2339594 1.3365493 1498 -0.1750473 0.8610661
unsigned_lag_pe_2:pe_2_lag_sign_f1 0.5487377 0.1082471 1498 5.0693080 0.0000004



Prediction Adjustments

\[Prediction_2 - Prediction_1\]

For each exam, participants report their expected grades at two separate time points: immediately following each exam, and immediately before we release their grades (usually 2-3 days later). Using these two predictions for each exam, we can see whether participants “adjust” their predictions in the period between the completion of an exam and the reveal of the outcome.


Average prediction adjustment is negative

On average, participants reduce their grade expectations by approximately -2.5 points in the lead up to the grade reveal. In other words, participants’ expectations become more pessimistic as the release of grades comes nigh.

n mean sd median trimmed mad min max range skew kurtosis se
Prediction Adjustment 2565 -2.47 7.02 0 -2.23 2.97 -40 45 85 -0.3 6.34 0.14



Prediction adjustments become less negative over time

Linear fits indicate that prediction adjustments are largest and most negative at the start of the semester. Over the course of the semester, average prediction adjustments become closer to zero, but are still negative at the final exam.

## pred_adj ~ exam + (1 | cohort/id)
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -3.2010392 0.3193852 2544.455 -10.022504 0.0000000
exam 0.2665492 0.1110968 2119.081 2.399253 0.0165145



Negative prediction adjustments predict more positive PEs

If more negative prediction adjustments lead to more positive PEs, we might think of this as adaptive pessimism that increases the probability of a positive surprise.

To see whether this is the case, PEs are regressed onto prediction adjustment in the model below.

## pe_2 ~ pred_adj + (1 | cohort/id)
Estimate Std. Error df t value Pr(>|t|)
(Intercept) 0.9041512 0.9820188 1.992745 0.9207067 0.4547039
pred_adj -0.3044954 0.0328590 2521.177526 -9.2667356 0.0000000



Prediction adjustment predicts expectation updating on the next exam

In a model with PE on exam t-1, prediction adjustment on exam t-1 negatively predicts expectation updating. Taken with the preceding results, this seems to indicate that participants who negatively adjust their predictions tend to have more positive PEs, which leads to more positive increases in expectations for the next exam.

## delta_pred_2 ~ pred_adj_lag1 + unsigned_lag_pe_2 * pe_2_lag_sign_f + 
##     (1 | cohort/id)
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -2.2003836 1.5310323 5.005739 -1.4371895 0.2101036
pred_adj_lag1 -0.2534873 0.0563857 1496.575577 -4.4955974 0.0000075
unsigned_lag_pe_2 -0.0277109 0.0814392 1496.017388 -0.3402645 0.7337051
pe_2_lag_sign_f1 -0.0923724 1.3201490 1495.402315 -0.0699712 0.9442260
unsigned_lag_pe_2:pe_2_lag_sign_f1 0.5240409 0.1080784 1496.715098 4.8487088 0.0000014



New predictor: information content of exam grade

Information content represents how surprising an exam grade was, given a subject’s previous grades and uncertainty around those grades (i.e., prediction errors). The goal here is to build probability densities that represent a participant’s expectation for the next exam grade.

For each exam, I set up a normal probability distribution (likelihood) with the mean centered around a subject’s grade, and variance proportional to the subject’s unsigned PE for that exam. Larger unsigned PEs lead to wider probability distributions around an exam grade.

With each new exam and grade PE, a new likelihood is constructed to update prior grade expectations. The plots below show how priors (shown in blue) are cumulatively updated by the likelihoods (shown in green) for each exam that follows. Using these cumulatively updated grade probability distributions, I computed information content as -log(prior probability) for the grade on the following exam.

A few notes:

  • Information content was only computed for subjects for whom we have at least 4 successive exam grades and PEs.

  • Variance of grade likelihood is proportional to PE. I used a scaling coefficient of 2 to set the variance parameter. This was not an empirical decision, so we might try optimizing this coefficient.

Here’s how this looks for one example subject:




Information content and expectation updating

For exams where participants reported negative prediction errors, updating is strongest when information content (i.e., surprisal) is low. For instances where participants reported positive prediction errors, expectation updating does not appear to be modulated by information content.

## delta_pred_2 ~ log_info_content_lag1 * pe_2_lag_sign_f + (1 | 
##     id)
Estimate Std. Error df t value Pr(>|t|)
(Intercept) -13.210089 2.445931 578 -5.400842 0.0000001
log_info_content_lag1 5.517285 1.132822 578 4.870389 0.0000014
pe_2_lag_sign_f1 14.660977 3.265144 578 4.490146 0.0000086
log_info_content_lag1:pe_2_lag_sign_f1 -5.130500 1.539668 578 -3.332212 0.0009166



Integrating information content, PE, and prediction adjustment

In all likelihood this model is woefully misspecified, but this is a first attempt at including the aforementioned predictors (prediction adjustment, previous PE, and information content) in the same model. Below is a summary of this model, as well as plots of the predicted values of expectation updating as a function of PE, information content, and prediction adjustment.

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: delta_pred_2 ~ pred_adj_lag1 * pe_2_lag1 + log_info_content_lag1 *  
##     pe_2_lag1 + exam + (1 | cohort/id)
##    Data: df.new
## 
## REML criterion at convergence: 4752.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.6133 -0.5480 -0.0011  0.4929  3.8682 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev.
##  id:cohort (Intercept)   0.000   0.000  
##  cohort    (Intercept)   9.252   3.042  
##  Residual              161.862  12.723  
## Number of obs: 598, groups:  id:cohort, 248; cohort, 3
## 
## Fixed effects:
##                                   Estimate Std. Error         df t value
## (Intercept)                     -15.610585   3.566961  27.117746  -4.376
## pred_adj_lag1                    -0.257503   0.102088 589.958947  -2.522
## pe_2_lag1                         0.667676   0.131735 590.057793   5.068
## log_info_content_lag1             2.747073   0.734320 589.267068   3.741
## exam                              2.733966   0.766972 582.878441   3.565
## pred_adj_lag1:pe_2_lag1          -0.015670   0.007946 589.562546  -1.972
## pe_2_lag1:log_info_content_lag1  -0.204493   0.051744 590.618335  -3.952
##                                 Pr(>|t|)    
## (Intercept)                     0.000161 ***
## pred_adj_lag1                   0.011919 *  
## pe_2_lag1                       5.38e-07 ***
## log_info_content_lag1           0.000201 ***
## exam                            0.000394 ***
## pred_adj_lag1:pe_2_lag1         0.049076 *  
## pe_2_lag1:log_info_content_lag1 8.69e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) prd__1 p_2_l1 lg___1 exam   p__1:_
## pred_dj_lg1  0.110                                   
## pe_2_lag1    0.004  0.007                            
## lg_nf_cnt_1 -0.336 -0.091  0.037                     
## exam        -0.750 -0.042 -0.047 -0.098              
## prd__1:_2_1 -0.007 -0.258  0.263  0.046 -0.013       
## p_2_l1:___1 -0.006  0.007 -0.915 -0.003  0.020 -0.144
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular



Regardless of PE direction, updating slopes are larger when information content is low. However, the interaction between PE and information content is significant, which means that positive but not negative PEs lead to updating when information content is high. This suggests that participants are more likely to discount surprising negative information and integrate surprising positive information into future expectations. Perhaps this is evidence of an optimism bias.




Expectation updating is also predicted by the interaction between prediction adjustment and PE, such that negative adjustments to grade predictions yield more positive prediction errors, and greater updating over resultant positive PEs.




Taken together, these results seem indicative of optimism biases in the way participants consider their performance and subsequently update their expectations:

  • Updating is asymmetrical over postive vs. negative PEs, such that updating occurs mostly after positive PEs

  • Following negative PEs, updating primarily occurs when information content is low. When grades are surprising but better than expected, participants seem to incorporate them into their future expectations. However, surprising grades that are worse than expected are discounted, and expectations go unchanged.

  • Given the asymmetry in updating over positive vs. negative PEs, and that negative prediction adjustments are associated with more positive PEs, negative adjustments to grade predictions predict more positive expectation updating.