Expectation updating, PEs, and information content

Prediction Updating and previous PE

Previous grade prediction error (PE) predicts expectation updating. This is the basic updating finding from the last markdown. In the sections below, I fit a few additional models to see whether updating differs following positive vs. negative PEs.

## delta_pred_2 ~ pe_2_lag1 + (1 | cohort/id)

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	0.6301874	1.2654662	2.063108	0.4979883	0.666525
pe_2_lag1	0.3014926	0.0316644	1523.995268	9.5214976	0.000000

Does expectation updating differ following positive vs. negative PEs?

Nonlinear fits seem to show that updating is more strongly predicted by positive PEs.

## delta_pred_2 ~ s(pe_2_lag1) + (1 | id)

Below, the data are split by PE direction and two separate updating models are fit.

Positive PEs predict linear increases in grade expectation

The following model is fit only to data with positive PEs on exam t-1.

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	-2.360726	0.8979133	886	-2.629125	0.0087088
pe_2_lag1	0.510907	0.0713421	886	7.161372	0.0000000

Negative PEs do not predict expectation updating

Here, the data are subset to observations with negative PEs on exam t-1 and an identical model is fit.

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	-1.5053055	1.7641859	3.209626	-0.8532579	0.4524736
pe_2_lag1	0.0558684	0.0798899	611.189935	0.6993179	0.4846194

## boundary (singular) fit: see ?isSingular

Modeling PE as an interaction (Unsigned PE * PE direction)

BICs indicate that this interactive model fits the data best:

BIC(updating.PEint.lmer) - BIC(updating.lmer)

## [1] -176.5658

Again, updating is occuring primarily after positive PEs.

## delta_pred_2 ~ unsigned_lag_pe_2 * pe_2_lag_sign_f + (1 | id)

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	-2.1267669	1.0018378	1498	-2.1228655	0.0339287
unsigned_lag_pe_2	-0.0378307	0.0823194	1498	-0.4595595	0.6458991
pe_2_lag_sign_f1	-0.2339594	1.3365493	1498	-0.1750473	0.8610661
unsigned_lag_pe_2:pe_2_lag_sign_f1	0.5487377	0.1082471	1498	5.0693080	0.0000004

Prediction Adjustments

\[Prediction_2 - Prediction_1\]

For each exam, participants report their expected grades at two separate time points: immediately following each exam, and immediately before we release their grades (usually 2-3 days later). Using these two predictions for each exam, we can see whether participants “adjust” their predictions in the period between the completion of an exam and the reveal of the outcome.

Average prediction adjustment is negative

On average, participants reduce their grade expectations by approximately -2.5 points in the lead up to the grade reveal. In other words, participants’ expectations become more pessimistic as the release of grades comes nigh.

	n	mean	sd	median	trimmed	mad	min	max	range	skew	kurtosis	se
Prediction Adjustment	2565	-2.47	7.02	0	-2.23	2.97	-40	45	85	-0.3	6.34	0.14

Prediction adjustments become less negative over time

Linear fits indicate that prediction adjustments are largest and most negative at the start of the semester. Over the course of the semester, average prediction adjustments become closer to zero, but are still negative at the final exam.

## pred_adj ~ exam + (1 | cohort/id)

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	-3.2010392	0.3193852	2544.455	-10.022504	0.0000000
exam	0.2665492	0.1110968	2119.081	2.399253	0.0165145

Negative prediction adjustments predict more positive PEs

If more negative prediction adjustments lead to more positive PEs, we might think of this as adaptive pessimism that increases the probability of a positive surprise.

To see whether this is the case, PEs are regressed onto prediction adjustment in the model below.

## pe_2 ~ pred_adj + (1 | cohort/id)

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	0.9041512	0.9820188	1.992745	0.9207067	0.4547039
pred_adj	-0.3044954	0.0328590	2521.177526	-9.2667356	0.0000000

Prediction adjustment predicts expectation updating on the next exam

In a model with PE on exam t-1, prediction adjustment on exam t-1 negatively predicts expectation updating. Taken with the preceding results, this seems to indicate that participants who negatively adjust their predictions tend to have more positive PEs, which leads to more positive increases in expectations for the next exam.

## delta_pred_2 ~ pred_adj_lag1 + unsigned_lag_pe_2 * pe_2_lag_sign_f + 
##     (1 | cohort/id)

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	-2.2003836	1.5310323	5.005739	-1.4371895	0.2101036
pred_adj_lag1	-0.2534873	0.0563857	1496.575577	-4.4955974	0.0000075
unsigned_lag_pe_2	-0.0277109	0.0814392	1496.017388	-0.3402645	0.7337051
pe_2_lag_sign_f1	-0.0923724	1.3201490	1495.402315	-0.0699712	0.9442260
unsigned_lag_pe_2:pe_2_lag_sign_f1	0.5240409	0.1080784	1496.715098	4.8487088	0.0000014

New predictor: information content of exam grade

Information content represents how surprising an exam grade was, given a subject’s previous grades and uncertainty around those grades (i.e., prediction errors). The goal here is to build probability densities that represent a participant’s expectation for the next exam grade.

For each exam, I set up a normal probability distribution (likelihood) with the mean centered around a subject’s grade, and variance proportional to the subject’s unsigned PE for that exam. Larger unsigned PEs lead to wider probability distributions around an exam grade.

With each new exam and grade PE, a new likelihood is constructed to update prior grade expectations. The plots below show how priors (shown in blue) are cumulatively updated by the likelihoods (shown in green) for each exam that follows. Using these cumulatively updated grade probability distributions, I computed information content as -log(prior probability) for the grade on the following exam.

A few notes:

Information content was only computed for subjects for whom we have at least 4 successive exam grades and PEs.
Variance of grade likelihood is proportional to PE. I used a scaling coefficient of 2 to set the variance parameter. This was not an empirical decision, so we might try optimizing this coefficient.

Here’s how this looks for one example subject:

Information content and expectation updating

For exams where participants reported negative prediction errors, updating is strongest when information content (i.e., surprisal) is low. For instances where participants reported positive prediction errors, expectation updating does not appear to be modulated by information content.

## delta_pred_2 ~ log_info_content_lag1 * pe_2_lag_sign_f + (1 | 
##     id)

	Estimate	Std. Error	df	t value	Pr(>\|t\|)
(Intercept)	-13.210089	2.445931	578	-5.400842	0.0000001
log_info_content_lag1	5.517285	1.132822	578	4.870389	0.0000014
pe_2_lag_sign_f1	14.660977	3.265144	578	4.490146	0.0000086
log_info_content_lag1:pe_2_lag_sign_f1	-5.130500	1.539668	578	-3.332212	0.0009166

Integrating information content, PE, and prediction adjustment

In all likelihood this model is woefully misspecified, but this is a first attempt at including the aforementioned predictors (prediction adjustment, previous PE, and information content) in the same model. Below is a summary of this model, as well as plots of the predicted values of expectation updating as a function of PE, information content, and prediction adjustment.

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: delta_pred_2 ~ pred_adj_lag1 * pe_2_lag1 + log_info_content_lag1 *  
##     pe_2_lag1 + exam + (1 | cohort/id)
##    Data: df.new
## 
## REML criterion at convergence: 4752.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.6133 -0.5480 -0.0011  0.4929  3.8682 
## 
## Random effects:
##  Groups    Name        Variance Std.Dev.
##  id:cohort (Intercept)   0.000   0.000  
##  cohort    (Intercept)   9.252   3.042  
##  Residual              161.862  12.723  
## Number of obs: 598, groups:  id:cohort, 248; cohort, 3
## 
## Fixed effects:
##                                   Estimate Std. Error         df t value
## (Intercept)                     -15.610585   3.566961  27.117746  -4.376
## pred_adj_lag1                    -0.257503   0.102088 589.958947  -2.522
## pe_2_lag1                         0.667676   0.131735 590.057793   5.068
## log_info_content_lag1             2.747073   0.734320 589.267068   3.741
## exam                              2.733966   0.766972 582.878441   3.565
## pred_adj_lag1:pe_2_lag1          -0.015670   0.007946 589.562546  -1.972
## pe_2_lag1:log_info_content_lag1  -0.204493   0.051744 590.618335  -3.952
##                                 Pr(>|t|)    
## (Intercept)                     0.000161 ***
## pred_adj_lag1                   0.011919 *  
## pe_2_lag1                       5.38e-07 ***
## log_info_content_lag1           0.000201 ***
## exam                            0.000394 ***
## pred_adj_lag1:pe_2_lag1         0.049076 *  
## pe_2_lag1:log_info_content_lag1 8.69e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) prd__1 p_2_l1 lg___1 exam   p__1:_
## pred_dj_lg1  0.110                                   
## pe_2_lag1    0.004  0.007                            
## lg_nf_cnt_1 -0.336 -0.091  0.037                     
## exam        -0.750 -0.042 -0.047 -0.098              
## prd__1:_2_1 -0.007 -0.258  0.263  0.046 -0.013       
## p_2_l1:___1 -0.006  0.007 -0.915 -0.003  0.020 -0.144
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see ?isSingular

Regardless of PE direction, updating slopes are larger when information content is low. However, the interaction between PE and information content is significant, which means that positive but not negative PEs lead to updating when information content is high. This suggests that participants are more likely to discount surprising negative information and integrate surprising positive information into future expectations. Perhaps this is evidence of an optimism bias.

Expectation updating is also predicted by the interaction between prediction adjustment and PE, such that negative adjustments to grade predictions yield more positive prediction errors, and greater updating over resultant positive PEs.

Taken together, these results seem indicative of optimism biases in the way participants consider their performance and subsequently update their expectations:

Updating is asymmetrical over postive vs. negative PEs, such that updating occurs mostly after positive PEs
Following negative PEs, updating primarily occurs when information content is low. When grades are surprising but better than expected, participants seem to incorporate them into their future expectations. However, surprising grades that are worse than expected are discounted, and expectations go unchanged.
Given the asymmetry in updating over positive vs. negative PEs, and that negative prediction adjustments are associated with more positive PEs, negative adjustments to grade predictions predict more positive expectation updating.