For each exam, we have been asking participants to report their expected grades at two separate time points: immediately following each exam, and immediately before we release their grades (usually 2-3 days later). Using these two predictions for each exam, we can check if participants are "adjusting" their predictions during the period between the completion of an exam and the reveal of the outcome.
Below I've included some descriptives and initial results on these prediction adjustments, and I've also replicated the original updating finding from the last markdown (https://rpubs.com/wvillano/652855) using predictions and PEs from both time points.
Below is the distribution of these so-called "adjustments," where adjustments are:
\[Prediction_2 - Prediction_1\]
| n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Prediction Adjustments | 1652 | -2.73 | 7.44 | 0 | -2.39 | 4.45 | -47 | 50 | 97 | -0.37 | 8.17 | 0.18 |
On average, participants' expectations become more pessimistic as they near the reveal of grade outcomes.
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: pred_change ~ 1 + (1 | id)
## Data: grades.nomiss
##
## REML criterion at convergence: 11314.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -5.7936 -0.3382 0.2852 0.3799 6.9435
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 3.534 1.880
## Residual 51.897 7.204
## Number of obs: 1652, groups: id, 528
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) -2.7470 0.1964 473.5662 -13.99 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Across the four exams in the semester, the raw data seems to suggest that these adjustments become smaller at the group level as exams accrue:
Indeed, the multilevel model below shows that over the course of the semester, these adjustments become predictably smaller, but still remain negative (i.e., lowering expectations):
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: `Prediction Adjustment` ~ 1 + Exam + (1 | id)
## Data: temp_df
##
## REML criterion at convergence: 11287.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -5.8643 -0.3946 0.2161 0.4133 6.9256
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 3.792 1.947
## Residual 50.919 7.136
## Number of obs: 1652, groups: id, 528
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) -4.1623 0.3439 1636.1858 -12.103 < 2e-16 ***
## Exam2 1.7218 0.4720 1212.8322 3.648 0.000275 ***
## Exam3 2.2091 0.4781 1224.9208 4.620 4.24e-06 ***
## Exam4 2.0054 0.5392 1341.0165 3.720 0.000208 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) Exam2 Exam3
## Exam2 -0.685
## Exam3 -0.676 0.494
## Exam4 -0.600 0.439 0.433
## $Exam
Since we have two predictions for each exam, we technically have two PEs for each exam as well. Here, I calculated PEs from the first and second predictions for each exam, and overlayed the distributions.
Regardless of which PE is used, estimation seems to become more precise over the course of the semester:
To confirm this empirically, I regressed unsigned PEs (absolute value of PE) onto exam number to see if unsigned PEs are becoming predictably smaller over the course of the semester.
Below are the predicted values of unsigned PE across each exam from the model. Taking the PEs that arise from the first prediction, estimation error decreases over the course of the semester:
## $Exam
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: `Unsigned PE` ~ Exam + (1 | id)
## Data: grades.nomiss.mod
##
## REML criterion at convergence: 12115.1
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.7667 -0.6679 -0.1867 0.4293 8.6608
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 13.67 3.697
## Residual 77.59 8.809
## Number of obs: 1654, groups: id, 528
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 13.5464 0.4428 1583.7176 30.592 < 2e-16 ***
## Exam2 -1.3644 0.5843 1143.2367 -2.335 0.0197 *
## Exam3 -3.2270 0.5927 1155.1796 -5.445 6.34e-08 ***
## Exam4 -5.0844 0.6719 1258.2696 -7.568 7.32e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) Exam2 Exam3
## Exam2 -0.658
## Exam3 -0.648 0.494
## Exam4 -0.574 0.438 0.432
A similar trend is present in the PEs that arise from the second prediction. However, a key difference is that the unsigned PEs at exam 1 and 2 are more similar when using the second prediction. Recall also that the degree of prediction adjustment is greater for the first exam.
## $Exam
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: `Unsigned PE` ~ Exam + (1 | id)
## Data: grades.nomiss.mod
##
## REML criterion at convergence: 12176.2
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.6656 -0.6625 -0.2284 0.4404 8.4768
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 12.25 3.499
## Residual 81.93 9.051
## Number of obs: 1654, groups: id, 528
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 12.0811 0.4501 1601.5750 26.842 < 2e-16 ***
## Exam2 -0.1273 0.5999 1158.2391 -0.212 0.8320
## Exam3 -1.3995 0.6084 1170.4602 -2.300 0.0216 *
## Exam4 -3.4790 0.6888 1277.2316 -5.051 5.04e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr) Exam2 Exam3
## Exam2 -0.665
## Exam3 -0.655 0.494
## Exam4 -0.580 0.438 0.432
Importantly, the updating results replicate regardless of whether the first or second predictions/PEs are used.
Below are the original results from the model using second predictions/PEs:
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: pred_delta ~ PE_lag1 + (1 | id)
## Data: df.2
##
## REML criterion at convergence: 11495.4
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -5.2301 -0.4976 -0.0121 0.5138 5.1139
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 0.0 0.00
## Residual 206.4 14.37
## Number of obs: 1407, groups: id, 519
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 1.535e+00 3.846e-01 1.405e+03 3.991 6.92e-05 ***
## PE_lag1 3.922e-01 2.488e-02 1.405e+03 15.762 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## PE_lag1 -0.091
## convergence code: 0
## boundary (singular) fit: see ?isSingular
And here are the replicated results using first predictions/PEs:
## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: pred_delta ~ PE_lag1 + (1 | id)
## Data: df.1
##
## REML criterion at convergence: 8893.9
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -4.6619 -0.4891 -0.0047 0.5453 4.6918
##
## Random effects:
## Groups Name Variance Std.Dev.
## id (Intercept) 0.0 0.00
## Residual 188.1 13.72
## Number of obs: 1101, groups: id, 475
##
## Fixed effects:
## Estimate Std. Error df t value Pr(>|t|)
## (Intercept) 1.6243 0.4167 1099.0000 3.898 0.000103 ***
## PE_lag1 0.3093 0.0273 1099.0000 11.328 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## PE_lag1 0.126
## convergence code: 0
## boundary (singular) fit: see ?isSingular
We also asked participants to report their confidence in their predictions at both time points. Below are the distributions and descriptive statistics for both of these measures.
| vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| X1 | 1 | 1684 | 61.18 | 23.95 | 65 | 62.84 | 22.24 | 0 | 100 | 100 | -0.56 | -0.23 | 0.58 |
| vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| X1 | 1 | 1131 | 55.29 | 24.69 | 54 | 56.31 | 25.2 | 0 | 100 | 100 | -0.32 | -0.55 | 0.73 |
##
## Call:
## lm(formula = conf_1 ~ SSE_1, data = df.1)
##
## Residuals:
## Min 1Q Median 3Q Max
## -61.796 -13.385 4.708 16.778 46.499
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 66.5046 3.6605 18.168 <2e-16 ***
## SSE_1 -0.1944 0.1455 -1.336 0.183
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 23.79 on 217 degrees of freedom
## Multiple R-squared: 0.008163, Adjusted R-squared: 0.003592
## F-statistic: 1.786 on 1 and 217 DF, p-value: 0.1828
##
## Call:
## lm(formula = conf_2 ~ SSE_2, data = df.2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -58.124 -14.257 -0.093 20.545 46.535
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 61.5987 3.9473 15.605 <2e-16 ***
## SSE_2 -0.1779 0.1291 -1.378 0.17
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 25.18 on 228 degrees of freedom
## Multiple R-squared: 0.008254, Adjusted R-squared: 0.003905
## F-statistic: 1.898 on 1 and 228 DF, p-value: 0.1697