The goal of these simulations is to create a ‘bare-bones’ example of how confidence can be related to accuracy within one mind. Here is the basic framework: Participants are presented with two cues represented by Normal distributions with some mean and variance. Cue A is defined as a Normal with mu = 0 and sd = 1. Cue B is defined as a Normal with a positive bias (mu > 0), and can take on several different standard deviations (e.g.; low = .5, medium = 1, high = 2). In Phase 1, the agent chooses a cue at random and draws N (in this case 4) random samples. It then generate best estimates and confidence intervals (using one of two definitions for each). In Phase 2, it is assigned to either a control or dialectical condition. In the control condition, it selects the same cue and in the dialectical condition it selects the other cue. It then draws N (again, 4) new random samples from the selected cue.
In each estimation phase, agents select a probe cue, then draw \(wm_i\) samples from the corresponding cue distribution, where \(wm_i\) is the working memory capacity of the agent. To keep things simple, I’ll set \(wm_i\) to 4. These samples define the Subjective Sample Distribution (SSD). Based on the SSD, the agent gives responses as follows:
Best Estimates
Confidence Intervals
(hidded but included in markdown)
Does confidence correlate with accuracy in first estimates? I’ll test this in three ways
Notes
The operational definition of best estimates (mean vs. median), and imprecision (SSD range vs. Juslin) does not affect any of our main conclusions (all four plots look virtually identical)
Within a cue, imprecision is virtually uncorrelated with estimate error (!). You can see this by the very small r values within each cue. This is true for both the unbiased cue and the biased cue. In other words, if you use a single cue, then confidence should be unrelated to accuracy of point estimates.
Across cues, imprecision is positively correlated with estimate error. You can see this by the positive r values for the orange regression lines. In other words, the more confident and more precise you are, the more accurate your best estimate is.
Positive resolution in repeated judgments is due to differential cue use, and not to repeated estimation from the same cue.
Next I explore the accuracy of different aggregation strategies: First (take the first best estimate), Blend (average the two best estimates), and High Confidence (take the high confidence estiamte).
First - Blend: Mean difference in absolute deviation between first estimates and blended estimates. Higher values indicate higher accuracy for blend.
First - HConf: Mean difference in absolute deviation between first estimates and High confidence estimates. Higher values indicate higher accuracy for High confidence estimates.
Blend - HConf: Mean difference in absolute deviation between blended estimates and High confidence estimates. Higher values indicate higher accuracy for High confidence estimates.
No systematic difference between using mean or median as best estimate.
No systematic difference between using SSD range or Juslin’s approach
Dialectical instructions tend to increase accuracy in all conditions EXCEPT when cue B is biased (mean > 0) and has an smaller variance than cue A (i.e.; less than 1).
High confidence choosing only outperforms averaging when cue B is both biased AND has a higher variance than cue A (bottom right plot).
High confidence choosing will benefit you (relative to first estimates) in all cases EXCEPT when one (evil) cue has a smaller variance AND a higher bias than the other cue. However, high-confidence choosing only beats averaging when one cue has a higher variance and a higher bias, and provided that you use both cues (i.e.; dialectical bootstrapping).