Overall it seems the model is learning to predict the arrest outcome very well. Remember that the outcome is was trained on is the baseline CNN’s prediction. This plot indicates that we are loosing almost no signal in this passthrough.
p-hat-cnn and it’s prediction:Well-Groomed signal ? … Kind ofPlot Outline
Skin-tone signal ? … NoPlot Outline
Note Since we matched on p-hat-cnn quantiles, I’ve compiled a plot for each. If this matching had worked (I think) we should observe horizontal lines for the skin-tone predictions in each of these plots. Since at a p-hat-cnn quartile level there should be no more skin-tone variation. This is clearly not the case ! However this plot also makes clear that the CNN is being fed well-matched labels (in red).
skin-tone and well-groomed on residualized cnn predictionI am regressing skin-tone and well-groomed on the residualized cnn prediction, which is supposed to contain no signal from either one of these.
Here we can see that:
skin-tone is significant and captures a fairly large amount of variation in the algorithms predictionswell-groomed is also significant, but has a super tiny r-sqrt which is making me quite hopefull that it’s working better| Dependent variable: | |||
| Residualized Baseline Prediction | |||
| Skin-tone | Well-groomed | ||
| (1) | (2) | (3) | |
| skin_tone_cont | 0.061*** | 0.062*** | |
| (0.004) | (0.004) | ||
| well_groomed | 0.008*** | 0.009*** | |
| (0.001) | (0.001) | ||
| Constant | 0.690*** | 0.679*** | 0.646*** |
| (0.002) | (0.006) | (0.007) | |
| Observations | 5,966 | 5,961 | 5,961 |
| Adjusted R2 | 0.034 | 0.005 | 0.041 |
| Note: | p<0.1; p<0.05; p<0.01 | ||