In this markdown I repeat our baseline arrest regression tables with the inclusion of new Side-By-Side MTurk Labels and we see an increase in adjusted r-squared from 0.0002 to 0.00124. We still observe problems with the bootstrap intervals for this particular row. We also now include a new variable: likely_released which captures answers to the following question:
“If each these individuals were arrested and presented to a magistrate, who then had to decide whether to grant them release prior to their trial, who do you think is more likely to be granted release?”
Demographics LM: model predicting final-arrest-outcome using sex and age_arrest
Charge Feature LM: model predicting final-arrest-outcome using felony_flag, gun_crime_flag, drug_crime_flag, violent_crime_flag, property_crime_flag, arrest_year
XgBoost risk: this is a boosted-tree using our historical and time-varying arrest-history data to predict re-arrest. We use this as a proxy for predicted risk. Note that we have always been using an XgBoost model, I am now only being more explicit with my naming scheme.
MTurk features: here I am including both the mean value for attractiveness, competence, dominance, trustworthiness and likely_released. This model also includes skin_tone
CNN Predicted Probability: These are the predicted-probabilities from our baseline CNN.
This is our baseline arrest table and this includes both single and combined-variable models:
sex and age_arrestfelony_flag, gun_crime_flag, drug_crime_flag, violent_crime_flag, property_crime_flag, arrest_yearattractiveness, competence, dominance, trustworthiness and likely_released. This model also includes skin_toneNOTE For Demographic LM + Charge Feature LM we include all charge feature and demographic variables in the same model (this is more parsimonious).
| Table 01 - Version 01 - Arrest Regressions | ||
|---|---|---|
| Fit measured in adjusted R squared and AUC | ||
| Model Configuration | Male-Female Combined | |
| Adjusted R Squared | ROC AUC | |
| Single Variable Model | ||
| Demographic LM | 0.0103 | 0.5556 |
| Lower 95% C.I. | 0.0075 | 0.5415 |
| Upper 95% C.I. | 0.0139 | 0.5697 |
| Charge Feature LM | 0.0910 | 0.6909 |
| 0.0803 | 0.6775 | |
| 0.1019 | 0.7043 | |
| XgBoost Risk | 0.0333 | 0.6099 |
| 0.0266 | 0.5962 | |
| 0.0404 | 0.6237 | |
| MTurk Features | 0.0012 | 0.5417 |
| 0.0013 | 0.5273 | |
| 0.0066 | 0.5562 | |
| P_hat_cnn | 0.0328 | 0.6231 |
| 0.0270 | 0.6095 | |
| 0.0399 | 0.6367 | |
| Combined Variable Model | ||
| Demographics + Charge Feature | 0.0979 | 0.7013 |
| 0.0872 | 0.6882 | |
| 0.1087 | 0.7144 | |
| Demographics + Charge Feature + Risk | 0.1129 | 0.7181 |
| 0.1009 | 0.7053 | |
| 0.1251 | 0.7308 | |
| Demographics + Charge Feature + Risk + MTurk | 0.1135 | 0.7200 |
| 0.1042 | 0.7073 | |
| 0.1287 | 0.7328 | |
| Demographics + Charge Feature + Risk + CNN | 0.1224 | 0.7292 |
| 0.1107 | 0.7168 | |
| 0.1357 | 0.7416 | |
| Combined Model | 0.1231 | 0.7316 |
| 0.1130 | 0.7192 | |
| 0.1382 | 0.7440 | |
Note The p_hat_demo_charge variable is the combined Demo + Charge model in table 01 below.
| Dependent variable: | ||||||||||
| Release Outcome | ||||||||||
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | |
| p_hat_demo | 1.234*** | |||||||||
| (1.017, 1.450) | ||||||||||
| p_hat_charge | 1.189*** | |||||||||
| (1.122, 1.257) | ||||||||||
| risk_pred_prob | -1.107*** | -0.760*** | -0.773*** | -0.687*** | -0.707*** | |||||
| (-1.214, -1.000) | (-0.865, -0.655) | (-0.879, -0.668) | (-0.792, -0.582) | (-0.813, -0.601) | ||||||
| attractiveness | -0.018 | -0.022 | -0.028 | |||||||
| (-0.053, 0.017) | (-0.054, 0.011) | (-0.060, 0.005) | ||||||||
| competence | 0.007 | 0.008 | 0.007 | |||||||
| (-0.028, 0.041) | (-0.025, 0.041) | (-0.026, 0.039) | ||||||||
| dominance | -0.021 | 0.002 | 0.011 | |||||||
| (-0.049, 0.006) | (-0.025, 0.028) | (-0.015, 0.037) | ||||||||
| trustworthiness | 0.045** | 0.034* | 0.030 | |||||||
| (0.009, 0.080) | (0.0004, 0.067) | (-0.003, 0.064) | ||||||||
| release_likely | 0.020 | 0.010 | 0.006 | |||||||
| (-0.015, 0.055) | (-0.023, 0.043) | (-0.027, 0.038) | ||||||||
| skin_tonenumber_f7ddc4 | 0.019 | -0.030 | -0.043** | |||||||
| (-0.017, 0.055) | (-0.064, 0.004) | (-0.077, -0.009) | ||||||||
| p_hat_cnn | 0.693*** | 0.387*** | 0.394*** | |||||||
| (0.626, 0.761) | (0.320, 0.454) | (0.327, 0.461) | ||||||||
| p_hat_demo_charge | 1.174*** | 1.081*** | 1.083*** | 1.001*** | 1.006*** | |||||
| (1.110, 1.238) | (1.016, 1.146) | (1.018, 1.148) | (0.936, 1.067) | (0.940, 1.072) | ||||||
| Constant | -0.208** | -0.174*** | 1.103*** | 0.721*** | 0.245*** | -0.163*** | 0.145*** | 0.132*** | -0.103** | -0.113** |
| (-0.378, -0.038) | (-0.227, -0.120) | (1.069, 1.137) | (0.689, 0.753) | (0.194, 0.296) | (-0.213, -0.112) | (0.079, 0.211) | (0.060, 0.204) | (-0.181, -0.025) | (-0.196, -0.030) | |
| Observations | 8,370 | 8,370 | 8,370 | 8,370 | 8,370 | 8,370 | 8,370 | 8,370 | 8,370 | 8,370 |
| Adjusted R2 | 0.010 | 0.091 | 0.033 | 0.001 | 0.033 | 0.098 | 0.113 | 0.113 | 0.122 | 0.123 |
| F Statistic | 88.034*** (df = 1; 8368) | 838.428*** (df = 1; 8368) | 289.131*** (df = 1; 8368) | 1.473* (df = 22; 8347) | 285.091*** (df = 1; 8368) | 909.727*** (df = 1; 8368) | 533.812*** (df = 2; 8367) | 45.629*** (df = 24; 8345) | 390.151*** (df = 3; 8366) | 47.995*** (df = 25; 8344) |
| Note: | p<0.1; p<0.05; p<0.01 | |||||||||
| NOTE We collapse the 18 skin-tone labels. | ||||||||||
This repeats the above baseline table, splitting by gender. The balance is: 1808 females and 6562 males in our validation set. The model specifications are exactly as above.
| Table 01 - Version 01 - Arrest Regressions - Split By Gender | ||||||
|---|---|---|---|---|---|---|
| Fit measured in adjusted R squared and AUC | ||||||
| Model Configuration | Male-Female Combined | Male Subsample | Female Subsample | |||
| Adjusted R Squared | ROC AUC | Adjusted R Squared | ROC AUC | Adjusted R Squared | ROC AUC | |
| Single Variable Model | ||||||
| Demographic LM | 0.0103 | 0.5556 | 0.0004 | 0.5120 | 0.0003 | 0.5336 |
| Lower 95% C.I. | 0.0072 | 0.5415 | −0.0001 | 0.4961 | −0.0005 | 0.4973 |
| Upper 95% C.I. | 0.0137 | 0.5697 | 0.0017 | 0.5279 | 0.0041 | 0.5699 |
| Charge Feature LM | 0.0910 | 0.6909 | 0.0916 | 0.6892 | 0.0687 | 0.6825 |
| 0.0806 | 0.6775 | 0.0800 | 0.6746 | 0.0471 | 0.6475 | |
| 0.1013 | 0.7043 | 0.1035 | 0.7038 | 0.0950 | 0.7175 | |
| XgBoost Risk | 0.0333 | 0.6099 | 0.0304 | 0.6051 | 0.0166 | 0.5782 |
| 0.0267 | 0.5962 | 0.0233 | 0.5899 | 0.0070 | 0.5452 | |
| 0.0404 | 0.6237 | 0.0384 | 0.6203 | 0.0315 | 0.6111 | |
| MTurk Features | 0.0012 | 0.5417 | 0.0001 | 0.5375 | 0.0045 | 0.6044 |
| 0.0015 | 0.5273 | 0.0007 | 0.5216 | 0.0061 | 0.5687 | |
| 0.0067 | 0.5562 | 0.0065 | 0.5533 | 0.0267 | 0.6400 | |
| P_hat_cnn | 0.0328 | 0.6231 | 0.0221 | 0.5980 | 0.0300 | 0.6588 |
| 0.0270 | 0.6095 | 0.0166 | 0.5826 | 0.0183 | 0.6270 | |
| 0.0392 | 0.6367 | 0.0283 | 0.6135 | 0.0448 | 0.6907 | |
| Combined Variable Model | ||||||
| Demographics + Charge Feature | 0.0979 | 0.7013 | 0.0920 | 0.6902 | 0.0703 | 0.6865 |
| 0.0867 | 0.6882 | 0.0803 | 0.6756 | 0.0483 | 0.6519 | |
| 0.1095 | 0.7144 | 0.1041 | 0.7048 | 0.0944 | 0.7211 | |
| Demographics + Charge Feature + Risk | 0.1129 | 0.7181 | 0.1085 | 0.7098 | 0.0789 | 0.6977 |
| 0.1012 | 0.7053 | 0.0959 | 0.6957 | 0.0558 | 0.6634 | |
| 0.1248 | 0.7308 | 0.1220 | 0.7239 | 0.1048 | 0.7321 | |
| Demographics + Charge Feature + Risk + MTurk | 0.1135 | 0.7200 | 0.1080 | 0.7110 | 0.0825 | 0.7213 |
| 0.1046 | 0.7073 | 0.0986 | 0.6969 | 0.0690 | 0.6888 | |
| 0.1274 | 0.7328 | 0.1248 | 0.7252 | 0.1203 | 0.7537 | |
| Demographics + Charge Feature + Risk + CNN | 0.1224 | 0.7292 | 0.1194 | 0.7218 | 0.1024 | 0.7343 |
| 0.1110 | 0.7168 | 0.1069 | 0.7081 | 0.0802 | 0.7034 | |
| 0.1346 | 0.7416 | 0.1331 | 0.7355 | 0.1316 | 0.7652 | |
| Combined Model | 0.1231 | 0.7316 | 0.1192 | 0.7239 | 0.0998 | 0.7440 |
| 0.1138 | 0.7192 | 0.1089 | 0.7102 | 0.0858 | 0.7137 | |
| 0.1369 | 0.7440 | 0.1354 | 0.7376 | 0.1378 | 0.7743 | |
| Dependent variable: | ||||||||||
| Release Outcome | ||||||||||
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | |
| p_hat_demo | 4.221* | |||||||||
| (0.453, 7.989) | ||||||||||
| p_hat_charge | 1.210*** | |||||||||
| (1.133, 1.287) | ||||||||||
| risk_pred_prob | -1.048*** | -0.784*** | -0.793*** | -0.734*** | -0.750*** | |||||
| (-1.168, -0.928) | (-0.900, -0.668) | (-0.910, -0.675) | (-0.850, -0.618) | (-0.867, -0.633) | ||||||
| attractiveness | -0.037 | -0.025 | -0.028 | |||||||
| (-0.077, 0.003) | (-0.063, 0.013) | (-0.066, 0.010) | ||||||||
| competence | 0.007 | 0.001 | -0.003 | |||||||
| (-0.034, 0.048) | (-0.038, 0.040) | (-0.042, 0.035) | ||||||||
| dominance | 0.005 | 0.002 | 0.007 | |||||||
| (-0.028, 0.037) | (-0.029, 0.033) | (-0.024, 0.038) | ||||||||
| trustworthiness | 0.045* | 0.035 | 0.031 | |||||||
| (0.004, 0.087) | (-0.004, 0.075) | (-0.008, 0.071) | ||||||||
| release_likely | 0.009 | 0.014 | 0.015 | |||||||
| (-0.032, 0.050) | (-0.025, 0.053) | (-0.023, 0.054) | ||||||||
| skin_tonenumber_f7ddc4 | 0.020 | -0.015 | -0.033 | |||||||
| (-0.023, 0.062) | (-0.055, 0.025) | (-0.073, 0.007) | ||||||||
| p_hat_cnn | 0.632*** | 0.448*** | 0.458*** | |||||||
| (0.547, 0.716) | (0.367, 0.530) | (0.376, 0.540) | ||||||||
| p_hat_demo_charge | 1.169*** | 1.091*** | 1.089*** | 1.048*** | 1.047*** | |||||
| (1.094, 1.244) | (1.016, 1.165) | (1.014, 1.164) | (0.973, 1.123) | (0.972, 1.122) | ||||||
| Constant | -2.502 | -0.207*** | 1.069*** | 0.703*** | 0.285*** | -0.159*** | 0.148*** | 0.133*** | -0.157*** | -0.170*** |
| (-5.395, 0.392) | (-0.268, -0.146) | (1.030, 1.108) | (0.667, 0.739) | (0.223, 0.346) | (-0.216, -0.101) | (0.075, 0.222) | (0.053, 0.214) | (-0.249, -0.066) | (-0.267, -0.073) | |
| Observations | 6,562 | 6,562 | 6,562 | 6,562 | 6,562 | 6,562 | 6,562 | 6,562 | 6,562 | 6,562 |
| Adjusted R2 | 0.0004 | 0.092 | 0.030 | 0.0001 | 0.022 | 0.092 | 0.109 | 0.108 | 0.119 | 0.119 |
| F Statistic | 3.395* (df = 1; 6560) | 662.714*** (df = 1; 6560) | 207.047*** (df = 1; 6560) | 1.017 (df = 22; 6539) | 149.488*** (df = 1; 6560) | 665.427*** (df = 1; 6560) | 400.423*** (df = 2; 6559) | 34.116*** (df = 24; 6537) | 297.559*** (df = 3; 6558) | 36.518*** (df = 25; 6536) |
| Note: | p<0.1; p<0.05; p<0.01 | |||||||||
| NOTE We collapse the 18 skin-tone labels. | ||||||||||
| Dependent variable: | ||||||||||
| Release Outcome | ||||||||||
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | |
| p_hat_demo | -7.355 | |||||||||
| (-16.902, 2.192) | ||||||||||
| p_hat_charge | 0.961*** | |||||||||
| (0.825, 1.097) | ||||||||||
| risk_pred_prob | -0.916*** | -0.672*** | -0.636*** | -0.576*** | -0.564*** | |||||
| (-1.185, -0.648) | (-0.935, -0.410) | (-0.901, -0.372) | (-0.836, -0.316) | (-0.827, -0.301) | ||||||
| attractiveness | 0.013 | 0.001 | -0.003 | |||||||
| (-0.053, 0.079) | (-0.062, 0.065) | (-0.066, 0.060) | ||||||||
| competence | 0.033 | 0.024 | 0.020 | |||||||
| (-0.030, 0.096) | (-0.036, 0.085) | (-0.039, 0.080) | ||||||||
| dominance | -0.031 | -0.008 | -0.010 | |||||||
| (-0.080, 0.019) | (-0.055, 0.040) | (-0.057, 0.037) | ||||||||
| trustworthiness | 0.033 | 0.029 | 0.025 | |||||||
| (-0.030, 0.095) | (-0.032, 0.089) | (-0.035, 0.084) | ||||||||
| release_likely | -0.0002 | 0.002 | -0.003 | |||||||
| (-0.063, 0.063) | (-0.058, 0.063) | (-0.063, 0.057) | ||||||||
| skin_tonenumber_f7ddc4 | -0.081* | -0.080* | -0.047 | |||||||
| (-0.155, -0.006) | (-0.152, -0.008) | (-0.119, 0.024) | ||||||||
| p_hat_cnn | 0.739*** | 0.658*** | 0.590*** | |||||||
| (0.578, 0.900) | (0.503, 0.814) | (0.426, 0.753) | ||||||||
| p_hat_demo_charge | 1.209*** | 1.149*** | 1.151*** | 1.125*** | 1.126*** | |||||
| (1.040, 1.378) | (0.978, 1.319) | (0.981, 1.322) | (0.957, 1.293) | (0.957, 1.295) | ||||||
| Constant | 7.120 | 0.072 | 1.105*** | 0.852*** | 0.224*** | -0.192** | 0.051 | 0.052 | -0.508*** | -0.447*** |
| (-1.026, 15.267) | (-0.039, 0.182) | (1.027, 1.183) | (0.780, 0.925) | (0.088, 0.360) | (-0.338, -0.046) | (-0.122, 0.225) | (-0.135, 0.238) | (-0.724, -0.292) | (-0.677, -0.216) | |
| Observations | 1,808 | 1,808 | 1,808 | 1,808 | 1,808 | 1,808 | 1,808 | 1,808 | 1,808 | 1,808 |
| Adjusted R2 | 0.0003 | 0.069 | 0.017 | 0.005 | 0.030 | 0.070 | 0.079 | 0.082 | 0.102 | 0.100 |
| F Statistic | 1.606 (df = 1; 1806) | 134.311*** (df = 1; 1806) | 31.513*** (df = 1; 1806) | 1.374 (df = 22; 1785) | 56.956*** (df = 1; 1806) | 137.682*** (df = 1; 1806) | 78.361*** (df = 2; 1805) | 7.768*** (df = 24; 1783) | 69.732*** (df = 3; 1804) | 9.011*** (df = 25; 1782) |
| Note: | p<0.1; p<0.05; p<0.01 | |||||||||
| NOTE We collapse the 18 skin-tone labels. | ||||||||||