Summary

In this markdown I repeat our baseline arrest regression tables with the inclusion of new Side-By-Side MTurk Labels and we see an increase in adjusted r-squared from 0.0002 to 0.00124. We still observe problems with the bootstrap intervals for this particular row. We also now include a new variable: likely_released which captures answers to the following question:

“If each these individuals were arrested and presented to a magistrate, who then had to decide whether to grant them release prior to their trial, who do you think is more likely to be granted release?”

Definitions

  1. Demographics LM: model predicting final-arrest-outcome using sex and age_arrest

  2. Charge Feature LM: model predicting final-arrest-outcome using felony_flag, gun_crime_flag, drug_crime_flag, violent_crime_flag, property_crime_flag, arrest_year

  3. XgBoost risk: this is a boosted-tree using our historical and time-varying arrest-history data to predict re-arrest. We use this as a proxy for predicted risk. Note that we have always been using an XgBoost model, I am now only being more explicit with my naming scheme.

  4. MTurk features: here I am including both the mean value for attractiveness, competence, dominance, trustworthiness and likely_released. This model also includes skin_tone

  5. CNN Predicted Probability: These are the predicted-probabilities from our baseline CNN.



Baselin Table: Table 01 - Config 01

This is our baseline arrest table and this includes both single and combined-variable models:

  1. Single-Variable models:
  1. Combined-Variable models:

NOTE For Demographic LM + Charge Feature LM we include all charge feature and demographic variables in the same model (this is more parsimonious).

Table 01 - Version 01 - Arrest Regressions
Fit measured in adjusted R squared and AUC
Model Configuration Male-Female Combined
Adjusted R Squared ROC AUC
Single Variable Model
Demographic LM 0.0103 0.5556
Lower 95% C.I. 0.0075 0.5415
Upper 95% C.I. 0.0139 0.5697
Charge Feature LM 0.0910 0.6909
0.0803 0.6775
0.1019 0.7043
XgBoost Risk 0.0333 0.6099
0.0266 0.5962
0.0404 0.6237
MTurk Features 0.0012 0.5417
0.0013 0.5273
0.0066 0.5562
P_hat_cnn 0.0328 0.6231
0.0270 0.6095
0.0399 0.6367
Combined Variable Model
Demographics + Charge Feature 0.0979 0.7013
0.0872 0.6882
0.1087 0.7144
Demographics + Charge Feature + Risk 0.1129 0.7181
0.1009 0.7053
0.1251 0.7308
Demographics + Charge Feature + Risk + MTurk 0.1135 0.7200
0.1042 0.7073
0.1287 0.7328
Demographics + Charge Feature + Risk + CNN 0.1224 0.7292
0.1107 0.7168
0.1357 0.7416
Combined Model 0.1231 0.7316
0.1130 0.7192
0.1382 0.7440

Table 01 - Regressions

Note The p_hat_demo_charge variable is the combined Demo + Charge model in table 01 below.

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
p_hat_demo 1.234***
(1.017, 1.450)
p_hat_charge 1.189***
(1.122, 1.257)
risk_pred_prob -1.107*** -0.760*** -0.773*** -0.687*** -0.707***
(-1.214, -1.000) (-0.865, -0.655) (-0.879, -0.668) (-0.792, -0.582) (-0.813, -0.601)
attractiveness -0.018 -0.022 -0.028
(-0.053, 0.017) (-0.054, 0.011) (-0.060, 0.005)
competence 0.007 0.008 0.007
(-0.028, 0.041) (-0.025, 0.041) (-0.026, 0.039)
dominance -0.021 0.002 0.011
(-0.049, 0.006) (-0.025, 0.028) (-0.015, 0.037)
trustworthiness 0.045** 0.034* 0.030
(0.009, 0.080) (0.0004, 0.067) (-0.003, 0.064)
release_likely 0.020 0.010 0.006
(-0.015, 0.055) (-0.023, 0.043) (-0.027, 0.038)
skin_tonenumber_f7ddc4 0.019 -0.030 -0.043**
(-0.017, 0.055) (-0.064, 0.004) (-0.077, -0.009)
p_hat_cnn 0.693*** 0.387*** 0.394***
(0.626, 0.761) (0.320, 0.454) (0.327, 0.461)
p_hat_demo_charge 1.174*** 1.081*** 1.083*** 1.001*** 1.006***
(1.110, 1.238) (1.016, 1.146) (1.018, 1.148) (0.936, 1.067) (0.940, 1.072)
Constant -0.208** -0.174*** 1.103*** 0.721*** 0.245*** -0.163*** 0.145*** 0.132*** -0.103** -0.113**
(-0.378, -0.038) (-0.227, -0.120) (1.069, 1.137) (0.689, 0.753) (0.194, 0.296) (-0.213, -0.112) (0.079, 0.211) (0.060, 0.204) (-0.181, -0.025) (-0.196, -0.030)
Observations 8,370 8,370 8,370 8,370 8,370 8,370 8,370 8,370 8,370 8,370
Adjusted R2 0.010 0.091 0.033 0.001 0.033 0.098 0.113 0.113 0.122 0.123
F Statistic 88.034*** (df = 1; 8368) 838.428*** (df = 1; 8368) 289.131*** (df = 1; 8368) 1.473* (df = 22; 8347) 285.091*** (df = 1; 8368) 909.727*** (df = 1; 8368) 533.812*** (df = 2; 8367) 45.629*** (df = 24; 8345) 390.151*** (df = 3; 8366) 47.995*** (df = 25; 8344)
Note: p<0.1; p<0.05; p<0.01
NOTE We collapse the 18 skin-tone labels.

Splitting by Gender: Table 01 - Config 01

This repeats the above baseline table, splitting by gender. The balance is: 1808 females and 6562 males in our validation set. The model specifications are exactly as above.

Table 01 - Version 01 - Arrest Regressions - Split By Gender
Fit measured in adjusted R squared and AUC
Model Configuration Male-Female Combined Male Subsample Female Subsample
Adjusted R Squared ROC AUC Adjusted R Squared ROC AUC Adjusted R Squared ROC AUC
Single Variable Model
Demographic LM 0.0103 0.5556 0.0004 0.5120 0.0003 0.5336
Lower 95% C.I. 0.0072 0.5415 −0.0001 0.4961 −0.0005 0.4973
Upper 95% C.I. 0.0137 0.5697 0.0017 0.5279 0.0041 0.5699
Charge Feature LM 0.0910 0.6909 0.0916 0.6892 0.0687 0.6825
0.0806 0.6775 0.0800 0.6746 0.0471 0.6475
0.1013 0.7043 0.1035 0.7038 0.0950 0.7175
XgBoost Risk 0.0333 0.6099 0.0304 0.6051 0.0166 0.5782
0.0267 0.5962 0.0233 0.5899 0.0070 0.5452
0.0404 0.6237 0.0384 0.6203 0.0315 0.6111
MTurk Features 0.0012 0.5417 0.0001 0.5375 0.0045 0.6044
0.0015 0.5273 0.0007 0.5216 0.0061 0.5687
0.0067 0.5562 0.0065 0.5533 0.0267 0.6400
P_hat_cnn 0.0328 0.6231 0.0221 0.5980 0.0300 0.6588
0.0270 0.6095 0.0166 0.5826 0.0183 0.6270
0.0392 0.6367 0.0283 0.6135 0.0448 0.6907
Combined Variable Model
Demographics + Charge Feature 0.0979 0.7013 0.0920 0.6902 0.0703 0.6865
0.0867 0.6882 0.0803 0.6756 0.0483 0.6519
0.1095 0.7144 0.1041 0.7048 0.0944 0.7211
Demographics + Charge Feature + Risk 0.1129 0.7181 0.1085 0.7098 0.0789 0.6977
0.1012 0.7053 0.0959 0.6957 0.0558 0.6634
0.1248 0.7308 0.1220 0.7239 0.1048 0.7321
Demographics + Charge Feature + Risk + MTurk 0.1135 0.7200 0.1080 0.7110 0.0825 0.7213
0.1046 0.7073 0.0986 0.6969 0.0690 0.6888
0.1274 0.7328 0.1248 0.7252 0.1203 0.7537
Demographics + Charge Feature + Risk + CNN 0.1224 0.7292 0.1194 0.7218 0.1024 0.7343
0.1110 0.7168 0.1069 0.7081 0.0802 0.7034
0.1346 0.7416 0.1331 0.7355 0.1316 0.7652
Combined Model 0.1231 0.7316 0.1192 0.7239 0.0998 0.7440
0.1138 0.7192 0.1089 0.7102 0.0858 0.7137
0.1369 0.7440 0.1354 0.7376 0.1378 0.7743

Table 01 - Regressions - Male Subsample

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
p_hat_demo 4.221*
(0.453, 7.989)
p_hat_charge 1.210***
(1.133, 1.287)
risk_pred_prob -1.048*** -0.784*** -0.793*** -0.734*** -0.750***
(-1.168, -0.928) (-0.900, -0.668) (-0.910, -0.675) (-0.850, -0.618) (-0.867, -0.633)
attractiveness -0.037 -0.025 -0.028
(-0.077, 0.003) (-0.063, 0.013) (-0.066, 0.010)
competence 0.007 0.001 -0.003
(-0.034, 0.048) (-0.038, 0.040) (-0.042, 0.035)
dominance 0.005 0.002 0.007
(-0.028, 0.037) (-0.029, 0.033) (-0.024, 0.038)
trustworthiness 0.045* 0.035 0.031
(0.004, 0.087) (-0.004, 0.075) (-0.008, 0.071)
release_likely 0.009 0.014 0.015
(-0.032, 0.050) (-0.025, 0.053) (-0.023, 0.054)
skin_tonenumber_f7ddc4 0.020 -0.015 -0.033
(-0.023, 0.062) (-0.055, 0.025) (-0.073, 0.007)
p_hat_cnn 0.632*** 0.448*** 0.458***
(0.547, 0.716) (0.367, 0.530) (0.376, 0.540)
p_hat_demo_charge 1.169*** 1.091*** 1.089*** 1.048*** 1.047***
(1.094, 1.244) (1.016, 1.165) (1.014, 1.164) (0.973, 1.123) (0.972, 1.122)
Constant -2.502 -0.207*** 1.069*** 0.703*** 0.285*** -0.159*** 0.148*** 0.133*** -0.157*** -0.170***
(-5.395, 0.392) (-0.268, -0.146) (1.030, 1.108) (0.667, 0.739) (0.223, 0.346) (-0.216, -0.101) (0.075, 0.222) (0.053, 0.214) (-0.249, -0.066) (-0.267, -0.073)
Observations 6,562 6,562 6,562 6,562 6,562 6,562 6,562 6,562 6,562 6,562
Adjusted R2 0.0004 0.092 0.030 0.0001 0.022 0.092 0.109 0.108 0.119 0.119
F Statistic 3.395* (df = 1; 6560) 662.714*** (df = 1; 6560) 207.047*** (df = 1; 6560) 1.017 (df = 22; 6539) 149.488*** (df = 1; 6560) 665.427*** (df = 1; 6560) 400.423*** (df = 2; 6559) 34.116*** (df = 24; 6537) 297.559*** (df = 3; 6558) 36.518*** (df = 25; 6536)
Note: p<0.1; p<0.05; p<0.01
NOTE We collapse the 18 skin-tone labels.

Table 01 - Regressions - Female Subsample

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
p_hat_demo -7.355
(-16.902, 2.192)
p_hat_charge 0.961***
(0.825, 1.097)
risk_pred_prob -0.916*** -0.672*** -0.636*** -0.576*** -0.564***
(-1.185, -0.648) (-0.935, -0.410) (-0.901, -0.372) (-0.836, -0.316) (-0.827, -0.301)
attractiveness 0.013 0.001 -0.003
(-0.053, 0.079) (-0.062, 0.065) (-0.066, 0.060)
competence 0.033 0.024 0.020
(-0.030, 0.096) (-0.036, 0.085) (-0.039, 0.080)
dominance -0.031 -0.008 -0.010
(-0.080, 0.019) (-0.055, 0.040) (-0.057, 0.037)
trustworthiness 0.033 0.029 0.025
(-0.030, 0.095) (-0.032, 0.089) (-0.035, 0.084)
release_likely -0.0002 0.002 -0.003
(-0.063, 0.063) (-0.058, 0.063) (-0.063, 0.057)
skin_tonenumber_f7ddc4 -0.081* -0.080* -0.047
(-0.155, -0.006) (-0.152, -0.008) (-0.119, 0.024)
p_hat_cnn 0.739*** 0.658*** 0.590***
(0.578, 0.900) (0.503, 0.814) (0.426, 0.753)
p_hat_demo_charge 1.209*** 1.149*** 1.151*** 1.125*** 1.126***
(1.040, 1.378) (0.978, 1.319) (0.981, 1.322) (0.957, 1.293) (0.957, 1.295)
Constant 7.120 0.072 1.105*** 0.852*** 0.224*** -0.192** 0.051 0.052 -0.508*** -0.447***
(-1.026, 15.267) (-0.039, 0.182) (1.027, 1.183) (0.780, 0.925) (0.088, 0.360) (-0.338, -0.046) (-0.122, 0.225) (-0.135, 0.238) (-0.724, -0.292) (-0.677, -0.216)
Observations 1,808 1,808 1,808 1,808 1,808 1,808 1,808 1,808 1,808 1,808
Adjusted R2 0.0003 0.069 0.017 0.005 0.030 0.070 0.079 0.082 0.102 0.100
F Statistic 1.606 (df = 1; 1806) 134.311*** (df = 1; 1806) 31.513*** (df = 1; 1806) 1.374 (df = 22; 1785) 56.956*** (df = 1; 1806) 137.682*** (df = 1; 1806) 78.361*** (df = 2; 1805) 7.768*** (df = 24; 1783) 69.732*** (df = 3; 1804) 9.011*** (df = 25; 1782)
Note: p<0.1; p<0.05; p<0.01
NOTE We collapse the 18 skin-tone labels.