I repeat our projector regressions with the full arrest validation set. We see that using the projected images vs. their real counterpart reduced the adjusted r-sqrt of p_hat_cnn from 0.033 to 0.0298 which is very promising
In this markdown I summarize the results from our updated arrest cnn prediction test. The goal is to see how different the models predictions for release are for real images and their respective projection through the GAN. As a reminder, projected images are those for which the GAN finds the closest approximate image in its own latent space w.r.t some real mugshot.
Here I take our entire validation set and produce projection-target image pairs, which I then pass through the CNN to get predictions. I then produce two summary plots together with out baseline table 01:
Plot 1 Comparing their respective p_hat_cnn distributions. Ideally these would align as closely as possible
Plot 2 Checking the correlation between p_hat_cnn_projection, the prediction for projected images, and p_hat_cnn_target, the cnn prediction for the real image counterpart. Here we want to see as close of a relationship as possible
Table 01 Configuration 1 I repeat the regressions for our baseline table 01 comparing between the p_hat_cnn values for projections and targets. This means I have two p_hat_cnn values, coming from the real mugshot and associated projection respectively. I compare their predictive power in two separate columns.
There are two different types of images here:
Targets: These are the real mugshots from our validation set.
Projections: These are the GAN generated images which try to approximate the real targets as close as possible.
An ideal projector will produce generated images which are indistinguishable from their target counterpart.
Note The definitions for the terms used in table 01 are also above the table.
NOTE I’m leaving out MTurk variables here since they’re not adding anything and blow up the table. The difference between column 7 and column 8 is the addition of p_hat_cnn_target and between column 7 and column 9 we isolate the p_hat_cnn_projection signal.
Note The p_hat_demo_charge variable is the combined Demo + Charge model in table 01 below.
| Dependent variable: | ||||||||||
| Release Outcome | ||||||||||
| (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | |
| p_hat_demo | 1.271*** | |||||||||
| (1.029, 1.513) | ||||||||||
| p_hat_charge | 1.222*** | |||||||||
| (1.146, 1.297) | ||||||||||
| p_hat_demo_charge | 1.206*** | 1.105*** | 1.029*** | 1.038*** | 1.025*** | |||||
| (1.135, 1.278) | (1.033, 1.177) | (0.955, 1.102) | (0.965, 1.112) | (0.952, 1.099) | ||||||
| risk_pred_prob | -1.192*** | -0.850*** | -0.771*** | -0.776*** | -0.765*** | |||||
| (-1.312, -1.073) | (-0.967, -0.734) | (-0.888, -0.654) | (-0.893, -0.659) | (-0.882, -0.648) | ||||||
| p_hat_cnn_target | 0.569*** | 0.292*** | 0.164*** | |||||||
| (0.507, 0.630) | (0.232, 0.353) | (0.065, 0.263) | ||||||||
| p_hat_cnn_proj | 0.592*** | 0.317*** | 0.176*** | |||||||
| (0.525, 0.659) | (0.251, 0.383) | (0.068, 0.284) | ||||||||
| Constant | -0.243** | -0.204*** | -0.193*** | 1.123*** | 0.326*** | 0.290*** | 0.149*** | -0.037 | -0.072 | -0.078 |
| (-0.434, -0.053) | (-0.264, -0.144) | (-0.250, -0.136) | (1.086, 1.161) | (0.279, 0.373) | (0.236, 0.343) | (0.076, 0.222) | (-0.119, 0.046) | (-0.158, 0.014) | (-0.164, 0.008) | |
| Observations | 6,782 | 6,782 | 6,782 | 6,782 | 6,782 | 6,782 | 6,782 | 6,782 | 6,782 | 6,782 |
| Adjusted R2 | 0.011 | 0.095 | 0.102 | 0.038 | 0.033 | 0.030 | 0.120 | 0.128 | 0.128 | 0.129 |
| F Statistic | 74.608*** (df = 1; 6780) | 709.641*** (df = 1; 6780) | 768.396*** (df = 1; 6780) | 269.594*** (df = 1; 6780) | 232.596*** (df = 1; 6780) | 209.470*** (df = 1; 6780) | 464.509*** (df = 2; 6779) | 333.500*** (df = 3; 6778) | 333.418*** (df = 3; 6778) | 252.150*** (df = 4; 6777) |
| Note: | p<0.1; p<0.05; p<0.01 | |||||||||
I repeat our baseline table 01 and compare the predictive power of the p_hat_cnn values from our projections and targets. Below is an outline of the definitions:
As a reminder (these are the same definitions as in previous iterations), here is a list of the regression terms. I split them into definitions for our models and our features:
Model Definitions
Demographic LM: This includes sex and age_arrest to predict our arrest-outcome
Charge Feature LM : This includes felony_flag, gun_crime_flag, drug_crime_flag, violent_crime_flag, property_crime_flag, arrest_year
XgBoost risk: As the name would suggest, this is our XgBoost risk predictor using time-varying arrest history features
Feature Definitions
MTurk Features: These are our high-detail (minimum of 6 workers per image) MTurk features together with their median
Kitchen Sink: The final row of the table includes all previous rows, it is our fully stacked model with all covariates.
NOTE The column titled P_hat_Targets uses the p_hat_cnn of the real mugshot targets. The column titled P_hat_Projections, as the name would suggest, uses the p_hat_cnn from the corresponding GAN projections.
| Table 01 - Version 01 - Model Comparison | ||||||
|---|---|---|---|---|---|---|
| Fit measured in adjusted R squared and AUC | ||||||
| Model Configuration | P_hat_Target | P_hat_Projection | P_hat_Target + P_hat_Projection | |||
| Adjusted R Squared | ROC AUC | Adjusted R Squared | ROC AUC | Adjusted R Squared | ROC AUC | |
| Single Variable Model | ||||||
| Demographic LM | 0.0107 | 0.5571 | 0.0107 | 0.5571 | 0.0107 | 0.5571 |
| Lower 95% C.I. | 0.0075 | 0.5416 | 0.0073 | 0.5416 | 0.0072 | 0.5416 |
| Upper 95% C.I. | 0.0146 | 0.5726 | 0.0147 | 0.5726 | 0.0147 | 0.5726 |
| Charge Feature LM | 0.0946 | 0.6949 | 0.0946 | 0.6949 | 0.0946 | 0.6949 |
| 0.0826 | 0.6802 | 0.0823 | 0.6802 | 0.0832 | 0.6802 | |
| 0.1070 | 0.7095 | 0.1076 | 0.7095 | 0.1065 | 0.7095 | |
| XgBoost Risk | 0.0381 | 0.6167 | 0.0381 | 0.6167 | 0.0381 | 0.6167 |
| 0.0301 | 0.6016 | 0.0305 | 0.6016 | 0.0307 | 0.6016 | |
| 0.0472 | 0.6319 | 0.0474 | 0.6319 | 0.0474 | 0.6319 | |
| MTurk Features (Mean + Median) | 0.0002 | 0.5382 | 0.0002 | 0.5382 | 0.0002 | 0.5382 |
| 0.0008 | 0.5223 | 0.0008 | 0.5223 | 0.0007 | 0.5223 | |
| 0.0063 | 0.5540 | 0.0065 | 0.5540 | 0.0063 | 0.5540 | |
| P_hat_cnn | 0.0330 | 0.6240 | 0.0298 | 0.6147 | 0.0349 | 0.6258 |
| 0.0262 | 0.6091 | 0.0231 | 0.5995 | 0.0284 | 0.6108 | |
| 0.0401 | 0.6390 | 0.0363 | 0.6299 | 0.0426 | 0.6408 | |
| Combined Variable Model | ||||||
| Demographics + Charge Feature | 0.1017 | 0.7053 | 0.1017 | 0.7053 | 0.1017 | 0.7053 |
| 0.0900 | 0.6910 | 0.0889 | 0.6910 | 0.0899 | 0.6910 | |
| 0.1148 | 0.7196 | 0.1148 | 0.7196 | 0.1145 | 0.7196 | |
| Demographics + Charge Feature + Risk | 0.1203 | 0.7232 | 0.1203 | 0.7232 | 0.1203 | 0.7232 |
| 0.1075 | 0.7092 | 0.1074 | 0.7092 | 0.1071 | 0.7092 | |
| 0.1344 | 0.7371 | 0.1333 | 0.7371 | 0.1342 | 0.7371 | |
| Demographics + Charge Feature + Risk + MTurk (Mean + Median) | 0.1203 | 0.7245 | 0.1203 | 0.7245 | 0.1203 | 0.7245 |
| 0.1093 | 0.7106 | 0.1089 | 0.7106 | 0.1099 | 0.7106 | |
| 0.1360 | 0.7385 | 0.1363 | 0.7385 | 0.1367 | 0.7385 | |
| Demographics + Charge Feature + Risk + CNN | 0.1282 | 0.7325 | 0.1282 | 0.7326 | 0.1290 | 0.7336 |
| 0.1162 | 0.7188 | 0.1153 | 0.7189 | 0.1162 | 0.7199 | |
| 0.1420 | 0.7461 | 0.1431 | 0.7462 | 0.1428 | 0.7472 | |
| Kitchen Sink (all RHS variables included) | 0.1299 | 0.7357 | 0.1294 | 0.7354 | 0.1307 | 0.7369 |
| 0.1200 | 0.7221 | 0.1189 | 0.7217 | 0.1210 | 0.7233 | |
| 0.1468 | 0.7493 | 0.1453 | 0.7490 | 0.1481 | 0.7504 | |