Summary

This document contains staggered regressions (OLS) on arrest-release. We include the following variables in this regression:

  1. Xg-Boost risk predictor
  2. Arrest and demographic covariates p-hat
  3. Mugshot CNN p-hat

Note that the new Xg-Boost risk predictor is based on the time-varying historical arrest data. I see an increase in AUC from 0.601 to 0.63 by including these.

My main takeaway from these regressions is:

Multihead 0.3 - ResNet50 - (not) Overfit

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3)
risk_pred_prob -1.112*** -0.787*** -0.713***
(-1.216, -1.008) (-0.889, -0.686) (-0.815, -0.612)
p_hat_covariates 1.084*** 1.010***
(1.023, 1.146) (0.947, 1.073)
p_hat_cnn 0.378***
(0.313, 0.443)
Constant 1.105*** 0.151*** -0.095**
(1.072, 1.138) (0.088, 0.214) (-0.171, -0.020)
Observations 8,835 8,835 8,835
Adjusted R2 0.034 0.116 0.125
F Statistic 307.362*** (df = 1; 8833) 583.222*** (df = 2; 8832) 423.407*** (df = 3; 8831)
Note: p<0.1; p<0.05; p<0.01

Decile Plots

Here I provide two types of plots for each of p_hat_cnn , p_hat_covariate, and risk_pred_prob:

  1. Decile Plot A - The max value in a decile vs. the mean arrest outcome in that decile
  2. Decile Plot B - The mean arrest outcome at each decile index

Decile Plot 1 - p_hat_cnn

Decile Plot 2 - p_hat_covariate

Decile Plot 3 - risk_pred_prob


MTurk Features

We now include MTurk results in our covariates. These are collected for some (not all) of the validation set, reaching 7318 arrest_ids. The included features are:

  1. Attractiveness
  2. Competence
  3. Dominance
  4. Trustworthiness
  5. Age
  6. Race (Black, White, Hispanic, Asian, Indian, Unsure/Other)
  7. Skin-color (18 variants)

Multihead 0.3 - ResNet50 - (not) Overfit

Table _01 - Model 03

  • The p_hat_features model includes 18 skin-tone variants (not super-categorized as in regression table No.2)
  • p_hat_cnn is significant throughout !
  • These effects are robust to the inclusion/exclusion of race in the covariate model on top of the skin_color levels
  • Quite confident that in this sense we are picking up signal on-top of the information gained through knowing race/skin_color !
Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4)
risk_pred_prob -1.074*** -1.053*** -0.716*** -0.654***
(-1.188, -0.960) (-1.167, -0.938) (-0.828, -0.604) (-0.766, -0.543)
p_hat_features 0.773*** 0.543*** 0.362*
(0.447, 1.099) (0.230, 0.856) (0.048, 0.675)
p_hat_covariates 1.070*** 0.989***
(1.000, 1.139) (0.919, 1.060)
p_hat_cnn 0.387***
(0.316, 0.458)
Constant 1.095*** 0.498*** -0.273* -0.379**
(1.059, 1.131) (0.243, 0.752) (-0.522, -0.023) (-0.627, -0.130)
Observations 7,318 7,318 7,318 7,318
Adjusted R2 0.032 0.033 0.111 0.121
F Statistic 239.132*** (df = 1; 7316) 127.396*** (df = 2; 7315) 307.049*** (df = 3; 7314) 252.805*** (df = 4; 7313)
Note: p<0.1; p<0.05; p<0.01

Table _02 - Model 03

  • covariates_lm excludes race (so as to allow the skin_tone to account for all race signal in this test)
  • The results (p_hat_cnn being significant) are robust to the inclusion of race, though the skin_tone_cat_light become insignificant
  • skin_tone_(category) is a factor variable which encodes the 18 raw hexidecimal color variants (included in Table _01) into three categories comprised of 6 such variants into one of light, medium, and dark skin categories respectively.
Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4)
risk_pred_prob -1.074*** -1.068*** -0.733*** -0.672***
(-1.188, -0.960) (-1.183, -0.953) (-0.846, -0.621) (-0.784, -0.559)
skin_tone_cat_light_skin -0.004 -0.015 -0.019*
(-0.022, 0.015) (-0.033, 0.002) (-0.037, -0.002)
skin_tone_cat_medium_skin 0.004 -0.004 -0.008
(-0.018, 0.026) (-0.025, 0.017) (-0.029, 0.014)
age -0.0005 0.0002 0.001
(-0.002, 0.001) (-0.001, 0.001) (-0.0001, 0.002)
attractiveness -0.003 0.00001 0.0004
(-0.013, 0.008) (-0.010, 0.010) (-0.010, 0.011)
competence 0.003 -0.002 -0.002
(-0.010, 0.015) (-0.014, 0.010) (-0.014, 0.009)
dominance -0.002 0.003 0.005
(-0.011, 0.007) (-0.005, 0.012) (-0.003, 0.014)
trustworthiness 0.004 0.004 0.001
(-0.007, 0.015) (-0.007, 0.014) (-0.009, 0.012)
p_hat_covariates 1.080*** 0.997***
(1.010, 1.149) (0.926, 1.067)
p_hat_cnn 0.409***
(0.338, 0.481)
Constant 1.095*** 1.100*** 0.111** -0.168***
(1.059, 1.131) (1.030, 1.170) (0.019, 0.204) (-0.272, -0.064)
Observations 7,318 7,318 7,318 7,318
Adjusted R2 0.032 0.031 0.110 0.121
F Statistic 239.132*** (df = 1; 7316) 30.083*** (df = 8; 7309) 101.669*** (df = 9; 7308) 101.451*** (df = 10; 7307)
Note: p<0.1; p<0.05; p<0.01

Table _03 - Model 03

Here we include the 18 raw skin-tone levels.

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5)
risk_pred_prob -1.074*** -1.077*** -1.072*** -0.742*** -0.682***
(-1.188, -0.960) (-1.192, -0.962) (-1.187, -0.956) (-0.855, -0.629) (-0.794, -0.569)
skin_tonenumber_f7ddc4 0.006 0.006 -0.021 -0.037
(-0.033, 0.045) (-0.034, 0.045) (-0.059, 0.017) (-0.075, 0.001)
age -0.0004 0.0003 0.001*
(-0.002, 0.001) (-0.001, 0.001) (0.00002, 0.002)
attractiveness -0.002 0.002 0.002
(-0.013, 0.009) (-0.009, 0.012) (-0.008, 0.013)
competence 0.002 -0.002 -0.003
(-0.010, 0.015) (-0.014, 0.010) (-0.015, 0.009)
dominance -0.002 0.003 0.005
(-0.011, 0.007) (-0.006, 0.011) (-0.004, 0.013)
trustworthiness 0.004 0.004 0.001
(-0.007, 0.015) (-0.007, 0.014) (-0.010, 0.012)
p_hat_covariates 1.085*** 1.003***
(1.015, 1.155) (0.932, 1.074)
p_hat_cnn 0.415***
(0.343, 0.487)
Constant 1.095*** 1.091*** 1.095*** 0.109* -0.171***
(1.059, 1.131) (1.043, 1.139) (1.018, 1.171) (0.012, 0.205) (-0.279, -0.063)
Observations 7,318 7,318 7,318 7,318 7,318
Adjusted R2 0.032 0.032 0.031 0.111 0.122
F Statistic 239.132*** (df = 1; 7316) 14.271*** (df = 18; 7299) 11.215*** (df = 23; 7294) 39.038*** (df = 24; 7293) 41.532*** (df = 25; 7292)
Note: p<0.1; p<0.05; p<0.01

Table _04 - Model 03 - Male vs. Female

We now split the regression model into male and female.

  • The p_hat_cnn coefficient is significant and larger than the combined model for both
  • The Female p_hat_cnn is surprisingly large
  • The dominance feature for the female population becomes signficant (which is fascinating !!!)
Table _04 - Male
Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5)
risk_pred_prob -0.997*** -1.002*** -1.003*** -0.754*** -0.718***
(-1.125, -0.869) (-1.132, -0.873) (-1.133, -0.874) (-0.879, -0.628) (-0.843, -0.593)
skin_tonenumber_f7ddc4 -0.004 -0.003 -0.021 -0.046*
(-0.049, 0.042) (-0.049, 0.043) (-0.066, 0.023) (-0.090, -0.002)
age 0.0005 0.001 0.002**
(-0.001, 0.002) (-0.0002, 0.002) (0.0005, 0.003)
attractiveness -0.005 0.0004 0.002
(-0.018, 0.008) (-0.012, 0.013) (-0.010, 0.014)
competence 0.004 0.0003 -0.00000
(-0.011, 0.019) (-0.014, 0.014) (-0.014, 0.014)
dominance -0.001 -0.003 -0.004
(-0.012, 0.009) (-0.013, 0.007) (-0.014, 0.006)
trustworthiness 0.001 0.003 0.001
(-0.013, 0.014) (-0.010, 0.015) (-0.011, 0.014)
p_hat_covariates 1.071*** 1.024***
(0.991, 1.151) (0.944, 1.104)
p_hat_cnn 0.483***
(0.395, 0.571)
Constant 1.055*** 1.064*** 1.053*** 0.133** -0.199***
(1.014, 1.096) (1.010, 1.118) (0.964, 1.142) (0.024, 0.243) (-0.324, -0.075)
Observations 5,725 5,725 5,725 5,725 5,725
Adjusted R2 0.028 0.028 0.027 0.104 0.116
F Statistic 164.322*** (df = 1; 5723) 10.197*** (df = 18; 5706) 8.033*** (df = 23; 5701) 28.633*** (df = 24; 5700) 31.147*** (df = 25; 5699)
Note: p<0.1; p<0.05; p<0.01
Table _04 - Female
Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5)
risk_pred_prob -0.996*** -0.987*** -0.963*** -0.705*** -0.644***
(-1.280, -0.712) (-1.272, -0.701) (-1.250, -0.677) (-0.985, -0.425) (-0.922, -0.366)
skin_tonenumber_f7ddc4 0.018 0.017 0.008 0.038
(-0.065, 0.100) (-0.066, 0.100) (-0.073, 0.088) (-0.042, 0.118)
age -0.002* -0.001 -0.0004
(-0.005, -0.00004) (-0.003, 0.001) (-0.003, 0.002)
attractiveness 0.007 0.008 0.007
(-0.013, 0.027) (-0.011, 0.027) (-0.012, 0.025)
competence -0.009 -0.011 -0.013
(-0.032, 0.014) (-0.033, 0.011) (-0.035, 0.009)
dominance 0.018* 0.016* 0.016*
(0.002, 0.035) (0.0002, 0.032) (0.001, 0.032)
trustworthiness 0.004 0.004 0.002
(-0.016, 0.025) (-0.015, 0.024) (-0.018, 0.021)
p_hat_covariates 0.892*** 0.880***
(0.753, 1.031) (0.742, 1.018)
p_hat_cnn 0.553***
(0.381, 0.724)
Constant 1.131*** 1.084*** 1.059*** 0.261** -0.211
(1.048, 1.213) (0.977, 1.190) (0.906, 1.211) (0.068, 0.454) (-0.452, 0.030)
Observations 1,593 1,593 1,593 1,593 1,593
Adjusted R2 0.020 0.022 0.026 0.089 0.105
F Statistic 33.263*** (df = 1; 1591) 3.024*** (df = 18; 1574) 2.819*** (df = 23; 1569) 7.518*** (df = 24; 1568) 8.465*** (df = 25; 1567)
Note: p<0.1; p<0.05; p<0.01
Table _05 - Including p_hat_cnn first
Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5)
risk_pred_prob -1.074*** -0.929*** -0.942*** -0.944*** -0.682***
(-1.188, -0.960) (-1.043, -0.815) (-1.056, -0.827) (-1.059, -0.829) (-0.794, -0.569)
p_hat_cnn 0.608*** 0.613*** 0.621*** 0.415***
(0.536, 0.679) (0.541, 0.685) (0.548, 0.694) (0.343, 0.487)
skin_tonenumber_f7ddc4 -0.021 -0.022 -0.037
(-0.060, 0.018) (-0.061, 0.018) (-0.075, 0.001)
age 0.001 0.001*
(-0.0003, 0.002) (0.00002, 0.002)
attractiveness -0.001 0.002
(-0.011, 0.010) (-0.008, 0.013)
competence 0.0005 -0.003
(-0.012, 0.013) (-0.015, 0.009)
dominance 0.001 0.005
(-0.007, 0.010) (-0.004, 0.013)
trustworthiness 0.0002 0.001
(-0.011, 0.011) (-0.010, 0.012)
p_hat_covariates 1.003***
(0.932, 1.074)
Constant 1.095*** 0.596*** 0.605*** 0.564*** -0.171***
(1.059, 1.131) (0.528, 0.665) (0.530, 0.679) (0.467, 0.662) (-0.279, -0.063)
Observations 7,318 7,318 7,318 7,318 7,318
Adjusted R2 0.032 0.057 0.057 0.056 0.122
F Statistic 239.132*** (df = 1; 7316) 220.280*** (df = 2; 7315) 24.137*** (df = 19; 7298) 19.172*** (df = 24; 7293) 41.532*** (df = 25; 7292)
Note: p<0.1; p<0.05; p<0.01

Per-Decile Regressions

Notes:

  1. Signficant changes in R squared for the extrema of risk_pred_prob
  2. p_hat_cnn is adding significant information in all three
  3. The higher the base risk_pred_prob the less important p_hat_cnn becomes. This indicates that:
    • When judges are quite certain of the re-arrest risk based on past record they make less of their decision based on the information from the face
    • However, even at the highest risk levels this trend is not zero !

Deciles 1 - 3

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4)
risk_pred_prob 2.107** 1.867* 1.796* 1.690*
(0.528, 3.686) (0.260, 3.474) (0.269, 3.322) (0.168, 3.211)
skin_tonenumber_f7ddc4 0.024 0.007 -0.005
(-0.048, 0.096) (-0.061, 0.076) (-0.073, 0.064)
age -0.001 -0.001 -0.0004
(-0.003, 0.001) (-0.003, 0.001) (-0.002, 0.001)
attractiveness 0.003 0.003 0.003
(-0.015, 0.021) (-0.014, 0.020) (-0.014, 0.020)
competence 0.012 0.008 0.008
(-0.009, 0.034) (-0.013, 0.028) (-0.012, 0.028)
dominance 0.003 0.007 0.009
(-0.012, 0.018) (-0.007, 0.022) (-0.006, 0.023)
trustworthiness -0.004 -0.005 -0.007
(-0.023, 0.014) (-0.022, 0.013) (-0.025, 0.010)
p_hat_covariates 1.163*** 1.088***
(1.038, 1.288) (0.959, 1.216)
p_hat_cnn 0.317***
(0.192, 0.442)
Constant 0.265 0.296 -0.621** -0.782***
(-0.143, 0.674) (-0.147, 0.738) (-1.053, -0.189) (-1.217, -0.347)
Observations 2,196 2,196 2,196 2,196
Adjusted R2 0.002 0.006 0.103 0.109
F Statistic 4.819** (df = 1; 2194) 1.617** (df = 23; 2172) 11.474*** (df = 24; 2171) 11.795*** (df = 25; 2170)
Note: p<0.1; p<0.05; p<0.01

Deciles 4 - 6

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4)
risk_pred_prob -3.844** -3.943** -1.701 -1.517
(-6.843, -0.846) (-7.007, -0.879) (-4.640, 1.238) (-4.441, 1.406)
skin_tonenumber_f7ddc4 0.028 0.004 -0.002
(-0.040, 0.095) (-0.060, 0.069) (-0.066, 0.062)
age -0.0004 -0.0004 0.00003
(-0.002, 0.001) (-0.002, 0.001) (-0.002, 0.002)
attractiveness -0.012 -0.010 -0.010
(-0.030, 0.005) (-0.027, 0.007) (-0.026, 0.007)
competence 0.003 -0.007 -0.008
(-0.018, 0.024) (-0.027, 0.013) (-0.028, 0.012)
dominance -0.004 0.003 0.005
(-0.019, 0.010) (-0.011, 0.017) (-0.009, 0.019)
trustworthiness 0.010 0.014 0.011
(-0.009, 0.029) (-0.004, 0.031) (-0.007, 0.029)
p_hat_covariates 1.029*** 0.951***
(0.912, 1.147) (0.831, 1.070)
p_hat_cnn 0.367***
(0.248, 0.485)
Constant 1.835*** 1.884*** 0.465 0.195
(1.049, 2.621) (1.082, 2.685) (-0.318, 1.248) (-0.588, 0.978)
Observations 2,196 2,196 2,196 2,196
Adjusted R2 0.002 -0.001 0.086 0.096
F Statistic 4.447** (df = 1; 2194) 0.902 (df = 23; 2172) 9.582*** (df = 24; 2171) 10.341*** (df = 25; 2170)
Note: p<0.1; p<0.05; p<0.01

Deciles 7 - 10

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4)
risk_pred_prob -0.953*** -0.919*** -0.569*** -0.504***
(-1.182, -0.723) (-1.153, -0.685) (-0.797, -0.340) (-0.731, -0.276)
skin_tonenumber_f7ddc4 -0.032 -0.066* -0.094**
(-0.099, 0.035) (-0.131, -0.001) (-0.159, -0.029)
age 0.001 0.002** 0.004***
(-0.001, 0.003) (0.0005, 0.004) (0.002, 0.005)
attractiveness -0.001 0.007 0.010
(-0.021, 0.018) (-0.012, 0.026) (-0.009, 0.029)
competence -0.004 -0.005 -0.007
(-0.026, 0.018) (-0.026, 0.016) (-0.028, 0.014)
dominance -0.001 0.002 0.003
(-0.016, 0.015) (-0.014, 0.017) (-0.012, 0.018)
trustworthiness 0.006 0.002 0.001
(-0.014, 0.027) (-0.017, 0.022) (-0.018, 0.020)
p_hat_covariates 1.069*** 0.989***
(0.952, 1.186) (0.871, 1.107)
p_hat_cnn 0.514***
(0.388, 0.640)
Constant 1.044*** 0.995*** -0.019 -0.388***
(0.956, 1.132) (0.841, 1.149) (-0.204, 0.167) (-0.593, -0.183)
Observations 2,926 2,926 2,926 2,926
Adjusted R2 0.015 0.016 0.087 0.100
F Statistic 46.531*** (df = 1; 2924) 3.076*** (df = 23; 2902) 12.556*** (df = 24; 2901) 14.034*** (df = 25; 2900)
Note: p<0.1; p<0.05; p<0.01

Non-Linearity in p_hat_cnn

We consider three approaches to deal with the non-linearity in p_hat_cnn;

  1. Replace p_hat_cnn with the decile value
  2. Collapse bottom three deciles into one average
  3. Simple float for 1-10 instead of decide averages

Average decile value for p_hat_cnn

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4)
risk_pred_prob -1.074*** -1.072*** -0.742*** -0.699***
(-1.188, -0.960) (-1.187, -0.956) (-0.855, -0.629) (-0.922, -0.477)
skin_tonenumber_f7ddc4 0.006 -0.021 -0.021
(-0.034, 0.045) (-0.059, 0.017) (-0.059, 0.017)
age -0.0004 0.0003 0.0003
(-0.002, 0.001) (-0.001, 0.001) (-0.001, 0.001)
attractiveness -0.002 0.002 0.001
(-0.013, 0.009) (-0.009, 0.012) (-0.009, 0.012)
competence 0.002 -0.002 -0.002
(-0.010, 0.015) (-0.014, 0.010) (-0.014, 0.010)
dominance -0.002 0.003 0.003
(-0.011, 0.007) (-0.006, 0.011) (-0.006, 0.011)
trustworthiness 0.004 0.004 0.004
(-0.007, 0.015) (-0.007, 0.014) (-0.007, 0.014)
p_hat_covariates 1.085*** 1.085***
(1.015, 1.155) (1.015, 1.155)
p_hat_cnn_decile_avr 0.191
(-0.675, 1.057)
Constant 1.095*** 1.095*** 0.109* -0.048
(1.059, 1.131) (1.018, 1.171) (0.012, 0.205) (-0.765, 0.668)
Observations 7,318 7,318 7,318 7,318
Adjusted R2 0.032 0.031 0.111 0.111
F Statistic 239.132*** (df = 1; 7316) 11.215*** (df = 23; 7294) 39.038*** (df = 24; 7293) 37.477*** (df = 25; 7292)
Note: p<0.1; p<0.05; p<0.01

Collapsing bottom three deciles

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4)
risk_pred_prob -1.074*** -1.072*** -0.742*** -0.698***
(-1.188, -0.960) (-1.187, -0.956) (-0.855, -0.629) (-0.939, -0.457)
skin_tonenumber_f7ddc4 0.006 -0.021 -0.021
(-0.034, 0.045) (-0.059, 0.017) (-0.059, 0.017)
age -0.0004 0.0003 0.0003
(-0.002, 0.001) (-0.001, 0.001) (-0.001, 0.001)
attractiveness -0.002 0.002 0.001
(-0.013, 0.009) (-0.009, 0.012) (-0.009, 0.012)
competence 0.002 -0.002 -0.002
(-0.010, 0.015) (-0.014, 0.010) (-0.014, 0.010)
dominance -0.002 0.003 0.003
(-0.011, 0.007) (-0.006, 0.011) (-0.006, 0.011)
trustworthiness 0.004 0.004 0.004
(-0.007, 0.015) (-0.007, 0.014) (-0.007, 0.014)
p_hat_covariates 1.085*** 1.085***
(1.015, 1.155) (1.015, 1.155)
p_hat_cnn_decile_avr 0.195
(-0.760, 1.150)
Constant 1.095*** 1.095*** 0.109* -0.051
(1.059, 1.131) (1.018, 1.171) (0.012, 0.205) (-0.840, 0.737)
Observations 7,318 7,318 7,318 7,318
Adjusted R2 0.032 0.031 0.111 0.111
F Statistic 239.132*** (df = 1; 7316) 11.215*** (df = 23; 7294) 39.038*** (df = 24; 7293) 37.476*** (df = 25; 7292)
Note: p<0.1; p<0.05; p<0.01

Brute force 1-10 as float

Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4)
risk_pred_prob -1.074*** -1.072*** -0.742*** -0.856***
(-1.188, -0.960) (-1.187, -0.956) (-0.855, -0.629) (-1.065, -0.648)
skin_tonenumber_f7ddc4 0.006 -0.021 -0.020
(-0.034, 0.045) (-0.059, 0.017) (-0.058, 0.018)
age -0.0004 0.0003 0.0003
(-0.002, 0.001) (-0.001, 0.001) (-0.001, 0.001)
attractiveness -0.002 0.002 0.002
(-0.013, 0.009) (-0.009, 0.012) (-0.009, 0.012)
competence 0.002 -0.002 -0.002
(-0.010, 0.015) (-0.014, 0.010) (-0.014, 0.010)
dominance -0.002 0.003 0.003
(-0.011, 0.007) (-0.006, 0.011) (-0.006, 0.011)
trustworthiness 0.004 0.004 0.004
(-0.007, 0.015) (-0.007, 0.014) (-0.007, 0.014)
p_hat_covariates 1.085*** 1.085***
(1.015, 1.155) (1.016, 1.155)
decile 0.003
(-0.002, 0.008)
Constant 1.095*** 1.095*** 0.109* 0.126**
(1.059, 1.131) (1.018, 1.171) (0.012, 0.205) (0.025, 0.226)
Observations 7,318 7,318 7,318 7,318
Adjusted R2 0.032 0.031 0.111 0.111
F Statistic 239.132*** (df = 1; 7316) 11.215*** (df = 23; 7294) 39.038*** (df = 24; 7293) 37.523*** (df = 25; 7292)
Note: p<0.1; p<0.05; p<0.01

Repeating Main Regression on subset of detailed labels

Here I am including all those images for which our MTurk labels had more than 3 workers. The number of workers per image now ranges from 6 - 9 and we are left with 558 validation observations.

Table _03 - SUB SAMPLE - Combined Male & Female
Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5)
risk_pred_prob -1.319*** -1.369*** -1.304*** -1.009*** -0.934***
(-1.784, -0.854) (-1.846, -0.891) (-1.791, -0.816) (-1.483, -0.534) (-1.414, -0.454)
skin_tonenumber_f7ddc4 0.057 0.072 0.090 0.072
(-0.110, 0.224) (-0.098, 0.242) (-0.073, 0.253) (-0.092, 0.236)
age 0.0004 0.001 0.002
(-0.005, 0.006) (-0.004, 0.006) (-0.003, 0.007)
attractiveness -0.015 -0.021 -0.017
(-0.071, 0.041) (-0.075, 0.033) (-0.070, 0.037)
competence 0.013 0.024 0.020
(-0.059, 0.086) (-0.046, 0.094) (-0.049, 0.090)
dominance 0.003 0.021 0.021
(-0.044, 0.049) (-0.024, 0.066) (-0.023, 0.066)
trustworthiness 0.034 0.028 0.027
(-0.034, 0.102) (-0.038, 0.093) (-0.039, 0.092)
p_hat_covariates 1.091*** 1.036***
(0.796, 1.386) (0.736, 1.336)
p_hat_cnn 0.308
(-0.005, 0.621)
Constant 1.167*** 1.105*** 0.900*** -0.151 -0.393
(1.019, 1.315) (0.906, 1.303) (0.524, 1.276) (-0.610, 0.308) (-0.914, 0.127)
Observations 449 449 449 449 449
Adjusted R2 0.044 0.040 0.033 0.109 0.112
F Statistic 21.745*** (df = 1; 447) 2.028*** (df = 18; 430) 1.672** (df = 23; 425) 3.282*** (df = 24; 424) 3.268*** (df = 25; 423)
Note: p<0.1; p<0.05; p<0.01
Table _04 - SUB SAMPLE - Male
Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5)
risk_pred_prob -1.104*** -1.185*** -1.140*** -0.936*** -0.885***
(-1.625, -0.583) (-1.723, -0.647) (-1.687, -0.593) (-1.467, -0.405) (-1.418, -0.353)
skin_tonenumber_f7ddc4 -0.014 0.016 0.045 0.019
(-0.210, 0.183) (-0.185, 0.217) (-0.149, 0.238) (-0.176, 0.214)
age -0.0002 0.0003 0.002
(-0.006, 0.006) (-0.006, 0.006) (-0.004, 0.008)
attractiveness -0.053 -0.058 -0.053
(-0.122, 0.016) (-0.124, 0.009) (-0.120, 0.014)
competence 0.005 0.024 0.020
(-0.080, 0.091) (-0.059, 0.106) (-0.063, 0.102)
dominance 0.014 0.028 0.027
(-0.045, 0.073) (-0.029, 0.085) (-0.030, 0.084)
trustworthiness 0.068 0.060 0.064
(-0.014, 0.150) (-0.019, 0.139) (-0.015, 0.143)
p_hat_covariates 1.054*** 1.006***
(0.714, 1.395) (0.662, 1.350)
p_hat_cnn 0.382
(-0.024, 0.788)
Constant 1.087*** 1.048*** 0.869*** -0.130 -0.441
(0.917, 1.257) (0.825, 1.271) (0.401, 1.336) (-0.684, 0.425) (-1.085, 0.204)
Observations 348 348 348 348 348
Adjusted R2 0.031 0.022 0.016 0.086 0.090
F Statistic 12.146*** (df = 1; 346) 1.436 (df = 18; 329) 1.245 (df = 23; 324) 2.363*** (df = 24; 323) 2.374*** (df = 25; 322)
Note: p<0.1; p<0.05; p<0.01
Table _04 - SUB SAMPLE - Female
Multihead(ResNet50)
Dependent variable:
Release Outcome
(1) (2) (3) (4) (5)
risk_pred_prob -2.440*** -2.448*** -2.526*** -2.006** -1.621*
(-3.641, -1.239) (-3.758, -1.138) (-3.837, -1.216) (-3.331, -0.681) (-2.987, -0.255)
skin_tonenumber_f7ddc4 0.223 0.109 0.200 0.283
(-0.161, 0.607) (-0.277, 0.495) (-0.181, 0.580) (-0.103, 0.668)
age 0.008 0.007 0.006
(-0.004, 0.020) (-0.005, 0.018) (-0.005, 0.017)
attractiveness 0.132** 0.119** 0.124**
(0.033, 0.230) (0.022, 0.215) (0.029, 0.219)
competence 0.104 0.070 0.074
(-0.049, 0.258) (-0.081, 0.221) (-0.075, 0.223)
dominance 0.005 0.021 0.013
(-0.081, 0.092) (-0.064, 0.106) (-0.072, 0.097)
trustworthiness -0.183** -0.157* -0.179**
(-0.320, -0.046) (-0.292, -0.022) (-0.314, -0.044)
p_hat_covariates 0.833** 0.869**
(0.250, 1.415) (0.292, 1.447)
p_hat_cnn 0.739
(-0.002, 1.479)
Constant 1.530*** 1.377*** 0.935** 0.137 -0.557
(1.182, 1.879) (0.863, 1.891) (0.238, 1.632) (-0.740, 1.015) (-1.670, 0.556)
Observations 101 101 101 101 101
Adjusted R2 0.092 0.062 0.106 0.155 0.173
F Statistic 11.171*** (df = 1; 99) 1.389 (df = 17; 83) 1.539* (df = 22; 78) 1.798** (df = 23; 77) 1.873** (df = 24; 76)
Note: p<0.1; p<0.05; p<0.01

Release-rate and skin-tone

We now presents plots for the difference relationships of skin-tone w.r.t a set of MTurk labels. These are:

  1. Mean arrest rate
  2. Attractiveness
  3. Competence
  4. Dominance
  5. Trustworthiness