I plot the difference in distribution between the old model without the minority-class-sampler and the new model with an updated data-loader.
Repeating the decile plots for the new p_hat_cnn we notice that we don’t see the high degree of non-linearity.
We repeat our baseline regressions with the new CNN model.
| Dependent variable: | |||||
| Release Outcome | |||||
| (1) | (2) | (3) | (4) | (5) | |
| risk_pred_prob | -1.105*** | -1.107*** | -1.106*** | -0.778*** | -0.696*** |
| (-1.211, -0.998) | (-1.215, -1.000) | (-1.214, -0.999) | (-0.883, -0.673) | (-0.801, -0.591) | |
| skin_tonenumber_f7ddc4 | 0.002 | 0.002 | -0.027 | -0.025 | |
| (-0.033, 0.037) | (-0.033, 0.038) | (-0.061, 0.007) | (-0.059, 0.009) | ||
| age | -0.0003 | 0.0003 | 0.001 | ||
| (-0.001, 0.001) | (-0.001, 0.001) | (-0.0002, 0.002) | |||
| attractiveness | -0.002 | 0.001 | 0.003 | ||
| (-0.012, 0.008) | (-0.009, 0.010) | (-0.007, 0.013) | |||
| competence | 0.003 | -0.001 | -0.004 | ||
| (-0.009, 0.015) | (-0.013, 0.010) | (-0.015, 0.007) | |||
| dominance | 0.001 | 0.005 | 0.006 | ||
| (-0.008, 0.009) | (-0.003, 0.012) | (-0.002, 0.014) | |||
| trustworthiness | 0.001 | 0.001 | -0.0002 | ||
| (-0.010, 0.011) | (-0.009, 0.011) | (-0.010, 0.010) | |||
| p_hat_covariates | 1.083*** | 1.006*** | |||
| (1.018, 1.148) | (0.940, 1.071) | ||||
| p_hat_cnn | 0.380*** | ||||
| (0.319, 0.441) | |||||
| Constant | 1.102*** | 1.088*** | 1.088*** | 0.114** | -0.110* |
| (1.069, 1.136) | (1.045, 1.131) | (1.017, 1.158) | (0.024, 0.203) | (-0.206, -0.015) | |
| Observations | 8,479 | 8,479 | 8,479 | 8,479 | 8,479 |
| Adjusted R2 | 0.033 | 0.033 | 0.033 | 0.112 | 0.123 |
| F Statistic | 291.841*** (df = 1; 8477) | 17.216*** (df = 18; 8460) | 13.490*** (df = 23; 8455) | 45.641*** (df = 24; 8454) | 48.550*** (df = 25; 8453) |
| Note: | p<0.1; p<0.05; p<0.01 | ||||
| Dependent variable: | |||||
| Release Outcome | |||||
| (1) | (2) | (3) | (4) | (5) | |
| risk_pred_prob | -0.966*** | -0.946*** | -0.916*** | -0.679*** | -0.569*** |
| (-1.232, -0.700) | (-1.214, -0.678) | (-1.185, -0.647) | (-0.942, -0.416) | (-0.830, -0.308) | |
| skin_tonenumber_f7ddc4 | -0.071 | -0.077* | -0.085* | -0.045 | |
| (-0.144, 0.003) | (-0.150, -0.003) | (-0.157, -0.013) | (-0.116, 0.027) | ||
| age | -0.002 | -0.001 | 0.00004 | ||
| (-0.004, 0.0002) | (-0.003, 0.001) | (-0.002, 0.002) | |||
| attractiveness | 0.012 | 0.012 | 0.013 | ||
| (-0.006, 0.031) | (-0.006, 0.030) | (-0.005, 0.031) | |||
| competence | -0.012 | -0.014 | -0.019 | ||
| (-0.034, 0.009) | (-0.035, 0.007) | (-0.040, 0.002) | |||
| dominance | 0.019** | 0.016* | 0.017* | ||
| (0.003, 0.034) | (0.001, 0.031) | (0.002, 0.032) | |||
| trustworthiness | 0.004 | 0.005 | 0.004 | ||
| (-0.015, 0.024) | (-0.014, 0.024) | (-0.015, 0.022) | |||
| p_hat_covariates | 1.140*** | 1.092*** | |||
| (0.969, 1.311) | (0.923, 1.261) | ||||
| p_hat_cnn | 0.509*** | ||||
| (0.391, 0.627) | |||||
| Constant | 1.118*** | 1.145*** | 1.095*** | 0.043 | -0.344** |
| (1.041, 1.195) | (1.046, 1.244) | (0.951, 1.239) | (-0.168, 0.254) | (-0.570, -0.117) | |
| Observations | 1,833 | 1,833 | 1,833 | 1,833 | 1,833 |
| Adjusted R2 | 0.019 | 0.022 | 0.026 | 0.086 | 0.110 |
| F Statistic | 35.610*** (df = 1; 1831) | 3.239*** (df = 18; 1814) | 3.113*** (df = 23; 1809) | 8.192*** (df = 24; 1808) | 10.083*** (df = 25; 1807) |
| Note: | p<0.1; p<0.05; p<0.01 | ||||
| Dependent variable: | |||||
| Release Outcome | |||||
| (1) | (2) | (3) | (4) | (5) | |
| risk_pred_prob | -1.039*** | -1.043*** | -1.048*** | -0.791*** | -0.732*** |
| (-1.158, -0.920) | (-1.163, -0.923) | (-1.168, -0.928) | (-0.908, -0.674) | (-0.848, -0.615) | |
| skin_tonenumber_f7ddc4 | 0.001 | 0.003 | -0.013 | -0.013 | |
| (-0.040, 0.043) | (-0.039, 0.045) | (-0.053, 0.027) | (-0.053, 0.027) | ||
| age | 0.001 | 0.001 | 0.001 | ||
| (-0.001, 0.002) | (-0.001, 0.002) | (-0.0003, 0.002) | |||
| attractiveness | -0.006 | -0.002 | 0.001 | ||
| (-0.018, 0.005) | (-0.013, 0.010) | (-0.010, 0.012) | |||
| competence | 0.005 | 0.001 | -0.001 | ||
| (-0.008, 0.019) | (-0.012, 0.015) | (-0.014, 0.013) | |||
| dominance | 0.002 | 0.0004 | -0.001 | ||
| (-0.008, 0.012) | (-0.009, 0.010) | (-0.010, 0.009) | |||
| trustworthiness | -0.003 | -0.0005 | -0.001 | ||
| (-0.015, 0.010) | (-0.012, 0.011) | (-0.013, 0.011) | |||
| p_hat_covariates | 1.086*** | 1.040*** | |||
| (1.011, 1.160) | (0.966, 1.115) | ||||
| p_hat_cnn | 0.396*** | ||||
| (0.321, 0.471) | |||||
| Constant | 1.066*** | 1.053*** | 1.041*** | 0.130** | -0.107 |
| (1.028, 1.105) | (1.005, 1.101) | (0.959, 1.124) | (0.029, 0.231) | (-0.216, 0.003) | |
| Observations | 6,646 | 6,646 | 6,646 | 6,646 | 6,646 |
| Adjusted R2 | 0.030 | 0.030 | 0.029 | 0.107 | 0.117 |
| F Statistic | 206.526*** (df = 1; 6644) | 12.312*** (df = 18; 6627) | 9.751*** (df = 23; 6622) | 34.077*** (df = 24; 6621) | 36.110*** (df = 25; 6620) |
| Note: | p<0.1; p<0.05; p<0.01 | ||||