XgBoost Risk Predictor - Kicking the tires

This document compiles a few different plots meant to “kick the tires” on the risk-predictor. The intention is for us to decide on what model to use for the first draft. The document is outlines as follows:

  1. Prediction Density Plots

  2. Prediction Density Plots by Demographic Groups

  3. ROC Curves

  4. Model ECDF

  5. Transition counts between old & new models

A note on definition of terms.

Figure 1 - Prediction Density Plots

Main Takeaway We seem to generate a “wider” distribution with the new charge types and descriptions. This makes sense as the optimal model has more trees & more categories than before.

Figure 2 - Demographic Distributions

Main Takeaway By and large the same trend as in figure 1 holds between genders and races. Notable is the difference for females and white-defendants in the Description-model predictions.

Figure 3 - ROC Curves

Main Takeaway This is the AUC on the Re-Arrest-Outcome the model is trained on. We can see that giving the raw charge descriptions increases the AUC marketable and statistically significantly. There is also 1%-point increase in AUC in the new-charge-type vs. old-charge-type models.

Looking at the AUC on the Detain-Outcome in the final regression we see that the old-charge-type predictions reach 0.624 AUC, the new-charge-type predictions reach 0.658, and the raw-description-model predictions reach 0.696.

Fig. 4 - Empirical Cumulative Distribution Function

Main Takeaway The ECDF shows the same results as the histograms in Figure 1. The description-model is able to cover a wider support than both other models, with the new-charge-type model doing marginally better than the current version. This plot is intended to give a ‘bin-width-invariant’ version of figure 1.

Figure 5 - Transition Counts

Here I am plotting the number of people who were in one of 4 quartiles of risk under the old-charge-type model, and there they have transitioned to in the new-charge-type and description model. Ideally we would want to see the largest count on the diagonal, as this would indicate our models are assigning most people approximately the same risk-probability.

Main Takeaway As expected (given the distribution above), the description-model gives a more diffuse prediction distribution and moves some people up/down a quartile in risk. The new-charge-type model is much more heavily concentrated on the diagonal, putting most people in the same prediction quartile as the old-charge-type model.