This report presents an audit of the Random Forest credit risk classification model. The objective is to assess the model’s ability to distinguish between good and bad credit applicants and to identify the main variables influencing its predictions.
The Receiver Operating Characteristic (ROC) curve below illustrates the trade-off between sensitivity and specificity.
Interpretation: The bow of the green line toward the top-left corner indicates that the model is performing significantly better than random guessing, which would be represented by the diagonal reference line. A steeper rise suggests that the model is effective at identifying high-risk applicants early.
This chart identifies the most influential variables using the Mean Decrease Gini metric. The metric measures the average contribution of each feature to the model’s overall predictive performance.
Key Finding: Status and Amount are the two most important predictors. This indicates that the model relies heavily on account status and the requested loan amount when assessing credit risk, which is consistent with standard lending practice.
Overall, the model demonstrates useful discriminatory power and produces feature-importance patterns that are consistent with credit risk intuition. The ROC curve suggests that the model can separate good and bad applicants effectively, while the importance plot shows that Status and Amount are the dominant drivers of its decisions.