Cost Curves as an ROC Alternative

Discussion Week 6 DATA621

Bonnie Cooper

Sensitivity & Specificity are measures of a model’s accuracy. However, accuracy only describes how well the model predicts specific data. We may be interested in more nuanced measures such as metrics that quantify the consequences of correct and incorrect predictions. This discussion post presents Cost Curves for a two-way classifier. Cost curves plot Normalized Expected Cost ~ the Probability Cost Function. This may sound like further abstraction, but the goal of this post is to demonstrate that cost curves are arguably more intuitive visualizations than ROC Curves.

Some Definitions
Probability Cost Function (PCF)
\[PCF = \frac{P \cdot C(+|-)}{P\cdot C(-|+)+(1-P)\cdot C(+|-)}\] using wordz: the probability cost function equals the prior probability of an event times the cost of a false positive divided by the sum of the prior probability times the cost of false negative and one mimus the prior probability of the event times the cost of a false negative.
Basically, the PCF is the proportion of total costs associated with a False-positive event

Normalized Expected Cost (NEC)
\[NEC = PCF\cdot (1-TP)+(1-PCF)\cdot FP\] using wordss: the normalized expected cost equals the sum of the probability cost function times the false negative rate and the false positive rate times one minus the probability cost function.
Basically, the NEC takes the prevalence of an event, the model performance and the cost into consideration and scales the cost between 0 and 1.

The figures below visualize the ROC and Cost Curves for 3 datasets from the ROCR library

A single instance. Dataset: ROCR.simple
This is a ‘simple’ set of simulated prediction and class label data. Interpretations:

ROC Curve: an ideal classifier would not have false positive classifications (or FN) and so the ROC curve to describe performance of an ideal classifier would follow a step function centered at x=0 to y= 1.0. Basically, it would follow the upper left corner of the plot. However, if a classifier is performing at chance, performance would follow a diagonal line from the origin to (1.0,1.0). In observing the profile of the data’s ROC we can see that it performs better than chance but not as well as an ideal classifier. The Area Under the ROC Curve (AUC), can quantify the accuracy of the model’s classification. Where AUC values closer to 1 approach the behavior of an ideal classifier.
Cost Curve: the cost curve represents different nature of the data than the ROC. The Cost Curve visualizes the performance of a classifier over a full range of class distibutions and misclassification costs. The cost curve is build by plotting numerous lines that assume different costs associated with miscalculations. The lower envelope of all the cost lines is taken as the cost curve. (for details see Drummond & Holte 2005). With Cost Curves, the AUC represents the expected cost of the classifier.

Multiple instances. Dataset: ROCR.xval
This dataset is a simulated set of predictions & labels as might be returned from a 10-fold cross-validation. Visualizing ROC and Cost Curves for the 10 folds gives us an idea of the variance of the model’s predictive accuracy and cost respectively.

Comparing Model Performance. Dataset: ROCR.hiv
This is an interesting dataset. Two different classifiers, a linear support vector machine (in blue) and a neural net classifier (in red), were applied to an HIV dataset to predict coreceptor usage from sequence data. The set include 10 cross-validation sets for both models.

The visualizations above facilitate model comparison
Let’s interpret the figures:

ROC curve: the curves for the Neural Net (red) have a higher area under the curve measure and therefore higher accuracy that the support vector machines (blue)
Cost Curves: the curves for the Neural Net (red) have a lower area under the curve measure that the support vector machines (blue) and therefore a lower expected cost.

Summary:
ROC curves are conventionally used to assess the accuracy of a 2-way classifier. However, accuracy is not the only factor that contributes to the performance of a model. Using the accuracy of a model’s performance negates the fact that in many systems there are costs for misclassifications. Cost Curves are a powerful visualization that build costs and benefits of decisions into the accuracy determination. Furthermore, cost curves are normalized such that the prevalence of each class are accounted for.