ROC class notes

Le Kang

2024-04-01

ROC curves for more than two classes

  • Standard ROC curves represent the trade-off, in the two-class case, between the proportion of class P points correctly classified and the proportion of class N points misclassified, as the classification threshold varies.

  • The two-class classification situation is the most common, since two classes fit many naturally occurring problems: yes/no, right/wrong, sick/healthy, etc.

  • Not uncommon to have two well-defined classes, along with a third class of objects in between about whose membership one is not certain.

    • Credit scoring in the retail financial services sector: bad/good/indetermined
    • Tumor triage: benign, malignant, intermediate
  • When more than two classes are involved, we enter the realm of needing to display the relationships between more than two variables simultaneously.

The conventional ROC curve

  • True positive rate on the vertical axis and false positive rate on the horizontal axis.

  • True positive rate on the vertical axis and true negative rate (1-false positive) on the horizontal axis. This permits immediate generalization to multiple classes.

The ROC hypersurface

  • We can define a ROC hypersurface for \(c\) classes in terms of \(c\) axes, with the \(i^{th}\) axis giving the proportion of the \(i^{th}\) class correctly classified into the \(i^{th}\) class.

  • In the two-class case, the true negative rate is the proportion of class N correctly classified.

  • A point on the surface simply tells us that there is a configuration of the multiclass classifier which will yield the indicated correct classification rates for each of the classes.

  • Note that With \(c\) classes, there are in fact \(c (c − 1)\) different possible types of misclassification.

  • In some situations one may be uneasy about analyzing the multiclass case in terms of simple correct/incorrect classifications, but may also want to consider the nature of the incorrect classifications. For example, a mistaken diagnosis can be of greater or lesser severity, according to the disease to which the case is incorrectly assigned, and the treatment, or lack of treatment, which follows.

The scenario when c = 3

  • We have to display just three axes, or contour plots, or 3-dimensional projections into 2-dimensional space.

  • Alzheimer’s disease (AD): The Clinical Dementia Rating (CDR) was developed at the Washington University School of Medicine and considered gold standard for the detection of dementia.

  • Approximately 2 weeks after the clinical evaluation, participants also completes several neuropsychological tests in order to be confirmed for a clinical diagnosis of possible AD.

CDR FACTOR1 ktemp kfront
0 1.5346 7.8802 2.3447
0 0.1927 0.8972 2.0296
1 -4.5326 -7.1151 0.3407
0.5 -1.2856 -0.7636 -0.1484
1 -2.1839 -3.4536 0.2379
0.5 -1.4965 1.5216 -0.4493
0 0.3369 3.2081 2.6884
1 -1.9130 -0.6499 -1.3039
\(\vdots\) \(\vdots\) \(\vdots\) \(\vdots\)

Let \(Y_{1}\), \(Y_{2}\) and \(Y_{3}\) denote the observational random variables resulting from a single diagnostic test for non-diseased, intermediate and diseased subjects, respectively, and let \(F_{1}\), \(F_{2}\) and \(F_{3}\) be the cumulative distribution functions conditional on the corresponding category.


Assume the results of a diagnostic test are measured on continuous scale and higher values indicate greater severity of the disease.

Given a pair of threshold values \(c_{1}\) and \(c_{3}\) \((c_{1}<c_{3})\), let \(\delta_{1}=P(Y_1<c_1)=F_{1}(c_{1})\), \(\delta_{3}=P(Y_3>c_3)=1-F_{3}(c_{3})\) be the true classification rates for non-diseased and diseased category, respectively.

The probability that a randomly selected subject from intermediate category has a score between \(c_{1}\) and \(c_{3}\) is \[ \delta_{2}=F_{2}(c_{3})-F_{2}(c_{1})=F_{2}\left[ F_{3}^{-1}(1-\delta _{3})\right] -F_{2}\left[ F_{1}^{-1}(\delta_{1})\right] . \]

The triplet \((\delta_{1}\), \(\delta_{2}\), \(\delta_{3})\), where \(\delta_{2} =\delta_{2}(\delta_{1}\), \(\delta_{3})\) is a function of \((\delta_{1}\), \(\delta_{3})\), would produce an ROC surface in the three-dimensional space for all possible \((c_{1}\), \(c_{3})\) \(\in \mathbb{R}^{2}\).

\[ VUS =\int_{0}^{1}\int_{0}^{1-F_{3}\left[ F_{1}^{-1}(\delta_{1})\right] }% F_{2}\left[ F_{3}^{-1}(1-\delta_{3})\right] -F_{2}\left[ F_{1}^{-1}% (\delta_{1})\right] d\delta_{3}d\delta_{1} \]

The volume under the ROC surface (VUS) has been considered in order to summarize the overall diagnostic accuracy for the diagnostic test with three ordinal diagnostic categories

delta1=seq(0,1,by=0.01)
delta3=seq(0,1,by=0.01)


f <- function(delta1,delta3) 
{ r <- pnorm(qnorm(1-delta3,mean=0.2),mean=0)-pnorm(qnorm(delta1,mean=-0.5),mean=0) }

z <- outer(delta1, delta3, f)
op <- par(bg = "white")

persp(delta1, delta3, z, theta = 30, phi = 30, expand = 0.5, 
col = "blue",xlab="delta_1",ylab="delta_3",zlab="delta_2")

op <- par(bg = "white")

persp(delta1, delta3, z, theta = -60, phi = -3, expand = 0.5, 
col = "blue",xlab="delta_1",ylab="delta_3",zlab="delta_2")