ROC class notes

Le Kang

2024-02-19

Tests of separation of two population distributions

The ROC curve provides a description of the separation between the distributions of the classification score \(S\) in the two populations.

The chance diagonal represents the uninformative ROC curve for which the probability of allocating an individual to P is the same whether that individual has come from P or N.

library(pROC)
y=rep(0:1, each=5000)
x=c(rnorm(5000),rnorm(5000,sd=2))
plot(roc(y,x),asp=1)

library(pROC)
y=rep(0:1, each=5000)
x=c(rnorm(5000,sd=2),rnorm(5000))
plot(roc(y,x),asp=1)

Kolmogorov-Smirnov test

Maximum vertical distance (MVD) between the chance diagonal and the ROC curve, i.e., how far the curve deviates from “randomness”. \[MVD=\max_t|y(t)-x(t)|=\max_t|p(S>t|P)-p(S>t|N)|\]

We could rewrite \[MVD=\max_x|1-G[F^{-1}(1-x)]-x| \\~~~~~~~~~~~=\max_x|1-x-G[F^{-1}(1-x)]|\] However, \(t=F^{-1}(1-x)\), so \(1-x=F(t)\), thus, \[MVD=\max_t|F(t)-G(t)|=\sup_{t\in (-\infty,\infty)}|F(t)-G(t)|\]

\[\widehat{MVD}=\max_t|\hat{F}(t)-\hat{G}(t)|=\sup_{t\in (-\infty,\infty)}|\hat{F}(t)-\hat{G}(t)|\]

This is the well-known Kolmogorov-Smirnov statistic for nonparametrically testing the equality of the two population distribution functions \(F\) and \(G\).

The maximum vertical distance statistic MVD measures the distance between the ROC curve and the chance diagonal at their point of greatest separation, but pays no attention to their difference at other points.

Testing \(AUC=0.5\)

\(H_0: AUC=0.5\)

\(H_1: AUC>0.5\)

We could use the asymptotic \(z\) test.

In general, we could test \(H_0: AUC=\theta_0\), we would reject \(H_0\) if \[\dfrac{\widehat{AUC}-\theta_0}{\sqrt{\widehat{var}(\widehat{AUC}|\theta_0)}}>z_{1-\alpha}\]

What about power?

\[z_{1-\alpha}\sqrt{\widehat{var}(\widehat{AUC}|\theta_0)}+z_{1-\beta}\sqrt{\widehat{var}(\widehat{AUC}|\theta_1)}=\theta_1-\theta_0.\]

To determine sample sizes for the experiment, the researcher must first specify target values for \(\alpha, \beta, \Delta=\theta_1-\theta_0\).

Sample size calculations

How large must the samples be to obtain an estimate of AUC that is sufficiently precise, or to test a hypothesis about AUC with at least a specified power?

\[var({\widehat{AUC}})=\dfrac{1}{n_N n_P}\left({AUC}(1-{AUC})+[n_P-1][Q_1-{AUC}^2] \\ +[n_N-1][Q_2-{AUC}^2]\right)\]

where \(Q_1\) is the probability that the classification scores of two randomly chosen individuals from P exceed the score of a randomly chosen individual from N, and \(Q_2\) is the converse probability that the classification score of a randomly chosen individual from P exceeds both scores of two randomly chosen individuals from N.

Hanley and McNeil¹ suggested approximating \(Q_1\) by \(AUC/(2-AUC)\) and \(Q_2\) by \(2AUC^2/(1+AUC)\).

\(n_N\) and \(n_P\) can be deduced by trial and error.

Measurement errors

\(S^{\prime}_{N_i}=S_{N_i}+\varepsilon_i\) \(S^{\prime}_{P_j}=S_{P_j}+\eta_j\)

\(\widehat{AUC}^{\prime}\) is still an unbiased estimator of \(AUC^{\prime}=P(S_P^{\prime}>S_N^{\prime})\)

But it will no longer be unbiased for \(AUC\).

Bias correction

Assume \(\sigma_P^2=\sigma_N^2=\sigma^2\), and \(\sigma_{\varepsilon}^2=\sigma_{\eta}^2=\sigma_E^2\),

\[AUC^{\prime}=\Phi\left(\dfrac{\mu_P-\mu_N}{\sqrt{(2\sigma^2(1+\theta^2)}}\right),\] where \(\theta=\sigma_E/\sigma_P\)