Assuming that \(\mu_P>\mu_N\) in accord with the convention that large values of \(S\) are indicative of population P and small ones indicative of population N.
The binormal model will be appropriate for any ROC curve pertaining to populations that can be transformed to normality by some monotone transformation.
ROC Estimation
Empirical counting process (jagged)
Parametric (binormal, bigamma)
Nonparametric
Semi-parametric
ROC curve in mathematical form
\[y=1-G[F^{-1}(1-x)], ~~0\leq x \leq 1,\]
assuming pdf and cdf to be \(f\) and \(F\) for population N, and pdf and cdf to be \(g\) and \(G\) for population P.
Empirical: \[y=1-\hat{G}[\hat{F}^{-1}(1-x)], ~~0\leq x \leq 1,\]
Empirical CDFs are step functions, depending only on the ranks of the combined set of test scores.
Although technically all possible values of \(t\) need to be considered, in practice, \(\hat{fp}\) will only change when \(t\) crosses the score values of the \(n_N\) individuals and \(\hat{tp}\) will only change when \(t\) crosses the score values of the \(n_P\) individuals, so there will at most be \(n_N+n_P+1\) discrete points on the plot.
The connected lines are either horizontal or vertical if just one of \((fp, tp)\) changes at that value of \(t\), and they are sloped if both estimates change.
Empirical ROC curves depend only on the ranks of the combined set of test scores.
Sometimes the irregular appearance of the empirical ROC curve is not deemed adequate as an estimate of the underlying “true” smooth curve.
Ex: The binormal model - estimating \(a\) and \(b\)
The Dorfman and Alf method
With ordered categorical data, Dorfman and Alf 1 proposed a maximum likelihood method.
Assume the score \(S\) can take on only one of a finite set of ranked values or categories \(C_1, C_2, \ldots, C_k\) say. Then there is a latent random variable \(W\), and a set of unknown thresholds \(-\infty=w_0<w_1<w_2,\ldots,<w_k=\infty\), such that \(S\) falls in category \(C_i\) if and only if \(w_{i-1}<W\leq w_i\). Then we could define \(p_{i|N}\) and \(p_{i|P}\).
The log-likelihood function \[\mathcal{L}=\sum_{i=1}^k(n_{i|N}\log p_{i|N}+n_{i|P}\log p_{i|P})\] where \(n_{i|N}\) and \(n_{i|P}\) are the observed numbers of individuals from populations N and P respectively falling in category \(C_i\).
The Metz method
With continuous data, Metz et al.1 considered truth-state runs in rank-ordered data for a natural categorization of continuously-distributed test results for maximum likelihood (ML) estimation of ROC curves.
where \(k(\cdot)\) is the kernel function and \(h_N, h_P\) are the bandwidths in each.
Choosing between the many available kernel functions is relatively unimportant as all give comparable results, but more care needs to be taken over the selection of bandwidth.
since \(F\) and \(G\) are estimated separately, the final ROC curve estimator is not invariant under a monotone transformation of the data.
The spline smoothing is also a popular in density estimation.