Diagnostic test

Basic concept

Diagnostic test 2x2 contingency table

  • A. True positives: those sick individuals with a positive test result.
  • B. False positives: those healthy individuals with a positive test result.
  • C. False negatives: those sick individuals with a negative test result.
  • D. True negatives: those healthy individuals with a negative test result.
  • Sensitivity is the probability that the sick individual will be positive
  • Specificity the probability that healthy individuals will have a negative result
  • positive predictive values is the probability that the positive individual is actual sick
  • negative predictive values is the probability that the negative individual is healthy. (Predictive values are conditioned by the a priori probability of the condition under study. When the a priori probability is low, negative predictive values will be high, and positive predictive values will be low.)
  • The positive likelihood ratio is the probability of positive result when sick exist in relation to the probability of positive result when healthy exist.
  • The negative likelihood ratio is the probability of negative result when sick exist in relation to the probability of negative result when healthy exist.
Posterior probability
  • Probability describes the chance of an event occurring, while likelihood refers to the strength of evidence for a hypothesis given observed data.
  • odd is the ratio of the probability of sick happening to that of it not happening
  • the posterior probability of hypothesis H given data D is the likelihood of observing the data D given hypothesis H times the prior probability of hypothesis H
  • Post-test odds = Pre-test odds * Likelihood ratio
  • Post-test probability is the updated probability of having a disease after a test result, while predictive values are a measure of the test’s accuracy in correctly identifying the presence or absence of the disease within a population.
ROC
  • The ROC curve plots the True Positive Rate (TPR) against the False Positive Rate (FPR) at various threshold settings of a classifier.
  • AUC provides a single, concise summary of a model’s performance

Meausres of diagnostic test

library(mltools)
library(caret)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.4.3
## Loading required package: lattice
library(pROC)
## Type 'citation("pROC")' for a citation.
## 
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
## 
##     cov, smooth, var
library(epiR)
## Warning: package 'epiR' was built under R version 4.4.3
## Loading required package: survival
## 
## Attaching package: 'survival'
## The following object is masked from 'package:caret':
## 
##     cluster
## Package epiR 2.0.84 is loaded
## Type help(epi.about) for summary information
## Type browseVignettes(package = 'epiR') to learn how to use epiR for applied epidemiological analyses
## 
set.seed(123)
# 2x2 contingency table, two methods/ packages
binomial_case <- data.frame(labels = sample(c(2,0), 100, 
                                            replace = TRUE), predictions = sample(c(2,0), 100, replace = TRUE))
############## have to pay attention the reference group very much #pay attention the order of two varialbes 
confusionMatrix=confusionMatrix( relevel( as.factor(binomial_case$predictions), ref = "2"), relevel( as.factor(binomial_case$labels), ref = "2"))
confusionMatrix
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  2  0
##          2 26 20
##          0 31 23
##                                          
##                Accuracy : 0.49           
##                  95% CI : (0.3886, 0.592)
##     No Information Rate : 0.57           
##     P-Value [Acc > NIR] : 0.9564         
##                                          
##                   Kappa : -0.0087        
##                                          
##  Mcnemar's Test P-Value : 0.1614         
##                                          
##             Sensitivity : 0.4561         
##             Specificity : 0.5349         
##          Pos Pred Value : 0.5652         
##          Neg Pred Value : 0.4259         
##              Prevalence : 0.5700         
##          Detection Rate : 0.2600         
##    Detection Prevalence : 0.4600         
##       Balanced Accuracy : 0.4955         
##                                          
##        'Positive' Class : 2              
## 
## Generate a two-by-two table:
tab <- table(binomial_case$predictions,binomial_case$labels)[2:1,2:1]
rval <- epi.tests(tab, conf.level = 0.95)
rval
##           Outcome +    Outcome -      Total
## Test +           26           20         46
## Test -           31           23         54
## Total            57           43        100
## 
## Point estimates and 95% CIs:
## --------------------------------------------------------------
## Apparent prevalence *                  0.46 (0.36, 0.56)
## True prevalence *                      0.57 (0.47, 0.67)
## Sensitivity *                          0.46 (0.32, 0.59)
## Specificity *                          0.53 (0.38, 0.69)
## Positive predictive value *            0.57 (0.41, 0.71)
## Negative predictive value *            0.43 (0.29, 0.57)
## Positive likelihood ratio              0.98 (0.64, 1.50)
## Negative likelihood ratio              1.02 (0.70, 1.47)
## False T+ proportion for true D- *      0.47 (0.31, 0.62)
## False T- proportion for true D+ *      0.54 (0.41, 0.68)
## False T+ proportion for T+ *           0.43 (0.29, 0.59)
## False T- proportion for T- *           0.57 (0.43, 0.71)
## Correctly classified proportion *      0.49 (0.39, 0.59)
## --------------------------------------------------------------
## * Exact CIs

One binary AUC comparison

############### one-class auc comparison
#generate df with random numbers
df <- data.frame(disease_status = rbinom(n=100, size=1, prob=0.20),
                 test1 = rnorm(100, mean=15, sd=4),
                 test2 = rnorm(100, mean=30, sd=2))

# Basic example with 2 roc objects
roc1<-roc(df$disease_status, df$test1, plot=TRUE, smooth = FALSE)
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases

roc2<-roc(df$disease_status, df$test2, plot=TRUE, smooth = FALSE)
## Setting levels: control = 0, case = 1
## Setting direction: controls > cases

result=roc.test(roc1, roc2)
result
## 
##  Bootstrap test for two correlated ROC curves
## 
## data:  roc1 and roc2
## D = -0.26962, boot.n = 2000, boot.stratified = 1, p-value = 0.7875
## alternative hypothesis: true difference in AUC is not equal to 0
## sample estimates:
## AUC of roc1 AUC of roc2 
##   0.4723847   0.4983756
# Comparison of partial AUC:
roc3 <- roc(df$disease_status, df$test1, partial.auc=c(1, 0.8), partial.auc.focus="se")
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
roc4 <- roc(df$disease_status, df$test2, partial.auc=c(1, 0.8), partial.auc.focus="se")
## Setting levels: control = 0, case = 1
## Setting direction: controls > cases
roc.test(roc3, roc4)
## 
##  Bootstrap test for two correlated ROC curves
## 
## data:  roc3 and roc4
## D = 0.60291, boot.n = 2000, boot.stratified = 1, p-value = 0.5466
## alternative hypothesis: true difference in AUC is not equal to 0
## sample estimates:
## pAUC (1-0.8 sensitivity) of roc1 pAUC (1-0.8 sensitivity) of roc2 
##                       0.03404808                       0.01910331
# Unpaired (unrelated) tests
roc7 <- roc1
# artificially create an roc8 unpaired with roc7
roc8 <- roc(df$disease_status[1:70], df$test1[1:70])
## Setting levels: control = 0, case = 1
## Setting direction: controls < cases
roc.test(roc7, roc8, paired=FALSE, method="bootstrap")
## 
##  Bootstrap test for two ROC curves
## 
## data:  roc7 and roc8
## D = -0.021121, boot.n = 2000, boot.stratified = 1, p-value = 0.9831
## alternative hypothesis: true difference in AUC is not equal to 0
## sample estimates:
## AUC of roc1 AUC of roc2 
##   0.4723847   0.4744898

Comparison on specificity and sensitivity

# related ROC
# Comparison on specificity and sensitivity
roc.test(roc1, roc2, method="specificity", specificity=0.9)  #true difference in sensitivity at 0.9 specificity is not equal to 0
## 
##  Specificity test for two correlated ROC curves
## 
## data:  roc1 and roc2
## D = -1.0584, boot.n = 2000, boot.stratified = 1, p-value = 0.2899
## alternative hypothesis: true difference in sensitivity at 0.9 specificity is not equal to 0
## sample estimates:
## AUC of roc1 AUC of roc2 
##   0.4723847   0.4983756
roc.test(roc1, roc2, method="sensitivity", sensitivity=0.9)  #true difference in specificity at 0.9 sensitivity is not equal to 0
## 
##  Sensitivity test for two correlated ROC curves
## 
## data:  roc1 and roc2
## D = 0.16563, boot.n = 2000, boot.stratified = 1, p-value = 0.8685
## alternative hypothesis: true difference in specificity at 0.9 sensitivity is not equal to 0
## sample estimates:
## AUC of roc1 AUC of roc2 
##   0.4723847   0.4983756
# unrelated ROC
# Comparison on specificity and sensitivity
roc.test(roc1, roc2, paired=FALSE,method="specificity", specificity=0.9)  #true difference in sensitivity at 0.9 specificity is not equal to 0
## Warning in roc.test.roc(roc1, roc2, paired = FALSE, method = "specificity", :
## The ROC curves seem to be paired. Consider performing a paired roc.test.
## 
##  Specificity test for two ROC curves
## 
## data:  roc1 and roc2
## D = -0.60266, boot.n = 2000, boot.stratified = 1, p-value = 0.5467
## alternative hypothesis: true difference in sensitivity at 0.9 specificity is not equal to 0
## sample estimates:
## AUC of roc1 AUC of roc2 
##   0.4723847   0.4983756
roc.test(roc1, roc2, paired=FALSE,method="sensitivity", sensitivity=0.9)  #true difference in specificity at 0.9 sensitivity is not equal to 0
## Warning in roc.test.roc(roc1, roc2, paired = FALSE, method = "sensitivity", :
## The ROC curves seem to be paired. Consider performing a paired roc.test.
## 
##  Sensitivity test for two ROC curves
## 
## data:  roc1 and roc2
## D = 0.17241, boot.n = 2000, boot.stratified = 1, p-value = 0.8631
## alternative hypothesis: true difference in specificity at 0.9 sensitivity is not equal to 0
## sample estimates:
## AUC of roc1 AUC of roc2 
##   0.4723847   0.4983756

Two-class auc comparsion

############### Two-class auc comparsion
set.seed(123)
library(metrica)
## Warning: package 'metrica' was built under R version 4.4.3
## 
## Attaching package: 'metrica'
## The following objects are masked from 'package:caret':
## 
##     MAE, precision, R2, recall, RMSE, sensitivity, specificity
## The following object is masked from 'package:lattice':
## 
##     barley
## The following object is masked from 'package:mltools':
## 
##     mcc
library(mltools)
binomial_case <- data.frame(labels = sample(c(2,0), 100, 
                                            replace = TRUE), predictions = sample(c(2,0), 100, replace = TRUE))
# Get AUC estimate for two-class case
# three packages/methods generate very similar results
AUC_roc(data = binomial_case, obs = labels, pred = predictions, tidy = TRUE)#pay attention the order of two varialbes 
##     AUC_roc
## 1 0.5692277
auc_roc( binomial_case$predictions,binomial_case$labels) 
## [1] 0.5137845
roc ( binomial_case$labels, binomial_case$predictions, plot=TRUE, smooth = FALSE)
## Setting levels: control = 0, case = 2
## Setting direction: controls < cases

## 
## Call:
## roc.default(response = binomial_case$labels, predictor = binomial_case$predictions,     smooth = FALSE, plot = TRUE)
## 
## Data: binomial_case$predictions in 43 controls (binomial_case$labels 0) < 57 cases (binomial_case$labels 2).
## Area under the curve: 0.4955

Multi-class auc comparsion

# Multi-class
multinomial_case <- data.frame(labels = sample(c(2,1,0), 100, 
                                               replace = TRUE), predictions = sample(c(2,1,0), 100, replace = TRUE) )
# Get AUC estimate for multi-class case
AUC_roc(data = multinomial_case, obs = labels, pred = predictions, tidy = TRUE)
##     AUC_roc
## 1 0.4582371
# https://r-packages.io/packages/metrica/AUC_roc