Post-test probabilities

Relationship between sensitivity, specificity, pre-test probability, and post-test probability

plot of chunk unnamed-chunk-7

Define P(Disease+|Test+) function (Red line)

dis.pos.given.test.pos <- function(pre.prob, sens, spec) {
    TP <- pre.prob * sens
    FN <- pre.prob - TP
    TN <- (1 - pre.prob) * spec
    FP <- (1 - pre.prob) - TN

    TP / (TP + FP)
}

Define P(Disease+|Test-) function (Green line)

dis.pos.given.test.neg <- function(pre.prob, sens, spec) {
    TP <- pre.prob * sens
    FN <- pre.prob - TP
    TN <- (1 - pre.prob) * spec
    FP <- (1 - pre.prob) - TN

    FN / (FN + TN)
}

Create combinations of pre-test probablities, sensitivities, and specificites

dat <- expand.grid(pre.prob = seq(0, 1, 0.01),
                   sens     = seq(0, 1, 0.1),
                   spec     = seq(0, 1, 0.1)
                   )
head(dat)

  pre.prob sens spec
1     0.00    0    0
2     0.01    0    0
3     0.02    0    0
4     0.03    0    0
5     0.04    0    0
6     0.05    0    0

Add post-test probabilities, convert sensitivities and specificites to categories

dat <- within(dat, {
    dis.pos.given.test.pos <- dis.pos.given.test.pos(pre.prob, sens, spec)
    dis.pos.given.test.neg <- dis.pos.given.test.neg(pre.prob, sens, spec)

    spec <- factor(spec, seq(1, 0, -0.1))
    sens <- factor(sens)
})
head(dat)

  pre.prob sens spec dis.pos.given.test.neg dis.pos.given.test.pos
1     0.00    0    0                    NaN                      0
2     0.01    0    0                      1                      0
3     0.02    0    0                      1                      0
4     0.03    0    0                      1                      0
5     0.04    0    0                      1                      0
6     0.05    0    0                      1                      0

Reshape data for ggplot2

library(reshape2)
dat2 <- melt(dat, id.vars  = c("pre.prob", "sens", "spec"),
             measure.vars  = c("dis.pos.given.test.pos","dis.pos.given.test.neg"),
             variable.name = "test.result", value.name = "post.prob")
head(dat2)

  pre.prob sens spec            test.result post.prob
1     0.00    0    0 dis.pos.given.test.pos         0
2     0.01    0    0 dis.pos.given.test.pos         0
3     0.02    0    0 dis.pos.given.test.pos         0
4     0.03    0    0 dis.pos.given.test.pos         0
5     0.04    0    0 dis.pos.given.test.pos         0
6     0.05    0    0 dis.pos.given.test.pos         0

Plot with ggplot2: Red line P(Disease+|Test+) and Green line P(Disease+|Test-)

When the sensitivity is 0, the test will be positive in none of the diseased. Positive = all false positive.
When the sensitivity is 1, the test will be positive in all of the diseased. Negative = all true negative.
When the specificity is 0, the test will positive in all of the non-diseased. Negative = all false negative.
When the specificity is 1, the test will positive in none of the non-diseased. Positive = all true positive.
When sensitivity + specificity = 1, the post-test probability = pre-test probability regardless of the test result. Thus, the test is as good as a coin toss.
When sensitivity + specificity > 1, the post-test probability > pre-test probability if the test is positive. The test is informative.
When sensitivity + specificity < 1, the post-test probability < pre-test probability if the test is positive. Thus, the test is misleading (disease more likely if the test is negative.). It can be informative if the test result is interpreted the other way around.

library(ggplot2)
library(grid)

p <-
    ggplot(dat2, aes(x = pre.prob, y = post.prob, color = test.result)) +
    geom_line() +
    scale_x_continuous(breaks = NULL, name = "Pre-test probability") +          # X-axis
    scale_y_continuous(breaks = NULL, name = "Post-test probability") +         # Y-axis
    facet_grid(spec ~ sens) +                                                   # row ~ column
    guides(color = FALSE)                                                       # No legend

VP <- viewport(width = 0.975, height = 0.975, x = 0.0, y = 0.0, just = c(0,0))
InPlot <- p + labs(title = "Sensitivity")                        # top label
print(InPlot, vp = VP)
grid.text("Specificity", x = 0.975, y = 0.5, rot = 270)          # right label

plot of chunk unnamed-chunk-7

Each panel is a pre-test probability (X-axis) vs post-test probability (Y-axis) plot at different values of sensitivities (along X-axis) and specificities (along Y-axis).

Red line is P(Disease+|Test+) and Green line is P(Disease+|Test-).

Very accurate test scenario: Sensitivity 99.9% Specificity 99.9%

If the pre-test probability is less than 0.1%, even a very accurate test with 99.9% sensitivity and specificity produces many false positives. Among test positives, less than 50% are true positives (red line).

dat3 <- expand.grid(pre.prob = seq(0, 0.01, 0.0001),
                   sens     = 0.999,
                   spec     = 0.999
                   )

dat3 <- within(dat3, {
    dis.pos.given.test.pos <- dis.pos.given.test.pos(pre.prob, sens, spec)
    dis.pos.given.test.neg <- dis.pos.given.test.neg(pre.prob, sens, spec)
})

dat3 <- melt(dat3, id.vars = c("pre.prob", "sens", "spec"),
             measure.vars  = c("dis.pos.given.test.pos","dis.pos.given.test.neg"),
             variable.name = "test.result", value.name = "post.prob")

ggplot(dat3, aes(x = pre.prob, y = post.prob, color = test.result)) +
    geom_line() +
    scale_x_continuous(limit = c(0, 0.001), breaks = c(0, 0.0001, 0.0005, 0.001), name = "Pre-test probability") +
    scale_y_continuous(limit = c(0, 1),    name = "Post-test probability")

plot of chunk unnamed-chunk-8