Test validity is defined as the ability of a screening test to accurately identify diseased and non-diseased individuals. Validity has two components: sensitivity and specificity. The sensitivity of the test is defined as the ability of the test to identify correctly those who have the disease. The specificity of the test is defined as the ability of the test to identify correctly those who do not have the disease.
Example: You have a population of 1000 patients, of whom 100 are diabetic and 900 are non-diabetic. You developed a new non-invasive screening test for diabetes. We want to use this test to try to distinguish persons who have diabetes from those who do not. Using the information below, calculate sensitivity and specificity and provide your interpretation.
diabets_matrix <- matrix(data =c(75,50,25,850),nrow=2,ncol=2,byrow = TRUE) #create contingency table/matrix
colnames(diabets_matrix) <- c('Diabetes (+)','Diabetes (-)') #assign columns names
rownames(diabets_matrix) <- c('Test (+)','Test (-)') #assign rows names
diabets_matrix
## Diabetes (+) Diabetes (-)
## Test (+) 75 50
## Test (-) 25 850
Let’s calculate sensitivity and specificity
diabets_matrix[1,1]/(diabets_matrix[1,1]+diabets_matrix[2,1]) #sensitivity (TP/(TP+FN))
## [1] 0.75
diabets_matrix[2,2]/(diabets_matrix[2,2]+diabets_matrix[1,2]) #specificity (TN/(TN+FP))
## [1] 0.9444444
Interpretation: The test’s sensitivity is 75%, meaning that if a patient is diabetic, there is an 80% probability that the screening test will be positive. The test’s specificity is 94%, meaning that if a patient is not-diabetic, there is an 94% probability that the screening will be negative.
When developing a diagnostic test, it is essential to know patients who truly have and don’t have the disease. This gold standard or source of truth can be derived from another test that has been in use or has been validated previously. Thus, to quantitatively assess the sensitivity and specificity of a test, we must have another source of truth with which to compare the test results.
When developing a screening test, it is important to evaluate the effect of false-positive and false-negative cases. Patients screened positive (true positives and false positives) will need to undergo further advanced/expensive testing. False-positive cases are a major source of cost for patients and the healthcare system.
In patients who have the disease but are erroneously diagnosed as negative (false negative), serious harm can be done that might lead to death. The importance of false-negative results depends on the nature and severity of the disease being screened for, the effectiveness of available intervention measures, and whether the effectiveness is greater if the intervention is administered early in the natural history of the disease.
Sensitivity and specificity inform us about how good is the test at identifying people with and without the disease. This information is particularly important when utilizing a diagnostic test to screen a population. Essentially, we are asking “If we screen a population, what proportion of people will be diagnosed correctly?” This is certainly an important public health and health policy question.
Clinicians and health practitioners may ask a different question, if a patient tests positive, what is the probability that the patient actually has the disease? This is the definition of positive predictive value (PPV). Similarly, if a patient tests negative, what is the probability that the patient does not have the disease? This is called the negative predictive value (NPV).
PPV is calculated by dividing the number of true positives by all who tested positive (true positives + false positives). NPV is calculated by dividing the number of true negatives by all those who tested negative (true negatives + false negatives).
Example: You have a population of 1000 patients, of whom 100 are diabetic and 900 are non-diabetic. You developed a new non-invasive screening test for diabetes. We want to use this test to try to distinguish persons who have diabetes from those who do not. Using the information below, calculate PPV and NPV and provide your interpretation.
diabets_matrix <- matrix(data =c(75,50,25,850),nrow=2,ncol=2,byrow = TRUE) #create contingency table/matrix
colnames(diabets_matrix) <- c('Diabetes (+)','Diabetes (-)') #assign columns names
rownames(diabets_matrix) <- c('Test (+)','Test (-)') #assign rows names
diabets_matrix
## Diabetes (+) Diabetes (-)
## Test (+) 75 50
## Test (-) 25 850
Let’s Calculate PPV and NPV
diabets_matrix[1,1]/(diabets_matrix[1,1]+diabets_matrix[1,2]) # PPV (TP/(TP+FP))
## [1] 0.6
diabets_matrix[2,2]/(diabets_matrix[2,2]+diabets_matrix[2,1]) # NPV (TN/(TN+FN))
## [1] 0.9714286
Interpretation: The PPV of the test is 60%, meaning if a patient tests positive, there is a 60% probability that the patient is diabetic. The NPV of the test is 97%, meaning if the patient tests negative, there is a 97% probability that the patient is not diabetic.
Sensitivity and specificity are unique characteristics of the test. However, PPV and NPV are influenced by the prevalence of the disease in the population tested.
Prevalence is the proportion of individuals in a population having a disease or characteristic. Prevalence could be viewed as the pre-test probability. That is, before any testing, the probability of a person in the specified population having the disease. Thus, if the prevalence of a disease is 1% of the population, then we would expect approximately 1 in 100 people to have the disease before any testing. As the prevalence increases, the PPV also increases but the NPV decreases. Similarly, as the prevalence decreases, the PPV decreases while the NPV increases. We can prove this relationship mathematically.
\[PPV=\frac{sensitivity \times prevalence}{(sensitivity \times prevalence)+((1-specificity) \times (1-prevalence))} \] If we hold all values constant except for the prevalence, then as prevalence increases the numerator will also increase for PPV. As prevalence increases towards 100% (a value of one), the term “1 – prevalence” goes towards zero, this drives the second part of the denominator, “(1 – specificity) x (1 – prevalence)” to zero. Thus at a very high prevalence the value of “1 – prevalence” goes towards zero and the PPV equation reduces to 1.
\[PPV=\frac{sensitivity \times prevalence}{(sensitivity \times prevalence)+((1-specificity) \times (0))} \] \[PPV=\frac{sensitivity \times prevalence}{(sensitivity \times prevalence)+0)} \] \[PPV=\frac{sensitivity \times prevalence}{sensitivity \times prevalence}=1 \] For the NPV as the prevalence increases (goes towards one) the term “1 – prevalence” becomes smaller making the numerator smaller. In the denominator NPV has the same first term as the numerator, “specificity x (1 – prevalence)” which will also become smaller as the prevalence increases. The second term in the denominator, “(1 – sensitivity) x prevalence” will increase as the prevalence increases.
\[NPV=\frac{specificity \times (1-prevalence)}{(specificity \times (1-prevalence))+((1-sensitivity) \times prevalence)} \]
\[NPV=\frac{specificity \times (1-1)}{(specificity \times (1-1))+((1-sensitivity) \times 1)} \]
\[NPV=\frac{specificity \times 0}{(specificity \times 0)+((1-sensitivity) \times 1)} \] \[NPV=\frac{0}{0+((1-sensitivity) \times 1)} = 0\]
To further illustrate the relationship between PPV and prevalence, consider the example of a test with a sensitivity of 99% and a specificity of 95% in a population of 10,000 people in which the disease prevalence is 1%. The PPV of the test is 17%, meaning that 83% of the patients testing positive are false positives.
If the prevalence of the disease increases from 2% to 10%. The PPV increases from 16% to 51%. Therefore, as the prevalence of the disease increases, the PPV of the test increases.
When developing a public health policy to screen for a disease, it is important to target high-risks population in order to maximize the efficiency of screening and reduce the probability of false positives. Screening a total population without accounting for the risk of infrequent disease can be very wasteful of resources and may yield few previously undetected cases relative to the amount of effort involved.