2024-02-01

Calculating reference intervals for analytes used in assessing liver health

Normal values of analytes can vary depending on several factors.\(^1\)

  • The nature of the population being sampled (sex, age, treatment status)
  • Regional variations (temperatures, pressures, altitudes)
  • The method being used to measure the analyte can affect the reference interval

In order to predict disease states, CLSI recommends a reference interval with 90% confidence of healthy individuals of a population.

Selecting Populations

Hepatitis C Data

- contains analyte data from 615 anonymized patients
- 533 disease negative
- 7 unknown status
- 75 in disease state

Reference ranges should be calculated separately by sex and age. For this presentation, we will be establishing a reference range for the AST marker in healthy adult men and women.

HCV sample data provided by ZAHRA AMINI on Kaggle.com https://www.kaggle.com/datasets/aminizahra/hcv-data

Population histogram for AST among healthy adult males and females

Scatter plot of AST values among healthy and diseased populations

Basic Statistics calculations

First we calculate the mean: \[ Mean = \frac{\sum_{1}^nx_{i}}{n} \] Then find the standard deviation: \[ SD = \large\sqrt{\frac{\sum_{}{|{x}-\bar{x}|}}{n-1}} \] Then calculate the variance: \[ s^2 = \large\frac{1}{n-1}\sum_{i=1}^n(x_{i}-\bar{x})^2 \]

Calculations to find a 90% reference interval

To find the upper and lower limits of the analyte for a specific population we can use the formula\(^2\):

\[ Limits = Mean \pm 1.96 \times Standard Deviation \]

Then calculate a 90% confidence interval for each limit using the following formula\(^2\):

\[ Limit\: \pm\: 1.64 \: \times\: \sqrt{Variance \times(\frac{1}{n} + \frac{2}{n-1})} \]

Sample R code for finding reference interval

confidence_int = function(dataset, test) {
  testdata = dataset %>% 
    select(test) %>% na.omit 
  n = nrow(testdata)  # find the sample number
  testdata = pull(testdata,test)  # convert column to vector to use sd function
  mean_pop = mean(testdata)  # find the mean
  sd_pop = sd(testdata)  # find the standard deviation for each population
  var_pop = var(testdata)   # find the variance of each population
  
  upper_limit = mean_pop + (1.96 * sd_pop)
  lower_limit = mean_pop - (1.96 * sd_pop)

  # find the 90% confidence interval for upper and lower
  low = lower_limit + 1.64 * sqrt(var_pop * (1/n + 2/(n-1)))
  high = upper_limit + 1.64 * sqrt(var_pop * (1/n + 2/(n-1)))
  
  if (low<0) low=0 #not possible to be less than zero level
  return(c(low,high))}

The range for AST for Healthy Adult Males is 14.47, 43.9
The range for AST for Healthy Adult Females is 9.679, 40.75

A selection of reference intervals for common liver analytes

References