Computing relative viral levels from COVID-19 test kits using R

Author

John Anderson

Background

A while ago, I saw a (formerly known as) Twitter post where someone had computed COVID levels from rapid-test results. Unfortunately, I can’t find the post, but I recall the person had to convert images of the test results to grayscale, and then compute averages for how dark the band for the control and test areas were.

A few weeks ago, I got COVID, and this seemed like as good a project to try out as any. At first, I tried using imageJ to compute the average values for regions of interest, but this is tedious, and likely could be automated.

Here’s an approach in R that automatically reads in cropped test results from png files in a directory and produces a ratio of the Test (minus background) to Control (minus background) results.

In plain English, higher values of this ratio indicate relatively darker lines for the Test (COVID positive) region than the Control region.

What you need!

A few images of COVID tests - preferably, some will be positive to varying degrees. Here’s an example of a raw image:

positive rapid test for COVID-19 (oriented vertically & in colour)

Raw photo of COVID test

… and here’s how you’ll need to reformat that image for this script to work.

reformatted photo (C is the right hand bar in ALL cases). The image should be grayscale and cropped tight around the test “well”

If you have at least one of these in a directory called “COVID_PHOTOS”, you can run the script to get values.

Overview of the steps:

  1. The script will attempt to resize all the photos to have the same width & then allow for differences in intial cropping by saving the dimensions of the new image.

  2. The “process image” function then reads in each file, converts it to grayscale, and applies three masks, one for the Test region, one for the Control region, and one for the Background (BG).

  3. Mean values are computed within the masks (images are scaled between 0 and 1) and backgrounds are subtracted from the T and the C values. What this yields, are values for how much darker the test and control regions are than the background regions. In the figure below, the green region corresponds to the mask for control, the yellow for the background, and the pink to the test.

Masked regions for image intensity calculations
  1. Finally, a ratio of T/C is computed, and higher values indicate how much darker (relatively) T is than C.
library(EBImage)
library(magick)
Linking to ImageMagick 6.9.12.3
Enabled features: cairo, fontconfig, freetype, heic, lcms, pango, raw, rsvg, webp
Disabled features: fftw, ghostscript, x11
resize_images_magick <- function(image_paths, width = 486) {
  resized_paths <- character(length(image_paths))
  dimensions <- matrix(0, nrow = length(image_paths), ncol = 2)
  for (i in seq_along(image_paths)) {
    image <- image_read(image_paths[i])
    resized_image <- image_scale(image, paste0(width, "x"))
    resized_path <- paste0("resized_", basename(image_paths[i]))
    image_write(resized_image, resized_path)
    resized_paths[i] <- resized_path
    dimensions[i,] <- c(width, image_info(resized_image)$height) # Get the new height
  }
  return(list(paths = resized_paths, dimensions = dimensions))
}

process_image <- function(image_path, mask_C, mask_T, mask_BG) {
  # Load the image
  image <- image_read(image_path)
  
  # Convert to grayscale
  image_gray <- image_convert(image, colorspace = 'Gray')
  
  # Get the pixel values
  pixels <- as.numeric(image_data(image_gray))
  height <- nrow(mask_C)
  width <- ncol(mask_C)
  pixels <- matrix(pixels, nrow = height, ncol = width) # Reshape pixels according to the dimensions
  
  # Apply the masks to extract the regions of interest
  roi_C <- pixels * mask_C
  roi_T <- pixels * mask_T
  roi_BG <- pixels * mask_BG
  
  # Calculate mean intensity
  mean_T <- mean(roi_T[roi_T > 0])
  mean_C <- mean(roi_C[roi_C > 0])
  mean_BG <- mean(roi_BG[roi_BG > 0])
  
  # Subtract background
  corrected_T <- mean_T - mean_BG
  corrected_C <- mean_C - mean_BG
  
  # Calculate ratio
  ratio <- corrected_T / corrected_C
  
  return(ratio)
}
# Example usage for multiple images
original_image_paths <- list.files("/Users/johnanderson/Downloads/COVID_PHOTOS/", pattern = "\\.(jpg|jpeg|png)$", full.names = TRUE) # Adjust path and extensions as needed
resized_images <- resize_images_magick(original_image_paths)
resized_image_paths <- resized_images$paths
dimensions <- resized_images$dimensions

ratios <- numeric(length(resized_image_paths))

# Initialize a data frame to hold the results
results <- data.frame(filename = character(), ratio = numeric())

# Iterate through the images and apply masks based on their individual dimensions
for (i in seq_along(resized_image_paths)) {
  width <- dimensions[i, 1]
  height <- dimensions[i, 2]
  mask_C <- matrix(0, nrow = height, ncol = width)
  mask_T <- matrix(0, nrow = height, ncol = width)
  mask_BG <- matrix(0, nrow = height, ncol = width)

  # Define the specific regions for each mask here (update these as needed)
  mask_C[80:120, 310:350] <- 1
  mask_T[80:120, 155:195] <- 1
  mask_BG[80:120, 195:310] <- 1

      # Call the display_masks function to visualize the masks
  
  ratio <- process_image(resized_image_paths[i], mask_C, mask_T, mask_BG)
  
  # Extract the filename without the extension
  filename <- tools::file_path_sans_ext(basename(resized_image_paths[i]))
  
  # Add the filename and ratio to the results data frame
  results <- rbind(results, data.frame(filename = filename, ratio = ratio))

}

# You can now use the results data frame for further analysis
print(results)
        filename       ratio
1  resized_Fifth  0.07293435
2  resized_First  0.58333371
3 resized_Fourth  0.19758516
4 resized_Second  1.07200008
5  resized_Sixth -0.03469754
6  resized_Third  0.31892355

If you want to plot these values, you might also want to include days from symptom onset. Here’s an example image for three people. Note that there was a slight increase in viral load around day 10 (this was also when I really started getting interested in the results and testing much more often).

library(readr)
library(ggplot2)
library(ggprism)
COVID <- read_csv("~/Downloads/COVID.csv")
COVID <- COVID[complete.cases(COVID$Person), ]

ggplot(COVID, aes(x = `Days from symptom onset`, y = New_T_C, colour = Person, shape = Person))+geom_point(size = 4)+theme_prism()+ scale_color_manual(values=c("#999999", "#E69F00", "#56B4E9"))+geom_smooth(se=FALSE)+geom_hline(yintercept = 0, linetype=2, size = 1)+ylab("T/C \n How much darker Test is than Control")+facet_wrap(~Person)