Malaria Detector Project

This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code.

Try executing this chunk by clicking the Run button within the chunk or by placing your cursor inside it and pressing Ctrl+Shift+Enter.

Project Overview

This project explores the detection of Plasmodium falciparum infections using two diagnostic methods: microscopy and PCR. Adoption of molecular techniques (PCR) has revealed many low-density, transmissible infections that are often missed by microscopy (submicroscopic infections). The analysis aims to compare detection rates, compute prevalence ratios, and visualize how submicroscopic infections vary across global malaria regions.

Tasks

  • Visualize PCR % vs. Microscopy %
  • Add a 1:1 reference line to compare both techniques
  • Compute the Prevalence Ratio (Microscopy Positives/PCR Positives)
  • Generate boxplots of prevalence ratios across global regions
  • Interpret results to determine which region has the highest density of submicroscopic infections

First, make sure you install your tidyverse package which also includes the readr package - which is a more modern, fast, and consistent way to read tabular data compared to base R.

Loading libraries

library(tidyverse)
library(readr)

Loading Dataset

#Read the dataset
dataset <- read.table(file = "https://raw.githubusercontent.com/HackBio-Internship/public_datasets/main/R/lancet_malaria.txt", header = TRUE, sep = "\t")
malaria_data <- dataset
head(malaria_data)

Renaming of the column names

colnames(malaria_data) <- c("Review Found", "Author", "Title", "Year", "Region","Country","Location", "PCR_N_Tested", "PCR_N_Positive", "PCR_Percent","Microscopy_N_Tested", "Microscopy_N_Positive", "Microscopy_Percent", "Historical_Transmission", "Current_Transmission", "Setting_20", "Setting_15", "Setting_10", "Setting_5", "PCR_Method", "Microscopy_Fields", "Sampling_Season", "Notes")
head(malaria_data)

Visualization of PCR % against microscopy %

plot(malaria_data$PCR_Percent, malaria_data$Microscopy_Percent,
     xlab = "Microscopy %", ylab = "PCR %",
     main = "PCR vs Microscopy Prevalence",
     col = "blue", pch = 19)
abline(0, 1, lty = 2, col = "red")

Prevalence Ratio

malaria_data$Prevalence_Ratio <- malaria_data$Microscopy_N_Positive / malaria_data$PCR_N_Positive
head(malaria_data)

PCR% vs Microscopy% by Region

ggplot(malaria_data, aes(x = Microscopy_Percent, y = PCR_Percent, color = Region)) +
geom_point() + geom_abline(intercept = 0, slope = 1, linetype = "dotted") + facet_wrap(~Region) + labs(title = "PCR% vs Microscopy% by Region",
x = "Microscopy %", y = "PCR %")

Prevalence Ratio by Region

boxplot(Prevalence_Ratio ~ Region, data = malaria_data,
        main = "Prevalence Ratio by Region",
        xlab = "Global Region", ylab = "Prevalence Ratio",
        col = c("lightblue","lightgreen","lightpink","lightyellow"),
        las = 2, notch = TRUE)
abline(h = 1, col = "red", lty = 2) 

According to the boxplot above, West Africa has the highest median prevalence ratio.This suggests microscopy is relatively better at detecting infections compared to other regions.

Prevalence Ratio by Region Using ggplot

ggplot(malaria_data, aes(x = Region, y = Prevalence_Ratio, fill = Region)) +
geom_boxplot(alpha = 0.7) + labs(title = "Prevalence Ratio by Region",
x = "Region", y = "Prevalence Ratio") 

Interpretation of Results

The boxplot of prevalence ratios across global regions highlights notable differences in the burden of submicroscopic Plasmodium falciparum infections. South America exhibits the lowest median prevalence ratio, indicating that microscopy detects relatively few infections compared to PCR, and suggesting a high prevalence of submicroscopic infections.

In contrast, Asia & Oceania and East Africa show intermediate prevalence ratios, consistent with a moderate burden of submicroscopic infections. West Africa demonstrates the highest prevalence ratio, implying that microscopy performs comparatively well in this region and that submicroscopic infections are less common relative to other regions.

Recommendations

These findings underscore the critical role of molecular diagnostics, such as PCR, in accurately assessing the true malaria burden, particularly in regions like South America where submicroscopic infections are highly prevalent.