September 2024

Chronic Kidney Disease (CKD)

  • Classified by abnormalities in kidney structure or function for more than three months

  • Common Symptoms:

    • Gross Hematuria
    • Foamy Urine
    • Nocturia
    • Flank Pain
    • Decreased Urine Output
  • Advanced Symptoms: Fatigue, Poor Appetite, Nausea, Vomiting, Weight Loss, Pruritus, and Dyspnea

(Chen, Knicley, & Grams, 2019)

CKD Diagnostic Criteria

  • Glomerular Filtration Rate: Less than 60 mL/min/1.73 m²
  • Albuminuria:
    • Urine albumin ≥30 mg per 24 hours
    • Urine albumin-to-creatinine ratio ≥30 mg/g
  • Kidney Damage Indicators:
    • Abnormalities in urine sediment, histology, or imaging
  • Other Considerations:
    • Renal tubular disorders
    • History of kidney transplantation

(Chen, Knicley, & Grams, 2019)

Stages

  • Glomerular Filtration Rate Categories:
    • G1 (GFR ≥90 mL/min/1.73 m2)
    • G2 (GFR 60–89 mL/min/1.73 m2)
    • G3a (45–59 mL/min/1.73 m2)
    • G3b (30–44 mL/min/1.73 m2)
    • G4 (15–29 mL/min/1.73 m2)
    • G5 (<15 mL/min/1.73 m2)
  • Albumin-to-Creatinine Ratio Categories:
    • A1 (urine ACR <30 mg/g)
    • A2 (30–300 mg/g)
    • A3 (>300 mg/g)

(Chen, Knicley, & Grams, 2019)

Importance of Early Detection

  • Affects 8-16% of the population worldwide

  • 16th leading cause of years of life lost worldwide

  • CKD can progress silently, often leading to severe health consequences if not detected early

(Chen, Knicley, & Grams, 2019)

Research Question

  • Which biomarkers are the most reliable indicators of early CKD detection?

Dataset

  • Title: Risk Factor Prediction of Chronic Kidney Disease

  • Source: UC Irvine Machine Learning Repository

  • Purpose: To analyze and predict the risk factors associated with CKD using machine learning algorithms

  • Characteristics:

    • Dataset Type: Multivariate
    • Instances: 200 patients
    • Variables: 28
    • Variable Types: Real-valued, categorical, and binary

(Islam et al., 2020)

Data Conversion

convert <- function(value) {
  if (grepl("-", value)) {
    nums <- as.numeric(unlist(strsplit(value, " - ")))
    return(mean(nums, na.rm = TRUE))
  } else if (grepl("<", value)) {
    return(NA)
  } else {
    return(as.numeric(value))
  }
}

Investigating Average Lab Test Values for CKD vs Non-CKD

library(dplyr)
library(ggplot2)
library(tidyr)

ckd_data <- read.csv("C:/Users/Skj23/Documents/ckd-dataset-v2.csv")

ckd_data <- ckd_data %>%mutate(across(c(sc, bu, hemo, bgr, pcv, rbcc), ~ sapply(., convert)))
ckd_data <- ckd_data[ckd_data$class %in% c("ckd", "notckd"), ]
ckd_data_long <- ckd_data %>%pivot_longer(cols = c(sc, bu, hemo, bgr, pcv, rbcc),names_to = "lab",values_to = "value")
avg_lab_tests <- ckd_data_long %>%group_by(class, lab) %>%summarise(avg = mean(value, na.rm = TRUE), .groups = 'drop')
avg_lab_tests_plot <- ggplot(avg_lab_tests, aes(x = lab, y = avg, color = class)) +geom_point(size = 3) +geom_line(aes(group = class)) + labs(x = "Lab Tests", y = "Average Value") 

Average Lab Test Values for CKD vs Non-CKD

Investigating Glucose, Blood Urea Nitrogen, Hematocrit as Risk Factors for CKD

library(plotly)

ckd_only_data <- ckd_data %>%
  filter(class == "ckd")

bgr_bu_pcv <- plot_ly(ckd_only_data,x = ~ bgr,y = ~ bu,z = ~ pcv,color = ~ class,colors = c("red", "grey"),type = "scatter3d",
  mode = "markers",marker = list(size = 5, opacity = 0.7)) %>% layout(scene = list(xaxis = list(title = "Glucose"),
  yaxis = list(title = "Blood Urea Nitrogen"),zaxis = list(title = "Hematocrit")))

Glucose, Blood Urea Nitrogen, Hematocrit as Risk Factors for CKD

Investigating Correlation Among CKD Biomarkers

library(corrplot)

matrix <- cor(ckd_only_data %>% select(sc, bu, hemo, bgr, pcv, rbcc), use = "pairwise.complete.obs")
matrix_plot <- corrplot(
  matrix,
  method = "color",
  addCoef.col = "black",
  tl.col = "black",
  tl.srt = 45
)

Correlation Among CKD Biomarkers

Investigating BUN and Serum Creatinine Levels for CKD

bu_sc_plot <- ggplot(ckd_only_data, aes(x = sc, y = bu, color = class)) +
  geom_point(alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE) +
  labs(x = "Serum Creatinine (mg/dL)", y = "Blood Urea Nitrogen (mg/dL)")

BUN and Serum Creatinine Levels for CKD

Results

  • CKD patients exhibited significantly higher levels of blood urea nitrogen and glucose compared to the unaffected population

  • The 3D scatter plot revealed that increased blood urea nitrogen levels correlate with decreased hematocrit

  • The heat map indicated a positive correlation between blood urea nitrogen and serum creatinine, while showing a slightly negative relationship between blood urea nitrogen and hematocrit.

  • The BUN-creatinine ratio displayed a positive correlation among individuals diagnosed with CKD

Conclusion

  • Elevated levels of blood urea nitrogen and serum creatinine are significant biomarkers for CKD

  • Identifying risk factors early can lead to better management strategies, reducing the risk of severe health consequences associated with advanced CKD

  • Machine learning algorithms applied to CKD risk factor prediction highlight the importance of data analytics in understanding and managing chronic diseases

References

  • Chen, T. K., Knicely, D. H., & Grams, M. E. (2019). Chronic Kidney Disease Diagnosis and Management: A Review. JAMA, 322(13), 1294–1304. https://doi.org/10.1001/jama.2019.14745

  • M. A. Islam, S. Akter, M. S. Hossen, S. A. Keya, S. A. Tisha and S. Hossain, ‘Risk Factor Prediction of Chronic Kidney Disease based on Machine Learning Algorithms,’ 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), Thoothukudi, India, 2020, pp. 952-957, doi: 10.1109/ICISS49785.2020.9315878.