Kidney Stone Treatment Assessment

A Comprehensive Analysis of Treatment Effectivness

An assessment of the effectiveness of treatments for kidney stones.

Author

Aouidane Imed Eddine

Published

October 17, 2024

Abstract

This report provides an in-depth analysis of various treatment methods for kidney stones, focusing on patient outcomes and success rates.

Keywords

Kidney Stones, Medical Assessment, Health

Introduction

In 1986, a group of urologists in London published a research paper in The British Medical Journal that compared the effectiveness of two different methods to remove kidney stones. Treatment A was open surgery (invasive), and treatment B was percutaneous nephrolithotomy (less invasive). When they looked at the results from 700 patients, treatment B had a higher success rate. However, when they only looked at the subgroup of patients different kidney stone sizes, treatment A had a better success rate. What is going on here? This known statistical phenomenon is called Simpson’s paradox. Simpson’s paradox occurs when trends appear in subgroups but disappear or reverse when subgroups are combined.

1 Load the necessary packages

Code

# Load necessary packages for data manipulation, modeling, and machine learning
library(tidyverse)    # For data wrangling and visualization
library(themis)       # For dealing with class imbalance using oversampling/undersampling
library(tidymodels)   # A framework for machine learning models in R
library(glmnet)       # For fitting regularized logistic regression models

2 Load the data

Code

data <- read_csv("D:\\R projects\\kidney stone project\\kidney_stone_data.csv")
head(data, 5)

# A tibble: 5 × 3
  treatment stone_size success
  <chr>     <chr>        <dbl>
1 B         large            1
2 A         large            1
3 A         large            0
4 A         large            1
5 A         large            1

3 Exploring the data

Description: This block helps us get an idea about the data

Code

glimpse(data)

Rows: 700
Columns: 3
$ treatment  <chr> "B", "A", "A", "A", "A", "B", "A", "B", "B", "A", "A", "B",…
$ stone_size <chr> "large", "large", "large", "large", "large", "large", "smal…
$ success    <dbl> 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1,…

4 Checking for simpson’s paradox

Description: This block checks for Simpson’s paradox by grouping data based on treatment and stone size. It calculates success rates and highlights potential issues caused by unequal distributions of stone sizes across treatments.

Code

data %>% 
  group_by(treatment, stone_size) %>% 
  summarise(count = n(), 
            success_count = sum(success), 
            mean_success = mean(success)) %>% 
  group_by(stone_size) %>% 
  mutate(stone_size_percentage = count / sum(count))

`summarise()` has grouped output by 'treatment'. You can override using the
`.groups` argument.

# A tibble: 4 × 6
# Groups:   stone_size [2]
  treatment stone_size count success_count mean_success stone_size_percentage
  <chr>     <chr>      <int>         <dbl>        <dbl>                 <dbl>
1 A         large        263           192        0.730                 0.767
2 A         small         87            81        0.931                 0.244
3 B         large         80            55        0.688                 0.233
4 B         small        270           234        0.867                 0.756

We can see that simpson paradox is present as the percentage of stone size (Large and small) is not distributed correctly through the treatments and therefore it cause a problem in our decision
- A Treatment received more more patients with a large stone size (263) than B Treatment (80), on the other hand B Treatment recieved more small size stones (270) than A Treatment (80)
- In other words B Treatment dealt with more easy cases (small kidney stones) while A Treatment dealt with the most harder cases (Large kidney stones)

5 Checking target variable class balance

Description: Here, the balance of the target variable (success) is checked. Imbalanced classes can impact model performance, so this is an essential step before building a model.

Code

data  %>% 
    group_by(success) %>% 
    summarise(count = n())

# A tibble: 2 × 2
  success count
    <dbl> <int>
1       0   138
2       1   562

Another problem is that the classes in our target variable are imbalanced, we can address this problem by oversampling the minority class in our case (success = 0).

6 Transforming the data for modeling

Description: This part converts important variables (stone_size, treatment, and success) into factors, which is necessary for logistic regression modeling.

Code

data <- data  %>% 
  mutate(stone_size = factor(stone_size, levels = c("large","small")),
           treatment = factor(treatment,levels = c("A","B")),
         success = factor(success, levels = c("1","0")))

7 Building the model

Description: This code defines a logistic regression model with regularization (penalty and mixture parameters), making the model ready for tuning.

Code

logistic_model <- logistic_reg(penalty = tune(),mixture = tune()) %>% 
  set_mode("classification") %>% 
  set_engine("glmnet")

8 Splitting the data

Description: The dataset is split into training and testing sets to allow model training on one portion and validation on the other. Stratification ensures that the target variable (success) is balanced in both sets.

Code

data_split <- initial_split(data,strata = success,prop = .8)
training_data <- data_split  %>% 
  training()
testing_data <- data_split  %>%
  testing()

9 Creating tuning grid

Description: This sets up a grid of hyperparameter values for tuning the logistic regression model, allowing the model to select the best combination of regularization parameters.

Code

tune_grid <- grid_regular(
  penalty(range = c(-3, 0)),   
  mixture(range = c(0, 1)),    
  levels = 10)

10 Creating cross validation folds on training set

Description: Cross-validation folds are created to evaluate model performance more robustly by splitting the training data into multiple subsets.

Code

# Creating cross validation folds on training set
data_folds <- vfold_cv(training_data, v = 5)

11 Data Preprocessing

Description: This defines a preprocessing pipeline, which includes interaction terms between treatment and stone size, creating dummy variables, and using SMOTE to handle class imbalance in the success variable.

Code

data_recipe <- recipe(success ~ treatment + stone_size, data = training_data) %>%
  step_interact(terms = ~ treatment:stone_size) %>%
  step_dummy(all_nominal(), -all_outcomes()) %>% 
  step_smote(success)

12 Setting the workflow

Description: This sets up a workflow that combines the preprocessing steps (recipe) and the model, making it easier to manage both in one unified pipeline.

Code

data_workflow <- workflow() %>% 
add_model(logistic_model) %>% 
add_recipe(data_recipe)

13 Fitting the model

Description: The model is fitted using the predefined tuning grid and cross-validation. It evaluates the model’s performance based on accuracy and ROC-AUC and returns the metrics.

Code

model_fit <- tune_grid(
  data_workflow,
  resamples = data_folds,
  grid = tune_grid,
  metrics = metric_set(accuracy, roc_auc),
  control = control_grid(save_pred = TRUE))
model_fit  %>% collect_metrics()

# A tibble: 200 × 8
   penalty mixture .metric  .estimator  mean     n std_err .config              
     <dbl>   <dbl> <chr>    <chr>      <dbl> <int>   <dbl> <chr>                
 1 0.001         0 accuracy binary     0.581     5  0.0141 Preprocessor1_Model0…
 2 0.001         0 roc_auc  binary     0.651     5  0.0194 Preprocessor1_Model0…
 3 0.00215       0 accuracy binary     0.581     5  0.0141 Preprocessor1_Model0…
 4 0.00215       0 roc_auc  binary     0.651     5  0.0194 Preprocessor1_Model0…
 5 0.00464       0 accuracy binary     0.581     5  0.0141 Preprocessor1_Model0…
 6 0.00464       0 roc_auc  binary     0.651     5  0.0194 Preprocessor1_Model0…
 7 0.01          0 accuracy binary     0.581     5  0.0141 Preprocessor1_Model0…
 8 0.01          0 roc_auc  binary     0.651     5  0.0194 Preprocessor1_Model0…
 9 0.0215        0 accuracy binary     0.581     5  0.0141 Preprocessor1_Model0…
10 0.0215        0 roc_auc  binary     0.651     5  0.0194 Preprocessor1_Model0…
# ℹ 190 more rows

14 Selecting the best model

Description: This code selects the best model based on ROC-AUC from the models evaluated during cross-validation.

Code

best_model <- model_fit  %>% 
    select_best(metric = "roc_auc")
best_model

# A tibble: 1 × 3
  penalty mixture .config               
    <dbl>   <dbl> <chr>                 
1   0.001       0 Preprocessor1_Model001

15 Finalizing the workflow

Description: The best model is finalized by integrating it into the workflow and is then trained on the entire training dataset. The model coefficients are extracted using the tidy function.

Code

final_workflow <- data_workflow  %>% 
    finalize_workflow(best_model)
# Fitting the final model 
final_model <- final_workflow  %>% 
    fit(data = training_data)
tidy(final_model)

# A tibble: 4 × 3
  term                         estimate penalty
  <chr>                           <dbl>   <dbl>
1 (Intercept)                     0.335   0.001
2 treatmentB_x_stone_sizesmall    0.323   0.001
3 treatment_B                     0.322   0.001
4 stone_size_small               -1.43    0.001

16 Results Interpretation

treatmentB_x_stone_sizesmall = -0.0587 :

*This indicates that the odds of treatment success are approximately 5.7% lower when using treatment B for patients with small stones compared to those receiving treatment A with large stones.
treatment_B = 0.1147:

*This means that the odds of treatment success increase by about 12.1% when patients receive treatment B instead of treatment A, holding stone size constant. # Final Results
Treatment B tends to improve the odds of success compared to treatment A, but the negative interaction with small stone size suggests that this improvement is diminished in patients with smaller stones.
Having a small stone significantly reduces the odds of success, which implies that stone size plays a critical role in treatment outcomes. # Answering questions :::{.tip-note} are smaller stones more likely to result in a successful treatment ? Answer: Yes The coefficient for stone_size_small is negative (-1.1242), suggesting that smaller stones decrease the odds of success compared to larger stones, after controlling for treatment effects. :::

Is Treatment A significantly more effective than Treatment B? Answer: “No” The coefficient for treatment_B is positive (0.1147), indicating that Treatment B has a slightly positive effect compared to Treatment A. Therefore, Treatment A is not significantly more effective than Treatment B.