A new study came out claiming stents increase mortality and it set indian twitterverse ablaze. You can read the study here.
The authors ought to be congratulated for carrying a study examining indian stent database.considering lack of dedicated registry and appropriate use criteria here, as well as open-sourcing the dataset
While stents definitely have limited benefit in chronic stable angina(mild freedom from angina in stable single vessel disease,no mortality benefit). Stents have definite mortality benefit in unstable patients- Acute coronary syndromes(STEMI and high risk NSTEMI ) and 2 vessel stable Angina.
Given the fact that 50% of cohort was acute coronary syndrome, i was surprised at these findings.
i and many others on twitter were intringued thinking that higher number of stents indeed might be a proxy for severe disease because though the thrombogenic potential of stents is well known , 15 year [long term RCT data of stents in stable angina)[https://www.nejm.org/doi/full/10.1056/nejmoa1505532] has failed to show a higher mortality(has not reduced it as well) with symptomatic benefit.
Also there are clear cut benefits of stents in Acute MI .
I was also surprised that in one of the figures while the total number of stents did not attain statistical significance in multivariate model to account for confounders, it was claimed as a predictor of mortality.
library(knitr)
include_graphics('stenter.png')
I communicated with authors who were generous with comments and explained that the explanatory model was secondary objective of the study(primary being comparison of DES n BMS) and they used a random forest model which does not generated traditional effect sizes and also on univariate analysis stent number was indeed associated with mortality(however it was not on adjusting for confounders)
However i remain in doubt because explanatory figure indicates a tradtional statistical model was indeed used(chi square n p values) which failed to show statistical significance for number of stents in multivariate model on adjusting and hence authors used the black box random forest variable importance plot for their assertion.
The authors said they used random forest model because of -a) non linear effect b)imbalance. Howver in my opinion it is a weak criticism as in medicine we need explanatory models as well and imbalance and non linear issues can be taken care of by adding penalised logistic regression and highe rorder splines for explanatory variables.
While i was a big fan of AI/ML methods, and have used them to develop models for prediction of mortality in hemorrhagic stroke, i have changed my mind on the subject and believe in spirit of parsimony and a simple logistic regression with appropriate modifications can do the job as well with additional advantage of giving effect sizes.
you can read the take down of ML models by Frank Harrell here
Also of note, in MI the databases often dont measure really granular details like vitals well and often have coding errors, so IMO while relying on them for stent utilization might be a good idea, realizing on them to predict outcomes is surely a hit and miss.
Lets read the data
library(tidyverse)
stents=read_csv('https://doi.org/10.1371/journal.pone.0196830.s006')
stents11= stents %>% mutate(status_death=ifelse(stents$status_death==0,"Alive","Dead")) %>% select(CAD_History,status_death,Age_tmnt,Sex_tmnt,total_length,total_number_of_stents,Site_ID,year_ptca,
smoke,Diabetes,Hypertension,Employed_status,min_min_width,Education_level) %>% drop_na()
we see mortality by number of stents
knitr::kable(stents11 %>% mutate(status_death=as.factor(status_death),stents=as.factor(total_number_of_stents ))%>% group_by(status_death,stents) %>% summarise(n=n()) %>%
ungroup() %>%
group_by(stents) %>%
mutate(per=paste0(round(n/sum(n)*100, 2), "%")) %>% arrange(stents))
| status_death | stents | n | per |
|---|---|---|---|
| Alive | 1 | 2324 | 96.23% |
| Dead | 1 | 91 | 3.77% |
| Alive | 2 | 1439 | 94.73% |
| Dead | 2 | 80 | 5.27% |
| Alive | 3 | 298 | 93.42% |
| Dead | 3 | 21 | 6.58% |
| Alive | 4 | 45 | 91.84% |
| Dead | 4 | 4 | 8.16% |
| Alive | 5 | 5 | 83.33% |
| Dead | 5 | 1 | 16.67% |
So indeed mortality rises with number of stents placed,but as discussed before it can be reflective of sicker patients as well, and its true value can be discovered only after adjusting for confounders.
Let us analyse by site
stents11 %>% mutate(status_death=as.factor(status_death),site=as.factor(Site_ID) )%>% group_by(status_death,site) %>% count() %>% ungroup %>%
# rename(status_death=as.factor(status_death),year=as.factor(year_ptca)) %>%
ggplot(aes(x = status_death, y = n, fill = site)) +
geom_bar(position = "fill" ,stat="identity") +coord_flip()
we see there is no difference in percent of alive and dead dpending upon site
Let us analyse by CAD history-
knitr::kable(stents11 %>% mutate(status_death=as.factor(status_death),history=as.factor(CAD_History) )%>% group_by(status_death,history) %>% summarise(n=n())%>%
mutate(freq = n / sum(n))
)
| status_death | history | n | freq |
|---|---|---|---|
| Alive | No | 3517 | 0.8555096 |
| Alive | Yes | 594 | 0.1444904 |
| Dead | No | 182 | 0.9238579 |
| Dead | Yes | 15 | 0.0761421 |
we see a very weird finding that those with history of CAD are less likely to die in the database
Lets plot it
stents11 %>% mutate(status_death=as.factor(status_death),history=as.factor(CAD_History) )%>% group_by(status_death,history) %>% count() %>% ungroup %>%
# rename(status_death=as.factor(status_death),year=as.factor(year_ptca)) %>%
ggplot(aes(x = status_death, y = n, fill = history)) +
geom_bar(position = "fill" ,stat="identity") +coord_flip()
Lets visualise mortality by year
stents11 %>% mutate(status_death=as.factor(status_death),year=as.factor(year_ptca) )%>% group_by(status_death,year_ptca) %>% count() %>% group_by(year_ptca) %>%
mutate(freq = n / sum(n)) %>% arrange(year_ptca)
## # A tibble: 10 x 4
## # Groups: year_ptca [5]
## status_death year_ptca n freq
## <fct> <int> <int> <dbl>
## 1 Alive 2012 645 0.949
## 2 Dead 2012 35 0.0515
## 3 Alive 2013 1117 0.956
## 4 Dead 2013 52 0.0445
## 5 Alive 2014 607 0.951
## 6 Dead 2014 31 0.0486
## 7 Alive 2015 1009 0.957
## 8 Dead 2015 45 0.0427
## 9 Alive 2016 733 0.956
## 10 Dead 2016 34 0.0443
we see there is average 1 year mortality of 4.4-5% in PCI while it is around 1.7-2% in USA. But it depends on a lot of factor and in my opinion cant be attributed to PCI alone.
stents11 %>% mutate(status_death=as.factor(status_death),year=as.factor(year_ptca) )%>% group_by(status_death,year_ptca) %>% count() %>% ungroup %>% #%>%
# rename(status_death=as.factor(status_death),year=as.factor(year_ptca)) %>%
ggplot(aes(x = year_ptca, y = n, fill = status_death)) +
geom_bar(position = "fill" ,stat="identity") +coord_flip()
Lets plot utilisation
knitr::include_graphics('stentra.png')
we see while totalstent utilisation has remained constant , DES have overtaken BMS due to fall in cost.
Ok now lets visualise models
Lets assess performance of traditional logistic regression model with explanatory variables used by the paper. Age,Sex,total_length,total_number_of_stents,Site_ID,year_ptca, Diabetes,Hypertension,Employed_status,min_min_width,Education_level
library(randomForest)
library(caret)
fitControl <- trainControl(method = "repeatedcv",number = 5,repeats = 5,
## Estimate class probabilities
classProbs = TRUE,
## Evaluate performance using
## the following function
summaryFunction = twoClassSummary)
modelLR <- train(status_death ~., # Survived is a function of the variables we decided to include
data = stents11, # Use the trainSet dataframe as the training data
method = "glm",# Use the "logistic regression" algorithm
family = "binomial",
metric = "ROC",
trControl = fitControl
)
modelLR
## Generalized Linear Model
##
## 4308 samples
## 13 predictors
## 2 classes: 'Alive', 'Dead'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 5 times)
## Summary of sample sizes: 3447, 3446, 3447, 3446, 3446, 3446, ...
## Resampling results:
##
## ROC Sens Spec
## 0.6175299 1 0
we see traditional logistic regression model has an AUROC of 0.61 (0.5 is as good as tossing a coin),So it is not a good performance as we would expect a good model to have at least 0.8 or above.
lets see summary of model
summary(modelLR)
##
## Call:
## NULL
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -0.6473 -0.3432 -0.2782 -0.2223 2.8348
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 111.129747 119.377654 0.931 0.352
## CAD_HistoryYes -0.571956 0.275627 -2.075 0.038 *
## Age_tmnt 0.040256 0.008036 5.010 5.45e-07 ***
## Sex_tmntMale 0.074827 0.185619 0.403 0.687
## total_length 0.003753 0.005720 0.656 0.512
## total_number_of_stents 0.216882 0.160762 1.349 0.177
## `Site_IDOut of Mumbai` 0.003958 0.170019 0.023 0.981
## year_ptca -0.058032 0.059308 -0.978 0.328
## `smokeNon Smoker` -0.127865 0.215976 -0.592 0.554
## `smokePast Smoker` 0.032775 0.272804 0.120 0.904
## DiabetesYes 0.144985 0.157838 0.919 0.358
## HypertensionYes -0.318571 0.157391 -2.024 0.043 *
## Employed_statusYes -0.181647 0.179349 -1.013 0.311
## min_min_width 0.031934 0.184056 0.173 0.862
## Education_levelSome_Higher -0.048303 0.156358 -0.309 0.757
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1600.4 on 4307 degrees of freedom
## Residual deviance: 1539.2 on 4293 degrees of freedom
## AIC: 1569.2
##
## Number of Fisher Scoring iterations: 6
we see while Age,CAD History and Hypertension ar related with mortality,total number of stents is not.
Lets see if we can get better model effect by using non linear effect on age which is an important variable
library(rms)
stents12=stents11 %>% mutate(status_death=ifelse(status_death=="Dead",1,0))
ddd <- datadist(stents12)
options( datadist = "ddd" )
f4=lrm(status_death~rcs(Age_tmnt,4)+rcs(total_length,3)+total_number_of_stents+CAD_History+smoke+
Diabetes+Hypertension+Employed_status+min_min_width+Education_level+Site_ID,data=stents11)
print(anova(f4))
## Wald Statistics Response: status_death
##
## Factor Chi-Square d.f. P
## Age_tmnt 27.50 3 <.0001
## Nonlinear 1.36 2 0.5062
## total_length 2.09 2 0.3511
## Nonlinear 1.68 1 0.1944
## total_number_of_stents 1.66 1 0.1977
## CAD_History 4.58 1 0.0324
## smoke 0.77 2 0.6819
## Diabetes 0.80 1 0.3705
## Hypertension 4.28 1 0.0385
## Employed_status 0.96 1 0.3262
## min_min_width 0.05 1 0.8166
## Education_level 0.04 1 0.8366
## Site_ID 0.12 1 0.7271
## TOTAL NONLINEAR 2.99 3 0.3937
## TOTAL 59.78 15 <.0001
plot(anova(f4))
we see Age ,CAD Hisry, Hypertension are important on splines.
Lets visualise performance
f5=update(f4, x=TRUE , y=TRUE)
##bootstrap-validate
kk_rcs=validate(f5, B=200, digits=2)
CalculateAucFromDxy <- function(validate) {
## Test if the object is correct
stopifnot(class(validate) == "validate")
## Calculate AUCs from Dxy's
aucs <- (validate["Dxy", c("index.orig","training","test","optimism","index.corrected")])/2 + 0.5
## Get n
n <- validate["Dxy", c("n")]
## Combine as result
res <- rbind(validate, AUC = c(aucs, n))
## Fix optimism
res["AUC","optimism"] <- res["AUC","optimism"] - 0.5
## Return results
res
}
knitr::kable(CalculateAucFromDxy(kk_rcs))
| index.orig | training | test | optimism | index.corrected | n | |
|---|---|---|---|---|---|---|
| Dxy | 0.3121648 | 0.3451806 | 0.2828722 | 0.0623084 | 0.2498565 | 200 |
| R2 | 0.0468070 | 0.0578431 | 0.0380714 | 0.0197716 | 0.0270354 | 200 |
| Intercept | 0.0000000 | 0.0000000 | -0.5451435 | 0.5451435 | -0.5451435 | 200 |
| Slope | 1.0000000 | 1.0000000 | 0.8111273 | 0.1888727 | 0.8111273 | 200 |
| Emax | 0.0000000 | 0.0000000 | 0.1650013 | 0.1650013 | 0.1650013 | 200 |
| D | 0.0143981 | 0.0179074 | 0.0116519 | 0.0062555 | 0.0081425 | 200 |
| U | -0.0004643 | -0.0004643 | 0.0007468 | -0.0012110 | 0.0007468 | 200 |
| Q | 0.0148624 | 0.0183717 | 0.0109051 | 0.0074666 | 0.0073958 | 200 |
| B | 0.0429553 | 0.0428236 | 0.0431470 | -0.0003234 | 0.0432788 | 200 |
| g | 0.6796081 | 0.7708733 | 0.6169520 | 0.1539213 | 0.5256868 | 200 |
| gp | 0.0278185 | 0.0305516 | 0.0250552 | 0.0054964 | 0.0223221 | 200 |
| AUC | 0.6560824 | 0.6725903 | 0.6414361 | 0.0311542 | 0.6249282 | 200 |
we see AUC rise slightly to 0.622 after corrected.
Lets add penalization to account for imbalance
options(scipen=999)
penalty <- pentrace(f5, penalty = 0:30)
mod1_pen <- update(f5, penalty=penalty$penalty)
kk1=validate(mod1_pen, B=200, digits=2)
knitr::kable(CalculateAucFromDxy(kk1))
| index.orig | training | test | optimism | index.corrected | n | |
|---|---|---|---|---|---|---|
| Dxy | 0.3110610 | 0.3415548 | 0.2930288 | 0.0485260 | 0.2625349 | 200 |
| R2 | 0.0394375 | 0.0465180 | 0.0391848 | 0.0073332 | 0.0321043 | 200 |
| Intercept | 0.0000000 | 0.0000000 | 0.0262773 | -0.0262773 | 0.0262773 | 200 |
| Slope | 1.0000000 | 1.0000000 | 1.0050002 | -0.0050002 | 1.0050002 | 200 |
| Emax | 0.0000000 | 0.0000000 | 0.0067797 | 0.0067797 | 0.0067797 | 200 |
| D | 0.0133709 | 0.0142400 | 0.0120014 | 0.0022386 | 0.0111322 | 200 |
| U | -0.0004643 | -0.0004643 | -0.0000470 | -0.0004173 | -0.0000470 | 200 |
| Q | 0.0138351 | 0.0147043 | 0.0120483 | 0.0026559 | 0.0111792 | 200 |
| B | 0.0429987 | 0.0425399 | 0.0430957 | -0.0005559 | 0.0435546 | 200 |
| g | 0.5426337 | 0.5815942 | 0.5774444 | 0.0041498 | 0.5384839 | 200 |
| gp | 0.0240655 | 0.0253774 | 0.0253127 | 0.0000647 | 0.0240007 | 200 |
| AUC | 0.6555305 | 0.6707774 | 0.6465144 | 0.0242630 | 0.6312675 | 200 |
| we see maxim | um AUROC achi | eved is 0.63 |
Lets now see performance of superstar AI algorithms
Random forest first
modelRF <- train(status_death ~., # Survived is a function of the variables we decided to include
data = stents11, # Use the trainSet dataframe as the training data
method = "rf",# Use the "random forest" algorithm
metric = "ROC",
trControl = fitControl)
library(knitr)
include_graphics('rf.png')
we see AUROC is 0.60. , below logistic regression.
Lets see explanatory variables-
varImpPlot(modelRF$finalModel)
library(knitr)
include_graphics('rf1.png')
while we get variable importance, we dont get effect sizes or statistical significance.
Lets see naive bayes now
modelNB <- train(status_death ~., # Survived is a function of the variables we decided to include
data = stentsml, # Use the trainSet dataframe as the training data
method = "nb",# Use the "naive bayes" algorithm
metric = "ROC",
trControl = fitControl
)
library(knitr)
include_graphics('nb.png')
Here also AUROC is same with similar predictor variables
Thus logistic regression outperforms Random forest and naive bayes if performed with suitable optimisation accounting for non-linear veffects and penalization and also gives us effect sizes.
1.Model of reality(cognitive) model must take priority over method and machine algorithms
2.Logistic regression is fast and can perform as better as ML algorithms or even better in small data set upto 5-10000 sample sizes(till there are many complex interactions)
3.Logistic regression is an explanatory model and hence can help us see effect sizes and avoid erroneous mistakes like stent number is associated with mortality just based on variable importance plot
4.EHR Databases while,great for indicating utilisation n audits (the purposes they were made for) are poor for outcome predction as they dont reflect Key clinical variables like(BP,killip class,pulse) eg and hence should be taken with a pinch of salt
5.While there have been genuine advances in Deep learning models, prediction algorithms are overhyped and good model n good data trumps everything
6.We should welcome the call for auditing and stenting registries in india and use of appropriate utisation criteria to benchmark the hospitals