EDA TAVI

Caratteristiche del campione, divise per setting. Ho diviso basandomi sul ID (non so se ha senso): A = PD, B = FIN, C = niente

                                    Stratified by centre
                                     Overall           a                
  n                                   562               141             
  PAPS_11nov (median [IQR])          36.0 [30.0, 42.0] 36.0 [30.0, 41.0]
  gender_1_men = 1 (%)                233 (41.5)         67 (47.5)      
  ageatprocedure (median [IQR])      83.0 [80.0, 87.0] 83.0 [80.0, 86.0]
  NHYA_baseline (%)                                                     
     1                                 14 ( 2.5)         13 ( 9.2)      
     2                                265 (47.2)         57 (40.4)      
     3                                270 (48.0)         63 (44.7)      
     4                                 13 ( 2.3)          8 ( 5.7)      
  Test_cognitivo (%)                                                    
     0                                145 (25.8)         39 (27.7)      
     1                                112 (19.9)         48 (34.0)      
     2                                131 (23.3)         40 (28.4)      
     3                                 62 (11.0)          5 ( 3.5)      
     4                                 40 ( 7.1)          6 ( 4.3)      
     5                                 39 ( 6.9)          2 ( 1.4)      
     6                                 31 ( 5.5)          0 ( 0.0)      
     7                                  2 ( 0.4)          1 ( 0.7)      
  BADL (%)                                                              
     1                                 36 ( 6.4)          4 ( 2.8)      
     2                                  5 ( 0.9)          3 ( 2.1)      
     3                                 24 ( 4.3)          2 ( 1.4)      
     4                                 68 (12.1)          6 ( 4.3)      
     5                                144 (25.7)         33 (23.4)      
     6                                283 (50.5)         93 (66.0)      
  IADL (median [IQR])                 6.0 [5.0, 8.0]    7.0 [5.0, 8.0]  
  MNA_sh (median [IQR])              11.0 [10.0, 12.0] 11.0 [10.0, 12.0]
  LVEF (median [IQR])                56.0 [48.0, 62.0] 59.0 [54.0, 64.0]
  Grad_picco (median [IQR])          73.0 [58.7, 88.0] 76.0 [67.0, 87.0]
  Grad_medio (median [IQR])          45.0 [33.0, 59.0] 47.0 [39.0, 55.0]
  AVAplan (median [IQR])              0.5 [0.4, 0.6]    0.4 [0.4, 0.5]  
  Crea_pre_op_feb2024 (median [IQR])  0.9 [0.7, 1.2]    0.8 [0.7, 1.0]  
  CKDEPI_Syn (median [IQR])          65.5 [47.8, 79.8] 75.6 [60.6, 83.8]
  RHYTHM_1FA2PM3RS (%)                                                  
     1                                 74 (13.2)         34 (24.1)      
     2                                 31 ( 5.5)         12 ( 8.5)      
     3                                457 (81.3)         95 (67.4)      
  RHYTHM_FA = 1 (%)                   105 (18.7)         46 (32.6)      
  BAV_1 (%)                                                             
     0                                 56 (10.0)          0 ( 0.0)      
     1                                477 (84.9)        126 (89.4)      
     2                                 29 ( 5.2)         15 (10.6)      
  LBBB_10 (%)                                                           
     0                                 60 (10.7)          0 ( 0.0)      
     1                                488 (86.8)        127 (90.1)      
     2                                 14 ( 2.5)         14 ( 9.9)      
  Composite_rechek = 1 (%)             78 (13.9)         10 ( 7.1)      
  Decesso_10 = 1 (%)                   75 (13.3)         10 ( 7.1)      
                                    Stratified by centre
                                     b                 c                
  n                                    37               384             
  PAPS_11nov (median [IQR])          36.0 [30.0, 41.0] 36.0 [30.0, 42.0]
  gender_1_men = 1 (%)                 14 (37.8)        152 (39.6)      
  ageatprocedure (median [IQR])      84.0 [82.0, 87.0] 83.0 [80.0, 87.0]
  NHYA_baseline (%)                                                     
     1                                  0 ( 0.0)          1 ( 0.3)      
     2                                 21 (56.8)        187 (48.7)      
     3                                 14 (37.8)        193 (50.3)      
     4                                  2 ( 5.4)          3 ( 0.8)      
  Test_cognitivo (%)                                                    
     0                                 31 (83.8)         75 (19.5)      
     1                                  6 (16.2)         58 (15.1)      
     2                                  0 ( 0.0)         91 (23.7)      
     3                                  0 ( 0.0)         57 (14.8)      
     4                                  0 ( 0.0)         34 ( 8.9)      
     5                                  0 ( 0.0)         37 ( 9.6)      
     6                                  0 ( 0.0)         31 ( 8.1)      
     7                                  0 ( 0.0)          1 ( 0.3)      
  BADL (%)                                                              
     1                                  1 ( 2.9)         31 ( 8.1)      
     2                                  0 ( 0.0)          2 ( 0.5)      
     3                                  1 ( 2.9)         21 ( 5.5)      
     4                                  2 ( 5.7)         60 (15.6)      
     5                                 15 (42.9)         96 (25.0)      
     6                                 16 (45.7)        174 (45.3)      
  IADL (median [IQR])                  NA [NA, NA]      6.0 [5.0, 7.0]  
  MNA_sh (median [IQR])              12.0 [11.0, 12.0] 11.0 [10.0, 13.0]
  LVEF (median [IQR])                60.0 [55.0, 62.0] 55.0 [46.0, 60.0]
  Grad_picco (median [IQR])          83.5 [70.0, 95.5] 71.0 [56.0, 88.0]
  Grad_medio (median [IQR])          52.0 [45.0, 62.0] 44.0 [29.0, 61.0]
  AVAplan (median [IQR])               NA [NA, NA]      0.5 [0.4, 0.6]  
  Crea_pre_op_feb2024 (median [IQR])   NA [NA, NA]      0.9 [0.8, 1.3]  
  CKDEPI_Syn (median [IQR])          75.0 [55.0, 88.0] 61.0 [43.2, 77.6]
  RHYTHM_1FA2PM3RS (%)                                                  
     1                                  4 (10.8)         36 ( 9.4)      
     2                                  1 ( 2.7)         18 ( 4.7)      
     3                                 32 (86.5)        330 (85.9)      
  RHYTHM_FA = 1 (%)                     5 (13.5)         54 (14.1)      
  BAV_1 (%)                                                             
     0                                 30 (81.1)         26 ( 6.8)      
     1                                  7 (18.9)        344 (89.6)      
     2                                  0 ( 0.0)         14 ( 3.6)      
  LBBB_10 (%)                                                           
     0                                 33 (89.2)         27 ( 7.0)      
     1                                  4 (10.8)        357 (93.0)      
     2                                  0 ( 0.0)          0 ( 0.0)      
  Composite_rechek = 1 (%)              6 (16.2)         62 (16.1)      
  Decesso_10 = 1 (%)                    6 (16.2)         59 (15.4)      

Uso A+B come derivation cohort. C sarà la mia validation cohort. Su questa faccio factor analysis e alleno un modello di regressione logistica penalizzato (LASSO), un modello di ensamble (Gradient boosting) simile a random forest ottimizzato, un support vector machine e una regressione logistica fittata secondo le mie intuizioni (mMod - my model).

Decido di utilizzare 3 fattori (elbow method sul grafico sotto). Identifico i pesi delle variabili in 3 fattori (tabella)e costruisco gli score


Loadings:
               ML1    ML2    ML3   
PAPS_11nov            -0.393       
gender_1_men           0.354  0.447
ageatprocedure                     
NHYA_baseline                      
BADL                   0.575       
MNA_sh                 0.645       
LVEF                         -0.560
CKDEPI_Syn                         
RHYTHM_FA                          
BAV_1           1.005              
LBBB_10         0.569         0.301

                 ML1   ML2   ML3
SS loadings    1.373 1.109 0.675
Proportion Var 0.125 0.101 0.061
Cumulative Var 0.125 0.226 0.287
Warning in ci.auc.roc(roc, ...): ci.auc() of a ROC curve with AUC == 1 is
always 1-1 and can be misleading.

model_name aucTRAIN X95CI_LB X95CI_UB
XGB XGB 1.0000000 1.0000000 1.0000000
SVM SVM 0.9722222 0.9073854 1.0000000
LASSO LASSO 0.9043210 0.8333223 0.9620811
ML2 ML2 0.8417108 0.7407297 0.9224096
myMod myMod 0.7843915 0.6551808 0.9023589
ML3 ML3 0.5515873 0.3776235 0.7096561
ML1 ML1 0.4129189 0.3116953 0.5092593

Valuto la capacità discriminativa - nel test - dei vari score

model_name aucTEST X95CI_LB X95CI_UB
LASSO LASSO 0.8988179 0.8519297 0.9403013
ML2 ML2 0.8700661 0.8101564 0.9245173
XGB XGB 0.8199760 0.7668766 0.8692772
SVM SVM 0.8194751 0.7712282 0.8649081
myMod myMod 0.7783510 0.7161384 0.8320032
ML3 ML3 0.6073432 0.5300967 0.6772315
ML1 ML1 0.4578241 0.4106291 0.5016035