Rapport de Stage - SESSTIM

Méthodes

Analyse de la survie nette

L’estimation de la survie nette se fait grâce à l’estimateur de Pohar-Perme, un estimateur consistant de la survie nette. La formule mathématique de calcul s’écrit :

\(\hat{\Lambda}_e(t) = \int_0^t \left( \frac{dN^w(u)}{Y^w(u)} \right) - \int_0^t \left( \frac{\sum_{i=1}^n Y_i^w(u) \lambda_{p,i}(u)}{Y^w(u)} \right) du\)

Où :

Dans notre étude, deux tables de mortalité (française et anglaise) seront utilisées.

Modèle de régression classique pour la mortalité en excès (Modèle d’Estève)

Le modèle additif de mortalité en excès est formulé ainsi :

\(\lambda_0(t|z) = \lambda_E(t|z) + \lambda_P(a + t | z_P)\)

\(a\) représente l’âge au moment du diagnostic, \(t\) le temps écoulé depuis ce diagnostic, et \(z\) un vecteur de covariables d’intérêt, incluant les variables populationnelles \(z_P\), telles que l’âge, le sexe et l’indice de déprivation. Dans ce modèle, \(λ_0\) représente la mortalité observée, \(λ_E\) la mortalité en excès (due au cancer) et \(λ_P\) la mortalité attendue, qui provient des tables de mortalité et n’est pas estimée directement à partir des données.

Estève et al. (1990) proposent :

\(\lambda_E(t|z) = \sum_{k=1}^K \exp(\tau_k) I_k(t) \exp(\beta^T z)\)

Où :

Modèle à risques populationnels proportionnels (Modèle de Touraine)

Le modèle Estève n’est pas adapté lorsque des covariables susceptibles d’influencer la mortalité en excès sont absentes de la table de mortalité de la population générale, ce qui induit des biais dans l’estimation des effets sur la mortalité en excès (Dickman et al., 1998 ; Grafféo et al., 2012). Pour répondre à ce problème, Touraine et al. (2020) ont proposé un modèle de régression multivarié à risques proportionnels pour la mortalité populationnelle, visant à estimer la mortalité en excès lorsque la mortalité attendue dans la population étudiée diffère de celle de la population générale. En plus des composantes définies dans le modèle Estève, ce modèle de régression s’écrit :

\(\lambda_0(t|z) = \lambda_E(t|z) + \sum_{r=1}^R \alpha_r I(x = r) \lambda_P(a + t | z_P)\)

\(\alpha_r\) ajuste la mortalité attendue pour la variable catégorielle \(x \notin z_P\).

Modèle à risques dépendant de l’âge (Modèle de Mba)

La principale limite du modèle Touraine et al. réside dans son hypothèse de constance du risque populationnel \(α_r\) dans le temps (ou en fonction de l’âge atteint), une hypothèse qui n’est pas toujours réaliste. Pour répondre à ce problème, Mba RD et al. (2021) propose un modèle qui se distingue par une plus grande souplesse, car il abandonne cette hypothèse par morceaux. Ce modèle proposé est une généralisation du modèle Touraine et al.. En plus des éléments définis précédemment, le modèle de régression à risques proportionnels dépendant du temps s’écrit comme suit :

\(\lambda_0(t|z) = \lambda_E(t|z) + \sum_{r=1}^R \sum_{s=1}^{S+1} \alpha_{rs} I_s(a + t | x = r) \lambda_P(a + t | z_P)\)

\(I_s(a + t) = 1\) si \(\epsilon_{s-1} \leq a + t < \epsilon_s\), 0 sinon.

Modèle à effets aléatoires de fragilité (Modèle de Rubio)

Ce modèle intègre un effet aléatoire de fragilité \(γ\) qui corrige le risque populationnel. Cependant cette hypothèse n’est pas toujours réaliste dans les études de population, à cause de la diversité des caractéristiques sociodémographiques ou populationnelles qui ne sont pas toujours disponibles dans les tables de mortalité. De ce fait, au lieu de supposer \(γ\) constant, il est pris comme une variable aléatoire positive continue qui suit une loi de Gamma. L’équation mathématique de ce modèle s’écrit :

\(\lambda_0(t|z) = \lambda_E(t|z) + \gamma \cdot \lambda_P(a + t, y + t | z_P)\)

\(\gamma \sim \text{Gamma}\) : variable aléatoire positive représentant la fragilité individuelle.

Modèle flexible B-spline redimensionné (Modèle de Goungounga)

Ce modèle, développé par Juste GOUNGOUNGA et al., est une extension du modèle paramétrique flexible de mortalité en excès proposé par Giorgi et al. . Dans le modèle de Giorgi et al., la mortalité en excès de base et les covariables dépendant du temps sont tous deux modélisés à l’aide de fonctions B-spline spécifiques. Plus précisément, pour un degré de flexibilité plus élevé, Giorgi et al. ont utilisé des B-splines quadratiques (ordre 3) et deux nœuds intérieurs. Pour simplifier, seule l’hypothèse d’effets proportionnels des covariables pronostiques est considérée. La version simplifiée du modèle de Giorgi et al. pour un patient \(i\) s’écrit comme suit :

\(\lambda_0(t_i|Z_i) = \left[ \sum_{j=-2}^2 \nu_j B_{j,3}(t_i) \right] \exp(\beta^T X_i) + \lambda_P(t_i|Z_i)\)

\(ν_j\) représente les coefficients des B-splines. \(X\) représente le vecteur des covariables ayant des effets proportionnels. Selon Cheuvart et Ryan, la mortalité due à d’autres causes peut être ajustée en appliquant un facteur d’échelle α au taux de mortalité de la population générale, tel qu’il est estimé à partir des tables de mortalité. Le nouveau modèle flexible, appelé modèle de B-spline redimensionné (“rescaled B-spline model” ou RBS), se formule ainsi :

\(\lambda_0(t_i|Z_i) = \left[ \sum_{j=-2}^2 \nu_j B_{j,3}(t_i) \right] \exp(\beta^T X_i) + \alpha \lambda_P(t_i|Z_i)\)

Les packages nécéssaires

# Chargement des packages nécéssaires
library(dplyr)
library(lubridate)
library(survival)
library(tidyverse)
library(relsurv)
library(gtsummary)
library(randomForest)
library(readxl)
library(flextable)
library(gt)
library(huxtable)
library(kableExtra)
library(survminer)
library(ggplot2)
library(tidyr)
library(arsenal)
library(coin)
library(cardx)
library(gridExtra)
library(cowplot)
library(officer)
library(xhaz)
library(mexhaz)
library(data.table)
library(tibble)
library(knitr)
library(stringr)
library(numDeriv)
library(compiler)

I. Données du Registre Bourguignon des Cancers Digestifs

I.1. DATA MANAGEMENT

# DATA MANAGEMENT 
# Chargement du jeu de données 
CCR.2171.RE <- read.csv("./CCR_2171_RE_2007_2011_pour_envoi.csv")

# Recodage et traitement des données 
CCR.2171.RENEW <- CCR.2171.RE %>%
    mutate(
      # Mettre au format date
        date_naiss = dmy(date_naiss),
        date_dg = dmy(date_dg),
        date_ddn = dmy(date_ddn),
        date_chir = dmy(date_chir),
        date_admc = dmy("31/08/2018"),
        date_ref = dmy("01/01/1960"),
      
      # Création de la variable annee_dgjr en jours 
        annee_dgjr = as.numeric(as.integer(difftime(date_dg, 
                      date_ref, units = "days"))),
      
      # Création de la variable sexe
        Sexe = factor(sexe, levels = c(1, 2), 
                      labels = c("homme", "femme")),
        
      # Création de la variable age_dg en année 
        age_dg = as.numeric(difftime(date_dg, 
                      date_naiss, units = "days"))/365.241,
      
      # Création de la variable Age_dg en jours
        Age_dg = as.integer(age_dg*365.241),
      
      # Création de la variable age_dgcl en classes
        age_dgcl = as.factor(ifelse(age_dg < 45, "0-44", 
                 ifelse(age_dg < 60 & age_dg >= 45, "45-59", 
                 ifelse(age_dg < 70 & age_dg >= 60, "60-69",
                 ifelse(age_dg < 85 & age_dg >= 70, "70-85", 
                 "85-105"))))),
      
      # Création de la variable time en années
        time = as.numeric(difftime(date_ddn, 
                    date_chir, units = "days")) / 365.241,
      
      # Création de la variable time_chir en annnées 
        time_chir = as.numeric(difftime(date_chir,
                    date_dg, units = "days")) / 365.241,
      
      # Création de la variable Time_chir en jours
        Time_chir = time_chir*365.241,
      
      # Création de la variable time en années
        time_admc = as.numeric(difftime(date_admc, 
                    date_chir, units = "days")) / 365.241,
      
      # Création de la variable Time en jours 
        Time = time*365.241,
      
      # Création de la variable charlsoncl en classes
        charlsoncl = as.factor(ifelse(charlson == 0, "0",
                   ifelse(charlson == 1, "1", 
                   ifelse(charlson == 2, "2",
                   ifelse(charlson == 3, "3","4+"))))),

      # Création de la variable Localisation en factor
        LOC = factor(loc, levels = c(4, 5, 6), 
                    labels = c("colon", "JRS", "rectum")),
      
      # Création de la variable Departement en factor
        Departement = factor(dept, levels = c(21, 71), 
                labels = c("Côte d'Or", "Saone et Loire")),
      
      # Recodage de la variable dept en factor
        dept = as.factor(dept),
      
      # Recodage de la variable sexe en factor 
        sexe = as.factor(sexe),
      
      # Création de la variable temps de censure time_cens
        time_cens = pmin(time, time_admc)*365.241,
      
      # Création de la variable temps de censure en années
        Time_cens = time_cens / 365.241,
      
      # Création de la variable statut de la censure à 10 ans 
        statut_dc_cens = as.integer(ifelse(time > time_admc, 
                         0, statut_dc)),
      
      # Création de la variable Annee_dg pour les années 
        Annee_dg = as.factor(year(date_dg)),
      
      # Création de la variable Statut en factor
        Statut = factor(statut_dc_cens, levels = c(0, 1), 
                       labels = c("vivant", "mort")),
      
      # Création de la variable Stade en factor 
        Stade = factor(stade, levels = c(1, 2, 3, 4), 
                       labels = c("I", "II", "III", "IV")),
      
      # Création de la variable localisation en factor
        Loc = as.factor(ifelse(loc == 4 | loc == 5,
                       "Colon","Rectum")),
      
      # Recodage de la variable obesite en factor 
        obesite = factor(obesite, levels = c(0, 1), 
                       labels = c("non", "oui")),
      
      # Recodage de la variable rt en factor
        rt = factor(rt, levels = c(0, 1), 
                    labels = c("non_faite", "faite")),
      
      # Recodage de la variable ct en factor
        ct = factor(ct, levels = c(0, 1), 
                    labels = c("non_faite", "faite")),
      
      # Recodage de la variable urgence en factor
        urgence = factor(urgence, levels = c(0, 1), 
                    labels=c("opération élective",
                             "opéré en urgence"))
      )

      # Affectation 
        Anneejrs <- 365.241

      # Suppression des individus ayant des âges < 45 et > 83 ans
        CCR.2171.RENEW_P <- CCR.2171.RENEW[
                    CCR.2171.RENEW$age_dg <= 83 
                    & CCR.2171.RENEW$age_dg >= 45, ]

      # Création de la variable age_dgccl
        CCR.2171.RENEW_P <- CCR.2171.RENEW_P %>%
          mutate(
          # Création de la variable âge en classes
          age_cl = as.factor(
                 ifelse(age_dg >= 45 & age_dg < 60,"45-59",
                 ifelse(age_dg >= 60 & age_dg < 70,
                 "60-69","70-83"))),

      # Création de la variable âge en classes
        age_cl2 = as.factor(
                ifelse(age_dg >= 45 & age_dg < 70,
                "45-69", "70-83")),

     # Création de la variable stade2 en factor 
       stade2 = factor(
               ifelse(Stade == "III" | Stade == "IV", 
               "III-IV", "I-II")),

     # Définition de la modalité de référence pour stade2
       stade2 = relevel(stade2, ref = "I-II"),
  
     # Création de la variable stade2 en factor 
      stade3 = factor(
             ifelse(Stade == "III" | Stade == "IV", 
             "III-IV", "I-II")),

     # Définition de la modalité de référence pour stade3
       stade3 = relevel(stade3, ref = "III-IV"),
  
     # Création de la variable Stade2 en factor 
       Stade2 = factor(
              ifelse(Stade == "III" | Stade == "IV", "III-IV",
              ifelse(Stade == "I", "I", "II"))),
  
     # Définition de la modalité de référence pour Stade2
       Stade2 = relevel(Stade2, ref = "III-IV"),
  
     # Création de la variable Annee_dg2
       Annee_dg2 = factor(
                 ifelse(Annee_dg == "2007" | Annee_dg == "2008",
                 "2007-2008", "2009-2011")),

     # Création de la variable charlsoncl2 en classes
       charlsoncl2 = as.factor(
                   ifelse(charlson == 0, "0",
                   ifelse(charlson == 1, "1", 
                   ifelse(charlson == 2, "2", ">2")))),

     # Définition de la modalité de référence pour Stade2
       charlsoncl2 = relevel(charlsoncl2, ref = "0"),
     
     # Création de la variable charlsoncl2 en classes
       charlsoncl3 = as.factor(
                   ifelse(charlson == 0, "0",
                   ifelse(charlson == 1 | charlson == 2, "1-2", 
                   ifelse(charlson == 3 | charlson == 4, "3-4", 
                   ">4")))),

     # Création de la variable temps de censure time_cens
       time_cens5 = pmin(time, 5)*365.241,
      
     # Création de la variable temps de censure en années
       Time_cens5 = time_cens5 / 365.241,
      
     # Création de la variable statut de la censure à 5 ans 
       statut_cens5 = ifelse(time > 5, 0, statut_dc),
     
     # Création de la variable statut de la censure avec label
      Statut_cens5 = factor(statut_cens5, levels = c(0, 1), 
                     labels = c("vivant", "mort"))
  )

# Traitement des valeurs manquantes 
CCR.2171.RENEW_P <- CCR.2171.RENEW_P %>%
  dplyr::select(Time_cens5, statut_cens5, 
        time_cens5, age_cl, sexe, dept,
        dept, Annee_dg, charlsoncl, Stade, date_dg, Loc, loc, 
        age_dg, Age_dg, annee_dgjr, Sexe, time, 
        time_chir, stade3, Time_chir, charlson, 
        Departement, Statut, stade,age_cl2, Statut_cens5,
        Stade2, stade2, Annee_dg2, charlsoncl2, charlsoncl3) %>% 
  na.omit()

# Chargement de la table de mortalité 
setwd("E:/PARCOURS M2 MQERS 2024_2025/MON STAGE/DATASTAGE")
load("mua7522_lisse_rt.RData")

## Création des deux bases : colon+JRS et rectum
CCR.2171.RENEW_P_colon <- filter(CCR.2171.RENEW_P, loc != 6)
CCR.2171.RENEW_P_rectum <- CCR.2171.RENEW_P[
                                CCR.2171.RENEW_P$loc == 6, ]

# Sélection des individus ayant des âges entre 80 et 85
CCR.2171.RENEW_C <- CCR.2171.RENEW %>%
  filter(age_dg <= 85 & age_dg >= 80) %>%
  mutate(
    # Définition de l'âge en classe
     age_group = ifelse(age_dg > 80 & age_dg <= 81, "80-81",
              ifelse(age_dg > 81 & age_dg <= 82, "81-82",
              ifelse(age_dg > 82 & age_dg <= 83, "82-83",
              ifelse(age_dg > 83 & age_dg <= 84, "83-84",
              "84-85")))))

# Importation du fichier ratetable 
lifetable <- read.csv(
       "./mortalite_2171_2007_2020_lisse_HCL.csv")

# Recodage des variables 
lifetable <- lifetable %>%
  mutate(
  # Recodage de la variable sexe en factor 
    sexe = factor(sexe, levels = c(1,2), 
                  labels= c("Homme","Femme")),
    Dept = factor(dept, levels = c(21, 71),
                  labels = c("CO", "SL"))
  )

I.2. Critère de sélection des individus d’étude

Conceptuellement, et pour des raisons mathématiques bien documentées, il est difficile d’estimer la survie nette dans une population très âgée. En effet, au-delà d’un certain âge, il devient peu pertinent d’estimer la survie dans un monde hypothétique où les sujets seraient encore en vie, alors qu’ils sont presque certainement décédés dans le monde réel (par exemple, la survie nette à 15 ans pour un sujet âgé de 95 ans). La première figure illustre les difficultés d’estimation de la survie nette chez les personnes âgées de plus de 85 ans. De plus, la deuxième figure en annexe suggère également des problèmes d’estimation avant 5 ans de suivi chez les sujets âgés de 83 à 85 ans. Par conséquent, les individus âgés de plus de 83 ans ont été exclus de l’analyse. D’autre part, les cancers colorectaux survenant avant l’âge de 45 ans présentent des caractéristiques cliniques particulières et ont donc également été exclus. Enfin, nous avons procédé à la suppression des observations comportant des valeurs manquantes. Le temps de suivi présente trois valeurs manquantes, qui ont conduit à l’exclusion de ces observations incomplètes. La variable localisation de la tumeur comporte quant à elle 11 valeurs manquantes, qui ont également été supprimées. Enfin, la variable stade TNM contient 8 valeurs inconnues, codées « 9 », qui ont été retirées de manière analogue afin d’assurer la qualité des analyses. Cette suppression a été rendue possible par des taux de non-réponse inférieurs à 5\(\%\) pour chacune de ces variables, ce qui limite l’impact de ces exclusions sur la représentativité de l’échantillon et garantit la validité des analyses statistiques. Après ces exclusions, la population d’étude se compose de 1868 individus.

# Estimation de la survie nette 
net_surv_age1 <- rs.surv(Surv(time_cens, statut_dc_cens) ~ age_dgcl,
              ratetable = mua7522.rt ,data = CCR.2171.RENEW,
              rmap = list(age = Age_dg,sex = sexe,dept=dept,
              year = annee_dgjr), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_age1, 
           data = CCR.2171.RENEW,
           xscale = Anneejrs,
           axes.offset= FALSE,
           xlim = c(-0.3*Anneejrs, 12.5*Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor=FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2*Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab="Temps après chirurgie (années)",
           ylab="Probabilité de survie nette",
           risk.table.title="Sujets à risque (censures)",
           legend.title="Âge",
           legend.labs=c("0-45","85-105","70-85","60-70","45-60"),
           legend = c(0.13, 0.27), 
           font.title=c(15,"bold","black"),
           conf.int.style = "step"
)

# Estimation de la survie nette 
net_surv_age2 <- rs.surv(Surv(time_cens, statut_dc_cens)~ age_group,
              ratetable = mua7522.rt ,data = CCR.2171.RENEW_C,
              rmap = list(age = Age_dg,sex = sexe,dept=dept,
              year = annee_dgjr), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_age2, 
           data = CCR.2171.RENEW_C,
           xscale = Anneejrs,
           axes.offset= FALSE,
           xlim = c(-0.3*Anneejrs, 12.5*Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor=FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2*Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab="Temps après chirurgie (années)",
           ylab="Probabilité de survie nette",
           risk.table.title="Sujets à risque (censures)",
           legend.title="Âge",
           legend.labs=c("80-81","84-85","83-84","82-83","81-82"),
           legend= c(0.13, 0.27), 
           font.title=c(15,"bold","black"),
           conf.int.style = "step"
)

I.3. Analyse descriptive du registre

I.3.1. Selon la localisation du cancer

# Contrôles pour la table
tb1_controls <- tableby.control(
  test = TRUE,
  cat.simplify = FALSE,
  digits = 1)

# Labels pour les statistiques
tb_stats_labels <- c(
  meansd = "Moyenne-ET",
  q1q3 = "Q1-Q3",
  median = "Médiane",
  range = "Min - Max"
)

# Traductions des labels pour les variables
mylabels <- list(
  age_dg = "Âge au diagnostic",
  time = "Délai entre dernier nouvel et chirurgie (An)",
  Time_chir = "Délai entre chirurgie et diagnostic (Jrs)",
  Stade = "Stade TNM",
  Sexe = "Sexe",
  obesite = "Obésité",
  charlsoncl = "Index de Charlson"
)

# Création de la table par localisation du cancer
table_three <- tableby(
  Loc ~ medtest(age_dg, "median", "q1q3") +
         medtest(time, "median", "q1q3") +
         medtest(Time_chir, "median", "q1q3") +
         Stade + Sexe + Departement + Annee_dg +
         charlson + charlsoncl + age_cl,
  data = CCR.2171.RENEW_P,
  test = TRUE,
  control = tb1_controls
)

# Résumé des statistiques descriptives
kable(summary(
  table_three,
  title = "Statistiques descriptives de la population selon la localisation du cancer",
  labelTranslations = mylabels,
  stats.labels = tb_stats_labels
  )
)
Colon (N=1446) Rectum (N=422) Total (N=1868) p value
Âge au diagnostic < 0.001
   Médiane 71.0 67.1 70.1
   Q1-Q3 62.3, 77.3 58.7, 74.3 61.2, 76.5
Délai entre dernier nouvel et chirurgie (An) 0.580
   Médiane 7.9 7.8 7.9
   Q1-Q3 5.6, 9.8 5.5, 9.5 5.5, 9.7
Délai entre chirurgie et diagnostic (Jrs) < 0.001
   Médiane 23.0 112.0 30.0
   Q1-Q3 11.0, 39.0 80.0, 134.0 14.0, 63.2
Stade TNM < 0.001
   I 345 (23.9%) 206 (48.8%) 551 (29.5%)
   II 611 (42.3%) 104 (24.6%) 715 (38.3%)
   III 431 (29.8%) 97 (23.0%) 528 (28.3%)
   IV 59 (4.1%) 15 (3.6%) 74 (4.0%)
Sexe 0.071
   homme 823 (56.9%) 261 (61.8%) 1084 (58.0%)
   femme 623 (43.1%) 161 (38.2%) 784 (42.0%)
Departement 0.487
   Côte d’Or 665 (46.0%) 186 (44.1%) 851 (45.6%)
   Saone et Loire 781 (54.0%) 236 (55.9%) 1017 (54.4%)
Annee_dg 0.787
   2007 296 (20.5%) 84 (19.9%) 380 (20.3%)
   2008 326 (22.5%) 92 (21.8%) 418 (22.4%)
   2009 252 (17.4%) 84 (19.9%) 336 (18.0%)
   2010 253 (17.5%) 76 (18.0%) 329 (17.6%)
   2011 319 (22.1%) 86 (20.4%) 405 (21.7%)
charlson 0.003
   Moyenne-ET 0.8 (1.2) 0.6 (1.1) 0.8 (1.2)
   Min - Max 0.0 - 8.0 0.0 - 7.0 0.0 - 8.0
Index de Charlson 0.006
   0 815 (56.4%) 271 (64.2%) 1086 (58.1%)
   1 303 (21.0%) 87 (20.6%) 390 (20.9%)
   2 183 (12.7%) 40 (9.5%) 223 (11.9%)
   3 84 (5.8%) 10 (2.4%) 94 (5.0%)
   4+ 61 (4.2%) 14 (3.3%) 75 (4.0%)
age_cl < 0.001
   45-59 302 (20.9%) 119 (28.2%) 421 (22.5%)
   60-69 381 (26.3%) 131 (31.0%) 512 (27.4%)
   70-83 763 (52.8%) 172 (40.8%) 935 (50.1%)

I.3.2. Selon le statut vital

# Affectation des resultats à l'object tab
tab <- CCR.2171.RENEW_P %>%
  dplyr::select(-c(Time_cens5, statut_cens5, loc, sexe, 
        dept, Age_dg, annee_dgjr, time_chir, date_dg, 
        stade, time_cens5)) %>% 
  tbl_summary(
    by = Statut,
    type = c(age_dg, time, Time_chir, charlson) ~ "continuous2",
    statistic = all_continuous2() ~ c("{mean} ({median}), {sd}({min}-{max})", "{p25}-{p75}"),
    percent = "row"
  ) %>%
  add_overall(last = TRUE) %>%
  bold_labels()

# Sélection uniquement des colonnes "mort" et "Total" 
tab %>%
  modify_table_styling(columns = "stat_1", hide = TRUE) %>%
  as_kable()
Characteristic mort N = 718 Overall N = 1,868
age_cl
45-59 95 (23%) 421 (100%)
60-69 151 (29%) 512 (100%)
70-83 472 (50%) 935 (100%)
Annee_dg
2007 179 (47%) 380 (100%)
2008 169 (40%) 418 (100%)
2009 126 (38%) 336 (100%)
2010 120 (36%) 329 (100%)
2011 124 (31%) 405 (100%)
charlsoncl
0 334 (31%) 1,086 (100%)
1 163 (42%) 390 (100%)
2 112 (50%) 223 (100%)
3 61 (65%) 94 (100%)
4+ 48 (64%) 75 (100%)
Stade
I 147 (27%) 551 (100%)
II 265 (37%) 715 (100%)
III 256 (48%) 528 (100%)
IV 50 (68%) 74 (100%)
Loc
Colon 559 (39%) 1,446 (100%)
Rectum 159 (38%) 422 (100%)
age_dg
Mean (Median), SD(Min-Max) 72 (74), 9(46-83) 68 (70), 10(45-83)
Q1-Q3 66-79 61-76
Sexe
homme 458 (42%) 1,084 (100%)
femme 260 (33%) 784 (100%)
time
Mean (Median), SD(Min-Max) 4.52 (4.25), 2.57(0.25-11.39) 7.36 (7.90), 3.02(0.25-13.55)
Q1-Q3 2.37-6.55 5.55-9.71
stade3
III-IV 306 (51%) 602 (100%)
I-II 412 (33%) 1,266 (100%)
Time_chir
Mean (Median), SD(Min-Max) 49 (28), 57(0-445) 49 (30), 52(0-445)
Q1-Q3 11-62 14-64
charlson
Mean (Median), SD(Min-Max) 1 (1), 1(0-7) 1 (0), 1(0-8)
Q1-Q3 0-2 0-1
Departement
Côte d’Or 321 (38%) 851 (100%)
Saone et Loire 397 (39%) 1,017 (100%)
age_cl2
45-69 246 (26%) 933 (100%)
70-83 472 (50%) 935 (100%)
Statut_cens5
vivant 299 (21%) 1,449 (100%)
mort 419 (100%) 419 (100%)
Stade2
III-IV 306 (51%) 602 (100%)
I 147 (27%) 551 (100%)
II 265 (37%) 715 (100%)
stade2
I-II 412 (33%) 1,266 (100%)
III-IV 306 (51%) 602 (100%)
Annee_dg2
2007-2008 348 (44%) 798 (100%)
2009-2011 370 (35%) 1,070 (100%)
charlsoncl2
0 334 (31%) 1,086 (100%)
>2 109 (64%) 169 (100%)
1 163 (42%) 390 (100%)
2 112 (50%) 223 (100%)
charlsoncl3
>4 22 (67%) 33 (100%)
0 334 (31%) 1,086 (100%)
1-2 275 (45%) 613 (100%)
3-4 87 (64%) 136 (100%)

I.3.3. Selon la localisation du cancer et du statut vital

# Affectation des resultats à l'object tab
tab <- CCR.2171.RENEW_P %>%
  dplyr::select(-c(Time_cens5, statut_cens5, loc, sexe, 
        dept, Age_dg, annee_dgjr, time_chir, date_dg, 
        stade, time_cens5)) %>% 
  tbl_strata(
    strata = Loc,  
    .tbl_fun = ~ .x %>%
      tbl_summary(
        by = Statut, 
        type = c(age_dg, time, Time_chir, charlson) ~ "continuous2",
        statistic = all_continuous2() ~ c("{mean} ({median}), {sd}({min}-{max})", "{p25}-{p75}"),
        percent = "row"
      ) %>%
      add_overall(last = TRUE)
  ) %>%
  bold_labels()

# Sélection uniquement des colonnes "mort" et "Total"
tab %>%
  modify_table_styling(columns = c("stat_1_1", "stat_1_2"), 
  hide = TRUE)
Characteristic
Colon
Rectum
mort
N = 559
1
Overall
N = 1,446
1
mort
N = 159
1
Overall
N = 422
1
age_cl



    45-59 63 (21%) 302 (100%) 32 (27%) 119 (100%)
    60-69 115 (30%) 381 (100%) 36 (27%) 131 (100%)
    70-83 381 (50%) 763 (100%) 91 (53%) 172 (100%)
Annee_dg



    2007 141 (48%) 296 (100%) 38 (45%) 84 (100%)
    2008 131 (40%) 326 (100%) 38 (41%) 92 (100%)
    2009 98 (39%) 252 (100%) 28 (33%) 84 (100%)
    2010 94 (37%) 253 (100%) 26 (34%) 76 (100%)
    2011 95 (30%) 319 (100%) 29 (34%) 86 (100%)
charlsoncl



    0 253 (31%) 815 (100%) 81 (30%) 271 (100%)
    1 122 (40%) 303 (100%) 41 (47%) 87 (100%)
    2 90 (49%) 183 (100%) 22 (55%) 40 (100%)
    3 56 (67%) 84 (100%) 5 (50%) 10 (100%)
    4+ 38 (62%) 61 (100%) 10 (71%) 14 (100%)
Stade



    I 96 (28%) 345 (100%) 51 (25%) 206 (100%)
    II 223 (36%) 611 (100%) 42 (40%) 104 (100%)
    III 199 (46%) 431 (100%) 57 (59%) 97 (100%)
    IV 41 (69%) 59 (100%) 9 (60%) 15 (100%)
age_dg



    Mean (Median), SD(Min-Max) 72 (75), 9(46-83) 69 (71), 10(45-83) 70 (71), 9(47-83) 66 (67), 9(46-83)
    Q1-Q3 67-79 62-77 62-78 59-74
Sexe



    homme 353 (43%) 823 (100%) 105 (40%) 261 (100%)
    femme 206 (33%) 623 (100%) 54 (34%) 161 (100%)
time



    Mean (Median), SD(Min-Max) 4.54 (4.26), 2.60(0.25-11.39) 7.38 (7.92), 3.04(0.25-13.55) 4.47 (4.08), 2.45(0.57-10.86) 7.28 (7.80), 2.94(0.57-12.57)
    Q1-Q3 2.37-6.59 5.57-9.80 2.49-6.35 5.51-9.50
stade3



    III-IV 240 (49%) 490 (100%) 66 (59%) 112 (100%)
    I-II 319 (33%) 956 (100%) 93 (30%) 310 (100%)
Time_chir



    Mean (Median), SD(Min-Max) 31 (21), 41(0-402) 31 (23), 35(0-402) 109 (112), 62(0-445) 110 (112), 55(0-445)
    Q1-Q3 8-38 11-39 68-140 80-134
charlson



    Mean (Median), SD(Min-Max) 1 (1), 1(0-7) 1 (0), 1(0-8) 1 (0), 1(0-7) 1 (0), 1(0-7)
    Q1-Q3 0-2 0-1 0-1 0-1
Departement



    Côte d'Or 251 (38%) 665 (100%) 70 (38%) 186 (100%)
    Saone et Loire 308 (39%) 781 (100%) 89 (38%) 236 (100%)
age_cl2



    45-69 178 (26%) 683 (100%) 68 (27%) 250 (100%)
    70-83 381 (50%) 763 (100%) 91 (53%) 172 (100%)
Statut_cens5



    vivant 234 (21%) 1,121 (100%) 65 (20%) 328 (100%)
    mort 325 (100%) 325 (100%) 94 (100%) 94 (100%)
Stade2



    III-IV 240 (49%) 490 (100%) 66 (59%) 112 (100%)
    I 96 (28%) 345 (100%) 51 (25%) 206 (100%)
    II 223 (36%) 611 (100%) 42 (40%) 104 (100%)
stade2



    I-II 319 (33%) 956 (100%) 93 (30%) 310 (100%)
    III-IV 240 (49%) 490 (100%) 66 (59%) 112 (100%)
Annee_dg2



    2007-2008 272 (44%) 622 (100%) 76 (43%) 176 (100%)
    2009-2011 287 (35%) 824 (100%) 83 (34%) 246 (100%)
charlsoncl2



    0 253 (31%) 815 (100%) 81 (30%) 271 (100%)
    >2 94 (65%) 145 (100%) 15 (63%) 24 (100%)
    1 122 (40%) 303 (100%) 41 (47%) 87 (100%)
    2 90 (49%) 183 (100%) 22 (55%) 40 (100%)
charlsoncl3



    >4 15 (65%) 23 (100%) 7 (70%) 10 (100%)
    0 253 (31%) 815 (100%) 81 (30%) 271 (100%)
    1-2 212 (44%) 486 (100%) 63 (50%) 127 (100%)
    3-4 79 (65%) 122 (100%) 8 (57%) 14 (100%)
1 n (%)

I.4. Graphiques des tables de mortalité

# Taux de mortalité par âge, année et département, 
# avec facettes par sexe
ggplot(data = lifetable) + 
  aes(agerev, mua, color = interaction(annee, Dept)) +
  geom_line() +
  facet_wrap(~ sexe) +
  xlab("Âge") + 
  ylab("Taux de mortalité")

# Taux de mortalité par âge, année et département, 
# avec ajout de points pour les âges entre 85 et 100
ggplot(data = lifetable) + 
  aes(agerev, mua, color = interaction(annee, Dept)) +
  geom_line() + 
  geom_point(aes(agerev,dc/parev),shape = 4,size=0.1) + 
  xlim(85, 100) +
  facet_wrap(~ sexe) +
  xlab("Âge") + 
  ylab("Taux de mortalité")

# Taux de mortalité par âge, année et sexe, 
# avec facettes par département
ggplot(data = lifetable) + 
  aes(agerev, mua, color = interaction(annee, sexe)) +
  geom_line() +
  facet_wrap(~ Dept) +
  xlab("Âge") + 
  ylab("Taux de mortalité")

I.5. Courbe de survie nette

I.5.1. Pour le groupe côlon + rectum

# Distribution de la survie globale et de la survie nette
# Kaplan Meier : Côlon + Rectum
KM <- survfit(Surv(time_cens5, statut_cens5) ~ 1,
                  data = CCR.2171.RENEW_P)

# Pohar Perme : Côlon + Rectum
net_surv <- rs.surv(Surv(time_cens5, statut_cens5) ~ 1,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Créer une liste contenant les deux ajustements
fit <- list("Survie brute" = KM, "Survie nette" = net_surv)

# Tracer les courbes de survie avec les intervalles
# de confiance en pointillés
ggsurvplot_combine(
  fit,
  data = CCR.2171.RENEW_P,
  conf.int = TRUE, 
  censor = FALSE, 
  palette = c("blue", "red"),
  xscale = Anneejrs,
  xlab = "Temps après chirurgie (années)", 
  ylab = "Probabilité de survie",
  legend.title = "Méthodes",
  legend.labs = c("Survie brute", "Survie nette")
)

#Estimation de la survie nette : Côlon + Rectum
# Selon la variable Stade
net_survst <- rs.surv(Surv(time_cens5, statut_cens5) ~ stade2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survst,
  data = CCR.2171.RENEW_P,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Stade",
  legend.labs = levels(CCR.2171.RENEW_P$stade2),
  legend = c(0.13, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon + Rectum
# Selon la variable sexe
net_survse <- rs.surv(Surv(time_cens5, statut_cens5) ~ Sexe,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survse,
  data = CCR.2171.RENEW_P,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Sexe",
  legend.labs = levels(CCR.2171.RENEW_P$Sexe),
  legend = c(0.13, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon + Rectum
# Selon la variable âge
net_survag <- rs.surv(Surv(time_cens5, statut_cens5) ~ age_cl,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survag,
  data = CCR.2171.RENEW_P,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Âge",
  legend.labs = levels(CCR.2171.RENEW_P$age_cl),
  legend = c(0.14, 0.3),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon + Rectum
# Selon la variable le score de Charlson (charlsoncl)
net_survch <- rs.surv(Surv(time_cens5, statut_cens5) ~ charlsoncl2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survch,
  data = CCR.2171.RENEW_P,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Charlson",
  legend.labs = levels(CCR.2171.RENEW_P$charlsoncl2),
  legend = c(0.13, 0.4),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon + Rectum
# Selon la variable Departement
net_survde <- rs.surv(Surv(time_cens5, statut_cens5) ~ Departement,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survde,
  data = CCR.2171.RENEW_P,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Département",
  legend.labs = levels(CCR.2171.RENEW_P$Departement),
  legend = c(0.4, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon + Rectum
# Selon la variable Departement
net_survlo <- rs.surv(Surv(time_cens5, statut_cens5) ~ Loc,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survlo,
  data = CCR.2171.RENEW_P,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Localisation",
  legend.labs = levels(CCR.2171.RENEW_P$Loc),
  legend = c(0.4, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon + Rectum
# Selon la variable Annee_dg
net_survan <- rs.surv(Surv(time_cens5, statut_cens5) ~ Annee_dg,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survan,
  data = CCR.2171.RENEW_P,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Annee au diagnostic",
  legend.labs = levels(CCR.2171.RENEW_P$Annee_dg),
  legend = c(0.45, 0.27),
  font.title = c(15, "bold", "black")
)

I.5.2. Pour le groupe côlon

# Distribution de la survie globale et de la survie nette
# Kaplan Meier : Côlon
KMc <- survfit(Surv(time_cens5, statut_cens5) ~ 1,
                  data = CCR.2171.RENEW_P_colon)

# Pohar Perme : Côlon
net_survc <- rs.surv(Surv(time_cens5, statut_cens5) ~ 1,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Créer une liste contenant les deux ajustements
fitc <- list("Survie brute" = KMc, "Survie nette" = net_survc)

# Tracer les courbes de survie avec les intervalles
# de confiance en pointillés
ggsurvplot_combine(
  fitc,
  data = CCR.2171.RENEW_P_colon,
  conf.int = TRUE, 
  censor = FALSE, 
  palette = c("blue", "red"),
  xscale = Anneejrs,
  xlab = "Temps après chirurgie (années)", 
  ylab = "Probabilité de survie",
  legend.title = "Méthodes",
  legend.labs = c("Survie brute", "Survie nette")
)

# Estimation de la survie nette : Côlon
# Selon la variable Stade
net_survstc <- rs.surv(Surv(time_cens5, statut_cens5) ~ stade2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survstc,
  data = CCR.2171.RENEW_P_colon,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Stade",
  legend.labs = levels(CCR.2171.RENEW_P_colon$stade2),
  legend = c(0.13, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon
# Selon la variable sexe
net_survsec <- rs.surv(Surv(time_cens5, statut_cens5) ~ Sexe,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survsec,
  data = CCR.2171.RENEW_P_colon,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Sexe",
  legend.labs = levels(CCR.2171.RENEW_P_colon$Sexe),
  legend = c(0.13, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon
# Selon la variable âge
net_survagc <- rs.surv(Surv(time_cens5, statut_cens5) ~ age_cl,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survagc,
  data = CCR.2171.RENEW_P_colon,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Âge",
  legend.labs = levels(CCR.2171.RENEW_P_colon$age_cl),
  legend = c(0.14, 0.3),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon
# Selon la variable le score de Charlson (charlsoncl)
net_survchc <- rs.surv(Surv(time_cens5, statut_cens5) ~ charlsoncl2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survchc,
  data = CCR.2171.RENEW_P_colon,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Charlson",
  legend.labs = levels(CCR.2171.RENEW_P_colon$charlsoncl2),
  legend = c(0.13, 0.4),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon 
# Selon la variable Departement
net_survdec <- rs.surv(Surv(time_cens5, statut_cens5) ~ Departement,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survdec,
  data = CCR.2171.RENEW_P_colon,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Département",
  legend.labs = levels(CCR.2171.RENEW_P_colon$Departement),
  legend = c(0.4, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Côlon
# Selon la variable Annee_dg
net_survanc <- rs.surv(Surv(time_cens5, statut_cens5) ~ Annee_dg,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survanc,
  data = CCR.2171.RENEW_P_colon,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Annee au diagnostic",
  legend.labs = levels(CCR.2171.RENEW_P_colon$Annee_dg),
  legend = c(0.45, 0.27),
  font.title = c(15, "bold", "black")
)

I.5.3. Pour le groupe rectum

# Distribution de la survie globale et de la survie nette
# Kaplan Meier : Rectum
KMr <- survfit(Surv(time_cens5, statut_cens5) ~ 1,
                  data = CCR.2171.RENEW_P_rectum)

# Pohar Perme : Rectum
net_survr <- rs.surv(Surv(time_cens5, statut_cens5) ~ 1,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Créer une liste contenant les deux ajustements
fitr <- list("Survie brute" = KMc, "Survie nette" = net_survc)

# Tracer les courbes de survie avec les intervalles
# de confiance en pointillés
ggsurvplot_combine(
  fitr,
  data = CCR.2171.RENEW_P_rectum,
  conf.int = TRUE, 
  censor = FALSE, 
  palette = c("blue", "red"),
  xscale = Anneejrs,
  xlab = "Temps après chirurgie (années)", 
  ylab = "Probabilité de survie",
  legend.title = "Méthodes",
  legend.labs = c("Survie brute", "Survie nette")
)

# Estimation de la survie nette : Rectum
# Selon la variable Stade
net_survstr <- rs.surv(Surv(time_cens5, statut_cens5) ~ stade2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survstr,
  data = CCR.2171.RENEW_P_rectum,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Stade",
  legend.labs = levels(CCR.2171.RENEW_P_rectum$stade2),
  legend = c(0.13, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Rectum
# Selon la variable sexe
net_survser <- rs.surv(Surv(time_cens5, statut_cens5) ~ Sexe,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survsec,
  data = CCR.2171.RENEW_P_rectum,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Sexe",
  legend.labs = levels(CCR.2171.RENEW_P_rectum$Sexe),
  legend = c(0.13, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Rectum
# Selon la variable âge
net_survagr <- rs.surv(Surv(time_cens5, statut_cens5) ~ age_cl,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survagr,
  data = CCR.2171.RENEW_P_rectum,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Âge",
  legend.labs = levels(CCR.2171.RENEW_P_rectum$age_cl),
  legend = c(0.14, 0.3),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Rectum
# Selon la variable le score de Charlson (charlsoncl)
net_survchr <- rs.surv(Surv(time_cens5, statut_cens5) ~ charlsoncl2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survchr,
  data = CCR.2171.RENEW_P_rectum,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Charlson",
  legend.labs = levels(CCR.2171.RENEW_P_rectum$charlsoncl2),
  legend = c(0.13, 0.4),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Rectum
# Selon la variable Departement
net_survder <- rs.surv(Surv(time_cens5, statut_cens5) ~ Departement,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survder,
  data = CCR.2171.RENEW_P_rectum,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Département",
  legend.labs = levels(CCR.2171.RENEW_P_rectum$Departement),
  legend = c(0.4, 0.27),
  font.title = c(15, "bold", "black")
)

# Estimation de la survie nette : Rectum
# Selon la variable Annee_dg
net_survanr <- rs.surv(Surv(time_cens5, statut_cens5) ~ Annee_dg,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

ggsurvplot(
  net_survanr,
  data = CCR.2171.RENEW_P_rectum,
  xscale = Anneejrs,
  axes.offset = FALSE,
  xlim = c(-0.3*Anneejrs, 5.5 * Anneejrs),
  conf.int = TRUE,
  censor = FALSE,
  risk.table = "nrisk_cumcensor",
  palette = "jco",
  break.x.by = 2 * Anneejrs,
  tables.height = 0.25,
  fontsize = 2.8,
  tables.y.text = FALSE,
  tables.theme = theme_cleantable(),
  xlab = "Temps après chirurgie (années)",
  ylab = "Probabilité de survie",
  risk.table.title = "Sujets à risque (censures)",
  legend.title = "Annee au diagnostic",
  legend.labs = levels(CCR.2171.RENEW_P_rectum$Annee_dg),
  legend = c(0.45, 0.27),
  font.title = c(15, "bold", "black")
)

I.6. Estimations de probabilité de survie nette

I.6.1. Pour le groupe côlon + rectum

# Estimation de la survie nette : Côlon & Rectum
# Selon la variable localisation 
net_survlop <- rs.surv(Surv(time_cens5, statut_cens5) ~ Loc,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survlop, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Loc, data = CCR.2171.RENEW_P, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Loc=Colon 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1407      39    0.993 0.00437        0.984        1.002
##   730   1329      77    0.959 0.00757        0.945        0.974
##  1826   1118     209    0.872 0.01273        0.847        0.897
## 
##                 Loc=Rectum 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    412      10    0.993  0.0076        0.979        1.008
##   730    390      22    0.957  0.0135        0.931        0.984
##  1826    325      62    0.859  0.0228        0.816        0.905
# Estimation de la survie nette : Côlon & Rectum
# Selon la variable Sexe
net_survsep <- rs.surv(Surv(time_cens5, statut_cens5) ~ Sexe,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survsep, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Sexe, data = CCR.2171.RENEW_P, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Sexe=homme 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1049      35    0.992 0.00554        0.981        1.003
##   730    986      62    0.956 0.00926        0.938        0.974
##  1826    811     174    0.859 0.01551        0.829        0.890
## 
##                 Sexe=femme 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    770      14    0.995 0.00481        0.986        1.005
##   730    733      37    0.962 0.00912        0.945        0.980
##  1826    632      97    0.884 0.01548        0.854        0.914
# Estimation de la survie nette : Côlon & Rectum
# Selon la variable Stade
net_survstp <- rs.surv(Surv(time_cens5, statut_cens5) ~ stade2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survstp, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ stade2, data = CCR.2171.RENEW_P, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 stade2=I-II 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1243      23    1.002 0.00387        0.994        1.009
##   730   1190      52    0.980 0.00704        0.967        0.994
##  1826   1044     143    0.929 0.01236        0.905        0.953
## 
##                 stade2=III-IV 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    576      26    0.976 0.00847        0.959        0.992
##   730    529      47    0.914 0.01398        0.887        0.941
##  1826    399     128    0.744 0.02182        0.702        0.788
# Estimation de la survie nette : Côlon & Rectum
# Selon la variable Departement
net_survdep <- rs.surv(Surv(time_cens5, statut_cens5) ~ Departement,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survdep, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Departement, 
##     data = CCR.2171.RENEW_P, ratetable = mua7522.rt, method = "pohar-perme", 
##     rmap = list(age = Age_dg, sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Departement=Côte d'Or 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    830      21    0.994 0.00546        0.983        1.004
##   730    785      45    0.957 0.00967        0.939        0.977
##  1826    652     131    0.852 0.01653        0.821        0.885
## 
##                 Departement=Saone et Loire 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    989      28    0.992 0.00527        0.982        1.003
##   730    934      54    0.960 0.00903        0.943        0.978
##  1826    791     140    0.883 0.01502        0.854        0.913
# Estimation de la survie nette : Côlon & Rectum
# Selon la variable age_cl
net_survagp <- rs.surv(Surv(time_cens5, statut_cens5) ~ age_cl,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survagp, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ age_cl, data = CCR.2171.RENEW_P, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 age_cl=45-59 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    419       2    1.000 0.00336        0.994        1.007
##   730    408      11    0.980 0.00851        0.963        0.997
##  1826    359      48    0.881 0.01768        0.847        0.917
## 
##                 age_cl=60-69 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    498      14    0.983 0.00728        0.969        0.998
##   730    478      19    0.958 0.01109        0.936        0.980
##  1826    416      61    0.871 0.01831        0.836        0.908
## 
##                 age_cl=70-83 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    902      33    0.995 0.00626        0.983        1.008
##   730    833      69    0.950 0.01100        0.929        0.972
##  1826    668     162    0.863 0.01814        0.828        0.899
# Estimation de la survie nette : Côlon & Rectum
# Selon la variable charlsoncl
net_survchp <- rs.surv(Surv(time_cens5, statut_cens5) ~ charlsoncl2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survchp, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ charlsoncl2, 
##     data = CCR.2171.RENEW_P, ratetable = mua7522.rt, method = "pohar-perme", 
##     rmap = list(age = Age_dg, sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 charlsoncl2=0 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1066      20    0.998 0.00418        0.990        1.006
##   730   1027      39    0.978 0.00719        0.964        0.992
##  1826    899     124    0.915 0.01274        0.891        0.940
## 
##                 charlsoncl2=>2 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    154      15    0.938  0.0225        0.895        0.984
##   730    126      28    0.790  0.0358        0.723        0.863
##  1826     91      35    0.624  0.0458        0.541        0.721
## 
##                 charlsoncl2=1 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    384       6    1.006 0.00641        0.994         1.02
##   730    370      13    0.997 0.01154        0.975         1.02
##  1826    294      75    0.868 0.02548        0.820         0.92
## 
##                 charlsoncl2=2 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    215       8    0.987  0.0127        0.963        1.013
##   730    196      19    0.927  0.0232        0.882        0.973
##  1826    159      37    0.832  0.0358        0.765        0.905
# Estimation de la survie nette : Côlon & Rectum
# Selon la variable Annee_dg
net_survanp <- rs.surv(Surv(time_cens5, statut_cens5) ~ Annee_dg,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survanp, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Annee_dg, 
##     data = CCR.2171.RENEW_P, ratetable = mua7522.rt, method = "pohar-perme", 
##     rmap = list(age = Age_dg, sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Annee_dg=2007 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    371       9    0.998 0.00806        0.982        1.014
##   730    350      21    0.961 0.01464        0.932        0.990
##  1826    288      60    0.861 0.02543        0.813        0.912
## 
##                 Annee_dg=2008 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    409       9    0.996 0.00727        0.982        1.010
##   730    389      20    0.969 0.01315        0.943        0.995
##  1826    333      56    0.897 0.02255        0.854        0.942
## 
##                 Annee_dg=2009 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    329       7    0.998 0.00793        0.983        1.014
##   730    314      15    0.974 0.01421        0.946        1.002
##  1826    264      49    0.880 0.02593        0.831        0.933
## 
##                 Annee_dg=2010 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    314      15    0.972  0.0118        0.949        0.996
##   730    294      19    0.932  0.0176        0.898        0.968
##  1826    250      44    0.852  0.0269        0.801        0.907
## 
##                 Annee_dg=2011 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    396       9    0.996 0.00749        0.982        1.011
##   730    372      24    0.953 0.01432        0.925        0.982
##  1826    308      62    0.853 0.02383        0.807        0.901

I.6.2. Pour le groupe côlon

# Analyse de la survie nette sur les données du côlon
# Selon la variable Sexe
net_survsepc <- rs.surv(Surv(time_cens5, statut_cens5) ~ Sexe,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survsepc, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Sexe, data = CCR.2171.RENEW_P_colon, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Sexe=homme 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    797      26    0.993 0.00629        0.981        1.005
##   730    747      49    0.956 0.01075        0.935        0.977
##  1826    614     132    0.860 0.01798        0.826        0.896
## 
##                 Sexe=femme 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    610      13    0.993 0.00583        0.982        1.004
##   730    582      28    0.963 0.01031        0.943        0.984
##  1826    504      77    0.888 0.01749        0.854        0.922
# Analyse de la survie nette sur les données du côlon
# Selon la variable Stade
net_survstpc <- rs.surv(Surv(time_cens5, statut_cens5) ~ stade2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survstpc, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ stade2, data = CCR.2171.RENEW_P_colon, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 stade2=I-II 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    940      16    1.004 0.00429        0.995        1.012
##   730    895      44    0.979 0.00836        0.962        0.995
##  1826    786     108    0.930 0.01446        0.902        0.959
## 
##                 stade2=III-IV 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    467      23    0.972 0.00975        0.953        0.991
##   730    434      33    0.921 0.01511        0.892        0.951
##  1826    332     101    0.759 0.02401        0.713        0.808
# Analyse de la survie nette sur les données du côlon
# Selon la variable Departement
net_survdepc <- rs.surv(Surv(time_cens5, statut_cens5) ~ Departement,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survdepc, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Departement, 
##     data = CCR.2171.RENEW_P_colon, ratetable = mua7522.rt, method = "pohar-perme", 
##     rmap = list(age = Age_dg, sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Departement=Côte d'Or 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    649      16    0.994  0.0061        0.983        1.006
##   730    614      35    0.959  0.0109        0.938        0.981
##  1826    509     104    0.853  0.0188        0.817        0.890
## 
##                 Departement=Saone et Loire 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    758      23    0.991 0.00621        0.979        1.004
##   730    715      42    0.959 0.01048        0.939        0.980
##  1826    609     105    0.889 0.01726        0.855        0.923
# Analyse de la survie nette sur les données du côlon
# Selon la variable age_cl
net_survagpc <- rs.surv(Surv(time_cens5, statut_cens5) ~ age_cl,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survagpc, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ age_cl, data = CCR.2171.RENEW_P_colon, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 age_cl=45-59 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    300       2    0.998 0.00468        0.989        1.008
##   730    292       8    0.977 0.01039        0.957        0.998
##  1826    262      30    0.894 0.02007        0.856        0.934
## 
##                 age_cl=60-69 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    372       9    0.987 0.00785        0.972        1.002
##   730    358      13    0.965 0.01221        0.941        0.989
##  1826    309      48    0.872 0.02126        0.831        0.914
## 
##                 age_cl=70-83 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    735      28    0.993 0.00705        0.980        1.007
##   730    679      56    0.949 0.01225        0.926        0.974
##  1826    547     131    0.864 0.02009        0.825        0.904
# Analyse de la survie nette sur les données du côlon
# Selon la variable charlsoncl2
net_survchpc <- rs.surv(Surv(time_cens5, statut_cens5) ~ charlsoncl2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survchpc, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ charlsoncl2, 
##     data = CCR.2171.RENEW_P_colon, ratetable = mua7522.rt, method = "pohar-perme", 
##     rmap = list(age = Age_dg, sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 charlsoncl2=0 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    800      15    0.997 0.00480        0.988        1.007
##   730    772      28    0.980 0.00818        0.964        0.996
##  1826    677      94    0.917 0.01475        0.888        0.946
## 
##                 charlsoncl2=>2 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    132      13    0.938  0.0245        0.891        0.987
##   730    108      24    0.790  0.0388        0.717        0.869
##  1826     77      31    0.617  0.0497        0.527        0.723
## 
##                 charlsoncl2=1 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    300       3    1.012 0.00588        1.001        1.024
##   730    288      11    1.001 0.01280        0.976        1.026
##  1826    232      55    0.888 0.02844        0.834        0.946
## 
##                 charlsoncl2=2 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    175       8    0.981  0.0154        0.951        1.012
##   730    161      14    0.930  0.0257        0.881        0.982
##  1826    132      29    0.848  0.0398        0.773        0.929
# Analyse de la survie nette sur les données du côlon
# Selon la variable Annee_dg
net_survanpc <- rs.surv(Surv(time_cens5, statut_cens5) ~ Annee_dg,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_colon,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survanpc, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Annee_dg, 
##     data = CCR.2171.RENEW_P_colon, ratetable = mua7522.rt, method = "pohar-perme", 
##     rmap = list(age = Age_dg, sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Annee_dg=2007 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    290       6    1.001 0.00838        0.985        1.018
##   730    273      17    0.963 0.01648        0.931        0.996
##  1826    228      44    0.874 0.02842        0.820        0.932
## 
##                 Annee_dg=2008 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    319       7    0.996 0.00825        0.980        1.013
##   730    305      14    0.975 0.01444        0.947        1.004
##  1826    258      47    0.894 0.02592        0.844        0.946
## 
##                 Annee_dg=2009 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    246       6    0.996 0.00976        0.977        1.015
##   730    235      11    0.973 0.01670        0.941        1.006
##  1826    197      38    0.876 0.03041        0.819        0.938
## 
##                 Annee_dg=2010 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    239      14    0.963  0.0147        0.935        0.992
##   730    223      15    0.922  0.0211        0.881        0.964
##  1826    188      35    0.837  0.0316        0.777        0.901
## 
##                 Annee_dg=2011 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    313       6    0.998 0.00778        0.983        1.013
##   730    293      20    0.955 0.01618        0.923        0.987
##  1826    247      45    0.872 0.02663        0.821        0.926

I.6.3. Pour le groupe rectum

# Analyse de la survie nette sur les données du rectum
# Selon la variable Sexe
net_survsepr <- rs.surv(Surv(time_cens5, statut_cens5) ~ Sexe,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survsepr, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Sexe, data = CCR.2171.RENEW_P_rectum, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Sexe=homme 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    252       9    0.987  0.0116        0.964        1.010
##   730    239      13    0.956  0.0181        0.922        0.993
##  1826    197      42    0.854  0.0305        0.796        0.916
## 
##                 Sexe=femme 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    160       1    1.002 0.00621        0.990        1.015
##   730    151       9    0.959 0.01947        0.921        0.998
##  1826    128      20    0.868 0.03315        0.805        0.935
# Analyse de la survie nette sur les données du rectum
# Selon la variable Stade
net_survstpr <- rs.surv(Surv(time_cens5, statut_cens5) ~ stade2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survstpr, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ stade2, data = CCR.2171.RENEW_P_rectum, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 stade2=I-II 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    303       7    0.994 0.00864        0.977        1.011
##   730    295       8    0.985 0.01263        0.961        1.010
##  1826    258      35    0.925 0.02359        0.880        0.973
## 
##                 stade2=III-IV 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    109       3    0.992  0.0157        0.961        1.023
##   730     95      14    0.880  0.0354        0.813        0.952
##  1826     67      27    0.676  0.0515        0.583        0.785
# Analyse de la survie nette sur les données du rectum
# Selon la variable Departement
net_survdepr <- rs.surv(Surv(time_cens5, statut_cens5) ~ Departement,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survdepr, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Departement, 
##     data = CCR.2171.RENEW_P_rectum, ratetable = mua7522.rt, method = "pohar-perme", 
##     rmap = list(age = Age_dg, sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Departement=Côte d'Or 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    181       5    0.989  0.0121        0.966        1.013
##   730    171      10    0.950  0.0208        0.911        0.992
##  1826    143      27    0.851  0.0343        0.787        0.921
## 
##                 Departement=Saone et Loire 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    231       5    0.996 0.00961        0.977        1.015
##   730    219      12    0.962 0.01756        0.929        0.997
##  1826    182      35    0.866 0.03039        0.808        0.927
# Analyse de la survie nette sur les données du rectum
# Selon la variable age_cl
net_survagpr <- rs.surv(Surv(time_cens5, statut_cens5) ~ age_cl,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survagpr, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ age_cl, data = CCR.2171.RENEW_P_rectum, 
##     ratetable = mua7522.rt, method = "pohar-perme", rmap = list(age = Age_dg, 
##         sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 age_cl=45-59 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    119       0    1.000  0.0000        1.000        1.000
##   730    116       3    0.986  0.0145        0.958        1.014
##  1826     97      18    0.849  0.0360        0.781        0.923
## 
##                 age_cl=60-69 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    126       5    0.972  0.0169        0.939        1.006
##   730    120       6    0.937  0.0247        0.889        0.986
##  1826    107      13    0.869  0.0359        0.801        0.942
## 
##                 age_cl=70-83 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    167       5    1.001  0.0133        0.975        1.028
##   730    154      13    0.954  0.0249        0.906        1.004
##  1826    121      31    0.859  0.0420        0.781        0.946
# Analyse de la survie nette sur les données du rectum
# Selon la variable charlsoncl2
net_survchpr <- rs.surv(Surv(time_cens5, statut_cens5) ~ charlsoncl2,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survchpr, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ charlsoncl2, 
##     data = CCR.2171.RENEW_P_rectum, ratetable = mua7522.rt, method = "pohar-perme", 
##     rmap = list(age = Age_dg, sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 charlsoncl2=0 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    266       5    0.997 0.00842        0.981        1.014
##   730    255      11    0.971 0.01495        0.942        1.001
##  1826    222      30    0.910 0.02519        0.862        0.961
## 
##                 charlsoncl2=>2 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     22       2    0.941  0.0564        0.837        1.059
##   730     18       4    0.787  0.0908        0.628        0.987
##  1826     14       4    0.667  0.1142        0.477        0.933
## 
##                 charlsoncl2=1 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     84       3    0.985  0.0199        0.947        1.025
##   730     82       2    0.974  0.0257        0.925        1.026
##  1826     62      20    0.800  0.0560        0.698        0.918
## 
##                 charlsoncl2=2 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     40       0    1.000  0.0000        1.000        1.000
##   730     35       5    0.909  0.0532        0.811        1.020
##  1826     27       8    0.760  0.0813        0.616        0.937
# Analyse de la survie nette sur les données du rectum
# Selon la variable Annee_dg
net_survanpr <- rs.surv(Surv(time_cens5, statut_cens5) ~ Annee_dg,
          ratetable = mua7522.rt ,data = CCR.2171.RENEW_P_rectum,
          rmap = list(age = Age_dg,sex = sexe,dept=dept,
          year = annee_dgjr), method = "pohar-perme")

# Affichage des résultats
summary(net_survanpr, times = c(1, 2, 5)*365.241)
## Call: rs.surv(formula = Surv(time_cens5, statut_cens5) ~ Annee_dg, 
##     data = CCR.2171.RENEW_P_rectum, ratetable = mua7522.rt, method = "pohar-perme", 
##     rmap = list(age = Age_dg, sex = sexe, dept = dept, year = annee_dgjr))
## 
##                 Annee_dg=2007 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     81       3    0.983  0.0212        0.942        1.026
##   730     77       4    0.945  0.0313        0.886        1.009
##  1826     60      16    0.815  0.0560        0.712        0.932
## 
##                 Annee_dg=2008 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     90       2    0.992  0.0153        0.963        1.023
##   730     84       6    0.946  0.0306        0.888        1.008
##  1826     75       9    0.907  0.0450        0.823        0.999
## 
##                 Annee_dg=2009 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     83       1    0.998  0.0119        0.975        1.022
##   730     79       4    0.975  0.0267        0.924        1.029
##  1826     67      11    0.892  0.0488        0.802        0.993
## 
##                 Annee_dg=2010 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     75       1    1.000  0.0131        0.975         1.03
##   730     71       4    0.967  0.0292        0.911         1.03
##  1826     62       9    0.904  0.0489        0.813         1.01
## 
##                 Annee_dg=2011 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     83       3    0.979  0.0200        0.941         1.02
##   730     79       4    0.946  0.0305        0.888         1.01
##  1826     61      17    0.780  0.0526        0.684         0.89

I.7. Modélisation

I.7.1. Choix du nombre d’intervalles pour la modélisation des taux de mortalité de base par morceaux pour les modèles : Estève, Touraine, Mba

La détermination du nombre d’intervalles, ainsi que la localisation de ces derniers, sur lesquels le taux de base en excès est supposé constant par morceaux, joue un rôle déterminant dans la précision des estimations. Plusieurs méthodes ont été proposées dans la littérature afin d’optimiser le choix et la position de ces intervalles. Dans notre étude, nous avons utilisé le critère d’information d’Akaike (AIC) afin d’identifier le nombre optimal d’intervalles permettant de modéliser au mieux les taux de mortalité de base en excès. Pour ce faire, nous avons estimé plusieurs modèles d’Estève intégrant l’ensemble des variables d’étude, en faisant varier le nombre d’intervalles de deux à huit. Le modèle présentant la plus faible valeur d’AIC a été retenu. Ainsi, le modèle optimal selon ce critère est celui comportant six intervalles. Le nombre d’intervalles retenu pour la modélisation des taux de mortalité de base en excès est donc fixé à six pour les modèles suivants : Estève, Touraine et Mba. En ce qui concerne la localisation de ces intervalles, nous avons utilisé des NA dans l’argument interval de la fonction xhaz(), ce qui permet à la fonction de déterminer automatiquement les bornes optimales en fonction des données.

# Cas de huit intervalles
fit_estrb8 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, 
        data=CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, NA, NA, NA, NA, NA, NA, 5), 
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg',
        dept = 'dept'), baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estrb8 <- predict(object = fit_estrb8,
        new.data = CCR.2171.RENEW_P,
        times.pts = c(seq(0, 5, 0.1)),
        baseline = TRUE)

# Cas de six intervalles
fit_estrb6 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, 
        data=CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, NA, NA, NA, NA, 5), 
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg',
        dept = 'dept'), baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estrb6 <- predict(object = fit_estrb6,
        new.data = CCR.2171.RENEW_P,
        times.pts = c(seq(0, 5, 0.1)),
        baseline = TRUE)

# Cas de quatre intervalles
# Modèle d'estève pour l'ensemble des individus 
fit_estrb4 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, 
        data=CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, NA, NA, 5), 
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg',
        dept = 'dept'), baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estrb4 <- predict(object = fit_estrb4,
        new.data = CCR.2171.RENEW_P,
        times.pts = c(seq(0, 5, 0.1)),
        baseline = TRUE)

# Cas de deux intervalles
fit_estrb2 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, 
        data=CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, 5), 
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg',
        dept = 'dept'), baseline = "constant", pophaz = "classic")

# Cas de trois intervalles
fit_estrb3 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, 
        data=CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, NA, 5), 
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg',
        dept = 'dept'), baseline = "constant", pophaz = "classic")

# Cas de cinq intervalles 
fit_estrb5 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, 
        data=CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, NA, NA, NA, 5), 
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg',
        dept = 'dept'), baseline = "constant", pophaz = "classic")

# Cas de sept intervalles
fit_estrb7 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, 
        data=CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, NA, NA, NA, NA, NA, 5), 
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg',
        dept = 'dept'), baseline = "constant", pophaz = "classic")
# Définir les intervalles
intervalles <- 2:8

# Calculer les AIC pour chaque modèle
aic <- sapply(intervalles, 
       function(i) AIC(get(paste0("fit_estrb", i))))

# Créer un data.frame avec les intervalles en lignes
tableau_AIC <- data.frame(
  Intervalles = intervalles,
  `AIC` = aic)

# Affichage du tableau
kable(tableau_AIC)
Intervalles AIC
2 3151.526
3 3146.621
4 3144.002
5 3134.956
6 3128.864
7 3134.950
8 3131.349

I.7.2. Modèle classique et les différents modèles de correction

# Estimation de la survie nette par la méthoded'Estève
# Modèle d'Estève I
fit_estrb61 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc, 
        data=CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, NA, NA, NA, NA, 5), 
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg',
        dept = 'dept'), baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estrb61 <- predict(object = fit_estrb61,
        new.data = CCR.2171.RENEW_P,
        times.pts = c(seq(0, 5, 0.1)),
        baseline = TRUE)

# Affichage des résultats
summary(fit_estrb61)
## Call:
## xhaz(formula = Surv(Time_cens5, statut_cens5) ~ age_cl + Sexe + 
##     dept + Annee_dg + stade2 + Loc, data = CCR.2171.RENEW_P, 
##     ratetable = mua7522.rt, rmap = list(age = "age_dg", sex = "sexe", 
##         year = "date_dg", dept = "dept"), baseline = "constant", 
##     pophaz = "classic", interval = c(0, NA, NA, NA, NA, NA, 5))
## 
## 
##                  coef se(coef) lower 0.95 upper 0.95     z Pr(>|z|)    
## age_cl60-69   0.13355  0.21427   -0.28641    0.55352  0.62   0.5300    
## age_cl70-83   0.22433  0.20338   -0.17428    0.62294  1.10   0.2700    
## Sexefemme    -0.13401  0.16898   -0.46519    0.19718 -0.79   0.4300    
## dept71       -0.19482  0.16767   -0.52344    0.13380 -1.16   0.2500    
## Annee_dg2008 -0.27116  0.25936   -0.77951    0.23718 -1.05   0.3000    
## Annee_dg2009 -0.28179  0.27620   -0.82313    0.25954 -1.02   0.3100    
## Annee_dg2010  0.01054  0.25286   -0.48506    0.50615  0.04   0.9700    
## Annee_dg2011 -0.02031  0.24774   -0.50587    0.46526 -0.08   0.9300    
## stade2III-IV  1.47733  0.19283    1.09940    1.85526  7.66  1.8e-14 ***
## LocRectum     0.35177  0.18547   -0.01174    0.71528  1.90   0.0580 .  
## [0-1.32[      0.00535  0.00218    0.00108    0.00962  2.45   0.0140 *  
## [1.32-1.95[   0.01972  0.00700    0.00601    0.03344  2.82   0.0048 ** 
## [1.95-2.62[   0.01985  0.00680    0.00653    0.03318  2.92   0.0035 ** 
## [2.62-3.4[    0.01693  0.00597    0.00524    0.02863  2.84   0.0045 ** 
## [3.4-4.07[    0.02413  0.00811    0.00824    0.04003  2.98   0.0029 ** 
## [4.07-5[      0.01358  0.00533    0.00314    0.02402  2.55   0.0110 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Excess hazard hazard ratio(s)
## (proportional effect variable(s) for exess hazard ratio(s))
##              exp(coef) lower 0.95 upper 0.95
## age_cl60-69     1.1429     0.7510      1.739
## age_cl70-83     1.2515     0.8401      1.864
## Sexefemme       0.8746     0.6280      1.218
## dept71          0.8230     0.5925      1.143
## Annee_dg2008    0.7625     0.4586      1.268
## Annee_dg2009    0.7544     0.4391      1.296
## Annee_dg2010    1.0106     0.6157      1.659
## Annee_dg2011    0.9799     0.6030      1.592
## stade2III-IV    4.3812     3.0024      6.393
## LocRectum       1.4216     0.9883      2.045
## 
## number of observations: 1868;  number of events: 419
## Likelihood ratio test: 77.2  on 16 degree(s) of freedom, p=5.21e-10
# Modèle d'Estève II
fit_estrb62 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, 
        data=CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, NA, NA, NA, NA, 5), 
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg',
        dept = 'dept'), baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estrb62 <- predict(object = fit_estrb62,
        new.data = CCR.2171.RENEW_P,
        times.pts = c(seq(0, 5, 0.1)),
        baseline = TRUE)

# Affichage des résultats
summary(fit_estrb62)
## Call:
## xhaz(formula = Surv(Time_cens5, statut_cens5) ~ age_cl + Sexe + 
##     dept + Annee_dg + stade2 + Loc + charlsoncl2, data = CCR.2171.RENEW_P, 
##     ratetable = mua7522.rt, rmap = list(age = "age_dg", sex = "sexe", 
##         year = "date_dg", dept = "dept"), baseline = "constant", 
##     pophaz = "classic", interval = c(0, NA, NA, NA, NA, NA, 5))
## 
## 
##                    coef  se(coef) lower 0.95 upper 0.95     z Pr(>|z|)    
## age_cl60-69    0.000249  0.215280  -0.421693   0.422190  0.00   1.0000    
## age_cl70-83   -0.046194  0.206691  -0.451301   0.358913 -0.22   0.8200    
## Sexefemme     -0.021821  0.167538  -0.350189   0.306546 -0.13   0.9000    
## dept71        -0.240927  0.164703  -0.563740   0.081885 -1.46   0.1400    
## Annee_dg2008  -0.338290  0.248943  -0.826209   0.149630 -1.36   0.1700    
## Annee_dg2009  -0.537484  0.290414  -1.106684   0.031716 -1.85   0.0640 .  
## Annee_dg2010  -0.088440  0.243376  -0.565448   0.388568 -0.36   0.7200    
## Annee_dg2011   0.016104  0.230135  -0.434953   0.467160  0.07   0.9400    
## stade2III-IV   1.510336  0.181274   1.155046   1.865625  8.33  < 2e-16 ***
## LocRectum      0.423348  0.182476   0.065701   0.780995  2.32   0.0200 *  
## charlsoncl2>2  1.809107  0.227069   1.364059   2.254154  7.97  1.7e-15 ***
## charlsoncl21   0.393329  0.222227  -0.042229   0.828887  1.77   0.0770 .  
## charlsoncl22   0.770427  0.247185   0.285954   1.254901  3.12   0.0018 ** 
## [0-1.32[       0.004524  0.001699   0.001195   0.007853  2.66   0.0077 ** 
## [1.32-1.95[    0.016311  0.005517   0.005498   0.027124  2.96   0.0031 ** 
## [1.95-2.62[    0.015001  0.005088   0.005028   0.024974  2.95   0.0032 ** 
## [2.62-3.4[     0.013317  0.004581   0.004339   0.022295  2.91   0.0036 ** 
## [3.4-4.07[     0.019049  0.006253   0.006793   0.031305  3.05   0.0023 ** 
## [4.07-5[       0.010834  0.004076   0.002846   0.018822  2.66   0.0079 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Excess hazard hazard ratio(s)
## (proportional effect variable(s) for exess hazard ratio(s))
##               exp(coef) lower 0.95 upper 0.95
## age_cl60-69      1.0002     0.6559      1.525
## age_cl70-83      0.9549     0.6368      1.432
## Sexefemme        0.9784     0.7046      1.359
## dept71           0.7859     0.5691      1.085
## Annee_dg2008     0.7130     0.4377      1.161
## Annee_dg2009     0.5842     0.3307      1.032
## Annee_dg2010     0.9154     0.5681      1.475
## Annee_dg2011     1.0162     0.6473      1.595
## stade2III-IV     4.5283     3.1742      6.460
## LocRectum        1.5271     1.0679      2.184
## charlsoncl2>2    6.1050     3.9120      9.527
## charlsoncl21     1.4819     0.9587      2.291
## charlsoncl22     2.1607     1.3310      3.507
## 
## number of observations: 1868;  number of events: 419
## Likelihood ratio test: 127  on 19 degree(s) of freedom, p=0
# Estimation de la survie nette par la méthode de Touraine
# Modèle de Touraine 
fit.correctedrb1 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2,
        data = CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, NA, NA, NA, NA, NA, 5),
        rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg', 
        dept = 'dept'), baseline = "constant", pophaz = "corrected",
        add.rmap = "charlsoncl2")

# Prédiction de la mortalité de base en excès
predict_correctedrb1 <- predict(object = fit.correctedrb1,
        new.data = CCR.2171.RENEW_P,
        times.pts = c(seq(0, 5, 0.1)),
        baseline = TRUE)

# Affichage des résultats
summary(fit.correctedrb1)
## Call:
## xhaz(formula = Surv(Time_cens5, statut_cens5) ~ age_cl + Sexe + 
##     dept + Annee_dg + stade2 + Loc + charlsoncl2, data = CCR.2171.RENEW_P, 
##     ratetable = mua7522.rt, rmap = list(age = "age_dg", sex = "sexe", 
##         year = "date_dg", dept = "dept"), baseline = "constant", 
##     pophaz = "corrected", add.rmap = "charlsoncl2", interval = c(0, 
##         NA, NA, NA, NA, NA, 5))
## 
## 
##                    coef  se(coef) lower 0.95 upper 0.95     z Pr(>|z|)    
## age_cl60-69    0.062458  0.218398  -0.365595   0.490511  0.29  0.77000    
## age_cl70-83   -0.076142  0.247122  -0.560492   0.408207 -0.31  0.76000    
## Sexefemme      0.037534  0.176295  -0.307997   0.383066  0.21  0.83000    
## dept71        -0.203897  0.172770  -0.542520   0.134726 -1.18  0.24000    
## Annee_dg2008  -0.336048  0.263130  -0.851773   0.179677 -1.28  0.20000    
## Annee_dg2009  -0.435312  0.301123  -1.025501   0.154878 -1.45  0.15000    
## Annee_dg2010  -0.055487  0.254258  -0.553824   0.442850 -0.22  0.83000    
## Annee_dg2011  -0.068989  0.252168  -0.563228   0.425251 -0.27  0.78000    
## stade2III-IV   1.570249  0.236764   1.106200   2.034298  6.63  3.3e-11 ***
## LocRectum      0.473522  0.194251   0.092798   0.854247  2.44  0.01500 *  
## charlsoncl2>2  1.265246  0.383943   0.512733   2.017760  3.30  0.00098 ***
## charlsoncl21   0.416219  0.255323  -0.084205   0.916643  1.63  0.10000 .  
## charlsoncl22   0.776899  0.293855   0.200953   1.352845  2.64  0.00820 ** 
## [0-1.32[       0.003875  0.001854   0.000242   0.007509  2.09  0.03700 *  
## [1.32-1.95[    0.014570  0.005810   0.003181   0.025958  2.51  0.01200 *  
## [1.95-2.62[    0.014791  0.005636   0.003744   0.025838  2.62  0.00870 ** 
## [2.62-3.4[     0.012801  0.005030   0.002941   0.022660  2.54  0.01100 *  
## [3.4-4.07[     0.018855  0.006845   0.005439   0.032272  2.75  0.00590 ** 
## [4.07-5[       0.010558  0.004817   0.001116   0.020000  2.19  0.02800 *  
## log(alpha.0)  -0.099504  0.189440  -0.470799   0.271791 -0.53  0.60000    
## log(alpha.>2)  0.871186  0.239140   0.402479   1.339892  3.64  0.00027 ***
## log(alpha.1)  -0.094657  0.262650  -0.609442   0.420129 -0.36  0.72000    
## log(alpha.2)  -0.069114  0.321789  -0.699808   0.561580 -0.21  0.83000    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Excess hazard hazard ratio(s)
## (proportional effect variable(s) for exess hazard ratio(s))
##               exp(coef) lower 0.95 upper 0.95
## age_cl60-69      1.0644     0.6938      1.633
## age_cl70-83      0.9267     0.5709      1.504
## Sexefemme        1.0382     0.7349      1.467
## dept71           0.8155     0.5813      1.144
## Annee_dg2008     0.7146     0.4267      1.197
## Annee_dg2009     0.6471     0.3586      1.168
## Annee_dg2010     0.9460     0.5747      1.557
## Annee_dg2011     0.9333     0.5694      1.530
## stade2III-IV     4.8078     3.0228      7.647
## LocRectum        1.6056     1.0972      2.350
## charlsoncl2>2    3.5440     1.6698      7.521
## charlsoncl21     1.5162     0.9192      2.501
## charlsoncl22     2.1747     1.2226      3.868
## 
## and corrected scale parameters on population hazard 
##          exp(coef) lower 0.95 upper 0.95
## alpha.0     0.9053     0.6245      1.312
## alpha.>2    2.3897     1.4955      3.819
## alpha.1     0.9097     0.5437      1.522
## alpha.2     0.9332     0.4967      1.753
## 
## number of observations: 1868;  number of events: 419
## Likelihood ratio test: 88.4  on 23 degree(s) of freedom, p=1.3e-09
# Calcul des déciles pour l'âge au décès potentiel
points_rupture <- quantile(CCR.2171.RENEW_P$age_dg + 
          CCR.2171.RENEW_P$Time_cens5, seq(0.1, 0.9, 0.1))

# Affichage des quantiles 
kable(points_rupture, 
  caption = "Tableau des déciles selon l'âge au décès")
Tableau des déciles selon l’âge au décès
x
10% 58.97833
20% 63.90138
30% 67.61154
40% 71.23298
50% 74.54449
60% 77.36964
70% 79.65175
80% 82.02168
90% 84.40153
# Fonction d'évaluation
eval_model <- function(point.rupt) {
  fit <- tryCatch({
    xhaz(
      Surv(Time_cens5, statut_cens5) ~ age_cl + 
      Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, 
      data = CCR.2171.RENEW_P, ratetable = mua7522.rt, 
      interval = c(0, NA, NA, NA, NA, NA, 5),
      rmap = list(age = 'age_dg', sex = 'sexe', year = 'date_dg', 
      dept = 'dept'), baseline = "constant", pophaz = "corrected",
      add.rmap = "charlsoncl2", add.rmap.cut = list(breakpoint = TRUE,
      cut = c(point.rupt))
    )
  }, error = function(e) return(NULL))

# Extraction des AIC et BIC  
  aic_val <- tryCatch(AIC(fit), error = function(e) NA)
  bic_val <- tryCatch(BIC(fit), error = function(e) NA)

# Extraction des valeurs AIC et BIC sous forme tableur 
  tibble(
    Point_de_rupture = point.rupt,
    AIC = round(aic_val, 2),
    BIC = round(bic_val, 2))
}

# Appliquer la fonction et compiler les résultats
results_models <- lapply(points_rupture, eval_model) %>% 
  bind_rows()

# Ajouter les ΔAIC, ΔBIC, classements
results_models <- results_models %>%
  mutate(
    Delta_AIC = round(AIC - min(AIC, na.rm = TRUE), 2),
    Delta_BIC = round(BIC - min(BIC, na.rm = TRUE), 2),
    Rang_AIC = rank(AIC, ties.method = "min"),
    Rang_BIC = rank(BIC, ties.method = "min")
  )

# Affichage
kable(results_models)
Point_de_rupture AIC BIC Delta_AIC Delta_BIC Rang_AIC Rang_BIC
58.97833 3300.22 3452.09 0.00 0.00 1 1
63.90138 3368.42 3520.79 68.20 68.70 3 3
67.61154 3331.18 3483.96 30.96 31.87 2 2
71.23298 3379.55 3532.45 79.33 80.36 4 4
74.54449 3463.17 3616.94 162.95 164.85 5 5
77.36964 3567.72 3721.49 267.50 269.40 6 6
79.65175 3634.03 3787.68 333.81 335.59 7 7
82.02168 3674.67 3828.17 374.45 376.08 8 8
84.40153 NA NA NA NA 9 9
# Estimation de la survie nette par la méthode de Mba
# Modèle de Mba
# NB : Nous n'avons pas utilisé les NA dans interval
# Car NA, le modèle de Mba produit des valeurs des bornes différentes 
# De celles du modèle classique (modèle d'Estève)
fit.correctedrb2 <- xhaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2,
        data = CCR.2171.RENEW_P, ratetable = mua7522.rt,
        interval = c(0, 1.32, 1.95, 2.62, 3.4, 4.07, 5),
        rmap = list(age = 'age_dg', sex = 'sexe', 
        year = 'date_dg', dept = 'dept'), 
        baseline = "constant", pophaz = "corrected",
        add.rmap = "charlsoncl2", 
        add.rmap.cut = list(breakpoint = TRUE, cut = c(58.98)))

# Prédiction de la mortalité de base en excès
predict_correctedrb2 <- predict(object = fit.correctedrb2,
                              new.data = CCR.2171.RENEW_P,
                              times.pts = c(seq(0, 5, 0.1)),
                              baseline = TRUE)

# Affichage des résultats
summary(fit.correctedrb2)
## Call:
## xhaz(formula = Surv(Time_cens5, statut_cens5) ~ age_cl + Sexe + 
##     dept + Annee_dg + stade2 + Loc + charlsoncl2, data = CCR.2171.RENEW_P, 
##     ratetable = mua7522.rt, rmap = list(age = "age_dg", sex = "sexe", 
##         year = "date_dg", dept = "dept"), baseline = "constant", 
##     pophaz = "corrected", add.rmap = "charlsoncl2", add.rmap.cut = list(breakpoint = TRUE, 
##         cut = c(58.98)), interval = c(0, 1.32, 1.95, 2.62, 3.4, 
##         4.07, 5))
## 
## 
##                              coef  se(coef) lower 0.95 upper 0.95     z
## age_cl60-69              0.459759  0.242332  -0.015203   0.934721  1.90
## age_cl70-83              0.314356  0.256995  -0.189344   0.818056  1.22
## Sexefemme                0.278153  0.195372  -0.104769   0.661076  1.42
## dept71                  -0.269678  0.182551  -0.627471   0.088116 -1.48
## Annee_dg2008            -0.224372  0.273955  -0.761314   0.312571 -0.82
## Annee_dg2009            -0.414974  0.320056  -1.042271   0.212323 -1.30
## Annee_dg2010             0.030710  0.266351  -0.491329   0.552749  0.12
## Annee_dg2011            -0.169046  0.286733  -0.731031   0.392940 -0.59
## stade2III-IV             1.635164  0.268669   1.108582   2.161746  6.09
## LocRectum                0.484342  0.205830   0.080923   0.887761  2.35
## charlsoncl2>2            0.974632  0.464468   0.064291   1.884973  2.10
## charlsoncl21             0.342448  0.274118  -0.194813   0.879709  1.25
## charlsoncl22             0.672189  0.319983   0.045034   1.299343  2.10
## [0-1.32[                 0.002101  0.001276  -0.000400   0.004602  1.65
## [1.32-1.95[              0.008944  0.004414   0.000293   0.017595  2.03
## [1.95-2.62[              0.009605  0.004706   0.000381   0.018830  2.04
## [2.62-3.4[               0.008580  0.004072   0.000598   0.016561  2.11
## [3.4-4.07[               0.012112  0.005505   0.001322   0.022902  2.20
## [4.07-5[                 0.006182  0.003651  -0.000973   0.013338  1.69
## log(alpha.0_(48.1,59])   1.213159  0.468622   0.294676   2.131642  2.59
## log(alpha.0_(59,88])    -0.114542  0.194803  -0.496348   0.267265 -0.59
## log(alpha.>2_(48.1,59])  3.344028  0.544394   2.277036   4.411020  6.14
## log(alpha.>2_(59,88])    0.966535  0.207203   0.560424   1.372645  4.66
## log(alpha.1_(48.1,59])   2.203528  0.866432   0.505353   3.901704  2.54
## log(alpha.1_(59,88])    -0.045668  0.248086  -0.531908   0.440571 -0.18
## log(alpha.2_(48.1,59])   1.995943  0.756742   0.512755   3.479130  2.64
## log(alpha.2_(59,88])    -0.009711  0.302048  -0.601714   0.582291 -0.03
##                         Pr(>|z|)    
## age_cl60-69               0.0580 .  
## age_cl70-83               0.2200    
## Sexefemme                 0.1500    
## dept71                    0.1400    
## Annee_dg2008              0.4100    
## Annee_dg2009              0.1900    
## Annee_dg2010              0.9100    
## Annee_dg2011              0.5600    
## stade2III-IV             1.2e-09 ***
## LocRectum                 0.0190 *  
## charlsoncl2>2             0.0360 *  
## charlsoncl21              0.2100    
## charlsoncl22              0.0360 *  
## [0-1.32[                  0.1000 .  
## [1.32-1.95[               0.0430 *  
## [1.95-2.62[               0.0410 *  
## [2.62-3.4[                0.0350 *  
## [3.4-4.07[                0.0280 *  
## [4.07-5[                  0.0900 .  
## log(alpha.0_(48.1,59])    0.0096 ** 
## log(alpha.0_(59,88])      0.5600    
## log(alpha.>2_(48.1,59])  8.1e-10 ***
## log(alpha.>2_(59,88])    3.1e-06 ***
## log(alpha.1_(48.1,59])    0.0110 *  
## log(alpha.1_(59,88])      0.8500    
## log(alpha.2_(48.1,59])    0.0084 ** 
## log(alpha.2_(59,88])      0.9700    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Excess hazard hazard ratio(s)
## (proportional effect variable(s) for exess hazard ratio(s))
##               exp(coef) lower 0.95 upper 0.95
## age_cl60-69      1.5837     0.9849      2.547
## age_cl70-83      1.3694     0.8275      2.266
## Sexefemme        1.3207     0.9005      1.937
## dept71           0.7636     0.5339      1.092
## Annee_dg2008     0.7990     0.4671      1.367
## Annee_dg2009     0.6604     0.3527      1.237
## Annee_dg2010     1.0312     0.6118      1.738
## Annee_dg2011     0.8445     0.4814      1.481
## stade2III-IV     5.1303     3.0301      8.686
## LocRectum        1.6231     1.0843      2.430
## charlsoncl2>2    2.6502     1.0664      6.586
## charlsoncl21     1.4084     0.8230      2.410
## charlsoncl22     1.9585     1.0461      3.667
## 
## and corrected scale parameters on population hazard 
##  (non proportional correction using breakpoint approach)
## 
##                    exp(coef) lower 0.95 upper 0.95
## alpha.0_(48.1,59]     3.3641     1.3427      8.429
## alpha.0_(59,88]       0.8918     0.6087      1.306
## alpha.>2_(48.1,59]   28.3330     9.7477     82.353
## alpha.>2_(59,88]      2.6288     1.7514      3.946
## alpha.1_(48.1,59]     9.0569     1.6576     49.487
## alpha.1_(59,88]       0.9554     0.5875      1.554
## alpha.2_(48.1,59]     7.3591     1.6699     32.431
## alpha.2_(59,88]       0.9903     0.5479      1.790
## 
## 
## number of observations: 2048;  number of events: 438
## Likelihood ratio test: 84.1  on 27 degree(s) of freedom, p=8.86e-08
# Calcul des taux de mortalité simple et cumulé
fit.hazrb <- exphaz(Surv(Time_cens5, statut_cens5) ~ age_cl + 
        Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2,
        data = CCR.2171.RENEW_P, ratetable = mua7522.rt,
        only_ehazard = FALSE, rmap = list(age = 'age_dg', 
        sex = 'sexe', year = 'date_dg', dept = 'dept'))

# Calcul des taux de mortalité cumulé
fit_thernrb <- survexp(Time_cens5*365.24 ~ 1, 
     rmap = list(age = age_dg*365.24, sex = sexe, 
     year = annee_dgjr, dept = dept), method="individual.h",
     data = CCR.2171.RENEW_P, ratetable = mua7522.rt)
 
# Extraction des taux 
CCR.2171.RENEW_P$expected <- fit.hazrb$ehazard
CCR.2171.RENEW_P$expectedCum <- as.vector(fit_thernrb)

# Ajout des covariables 
CCR.2171.RENEW_P <- CCR.2171.RENEW_P %>%
    mutate(
    # Crétion de la variable sexfemale
      sexeF = ifelse(Sexe=="femme", 1, 0),
      
    # Création de la variable age_cl60_70
      age_cl60_69 = ifelse(age_cl=="60-69", 1, 0),
    
    # Création de la variable age_cl70_83
      age_cl70_83 = ifelse(age_cl=="70-83", 1, 0),
    
    # Création de la variable depSL
     deptSL = ifelse(dept=="71", 1, 0),
    
    # Création de la variable Annee_dg2008
      Annee_dg2008 = ifelse(Annee_dg=="2008", 1, 0),
    
    # Création de la variable Annee_dg2009
      Annee_dg2009 = ifelse(Annee_dg=="2009", 1, 0),

    # Création de la variable Annee_dg2010
      Annee_dg2010 = ifelse(Annee_dg=="2010", 1, 0),

    # Création de la variable Annee_dg2011
      Annee_dg2011 = ifelse(Annee_dg=="2011", 1, 0),

    # Création de la variable StadeIII-IV
      StadeIII_IV = ifelse(stade2=="III-IV", 1, 0),
    
    # Création de la variable LocRectum
      LocRectum = ifelse(Loc=="Rectum", 1, 0),
    
    # Création de la variable charlsoncl1
      charlsoncl21 = ifelse(charlsoncl2=="1", 1, 0),
    
    # Création de la variable charlsoncl1
      charlsoncl22 = ifelse(charlsoncl2=="2", 1, 0),
    
    # Création de la variable charlsoncl1
      charlsoncl22p = ifelse(charlsoncl2==">2", 1, 0)
      )

# Modèle d'estimation de mortalité en excès de Rubio
# Exponentiated Weibull Distribution
# Hazard
hexpw <- function(t, lambda, kappa, alpha){
  pdf0 <-  alpha*dweibull(t, scale=lambda,
           shape=kappa)*pweibull(t, scale=lambda, 
           shape=kappa)^(alpha-1) 
  cdf0 <- pweibull(t, scale=lambda, shape=kappa)^alpha
  cdf0 <- ifelse(cdf0==1, 0.9999999, cdf0)
  return(pdf0/(1-cdf0))
}                                                                                      
# Cumulative hazard
Hexpw <- function(t, lambda, kappa, alpha, log.p=FALSE){
  cdf <- pweibull(t, scale=lambda, shape=kappa)^alpha  
  return(-log(1-cdf))
} 

# Fonction de calcul de l'intervalle de confiance
Conf.Int <- function(FUN, MLE, level=0.95){
  sd.int <- abs(qnorm(0.5*(1-level)))
  HESS <- hessian(FUN, x=MLE)
  Fisher.Info <- solve(HESS, tol=1e-18)
  Sigma <- sqrt(diag(Fisher.Info))
  U<- MLE + sd.int*Sigma
  L<- MLE - sd.int*Sigma
  exp.U <- exp(U)
  exp.L <- exp(L)
  C.I <- cbind(L, U, exp.L, exp.U, MLE, Sigma)
  rownames(C.I)<- paste0("par", seq_along(1:length(MLE)))
  colnames(C.I)<- c("Lower","Upper","exp.Lower","exp.Upper","Transf MLE", "Std. Error")
  return(C.I)
}

# Différence des risques cumulés
Hp.diff.new <- as.vector(CCR.2171.RENEW_P[,"expectedCum"])

# Variables utilisées dans la modélisation
# Taux attendus
hp.as = as.vector(CCR.2171.RENEW_P[,"expected"])

# Statut de censure : 0 = vivant, 1 = décédé
ind.cens = as.vector(CCR.2171.RENEW_P[,"statut_cens5"]) 

# Temps de survie.
surv.time = as.vector(CCR.2171.RENEW_P[,"Time_cens5"]) 

# Matrice des variables explicatives.
x2 = as.matrix(CCR.2171.RENEW_P[,c("age_cl60_69",
    "age_cl70_83","sexeF", "deptSL", "Annee_dg2008", 
    "Annee_dg2009", "Annee_dg2010", "Annee_dg2011", 
    "StadeIII_IV", "LocRectum", "charlsoncl21", 
    "charlsoncl22", "charlsoncl22p")]) 

# Fonction pour le modèle de Rubio
log.lik.E = function(par){
  if(any(is.na(par))) return(Inf)
  theta = exp(par[1]); 
  kappa = exp(par[2])/exp(par[1]); 
  ae0 = exp(par[3]); 
  be0 = exp(par[4]); 
  ce0 <- exp(par[5]);
  beta1 <- par[6:(6 + dim(x2)[2]-1)];
  exp.x.beta1 <- exp(x2%*%beta1)
  haz0 = kappa*theta*hp.as/(1 + theta*Hp.diff.new) +
         hexpw(surv.time, ae0, be0, ce0)*exp.x.beta1
  log.Surv0 = -Hexpw(surv.time, ae0, be0, ce0)*exp.x.beta1 -
         kappa*log(1+theta*Hp.diff.new)
  val = sum(- ind.cens*log(haz0) - log.Surv0)
  ifelse(is.na(val), return(Inf), return(val))
}

# Étape d’optimisation
initE <- c(1.2, 1.2, 1.2, 1, 1, rep(0, 13))
OPTE1 = nlminb(initE, log.lik.E, control=list(iter.max=10000))
MLEE <- c(exp(OPTE1$par[1:18]), OPTE1$par[-c(1:18)])
AICE <- 2*OPTE1$objective + 2*length(MLEE)
IC <- Conf.Int(log.lik.E, OPTE1$par, level=0.95)

# Affichage des résultats du modèle Rubio
## Affichage des valeurs des hazards ratios (HR)
MLEE
##  [1] 3.567113e-11 7.679507e-01 2.044437e-03 1.447565e-01 5.785766e+01
##  [6] 1.065364e+00 1.159950e+00 9.247782e-01 8.059488e-01 7.192918e-01
## [11] 6.465383e-01 9.247240e-01 1.021733e+00 3.763313e+00 1.436335e+00
## [16] 1.471421e+00 1.972947e+00 5.147685e+00
# Affichage des résultats du modèle Rubio
## Affichage des intervalles de confiance
IC
##               Lower       Upper    exp.Lower    exp.Upper   Transf MLE
## par1  -49.944235875  1.83087703 2.039360e-22 6.239356e+00 -24.05667942
## par2   -0.680456011  0.15239655 5.063860e-01 1.164622e+00  -0.26402973
## par3  -24.649526522 12.26426096 1.971726e-11 2.119829e+05  -6.19263278
## par4   -3.186248617 -0.67915567 4.132661e-02 5.070449e-01  -1.93270214
## par5    0.067177911  8.04879363 1.069486e+00 3.130017e+03   4.05798577
## par6   -0.332797787  0.45943006 7.169151e-01 1.583171e+00   0.06331613
## par7   -0.282125014  0.57887856 7.541794e-01 1.784037e+00   0.14837677
## par8   -0.377330172  0.22092746 6.856896e-01 1.247233e+00  -0.07820135
## par9   -0.503681930  0.07221170 6.043016e-01 1.074883e+00  -0.21573512
## par10  -0.762763602  0.10378734 4.663758e-01 1.109365e+00  -0.32948813
## par11  -0.944527682  0.07228203 3.888632e-01 1.074958e+00  -0.43612283
## par12  -0.506847546  0.35032765 6.023916e-01 1.419533e+00  -0.07825995
## par13  -0.380081979  0.42308284 6.838053e-01 1.526661e+00   0.02150043
## par14   0.953358317  1.69724102 2.594408e+00 5.458866e+00   1.32529967
## par15   0.030599743  0.69359018 1.031073e+00 2.000886e+00   0.36209496
## par16   0.003631898  0.76882576 1.003639e+00 2.157232e+00   0.38622883
## par17   0.237361461  1.12169544 1.267899e+00 3.070055e+00   0.67952845
## par18   1.190176884  2.08691738 3.287663e+00 8.060031e+00   1.63854713
##       Std. Error
## par1  13.2081797
## par2   0.2124663
## par3   9.4169556
## par4   0.6395763
## par5   2.0361639
## par6   0.2021027
## par7   0.2196478
## par8   0.1526195
## par9   0.1469143
## par10  0.2210630
## par11  0.2593950
## par12  0.2186712
## par13  0.2048927
## par14  0.1897695
## par15  0.1691333
## par16  0.1952061
## par17  0.2255995
## par18  0.2287645
# Affichage des résultats du modèle Rubio
## Affichage de la valeur de l'AIC
AICE
## [1] 3118.929
# Estimation de la survie nette par la méthode de Goungounga
## Modèle de GOUNGOUNGA
fit.correctedrb4 <- mexhazLT(
  Surv(Time_cens5, statut_cens5) ~ age_cl + 
  Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2,
  data = CCR.2171.RENEW_P, expected = "expected",
  expectedCum = "expectedCum", base = "exp.bs",
  knots = quantile(CCR.2171.RENEW_P$Time_cens5[
  CCR.2171.RENEW_P$statut_cens5==1], probs = c(1:2/3)), 
  degree = 3, pophaz = "rescaled")
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1 -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## Function Value
## [1] 2225.799
## Gradient:
##  [1] 953.44184 205.63667 203.45829 175.34654 102.41203  55.85222 276.20871
##  [8] 429.19607 426.34277 523.41553 223.07189 178.37873 160.43595 201.24088
## [15] 225.63408 214.56505  39.11229 196.88342  97.71925 117.29800
## 
## iteration = 115
## Parameter:
##  [1] -9.28023963  6.01388330  4.43920868  5.64996204  4.89548326  4.59699296
##  [7]  0.06168975  0.09819352 -0.04837325 -0.20000761 -0.32016559 -0.45081978
## [13] -0.08391189  0.01457022  1.37132929  0.40388399  1.68361022  0.39607250
## [19]  0.73716857 -0.18005001
## Function Value
## [1] 1538.078
## Gradient:
##  [1] -7.994698e-05  1.698247e-05 -5.819345e-05 -6.628128e-05  1.694638e-05
##  [6]  4.179240e-06 -3.716536e-05 -1.097874e-05 -1.645640e-04  4.642629e-05
## [11]  3.015612e-04 -6.342134e-05 -3.739251e-04 -9.647465e-06  3.596317e-05
## [16] -2.759248e-04  1.672137e-05 -6.253117e-05  2.614115e-05 -1.597527e-05
## 
## Relative gradient close to zero.
## Current iterate is probably solution.
## 
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  data      1868  1868      419       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code    LogLik Total.Time
##   115 2356 exp.bs     20      10   nlm    ---    1 -1538.078       7.02
# Prédiction de la mortalité de base en excès
predict_correctedrb4 <- predict(fit.correctedrb4,
            time.pts = CCR.2171.RENEW_P$Time_cens5[
            CCR.2171.RENEW_P$statut_cens5==1],
            data.val = CCR.2171.RENEW_P[1, ])

# Affichage des résultats
summary(fit.correctedrb4)
## Call:
## mexhazLT(formula = Surv(Time_cens5, statut_cens5) ~ age_cl + 
##     Sexe + dept + Annee_dg + stade2 + Loc + charlsoncl2, data = CCR.2171.RENEW_P, 
##     expected = "expected", expectedCum = "expectedCum", pophaz = "rescaled", 
##     base = "exp.bs", degree = 3, knots = quantile(CCR.2171.RENEW_P$Time_cens5[CCR.2171.RENEW_P$statut_cens5 == 
##         1], probs = c(1:2/3)))
## 
## Coefficients:
##                Estimate    StdErr z.value   p.value    
## Intercept     -9.280240  1.642249 -5.6509 1.596e-08 ***
## BS3.1          6.013883  2.175807  2.7640  0.005710 ** 
## BS3.2          4.439209  1.398303  3.1747  0.001500 ** 
## BS3.3          5.649962  1.828225  3.0904  0.001999 ** 
## BS3.4          4.895483  1.566009  3.1261  0.001771 ** 
## BS3.5          4.596993  1.735095  2.6494  0.008063 ** 
## age_cl60-69    0.061690  0.206889  0.2982  0.765567    
## age_cl70-83    0.098194  0.223024  0.4403  0.659733    
## Sexefemme     -0.048373  0.157639 -0.3069  0.758950    
## dept71        -0.200008  0.151192 -1.3229  0.185877    
## Annee_dg2008  -0.320166  0.228055 -1.4039  0.160350    
## Annee_dg2009  -0.450820  0.270167 -1.6687  0.095183 .  
## Annee_dg2010  -0.083912  0.226646 -0.3702  0.711208    
## Annee_dg2011   0.014570  0.212598  0.0685  0.945361    
## stade2III-IV   1.371329  0.193544  7.0853 1.387e-12 ***
## LocRectum      0.403884  0.174101  2.3198  0.020351 *  
## charlsoncl2>2  1.683610  0.231826  7.2624 3.804e-13 ***
## charlsoncl21   0.396072  0.203299  1.9482  0.051388 .  
## charlsoncl22   0.737169  0.232003  3.1774  0.001486 ** 
## log(Alpha)    -0.180050  0.185962 -0.9682  0.332941    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Hazard ratios (for proportional effect variables):
##                  Coef     HR CI.lower CI.upper
## age_cl60-69    0.0617 1.0636   0.7091   1.5955
## age_cl70-83    0.0982 1.1032   0.7125   1.7080
## Sexefemme     -0.0484 0.9528   0.6995   1.2977
## dept71        -0.2000 0.8187   0.6088   1.1011
## Annee_dg2008  -0.3202 0.7260   0.4643   1.1352
## Annee_dg2009  -0.4508 0.6371   0.3752   1.0819
## Annee_dg2010  -0.0839 0.9195   0.5897   1.4338
## Annee_dg2011   0.0146 1.0147   0.6689   1.5392
## stade2III-IV   1.3713 3.9406   2.6966   5.7585
## LocRectum      0.4039 1.4976   1.0647   2.1067
## charlsoncl2>2  1.6836 5.3850   3.4186   8.4823
## charlsoncl21   0.3961 1.4860   0.9976   2.2134
## charlsoncl22   0.7372 2.0900   1.3264   3.2933
## 
## log-likelihood: -1538.0776 (for 20 degree(s) of freedom)
## 
## number of observations: 1868, number of events: 419

I.7.3. Reprèsentations graphiques des taux de mortalité de base en excès

# Mise en page : 3 lignes, 2 colonnes
par(mfrow = c(2, 3), mar = c(4.5, 4.5, 3, 1), oma = c(1, 1, 2, 1))

## -------- Estève I --------
plot(predict_estrb61,
     what = "hazard",
     xlab = "Temps de suivi en années", 
     ylab = "Mortalité en excès de base",
     col = "red",
     main = "Estève I",
     lwd = 2,
     xlim = c(0, 5),
     ylim = c(0, 0.1),
     axes = FALSE)
axis(1); axis(2, at = seq(0, 0.1, by = 0.05)); box()

## -------- Estève II --------
plot(predict_estrb62,
     what = "hazard",
     xlab = "Temps de suivi en années", 
     ylab = "Mortalité en excès de base",
     col = "blue",
     main = "Estève II",
     lwd = 2,
     xlim = c(0, 5),
     ylim = c(0, 0.1),
     axes = FALSE)
axis(1); axis(2, at = seq(0, 0.1, by = 0.05)); box()

## -------- Touraine --------
plot(predict_correctedrb1,
     what = "hazard",
     xlab = "Temps de suivi en années", 
     ylab = "Mortalité en excès de base",
     col = "darkred",
     main = "Touraine",
     lwd = 2,
     xlim = c(0, 5),
     ylim = c(0, 0.1),
     axes = FALSE)
axis(1); axis(2, at = seq(0, 0.1, by = 0.05)); box()

## -------- MBA --------
plot(predict_correctedrb2,
     what = "hazard",
     xlab = "Temps de suivi en années", 
     ylab = "Mortalité en excès de base",
     col = "darkgreen",
     main = "MBA",
     lwd = 2,
     xlim = c(0, 5),
     ylim = c(0, 0.1),
     axes = FALSE)
axis(1); axis(2, at = seq(0, 0.1, by = 0.05)); box()

## -------- Goungounga --------
plot(predict_correctedrb4$results$time.pts,
     predict_correctedrb4$results$hazard,
     type = "l", lwd = 2,
     xlab = "Temps de suivi en années",
     ylab = "Mortalité en excès de base",
     ylim = c(0, 0.1),
     xlim = c(0, 5),
     col = "purple",
     main = "Goungounga")
lines(predict_correctedrb4$results$time.pts,
      predict_correctedrb4$results$hazard.inf,
      lwd = 1, col = "black", lty = 2)
lines(predict_correctedrb4$results$time.pts,
      predict_correctedrb4$results$hazard.sup,
      lwd = 1, col = "black", lty = 2)

## -------- Rubio --------
# Fonction principale et intervalles de confiance
fit <- Vectorize(function(t) 
  hexpw(t, MLEE[3], MLEE[4], MLEE[5]))
ic_l <- Vectorize(function(t) 
  hexpw(t, IC[3, 3], IC[4, 3], IC[5, 3]))
ic_u <- Vectorize(function(t) 
  hexpw(t, IC[3, 4], IC[4, 4], IC[5, 4]))

curve(fit, from = 0, to = 5,
      xlab = "Temps de suivi en années", 
      ylab = "Mortalité en excès de base",
      main = "Rubio", col = "black",
      lwd = 2, ylim = c(0, 0.1), n = 1000)
curve(ic_l, from = 0, to = 5, lwd = 1.5, 
      lty = 2, col = "red", add = TRUE)
curve(ic_u, from = 0, to = 5, lwd = 1.5, 
      lty = 2, col = "red", add = TRUE)

  1. Données de la London School of Hygiene & Tropical Medicine

II.1. DATA MANAGEMENT

# DATA MANAGEMENT 
# Chargement du jeu de données
setwd("E:/PARCOURS M2 MQERS 2024_2025/MON STAGE/DATASTAGE")
load("lung12boost.RData")

# Création de l'index de charlson
lung12boost$charlson <- with(lung12boost, 
  ifelse(agediag >= 40, (agediag - 40) %/% 10, 0) + #Score lié à l’âge
  1 * CM_HES_6_108_cm1 +
  1 * CM_HES_6_108_cm2 +
  1 * CM_HES_6_108_cm3 +
  1 * CM_HES_6_108_cm4 +
  1 * CM_HES_6_108_cm5 +
  1 * CM_HES_6_108_cm6 +
  1 * CM_HES_6_108_cm7 +
  1 * CM_HES_6_108_cm8 +
  1 * CM_HES_6_108_cm9 +
  1 * CM_HES_6_108_cm10 +
  2 * CM_HES_6_108_cm11 +
  2 * CM_HES_6_108_cm12 +
  2 * CM_HES_6_108_cm13 +
  3 * CM_HES_6_108_cm14 +
  6 * CM_HES_6_108_cm15 +
  2 * CM_HES_6_108_cm16 +
  6 * CM_HES_6_108_cm17
  # Obésité (cm18) non prise en compte
)

# Recodage et traitement des données 
lung12boost <- lung12boost %>%
    mutate(
      
      # Mettre au format date
        date_admc = dmy("31/12/2015"),
      
      # Creation de la variable time_admc en jours
        time_admc = as.numeric(difftime(date_admc, 
                     diagdate2, units = "days")),
      
      # Creation de la variable Time_admc en années
        Time_admc = time_admc / 365.241,
      
      # Creation de la variable time en jours
        survtyear = survtday / 365.241,
      
      # Creation de la variable temps de censure time_cens
        survtdaycens = pmin(survtday, time_admc),
      
      # Creation de la variable temps de censure en annees
        survtyearcens = survtdaycens / 365.241,
      
      # Creation de la variable statut de la censure administrative
        deadcens = ifelse(survtday > time_admc, 0, dead),
 
      # Création de la variable statut vital avec des labelles 
        Deadcens = factor(deadcens, levels = c(0, 1), 
                      labels = c("Vivant", "Mort")),

      # Creation de la variable charlsoncl en classes
        charlsoncl = as.factor(
                     ifelse(charlson %in% c(0, 1), "0-1",
                     ifelse(charlson == 2, "2",
                     ifelse(charlson == 3, "3",
                     ifelse(charlson %in% c(4, 5), "4-5",
                            "6+"))))),
      
      # Creation de la variable charlsoncl2 en classes
        charlsoncl1 = as.factor(ifelse(charlson == 0, "0",
                      ifelse(charlson == 1, "1", 
                      ifelse(charlson == 2, "2", ">2")))),

      # Creation de la variable charlsoncl2 en classes
        charlsoncl2 = as.factor(ifelse(charlson == 0, "0",
              ifelse(charlson == 1 | charlson == 2, "1-2", 
              ifelse(charlson == 3 | charlson == 4, "3-4",
                     ">4")))),

      # Creation de la variable classe d'age 
        agediagcl = as.factor(ifelse(agediag < 73, "40-72", 
                    "73-90")),
      
      # Création de la variable Annee_dg pour les années 
        Annee_dg = as.factor(year(diagdate2)),
      
      # Création de la variable Annee_dg pour les années 
        annee_dg = as.numeric(year(diagdate2)),
      
      # Récodage de la variable sex avec des labelles
        sex = factor(sex, levels=c(1,2), 
                     labels=c("male", "female")),
      
      # Création d'une variable âge en jours 
        Agediag = agediag*365.241,
      
      # Création de la variable Stade avec des labelles 
        stade = factor(tnm_group_restr, levels=c(1,2,3,4),
                       labels=c("I","II", "III","IV") ),

      # Creation de la variable classe d'age 
        Stade = as.factor(ifelse(stade=="I" | stade=="II", 
                    "I-II", "III-IV")),      
            
      # Création de la variable statut vital avec des labelles 
        Dead = factor(dead, levels = c(0, 1), 
                      labels = c("Vivant", "Mort"))
      )

# Chargement de la table de mortalité 
expDF.UK <-dget("./expDF_UK.dat")
setDT(expDF.UK)

# Transformation d'un fichier expDF.UK au format ratetable 
dept_ratetable_list = list()
depts = levels(expDF.UK$deprivq)
for (dep in depts){
  dept_ratetable_list = c(dept_ratetable_list,
              list(transrate(expDF.UK[sex== "1" &
              deprivq == dep,.(age , year, yearsurv=1-rate)] %>%
              pivot_wider(id_cols=age, names_from = year,
              values_from = yearsurv) %>%
              select(!age) %>%
              as_tibble() %>% as.matrix(),
              expDF.UK[sex == "2" &
              deprivq==dep,.(age, year,yearsurv=1-rate)] %>%
              pivot_wider(id_cols=age, names_from = year,
              values_from = yearsurv) %>%
              select(!age) %>%
              as_tibble() %>% 
              as.matrix(),
  yearlim=c(expDF.UK[deprivq == dep ,min(year)],
  expDF.UK[,max(year)]), int.length=1)))
}

names(dept_ratetable_list) <- depts
rt_lifetable = joinrate(dept_ratetable_list, dim.name = "deprivq")

# Affectation 
Anneejrs <- 365.241

# Traitement des valeurs manquantes
lung12boost_P <- lung12boost %>%
  select(sex, agediag, gor, typeKC, charlson, Dead,
         survtday, charlsoncl1, dead, dep, tnm_group_restr, 
         survtyear, deadcens, survtyearcens, Stade,
         Annee_dg, diagdate2, survtdaycens, agediagcl,
         Agediag, annee_dg, Deadcens, charlsoncl,
         charlsoncl2, stade) %>%
  na.omit()

II.2. Critère de sélection des individus d’étude : Analyse avec nessie

L’analyse a été restreinte aux individus pour lesquels l’information sur le stade du cancer du poumon était disponible. Compte tenu des limitations méthodologiques liées à l’estimation de la survie nette dans les populations très âgées, une analyse exploratoire a été menée à l’aide de l’outil NESSIE, afin d’identifier les âges pour lesquels cette estimation demeure fiable. Cette analyse a conduit à l’exclusion des patients âgés de moins de 40 ans et de plus de 90 ans. En effet, les individus de moins de 40 ans présentaient un effectif attendus faibles après un suivi de 4 ans, tandis que ceux âgés de plus de 90 ans avaient une survie attendue inférieure à la durée maximale de suivi. Ces deux groupes présentaient donc un risque plus élevé de décéder de causes autres que le cancer du poumon, pouvant compromettre la validité de l’estimation de la survie nette (cf. Tableau ci-dessous). La population d’étude était constituée de 30729 personnes.

# Création de la covariable agegrp
agegrp <- c(15, seq(from = 20, to =95, by = 5), Inf)
lung12boost_P$agegrp <- cut(lung12boost_P$agediag, agegrp)

# Analyse avec nessie 
nessie_analyse <- nessie(Surv(survtdaycens, deadcens) ~ agegrp, 
     data = lung12boost_P, ratetable = rt_lifetable, 
     times =c(0,1,2,3,round(max(lung12boost_P$survtyearcens), 2)), 
     rmap = list(age = Agediag, sex = sex, deprivq = dep, 
     year = diagdate2))
## 
##                   0      1      2      3   4.01 c.exp.surv
## agegrp(15,20]     2    2.0    2.0    2.0    2.0       57.7
## agegrp(20,25]     9    9.0    9.0    9.0    9.0       57.3
## agegrp(25,30]    16   16.0   16.0   16.0   16.0       52.2
## agegrp(30,35]    40   40.0   39.9   39.9   39.9       48.1
## agegrp(35,40]    65   64.9   64.9   64.8   64.7       42.6
## agegrp(40,45]   246  245.6  245.1  244.6  244.1       38.0
## agegrp(45,50]   565  563.5  561.8  559.9  557.9       33.2
## agegrp(50,55]  1153 1148.2 1143.1 1137.5 1131.3       28.8
## agegrp(55,60]  2251 2236.6 2220.9 2203.7 2184.9       24.5
## agegrp(60,65]  3894 3854.3 3811.3 3764.3 3712.8       20.3
## agegrp(65,70]  5266 5182.3 5091.5 4993.5 4885.9       16.6
## agegrp(70,75]  5618 5469.5 5310.1 5138.8 4953.7       13.0
## agegrp(75,80]  5222 4996.4 4756.6 4504.3 4236.1        9.9
## agegrp(80,85]  4141 3840.3 3530.9 3214.8 2893.2        7.2
## agegrp(85,90]  2351 2063.7 1784.5 1516.8 1263.2        5.0
## agegrp(90,95]   698  558.4  435.3  329.8  240.0        3.4
## agegrp(95,Inf]   74   49.7   32.2   20.1   12.3        2.3
# Suppression des individus ayant des âges < 40 et > 90 ans
lung12boost_P <- lung12boost_P[lung12boost_P$agediag <= 90 
                    & lung12boost_P$agediag >= 40, ]

II.3. Analyse descriptive

II.3.1. Selon le statut vital

# Affectation des resultats à l'object tab
tab <- lung12boost_P %>%
  select(c(sex, agediag, agediagcl, gor, typeKC, 
  charlson, survtyearcens, charlsoncl1, Deadcens, 
  dep, Stade, Annee_dg)) %>% 
  tbl_summary(
    by = Deadcens,
    type = c(agediag, survtyearcens, charlson) ~ "continuous2",
    statistic = all_continuous2() ~ c("{mean} ({median}),
    {sd}({min}-{max})", "{p25}-{p75}"), percent = "row") %>%
  add_overall(last = TRUE) %>%
  bold_labels()

# Sélection uniquement des colonnes "mort" et "Total" 
tab %>%
  modify_table_styling(columns = "stat_1", hide = TRUE) %>%
  as_kable()
Characteristic Mort N = 26,204 Overall N = 30,729
sex
male 14,716 (87%) 16,857 (100%)
female 11,488 (83%) 13,872 (100%)
agediag
Mean (Median),SD(Min-Max) 72 (73),10(40-90) 72 (72),10(40-90)
Q1-Q3 66-80 65-79
agediagcl
40-72 12,626 (81%) 15,523 (100%)
73-90 13,578 (89%) 15,206 (100%)
gor
A 2,093 (87%) 2,412 (100%)
B 4,618 (85%) 5,413 (100%)
D 3,209 (84%) 3,805 (100%)
E 2,029 (85%) 2,396 (100%)
F 2,689 (86%) 3,112 (100%)
G 2,800 (86%) 3,256 (100%)
H 2,644 (84%) 3,147 (100%)
J 3,623 (85%) 4,256 (100%)
K 2,499 (85%) 2,932 (100%)
typeKC
NSCLC 22,890 (84%) 27,205 (100%)
SCLC 3,314 (94%) 3,524 (100%)
charlson
Mean (Median),SD(Min-Max) 3.50 (3.00),1.75(0.00-16.00) 3.42 (3.00),1.72(0.00-16.00)
Q1-Q3 2.00-4.00 2.00-4.00
survtyearcens
Mean (Median),SD(Min-Max) 0.71 (0.41),0.78(0.00-3.95) 1.12 (0.57),1.22(0.00-4.01)
Q1-Q3 0.14-1.02 0.16-1.71
charlsoncl1
>2 18,646 (87%) 21,365 (100%)
0 467 (78%) 601 (100%)
1 1,926 (80%) 2,414 (100%)
2 5,165 (81%) 6,349 (100%)
dep
1 3,613 (84%) 4,304 (100%)
2 4,401 (84%) 5,257 (100%)
3 5,258 (86%) 6,105 (100%)
4 6,194 (86%) 7,237 (100%)
5 6,738 (86%) 7,826 (100%)
Stade
I-II 3,911 (55%) 7,146 (100%)
III-IV 22,293 (95%) 23,583 (100%)
Annee_dg
2011 157 (89%) 177 (100%)
2012 25,950 (85%) 30,442 (100%)
2013 97 (88%) 110 (100%)

II.3.2. Selon le statut vital et le type de cancer

# Affectation des resultats à l'object tab
tab <- lung12boost_P %>%
  select(c(sex, agediag, agediagcl, gor, typeKC, 
  charlson, survtyearcens, charlsoncl1, Deadcens, 
  dep, Stade, Annee_dg)) %>% 
  tbl_strata(
    strata = typeKC,  
    .tbl_fun = ~ .x %>%
      tbl_summary(
        by = Deadcens, 
        type = c(agediag, survtyearcens, charlson) ~ "continuous2",
        statistic = all_continuous2() ~ c("{mean} ({median}), {sd}({min}-{max})", "{p25}-{p75}"),
        percent = "row"
      ) %>%
      add_overall(last = TRUE)
  ) %>%
  bold_labels()

# Sélection uniquement des colonnes "mort" et "Total"
Tab <- tab %>%
  modify_table_styling(columns = c("stat_1_1", "stat_1_2"), 
  hide = TRUE) 
# Conversion en flextable
as_flex_table(Tab)

NSCLC

SCLC

Characteristic

Mort
N = 22,8901

Overall
N = 27,2051

Mort
N = 3,3141

Overall
N = 3,5241

sex

male

12,953 (86%)

15,009 (100%)

1,763 (95%)

1,848 (100%)

female

9,937 (81%)

12,196 (100%)

1,551 (93%)

1,676 (100%)

agediag

Mean (Median), SD(Min-Max)

73 (73), 10(40-90)

72 (73), 10(40-90)

70 (70), 9(40-90)

69 (70), 9(40-90)

Q1-Q3

66-80

65-80

63-76

63-76

agediagcl

40-72

10,611 (80%)

13,347 (100%)

2,015 (93%)

2,176 (100%)

73-90

12,279 (89%)

13,858 (100%)

1,299 (96%)

1,348 (100%)

gor

A

1,822 (86%)

2,127 (100%)

271 (95%)

285 (100%)

B

3,971 (84%)

4,722 (100%)

647 (94%)

691 (100%)

D

2,776 (83%)

3,341 (100%)

433 (93%)

464 (100%)

E

1,738 (83%)

2,092 (100%)

291 (96%)

304 (100%)

F

2,353 (86%)

2,752 (100%)

336 (93%)

360 (100%)

G

2,450 (85%)

2,886 (100%)

350 (95%)

370 (100%)

H

2,366 (83%)

2,851 (100%)

278 (94%)

296 (100%)

J

3,207 (84%)

3,810 (100%)

416 (93%)

446 (100%)

K

2,207 (84%)

2,624 (100%)

292 (95%)

308 (100%)

charlson

Mean (Median), SD(Min-Max)

3.56 (3.00), 1.77(0.00-16.00)

3.46 (3.00), 1.74(0.00-16.00)

3.12 (3.00), 1.55(0.00-12.00)

3.09 (3.00), 1.54(0.00-12.00)

Q1-Q3

2.00-4.00

2.00-4.00

2.00-4.00

2.00-4.00

survtyearcens

Mean (Median), SD(Min-Max)

0.72 (0.41), 0.79(0.00-3.95)

1.15 (0.58), 1.24(0.00-4.01)

0.65 (0.46), 0.67(0.00-3.67)

0.82 (0.51), 0.94(0.00-4.00)

Q1-Q3

0.14-1.03

0.17-1.85

0.13-0.93

0.14-1.08

charlsoncl1

>2

16,540 (86%)

19,158 (100%)

2,106 (95%)

2,207 (100%)

0

395 (76%)

521 (100%)

72 (90%)

80 (100%)

1

1,605 (78%)

2,064 (100%)

321 (92%)

350 (100%)

2

4,350 (80%)

5,462 (100%)

815 (92%)

887 (100%)

dep

1

3,223 (83%)

3,875 (100%)

390 (91%)

429 (100%)

2

3,838 (82%)

4,662 (100%)

563 (95%)

595 (100%)

3

4,618 (85%)

5,436 (100%)

640 (96%)

669 (100%)

4

5,439 (85%)

6,429 (100%)

755 (93%)

808 (100%)

5

5,772 (85%)

6,803 (100%)

966 (94%)

1,023 (100%)

Stade

I-II

3,766 (54%)

6,936 (100%)

145 (69%)

210 (100%)

III-IV

19,124 (94%)

20,269 (100%)

3,169 (96%)

3,314 (100%)

Annee_dg

2011

147 (90%)

164 (100%)

10 (77%)

13 (100%)

2012

22,661 (84%)

26,946 (100%)

3,289 (94%)

3,496 (100%)

2013

82 (86%)

95 (100%)

15 (100%)

15 (100%)

1n (%)

II.4. Graphiques des tables de mortalité

# Recodage des variables 
expDF.UK <- expDF.UK %>%
  mutate(
  # Recodage de la variable sexe en factor 
    sex = factor(sex, levels = c(1,2), 
                  labels= c("male","female")),
    
  # Recodage de la variable deprivq en factor
    deprivq = factor(deprivq),
  )

# Taux de mortalité par âge, année et deprivation, 
# avec facettes par sexe
ggplot(data = expDF.UK) +
  aes(x = age, y = rate, group = year, color = factor(year)) +
  geom_line(alpha = 0.6) +
  facet_wrap(sex ~ deprivq) +
  scale_color_viridis_d(option = "plasma") +
  xlab("Âge") + 
  ylab("Taux de mortalité") +
  theme_minimal() +
  theme(legend.position = "right") +
  guides(color = guide_legend(title = "Année"))

# Taux de mortalité par âge, année et sexe, 
# avec facettes par département
p1 <- ggplot(data = expDF.UK) + 
  aes(age, rate, color = interaction(year, sex)) +
  geom_line() +
  facet_wrap(~ deprivq) +
  xlab("Âge") + 
  ylab("Taux de mortalité")

II.5. Courbes de survie nette

# Distribution de la survie globale et de la survie nette
# Kaplan Meier
KM <- survfit(Surv(survtdaycens, deadcens) ~ 1, 
              data = lung12boost_P)

# Pohar Perme : Côlon + Rectum
net_surv <- rs.surv(Surv(survtdaycens, deadcens) ~ 1,
          ratetable = rt_lifetable , data = lung12boost_P,
          rmap = list(age = Agediag, sex = sex, 
          deprivq = dep, year = diagdate2), method = "pohar-perme")

# Créer une liste contenant les deux ajustements
fit <- list("Survie brute" = KM, "Survie nette" = net_surv)

# Tracer les courbes de survie avec les intervalles
# de confiance en pointillés
ggsurvplot_combine(
  fit,
  data = lung12boost_P,
  conf.int = TRUE, 
  censor = FALSE, 
  palette = c("blue", "red"),
  xscale = Anneejrs,
  xlab = "Temps après chirurgie (années)", 
  ylab = "Probabilité de survie",
  legend.title = "Méthodes",
  legend.labs = c("Survie brute", "Survie nette")
)

# Estimation de la survie nette
# Selon la variable agediag_cl
net_surv_ag <- rs.surv(Surv(survtdaycens, deadcens) ~ agediagcl,
            ratetable = rt_lifetable , data = lung12boost_P,
            rmap = list(age = Agediag, sex = sex, 
            deprivq = dep, year = diagdate2), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_ag, 
           data = lung12boost_P,
           xscale = Anneejrs,
           axes.offset = FALSE,
           xlim = c(-0.3 * Anneejrs, 4.5 * Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor = FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2 * Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab = "Temps de suivi (Années)",
           ylab = "Probabilité de survie nette",
           risk.table.title = "Sujets à risque (censures)",
           legend.title = "Âge",
           legend.labs = levels(lung12boost_P$agediagcl),
           legend = "right", 
           font.title = c(15, "bold", "black"),
           conf.int.style = "step"
)

# Estimation de la survie nette
# Selon la variable sexe
net_surv_se <- rs.surv(Surv(survtdaycens, deadcens) ~ sex,
            ratetable = rt_lifetable , data = lung12boost_P,
            rmap = list(age = Agediag, sex = sex, 
            deprivq = dep, year = diagdate2), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_se, 
           data = lung12boost_P,
           xscale = Anneejrs,
           axes.offset = FALSE,
           xlim = c(-0.3 * Anneejrs, 4.5 * Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor = FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2 * Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab = "Temps de suivi (Années)",
           ylab = "Probabilité de survie nette",
           risk.table.title = "Sujets à risque (censures)",
           legend.title = "Sexe",
           legend.labs = levels(lung12boost_P$sex),
           legend = "right", 
           font.title = c(15, "bold", "black"),
           conf.int.style = "step"
)

# Estimation de la survie nette
# Selon la variable deprivation
net_surv_de <- rs.surv(Surv(survtdaycens, deadcens) ~ dep,
            ratetable = rt_lifetable , data = lung12boost_P,
            rmap = list(age = Agediag, sex = sex, 
            deprivq = dep, year = diagdate2), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_de, 
           data = lung12boost_P,
           xscale = Anneejrs,
           axes.offset = FALSE,
           xlim = c(-0.3 * Anneejrs, 4.5 * Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor = FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2 * Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab = "Temps de suivi (Années)",
           ylab = "Probabilité de survie nette",
           risk.table.title = "Sujets à risque (censures)",
           legend.title = "Deprivation",
           legend.labs = levels(lung12boost_P$dep),
           legend = "right", 
           font.title = c(15, "bold", "black"),
           conf.int.style = "step"
)

# Estimation de la survie nette
# Selon la variable typeKC
net_surv_ty <- rs.surv(Surv(survtdaycens, deadcens) ~ typeKC,
            ratetable = rt_lifetable , data = lung12boost_P,
            rmap = list(age = Agediag, sex = sex, 
            deprivq = dep, year = diagdate2), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_ty, 
           data = lung12boost_P,
           xscale = Anneejrs,
           axes.offset = FALSE,
           xlim = c(-0.3 * Anneejrs, 4.2 * Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor = FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2 * Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab = "Temps de suivi (Années)",
           ylab = "Probabilité de survie nette",
           risk.table.title = "Sujets à risque (censures)",
           legend.title = "Type de cancer",
           legend.labs = levels(lung12boost_P$typeKC),
           legend = "right", 
           font.title = c(15, "bold", "black"),
           conf.int.style = "step"
)

# Estimation de la survie nette
# Selon la variable Stade
net_surv_st <- rs.surv(Surv(survtdaycens, deadcens) ~ Stade,
            ratetable = rt_lifetable , data = lung12boost_P,
            rmap = list(age = Agediag, sex = sex, 
            deprivq = dep, year = diagdate2), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_st, 
           data = lung12boost_P,
           xscale = Anneejrs,
           axes.offset = FALSE,
           xlim = c(-0.3 * Anneejrs, 4.2 * Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor = FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2 * Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab = "Temps de suivi (Années)",
           ylab = "Probabilité de survie nette",
           risk.table.title = "Sujets à risque (censures)",
           legend.title = "Stade du cancer",
           legend.labs = levels(lung12boost_P$Stade),
           legend = "right", 
           font.title = c(15, "bold", "black"),
           conf.int.style = "step"
)

# Estimation de la survie nette
# Selon la variable Charlson
net_surv_ch <- rs.surv(Surv(survtdaycens, deadcens) ~ charlsoncl1,
            ratetable = rt_lifetable , data = lung12boost_P,
            rmap = list(age = Agediag, sex = sex, 
            deprivq = dep, year = diagdate2), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_ch, 
           data = lung12boost_P,
           xscale = Anneejrs,
           axes.offset = FALSE,
           xlim = c(-0.3 * Anneejrs, 4.2 * Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor = FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2 * Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab = "Temps de suivi (Années)",
           ylab = "Probabilité de survie nette",
           risk.table.title = "Sujets à risque (censures)",
           legend.title = "Score de charlson",
           legend.labs = levels(lung12boost_P$charlsoncl1),
           legend = "right", 
           font.title = c(15, "bold", "black"),
           conf.int.style = "step"
)

# Estimation de la survie nette
# Selon la variable Annee_dg
net_surv_an <- rs.surv(Surv(survtdaycens, deadcens) ~ Annee_dg,
            ratetable = rt_lifetable , data = lung12boost_P,
            rmap = list(age = Agediag, sex = sex, 
            deprivq = dep, year = diagdate2), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_an, 
           data = lung12boost_P,
           xscale = Anneejrs,
           axes.offset = FALSE,
           xlim = c(-0.3 * Anneejrs, 4.2 * Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor = FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2 * Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab = "Temps de suivi (Années)",
           ylab = "Probabilité de survie nette",
           risk.table.title = "Sujets à risque (censures)",
           legend.title = "Année de diagnostic",
           legend.labs = levels(lung12boost_P$Annee_dg),
           legend = "right", 
           font.title = c(15, "bold", "black"),
           conf.int.style = "step"
)

# Estimation de la survie nette
# Selon la variable Région administrative gouvernementale
net_surv_go <- rs.surv(Surv(survtdaycens, deadcens) ~ gor,
            ratetable = rt_lifetable , data = lung12boost_P,
            rmap = list(age = Agediag, sex = sex, 
            deprivq = dep, year = diagdate2), method = "pohar-perme")

# Distribution de la survie nette
ggsurvplot(net_surv_go, 
           data = lung12boost_P,
           xscale = Anneejrs,
           axes.offset = FALSE,
           xlim = c(-0.3 * Anneejrs, 4.2 * Anneejrs),
           conf.int = FALSE,
           size = 0.8,
           censor = FALSE,
           risk.table = "nrisk_cumcensor",
           palette = "jco",
           tables.height = 0.25,
           break.x.by = 2 * Anneejrs,
           fontsize = 2.8,
           tables.y.text = FALSE,
           tables.theme = theme_cleantable(),
           xlab = "Temps de suivi (Années)",
           ylab = "Probabilité de survie nette",
           risk.table.title = "Sujets à risque (censures)",
           legend.title = "Région",
           legend.labs = levels(lung12boost_P$gor),
           legend = "right", 
           font.title = c(15, "bold", "black"),
           conf.int.style = "step"
)

II.6. Estimations de probabilité de survie nette

# Affichage des estimations de probabilité de survie nette 
# Pour la variable Sexe
summary(net_surv_se, times = c(1, 2, 4)*365.241)
## Call: rs.surv(formula = Surv(survtdaycens, deadcens) ~ sex, data = lung12boost_P, 
##     ratetable = rt_lifetable, method = "pohar-perme", rmap = list(age = Agediag, 
##         sex = sex, deprivq = dep, year = diagdate2))
## 
##                 sex=male 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   5691   11164    0.350 0.00378        0.343        0.357
##   730   3321    2367    0.212 0.00331        0.206        0.219
##  1461      3    1185    0.132 0.00386        0.125        0.140
## 
##                 sex=female 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   5520    8351    0.408 0.00427        0.400        0.416
##   730   3510    1999    0.267 0.00390        0.259        0.274
##  1461      8    1138    0.173 0.00453        0.164        0.182
# Affichage des estimations de probabilité de survie nette 
# Pour la variable Stade
summary(net_surv_st, times = c(1, 2, 4)*365.241)
## Call: rs.surv(formula = Surv(survtdaycens, deadcens) ~ Stade, data = lung12boost_P, 
##     ratetable = rt_lifetable, method = "pohar-perme", rmap = list(age = Agediag, 
##         sex = sex, deprivq = dep, year = diagdate2))
## 
##                 Stade=I-II 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   5405    1741    0.782 0.00530        0.772        0.793
##   730   4228    1175    0.631 0.00631        0.619        0.644
##  1461      6     995    0.477 0.00958        0.458        0.496
## 
##                 Stade=III-IV 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   5806   17774   0.2532 0.00289       0.2476       0.2589
##   730   2603    3191   0.1170 0.00217       0.1128       0.1213
##  1461      5    1328   0.0519 0.00214       0.0479       0.0563
# Affichage des estimations de probabilité de survie nette 
# Pour la variable Deprivation
summary(net_surv_de, times = c(1, 2, 4)*365.241)
## Call: rs.surv(formula = Surv(survtdaycens, deadcens) ~ dep, data = lung12boost_P, 
##     ratetable = rt_lifetable, method = "pohar-perme", rmap = list(age = Agediag, 
##         sex = sex, deprivq = dep, year = diagdate2))
## 
##                 dep=1 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1643    2661    0.391 0.00760        0.376        0.406
##   730   1031     612    0.251 0.00685        0.238        0.265
## 
##                 dep=2 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1969    3288    0.385 0.00686        0.371        0.398
##   730   1250     717    0.251 0.00622        0.239        0.264
##  1461      2     396    0.165 0.00708        0.152        0.180
## 
##                 dep=3 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   2190    3914    0.370 0.00634        0.358        0.382
##   730   1315     870    0.230 0.00562        0.219        0.241
##  1461      3     474    0.143 0.00562        0.132        0.154
## 
##                 dep=4 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   2605    4631    0.372 0.00584        0.361        0.384
##   730   1585    1016    0.235 0.00522        0.225        0.245
##  1461      3     547    0.152 0.00524        0.142        0.163
## 
##                 dep=5 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   2804    5021    0.371 0.00563        0.360        0.382
##   730   1650    1151    0.226 0.00497        0.217        0.236
##  1461      3     566    0.141 0.00674        0.128        0.155
# Affichage des estimations de probabilité de survie nette 
# Pour la variable Age
summary(net_surv_ag, times = c(1, 2, 4)*365.241)
## Call: rs.surv(formula = Surv(survtdaycens, deadcens) ~ agediagcl, data = lung12boost_P, 
##     ratetable = rt_lifetable, method = "pohar-perme", rmap = list(age = Agediag, 
##         sex = sex, deprivq = dep, year = diagdate2))
## 
##                 agediagcl=40-72 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   6550    8970    0.427 0.00401        0.419        0.435
##   730   4132    2408    0.274 0.00364        0.267        0.281
##  1461      7    1248    0.179 0.00423        0.171        0.188
## 
##                 agediagcl=73-90 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   4661   10545    0.324 0.00396        0.316        0.332
##   730   2699    1958    0.199 0.00349        0.192        0.206
##  1461      4    1075    0.121 0.00402        0.114        0.130
# Affichage des estimations de probabilité de survie nette 
# Pour la variable Charlson
summary(net_surv_ch, times = c(1, 2, 4)*365.241)
## Call: rs.surv(formula = Surv(survtdaycens, deadcens) ~ charlsoncl1, 
##     data = lung12boost_P, ratetable = rt_lifetable, method = "pohar-perme", 
##     rmap = list(age = Agediag, sex = sex, deprivq = dep, year = diagdate2))
## 
##                 charlsoncl1=>2 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   7163   14201    0.350 0.00338        0.343        0.356
##   730   4314    2845    0.220 0.00301        0.214        0.226
##  1461      9    1600    0.136 0.00353        0.129        0.143
## 
##                 charlsoncl1=0 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    284     317    0.474  0.0204        0.435        0.515
##   730    177     106    0.297  0.0187        0.263        0.336
##  1461      1      44    0.223  0.0172        0.191        0.259
## 
##                 charlsoncl1=1 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1063    1350    0.443 0.01014        0.423        0.463
##   730    660     399    0.278 0.00917        0.260        0.296
##  1461      1     177    0.192 0.00909        0.175        0.211
## 
##                 charlsoncl1=2 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   2701    3647    0.430 0.00627        0.418        0.443
##   730   1680    1016    0.271 0.00567        0.261        0.283
# Affichage des estimations de probabilité de survie nette 
# Pour la variable Annee_dg
summary(net_surv_an, times = c(1, 2, 4)*365.241)
## Call: rs.surv(formula = Surv(survtdaycens, deadcens) ~ Annee_dg, data = lung12boost_P, 
##     ratetable = rt_lifetable, method = "pohar-perme", rmap = list(age = Agediag, 
##         sex = sex, deprivq = dep, year = diagdate2))
## 
##                 Annee_dg=2011 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     68     109    0.398  0.0377       0.3304        0.479
##   730     40      28    0.243  0.0335       0.1851        0.318
##  1461     11      20    0.124  0.0271       0.0807        0.190
## 
##                 Annee_dg=2012 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365  11111   19328    0.376 0.00285        0.371        0.382
##   730   6776    4321    0.237 0.00255        0.232        0.242
## 
##                 Annee_dg=2013 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365     32      78    0.298  0.0441        0.223        0.398
##   730     15      17    0.142  0.0335        0.089        0.225
# Affichage des estimations de probabilité de survie nette 
# Pour la variable TypeKC
summary(net_surv_ty, times = c(1, 2, 4)*365.241)
## Call: rs.surv(formula = Surv(survtdaycens, deadcens) ~ typeKC, data = lung12boost_P, 
##     ratetable = rt_lifetable, method = "pohar-perme", rmap = list(age = Agediag, 
##         sex = sex, deprivq = dep, year = diagdate2))
## 
##                 typeKC=NSCLC 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365  10240   16962    0.388 0.00304        0.383        0.394
##   730   6446    3782    0.253 0.00276        0.247        0.258
##  1461     10    2146    0.162 0.00329        0.156        0.169
## 
##                 typeKC=SCLC 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    971    2553   0.2813 0.00769       0.2666       0.2967
##   730    385     584   0.1142 0.00549       0.1040       0.1255
##  1461      1     177   0.0613 0.00444       0.0531       0.0706
# Affichage des estimations de probabilité de survie nette 
# Pour la variable Region
summary(net_surv_go, times = c(1, 2, 4)*365.241)
## Call: rs.surv(formula = Surv(survtdaycens, deadcens) ~ gor, data = lung12boost_P, 
##     ratetable = rt_lifetable, method = "pohar-perme", rmap = list(age = Agediag, 
##         sex = sex, deprivq = dep, year = diagdate2))
## 
##                 gor=A 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    824    1588    0.353 0.01000        0.334        0.374
##   730    485     339    0.215 0.00878        0.199        0.233
## 
##                 gor=B 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   2010    3403    0.383 0.00678        0.370        0.396
##   730   1199     809    0.235 0.00603        0.223        0.247
##  1461      1     406    0.151 0.00601        0.140        0.164
## 
##                 gor=D 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1481    2324    0.402 0.00818        0.387        0.419
##   730    898     582    0.253 0.00740        0.239        0.268
##  1461      5     303    0.173 0.00730        0.159        0.188
## 
##                 gor=E 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365    889    1507    0.382 0.01018        0.362        0.402
##   730    564     325    0.250 0.00921        0.232        0.268
##  1461      1     197    0.151 0.01155        0.130        0.175
## 
##                 gor=F 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1080    2032    0.358 0.00882        0.341        0.375
##   730    627     453    0.214 0.00766        0.199        0.229
##  1461      2     204    0.144 0.00738        0.131        0.160
## 
##                 gor=G 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1181    2075    0.373 0.00868        0.357        0.391
##   730    705     476    0.230 0.00768        0.215        0.245
## 
##                 gor=H 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1197    1950    0.393 0.00894        0.375        0.410
##   730    761     429    0.260 0.00819        0.244        0.276
## 
##                 gor=J 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1486    2769    0.359 0.00752        0.345        0.374
##   730    945     537    0.235 0.00677        0.222        0.249
##  1461      1     317    0.153 0.00674        0.140        0.166
## 
##                 gor=K 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##   365   1063    1867    0.374 0.00916        0.357        0.393
##   730    647     416    0.235 0.00819        0.220        0.252
##  1461      1     216    0.153 0.00857        0.137        0.171

II.7. Modélisation

II.7.1. Choix du nombre d’intervalles pour la modélisation des taux de mortalité de base pour les modèles : Estève, Touraine, Mba

Six intervalles ont été retenus pour l’analyse des différents modèles : Estève, Touraine et Mba. Nous constatons qu’à partir du modèle d’Estève avec six intervalles, la valeur de l’AIC diminue faiblement, bien que celle-ci continue de décroître légèrement à mesure que le nombre d’intervalles augmente de 2 à 16.

# Cas de huit intervalles
fit_estcr8 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + gor + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval=c(0, NA, NA, NA, NA, NA, NA , NA,
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estcr8 <- predict(object = fit_estcr8,
        new.data = lung12boost_P,
        times.pts = c(seq(0, 
        max(lung12boost_P$survtyearcens), 0.1)),
        baseline = TRUE)

# Cas de six intervalles
fit_estcr6 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + sex 
        + typeKC + dep + Stade + gor + Annee_dg, 
        data=lung12boost_P, ratetable = rt_lifetable,
        interval = c(0, NA, NA, NA, NA, NA,
        max(lung12boost_P$survtyearcens)), 
        rmap = list(age = 'agediag', sex = 'sex', 
        year = 'diagdate2', deprivq = 'dep'), 
        baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estcr6 <- predict(object = fit_estcr6,
        new.data = lung12boost_P,
        times.pts = c(seq(0, 
        max(lung12boost_P$survtyearcens), 0.1)),
        baseline = TRUE)

# Cas de quatre intervalles
# Modèle d'estève pour l'ensemble des individus 
fit_estcr4 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + sex 
        + typeKC + dep + Stade + gor + Annee_dg, 
        data=lung12boost_P, ratetable = rt_lifetable,
        interval = c(0, NA, NA, NA, max(lung12boost_P$survtyearcens)), 
        rmap = list(age = 'agediag', sex = 'sex', 
        year = 'diagdate2', deprivq = 'dep'), 
        baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estcr4 <- predict(object = fit_estcr4,
        new.data = lung12boost_P,
        times.pts = c(seq(0, 
        max(lung12boost_P$survtyearcens), 0.1)),
        baseline = TRUE)

# Cas de deux intervalles
fit_estcr2 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + sex 
        + typeKC + dep + Stade + gor + Annee_dg, 
        data=lung12boost_P, ratetable = rt_lifetable,
        interval = c(0, NA, max(lung12boost_P$survtyearcens)), 
        rmap = list(age = 'agediag', sex = 'sex', 
        year = 'diagdate2', deprivq = 'dep'), 
        baseline = "constant", pophaz = "classic")

# Cas de trois intervalles
fit_estcr3 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + sex 
        + typeKC + dep + Stade + gor + Annee_dg, 
        data=lung12boost_P, ratetable = rt_lifetable,
        interval = c(0, NA, NA, max(lung12boost_P$survtyearcens)), 
        rmap = list(age = 'agediag', sex = 'sex', 
        year = 'diagdate2', deprivq = 'dep'), 
        baseline = "constant", pophaz = "classic")

# Cas de cinq intervalles 
fit_estcr5 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + sex 
        + typeKC + dep + Stade + gor + Annee_dg, 
        data=lung12boost_P, ratetable = rt_lifetable,
        interval = c(0, NA, NA, NA, NA, 
        max(lung12boost_P$survtyearcens)), 
        rmap = list(age = 'agediag', sex = 'sex', 
        year = 'diagdate2', deprivq = 'dep'), 
        baseline = "constant", pophaz = "classic")

# Cas de sept intervalles
fit_estcr7 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + sex 
        + typeKC + dep + Stade + gor + Annee_dg, 
        data=lung12boost_P, ratetable = rt_lifetable,
        interval = c(0, NA, NA, NA, NA, NA, NA,
        max(lung12boost_P$survtyearcens)), 
        rmap = list(age = 'agediag', sex = 'sex', 
        year = 'diagdate2', deprivq = 'dep'), 
        baseline = "constant", pophaz = "classic")

# Cas de neuf intervalles
fit_estcr9 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + gor + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval=c(0, NA, NA, NA, NA, NA, NA , NA, NA,
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")

# Cas de 10 intervalles
fit_estcr10 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + gor + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval=c(0, NA, NA, NA, NA, NA, NA, NA, NA, NA,
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")


# Cas de 11 intervalles
fit_estcr11 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + gor + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval=c(0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")

# Cas de 12 intervalles
fit_estcr12 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + gor + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval=c(0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")

# Cas de 13 intervalles
fit_estcr13 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + gor + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval=c(0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")

# Cas de 14 intervalles
fit_estcr14 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + gor + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval=c(0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")

# Cas de 15 intervalles
fit_estcr15 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + gor + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval=c(0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")

# Cas de 16 intervalles
fit_estcr16 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + gor + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval=c(0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")
# Définir les intervalles
intervalles <- 2:16

# Calculer les AIC pour chaque modèle
aic <- sapply(intervalles, 
       function(i) AIC(get(paste0("fit_estcr", i))))

# Créer un data.frame avec les intervalles en lignes
tableau_AIC <- data.frame(
  Intervalles = intervalles,
  `AIC` = aic)

# Affichage du tableau
kable(tableau_AIC)
Intervalles AIC
2 49277.87
3 48465.53
4 48411.52
5 48197.50
6 48043.37
7 48005.45
8 47969.67
9 47858.05
10 47850.04
11 47789.20
12 47798.94
13 47765.87
14 47746.25
15 47737.27
16 47735.61

II.8. Modèle classique et les différents modèles de correction

# Estimation de la survie nette par la méthoded'Estève
# Modèle d'Estève I (sans région gor)
fit_estcr61 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + Annee_dg,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval = c(0, NA, NA, NA, NA, NA, 
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estcr61 <- predict(object = fit_estcr61,
        new.data = lung12boost_P,
        times.pts = c(seq(0, 
        max(lung12boost_P$survtyearcens), 0.1)),
        baseline = TRUE)

# Affichage des résultats
summary(fit_estcr61)
## Call:
## xhaz(formula = Surv(survtyearcens, deadcens) ~ agediagcl + sex + 
##     typeKC + dep + Stade + Annee_dg, data = lung12boost_P, ratetable = rt_lifetable, 
##     rmap = list(age = "agediag", sex = "sex", year = "diagdate2", 
##         deprivq = "dep"), baseline = "constant", pophaz = "classic", 
##     interval = c(0, NA, NA, NA, NA, NA, max(lung12boost_P$survtyearcens)))
## 
## 
##                    coef se(coef) lower 0.95 upper 0.95     z Pr(>|z|)    
## agediagcl73-90  0.34651  0.01307    0.32089    0.37214 26.50  < 2e-16 ***
## sexfemale      -0.12110  0.01306   -0.14670   -0.09550 -9.27  < 2e-16 ***
## typeKCSCLC      0.06893  0.01932    0.03107    0.10679  3.57  0.00036 ***
## dep2            0.02556  0.02340   -0.02031    0.07142  1.09  0.27000    
## dep3            0.08262  0.02255    0.03842    0.12683  3.66  0.00025 ***
## dep4            0.08517  0.02188    0.04229    0.12806  3.89  9.9e-05 ***
## dep5            0.12161  0.02158    0.07931    0.16392  5.63  1.8e-08 ***
## StadeIII-IV     1.59933  0.02093    1.55831    1.64035 76.42  < 2e-16 ***
## Annee_dg2012    0.12937  0.08481   -0.03686    0.29560  1.53  0.13000    
## Annee_dg2013    0.36352  0.13442    0.10007    0.62697  2.70  0.00680 ** 
## [0-0.09[        0.29930  0.02709    0.24619    0.35240 11.05  < 2e-16 ***
## [0.09-0.2[      0.30803  0.02785    0.25344    0.36262 11.06  < 2e-16 ***
## [0.2-0.41[      0.20037  0.01810    0.16489    0.23585 11.07  < 2e-16 ***
## [0.41-0.76[     0.17313  0.01562    0.14253    0.20374 11.09  < 2e-16 ***
## [0.76-1.39[     0.14948  0.01345    0.12311    0.17585 11.11  < 2e-16 ***
## [1.39-4.01[     0.09455  0.00849    0.07792    0.11119 11.14  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Excess hazard hazard ratio(s)
## (proportional effect variable(s) for exess hazard ratio(s))
##                exp(coef) lower 0.95 upper 0.95
## agediagcl73-90    1.4141     1.3784     1.4508
## sexfemale         0.8859     0.8636     0.9089
## typeKCSCLC        1.0714     1.0316     1.1127
## dep2              1.0259     0.9799     1.0740
## dep3              1.0861     1.0392     1.1352
## dep4              1.0889     1.0432     1.1366
## dep5              1.1293     1.0825     1.1781
## StadeIII-IV       4.9497     4.7508     5.1570
## Annee_dg2012      1.1381     0.9638     1.3439
## Annee_dg2013      1.4384     1.1052     1.8719
## 
## number of observations: 30729;  number of events: 26204
## Likelihood ratio test: 9542  on 16 degree(s) of freedom, p=0
# Modèle d'Estève II (avec région gor)
fit_estcr62 <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + Annee_dg + gor,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval = c(0, NA, NA, NA, NA, NA, 
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'), 
    baseline = "constant", pophaz = "classic")

# Prédiction de la mortalité de base en excès
predict_estcr62 <- predict(object = fit_estcr62,
        new.data = lung12boost_P,
        times.pts = c(seq(0, 
        max(lung12boost_P$survtyearcens), 0.1)),
        baseline = TRUE)

# Affichage des résultats
summary(fit_estcr62)
## Call:
## xhaz(formula = Surv(survtyearcens, deadcens) ~ agediagcl + sex + 
##     typeKC + dep + Stade + Annee_dg + gor, data = lung12boost_P, 
##     ratetable = rt_lifetable, rmap = list(age = "agediag", sex = "sex", 
##         year = "diagdate2", deprivq = "dep"), baseline = "constant", 
##     pophaz = "classic", interval = c(0, NA, NA, NA, NA, NA, max(lung12boost_P$survtyearcens)))
## 
## 
##                    coef se(coef) lower 0.95 upper 0.95     z Pr(>|z|)    
## agediagcl73-90  0.34625  0.01308    0.32061    0.37190 26.47  < 2e-16 ***
## sexfemale      -0.12169  0.01307   -0.14731   -0.09607 -9.31  < 2e-16 ***
## typeKCSCLC      0.06598  0.01934    0.02808    0.10388  3.41  0.00065 ***
## dep2            0.02815  0.02346   -0.01783    0.07413  1.20  0.23000    
## dep3            0.08507  0.02262    0.04072    0.12941  3.76  0.00017 ***
## dep4            0.09528  0.02211    0.05194    0.13862  4.31  1.6e-05 ***
## dep5            0.13355  0.02243    0.08959    0.17751  5.95  2.6e-09 ***
## StadeIII-IV     1.60293  0.02095    1.56188    1.64399 76.52  < 2e-16 ***
## Annee_dg2012    0.12466  0.08485   -0.04165    0.29096  1.47  0.14000    
## Annee_dg2013    0.35852  0.13450    0.09491    0.62213  2.67  0.00770 ** 
## gorB            0.00562  0.02777   -0.04881    0.06005  0.20  0.84000    
## gorD           -0.04267  0.02971   -0.10091    0.01557 -1.44  0.15000    
## gorE           -0.03022  0.03300   -0.09489    0.03446 -0.92  0.36000    
## gorF            0.04778  0.03073   -0.01244    0.10801  1.56  0.12000    
## gorG           -0.02416  0.03091   -0.08474    0.03642 -0.78  0.43000    
## gorH           -0.12026  0.03082   -0.18067   -0.05986 -3.90  9.5e-05 ***
## gorJ            0.02618  0.02958   -0.03180    0.08415  0.88  0.38000    
## gorK            0.00298  0.03165   -0.05905    0.06501  0.09  0.93000    
## [0-0.09[        0.30175  0.02834    0.24622    0.35729 10.65  < 2e-16 ***
## [0.09-0.2[      0.31067  0.02914    0.25356    0.36778 10.66  < 2e-16 ***
## [0.2-0.41[      0.20218  0.01895    0.16505    0.23931 10.67  < 2e-16 ***
## [0.41-0.76[     0.17478  0.01636    0.14272    0.20684 10.69  < 2e-16 ***
## [0.76-1.39[     0.15109  0.01411    0.12342    0.17875 10.70  < 2e-16 ***
## [1.39-4.01[     0.09573  0.00892    0.07824    0.11321 10.73  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Excess hazard hazard ratio(s)
## (proportional effect variable(s) for exess hazard ratio(s))
##                exp(coef) lower 0.95 upper 0.95
## agediagcl73-90    1.4138     1.3780     1.4505
## sexfemale         0.8854     0.8630     0.9084
## typeKCSCLC        1.0682     1.0285     1.1095
## dep2              1.0285     0.9823     1.0769
## dep3              1.0888     1.0416     1.1382
## dep4              1.1000     1.0533     1.1487
## dep5              1.1429     1.0937     1.1942
## StadeIII-IV       4.9676     4.7678     5.1758
## Annee_dg2012      1.1328     0.9592     1.3377
## Annee_dg2013      1.4312     1.0996     1.8629
## gorB              1.0056     0.9524     1.0619
## gorD              0.9582     0.9040     1.0157
## gorE              0.9702     0.9095     1.0351
## gorF              1.0489     0.9876     1.1141
## gorG              0.9761     0.9188     1.0371
## gorH              0.8867     0.8347     0.9419
## gorJ              1.0265     0.9687     1.0878
## gorK              1.0030     0.9427     1.0672
## 
## number of observations: 30729;  number of events: 26204
## Likelihood ratio test: 9590  on 24 degree(s) of freedom, p=0
# Modèles de correction
## Modèle de Touraine 
fit.corrected1r <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + Annee_dg + gor,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval = c(0, NA, NA, NA, NA, NA, 
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'),
    baseline = "constant", pophaz = "corrected",
    add.rmap = "gor")

# Prédiction de la mortalité de base en excès
predict_corrected1 <- predict(object = fit.corrected1r,
        new.data = lung12boost_P, 
        times.pts = c(seq(0, 
        max(lung12boost_P$survtyearcens), 0.1)),
        baseline = TRUE)

# Affichage des résultats
summary(fit.corrected1r)
## Call:
## xhaz(formula = Surv(survtyearcens, deadcens) ~ agediagcl + sex + 
##     typeKC + dep + Stade + Annee_dg + gor, data = lung12boost_P, 
##     ratetable = rt_lifetable, rmap = list(age = "agediag", sex = "sex", 
##         year = "diagdate2", deprivq = "dep"), baseline = "constant", 
##     pophaz = "corrected", add.rmap = "gor", interval = c(0, NA, 
##         NA, NA, NA, NA, max(lung12boost_P$survtyearcens)))
## 
## 
##                    coef se(coef) lower 0.95 upper 0.95     z Pr(>|z|)    
## agediagcl73-90  0.23815  0.01552    0.20774    0.26856 15.35  < 2e-16 ***
## sexfemale      -0.09860  0.01435   -0.12672   -0.07048 -6.87  6.3e-12 ***
## typeKCSCLC      0.06354  0.02052    0.02333    0.10376  3.10   0.0020 ** 
## dep2            0.02707  0.02540   -0.02271    0.07686  1.07   0.2900    
## dep3            0.07187  0.02455    0.02375    0.12000  2.93   0.0034 ** 
## dep4            0.07340  0.02408    0.02620    0.12060  3.05   0.0023 ** 
## dep5            0.10361  0.02454    0.05551    0.15172  4.22  2.4e-05 ***
## StadeIII-IV     2.02045  0.04149    1.93914    2.10176 48.70  < 2e-16 ***
## Annee_dg2012    0.15284  0.09467   -0.03271    0.33840  1.61   0.1100    
## Annee_dg2013    0.39600  0.14613    0.10959    0.68242  2.71   0.0067 ** 
## gorB           -0.02033  0.03487   -0.08866    0.04801 -0.58   0.5600    
## gorD           -0.01626  0.03651   -0.08781    0.05529 -0.45   0.6600    
## gorE           -0.02626  0.04089   -0.10640    0.05388 -0.64   0.5200    
## gorF            0.05350  0.03827   -0.02151    0.12852  1.40   0.1600    
## gorG            0.00431  0.03793   -0.07003    0.07865  0.11   0.9100    
## gorH           -0.09676  0.03799   -0.17123   -0.02230 -2.55   0.0110 *  
## gorJ            0.05153  0.03624   -0.01951    0.12256  1.42   0.1600    
## gorK            0.02580  0.03873   -0.05011    0.10171  0.67   0.5100    
## [0-0.09[        0.19766  0.02190    0.15472    0.24059  9.02  < 2e-16 ***
## [0.09-0.2[      0.20369  0.02255    0.15949    0.24788  9.03  < 2e-16 ***
## [0.2-0.41[      0.12906  0.01434    0.10096    0.15717  9.00  < 2e-16 ***
## [0.41-0.76[     0.11084  0.01231    0.08671    0.13497  9.00  < 2e-16 ***
## [0.76-1.39[     0.09412  0.01048    0.07357    0.11466  8.98  < 2e-16 ***
## [1.39-4.01[     0.05431  0.00613    0.04230    0.06632  8.86  < 2e-16 ***
## log(alpha.A)    1.30233  0.10001    1.10632    1.49835 13.02  < 2e-16 ***
## log(alpha.B)    1.46779  0.06531    1.33979    1.59579 22.47  < 2e-16 ***
## log(alpha.D)    1.06703  0.08771    0.89513    1.23894 12.17  < 2e-16 ***
## log(alpha.E)    1.25783  0.10968    1.04287    1.47280 11.47  < 2e-16 ***
## log(alpha.F)    1.30779  0.09780    1.11611    1.49947 13.37  < 2e-16 ***
## log(alpha.G)    1.09489  0.11397    0.87151    1.31828  9.61  < 2e-16 ***
## log(alpha.H)    1.07865  0.10203    0.87868    1.27862 10.57  < 2e-16 ***
## log(alpha.J)    1.18012  0.09463    0.99464    1.36560 12.47  < 2e-16 ***
## log(alpha.K)    1.13405  0.10508    0.92811    1.34000 10.79  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Excess hazard hazard ratio(s)
## (proportional effect variable(s) for exess hazard ratio(s))
##                exp(coef) lower 0.95 upper 0.95
## agediagcl73-90    1.2689     1.2309     1.3081
## sexfemale         0.9061     0.8810     0.9319
## typeKCSCLC        1.0656     1.0236     1.1093
## dep2              1.0274     0.9775     1.0799
## dep3              1.0745     1.0240     1.1275
## dep4              1.0762     1.0265     1.1282
## dep5              1.1092     1.0571     1.1638
## StadeIII-IV       7.5417     6.9528     8.1806
## Annee_dg2012      1.1651     0.9678     1.4027
## Annee_dg2013      1.4859     1.1158     1.9787
## gorB              0.9799     0.9152     1.0492
## gorD              0.9839     0.9159     1.0568
## gorE              0.9741     0.8991     1.0554
## gorF              1.0550     0.9787     1.1371
## gorG              1.0043     0.9324     1.0818
## gorH              0.9078     0.8426     0.9779
## gorJ              1.0529     0.9807     1.1304
## gorK              1.0261     0.9511     1.1071
## 
## and corrected scale parameters on population hazard 
##         exp(coef) lower 0.95 upper 0.95
## alpha.A     3.678      3.023      4.474
## alpha.B     4.340      3.818      4.932
## alpha.D     2.907      2.448      3.452
## alpha.E     3.518      2.837      4.361
## alpha.F     3.698      3.053      4.479
## alpha.G     2.989      2.391      3.737
## alpha.H     2.941      2.408      3.592
## alpha.J     3.255      2.704      3.918
## alpha.K     3.108      2.530      3.819
## 
## number of observations: 30729;  number of events: 26204
## Likelihood ratio test: 9538  on 33 degree(s) of freedom, p=0
# Calcul des déciles pour l'âge au décès potentiel
points_rupture <- quantile(lung12boost_P$agediag + 
          lung12boost_P$survtyearcens, seq(0.1, 0.9, 0.1))

# Affichage des quantiles 
kable(points_rupture, 
  caption = "Tableau des déciles selon l'âge au décès")
Tableau des déciles selon l’âge au décès
x
10% 59.41348
20% 64.46564
30% 67.94762
40% 70.74870
50% 73.42646
60% 76.12228
70% 78.98729
80% 82.01643
90% 85.29843
# Fonction d'évaluation
eval_model <- function(point.rupt) {
  fit <- tryCatch({
    xhaz(
    Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + Annee_dg + gor,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval = c(0, NA, NA, NA, NA, NA, 
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'),
    baseline = "constant", pophaz = "corrected",
    add.rmap = "gor", 
    add.rmap.cut = list(breakpoint = TRUE,
    cut = c(point.rupt))
    )
  }, error = function(e) return(NULL))

# Extraction des AIC et BIC  
  aic_val <- tryCatch(AIC(fit), error = function(e) NA)
  bic_val <- tryCatch(BIC(fit), error = function(e) NA)

# Extraction des valeurs AIC et BIC sous forme tableur 
  tibble(
    Point_de_rupture = point.rupt,
    AIC = round(aic_val, 2),
    BIC = round(bic_val, 2))
}

# Appliquer la fonction et compiler les résultats
results_models <- lapply(points_rupture, eval_model) %>% 
  bind_rows()

# Ajouter les ΔAIC, ΔBIC, classements
results_models <- results_models %>%
  mutate(
    Delta_AIC = round(AIC - min(AIC, na.rm = TRUE), 2),
    Delta_BIC = round(BIC - min(BIC, na.rm = TRUE), 2),
    Rang_AIC = rank(AIC, ties.method = "min"),
    Rang_BIC = rank(BIC, ties.method = "min")
  )

# Affichage
results_models 
Point_de_ruptureAICBICDelta_AICDelta_BICRang_AICRang_BIC
59.44.95e+044.99e+040       0       11
64.55.05e+045.08e+04943       943       55
67.95e+04       5.04e+04479       479       33
70.75.06e+045.1e+04 1.11e+031.11e+0366
73.45.14e+045.18e+041.89e+031.89e+0388
76.15.21e+045.25e+042.61e+032.62e+0399
79  5.02e+045.05e+04645       646       44
82  5.08e+045.11e+041.23e+031.23e+0377
85.34.98e+045.02e+04302       302       22
# Modèle de Mba avec 59.41 comme point de rupture 
fit.corrected2r <- xhaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
    sex + typeKC + dep + Stade + Annee_dg + gor,
    data=lung12boost_P, ratetable = rt_lifetable,
    interval = c(0, 0.09, 0.2, 0.41, 0.76, 1.39, 
    max(lung12boost_P$survtyearcens)), 
    rmap = list(age = 'agediag', sex = 'sex', 
    year = 'diagdate2', deprivq = 'dep'),
    baseline = "constant", pophaz = "corrected",
    add.rmap = "gor", 
    add.rmap.cut = list(breakpoint = TRUE, cut = c(59.41)))

# Prédiction de la mortalité de base en excès
predict_corrected2 <- predict(object = fit.corrected2r,
       new.data = lung12boost_P, 
       times.pts = c(seq(0, 
       max(lung12boost_P$survtyearcens), 0.1)),
       baseline = TRUE)

# Affichage des résultats
summary(fit.corrected2r)
## Call:
## xhaz(formula = Surv(survtyearcens, deadcens) ~ agediagcl + sex + 
##     typeKC + dep + Stade + Annee_dg + gor, data = lung12boost_P, 
##     ratetable = rt_lifetable, rmap = list(age = "agediag", sex = "sex", 
##         year = "diagdate2", deprivq = "dep"), baseline = "constant", 
##     pophaz = "corrected", add.rmap = "gor", add.rmap.cut = list(breakpoint = TRUE, 
##         cut = c(59.41)), interval = c(0, 0.09, 0.2, 0.41, 0.76, 
##         1.39, max(lung12boost_P$survtyearcens)))
## 
## 
##                              coef se(coef) lower 0.95 upper 0.95     z Pr(>|z|)
## agediagcl73-90            0.28633  0.01589    0.25519    0.31748 18.02  < 2e-16
## sexfemale                -0.09587  0.01475   -0.12477   -0.06696 -6.50  8.0e-11
## typeKCSCLC                0.06645  0.02096    0.02537    0.10753  3.17   0.0015
## dep2                      0.02975  0.02589   -0.02099    0.08048  1.15   0.2500
## dep3                      0.06497  0.02507    0.01584    0.11409  2.59   0.0095
## dep4                      0.05912  0.02468    0.01074    0.10749  2.39   0.0170
## dep5                      0.08294  0.02523    0.03349    0.13239  3.29   0.0010
## StadeIII-IV               2.20429  0.05199    2.10238    2.30620 42.39  < 2e-16
## Annee_dg2012              0.15897  0.09803   -0.03316    0.35110  1.62   0.1000
## Annee_dg2013              0.38966  0.15029    0.09509    0.68423  2.59   0.0095
## gorB                     -0.02500  0.03616   -0.09588    0.04588 -0.69   0.4900
## gorD                     -0.01132  0.03783   -0.08548    0.06283 -0.30   0.7600
## gorE                     -0.02316  0.04228   -0.10603    0.05970 -0.55   0.5800
## gorF                      0.04789  0.03962   -0.02976    0.12554  1.21   0.2300
## gorG                     -0.00082  0.03938   -0.07801    0.07637 -0.02   0.9800
## gorH                     -0.09169  0.03939   -0.16890   -0.01449 -2.33   0.0200
## gorJ                      0.04403  0.03760   -0.02967    0.11772  1.17   0.2400
## gorK                      0.01777  0.04048   -0.06157    0.09711  0.44   0.6600
## [0-0.09[                  0.15894  0.01883    0.12203    0.19584  8.44  < 2e-16
## [0.09-0.2[                0.16177  0.01916    0.12422    0.19932  8.44  < 2e-16
## [0.2-0.41[                0.09761  0.01163    0.07481    0.12041  8.39  < 2e-16
## [0.41-0.76[               0.08710  0.01037    0.06678    0.10742  8.40  < 2e-16
## [0.76-1.39[               0.07285  0.00871    0.05577    0.08992  8.36  < 2e-16
## [1.39-4.01[               0.04132  0.00502    0.03147    0.05117  8.22  < 2e-16
## log(alpha.A_(40.1,59.4])  3.59014  0.25268    3.09489    4.08538 14.21  < 2e-16
## log(alpha.A_(59.4,94])    1.41295  0.08963    1.23727    1.58862 15.76  < 2e-16
## log(alpha.B_(40.1,59.4])  3.33438  0.21399    2.91496    3.75380 15.58  < 2e-16
## log(alpha.B_(59.4,94])    1.58177  0.05884    1.46645    1.69710 26.88  < 2e-16
## log(alpha.D_(40.1,59.4])  3.47576  0.24615    2.99333    3.95820 14.12  < 2e-16
## log(alpha.D_(59.4,94])    1.20365  0.07781    1.05115    1.35615 15.47  < 2e-16
## log(alpha.E_(40.1,59.4])  2.93644  0.53530    1.88727    3.98561  5.49  4.1e-08
## log(alpha.E_(59.4,94])    1.38429  0.09738    1.19343    1.57515 14.22  < 2e-16
## log(alpha.F_(40.1,59.4])  3.35222  0.30425    2.75590    3.94854 11.02  < 2e-16
## log(alpha.F_(59.4,94])    1.44205  0.08611    1.27327    1.61083 16.75  < 2e-16
## log(alpha.G_(40.1,59.4])  3.85647  0.25231    3.36195    4.35098 15.28  < 2e-16
## log(alpha.G_(59.4,94])    1.24404  0.09950    1.04902    1.43905 12.50  < 2e-16
## log(alpha.H_(40.1,59.4])  2.80464  0.34929    2.12004    3.48925  8.03  1.0e-15
## log(alpha.H_(59.4,94])    1.19904  0.09118    1.02034    1.37774 13.15  < 2e-16
## log(alpha.J_(40.1,59.4])  3.49748  0.28553    2.93785    4.05711 12.25  < 2e-16
## log(alpha.J_(59.4,94])    1.32844  0.08299    1.16579    1.49108 16.01  < 2e-16
## log(alpha.K_(40.1,59.4])  4.11478  0.22308    3.67755    4.55201 18.45  < 2e-16
## log(alpha.K_(59.4,94])    1.27831  0.09229    1.09743    1.45919 13.85  < 2e-16
##                             
## agediagcl73-90           ***
## sexfemale                ***
## typeKCSCLC               ** 
## dep2                        
## dep3                     ** 
## dep4                     *  
## dep5                     ***
## StadeIII-IV              ***
## Annee_dg2012             .  
## Annee_dg2013             ** 
## gorB                        
## gorD                        
## gorE                        
## gorF                        
## gorG                        
## gorH                     *  
## gorJ                        
## gorK                        
## [0-0.09[                 ***
## [0.09-0.2[               ***
## [0.2-0.41[               ***
## [0.41-0.76[              ***
## [0.76-1.39[              ***
## [1.39-4.01[              ***
## log(alpha.A_(40.1,59.4]) ***
## log(alpha.A_(59.4,94])   ***
## log(alpha.B_(40.1,59.4]) ***
## log(alpha.B_(59.4,94])   ***
## log(alpha.D_(40.1,59.4]) ***
## log(alpha.D_(59.4,94])   ***
## log(alpha.E_(40.1,59.4]) ***
## log(alpha.E_(59.4,94])   ***
## log(alpha.F_(40.1,59.4]) ***
## log(alpha.F_(59.4,94])   ***
## log(alpha.G_(40.1,59.4]) ***
## log(alpha.G_(59.4,94])   ***
## log(alpha.H_(40.1,59.4]) ***
## log(alpha.H_(59.4,94])   ***
## log(alpha.J_(40.1,59.4]) ***
## log(alpha.J_(59.4,94])   ***
## log(alpha.K_(40.1,59.4]) ***
## log(alpha.K_(59.4,94])   ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## 
## Excess hazard hazard ratio(s)
## (proportional effect variable(s) for exess hazard ratio(s))
##                exp(coef) lower 0.95 upper 0.95
## agediagcl73-90    1.3315     1.2907     1.3737
## sexfemale         0.9086     0.8827     0.9352
## typeKCSCLC        1.0687     1.0257     1.1135
## dep2              1.0302     0.9792     1.0838
## dep3              1.0671     1.0160     1.1209
## dep4              1.0609     1.0108     1.1135
## dep5              1.0865     1.0341     1.1416
## StadeIII-IV       9.0638     8.1856    10.0362
## Annee_dg2012      1.1723     0.9674     1.4206
## Annee_dg2013      1.4765     1.0998     1.9822
## gorB              0.9753     0.9086     1.0469
## gorD              0.9887     0.9181     1.0648
## gorE              0.9771     0.8994     1.0615
## gorF              1.0491     0.9707     1.1338
## gorG              0.9992     0.9250     1.0794
## gorH              0.9124     0.8446     0.9856
## gorJ              1.0450     0.9708     1.1249
## gorK              1.0179     0.9403     1.1020
## 
## and corrected scale parameters on population hazard 
##  (non proportional correction using breakpoint approach)
## 
##                     exp(coef) lower 0.95 upper 0.95
## alpha.A_(40.1,59.4]    36.239     22.085     59.465
## alpha.A_(59.4,94]       4.108      3.446      4.897
## alpha.B_(40.1,59.4]    28.061     18.448     42.683
## alpha.B_(59.4,94]       4.864      4.334      5.458
## alpha.D_(40.1,59.4]    32.323     19.952     52.363
## alpha.D_(59.4,94]       3.332      2.861      3.881
## alpha.E_(40.1,59.4]    18.849      6.601     53.818
## alpha.E_(59.4,94]       3.992      3.298      4.831
## alpha.F_(40.1,59.4]    28.566     15.735     51.860
## alpha.F_(59.4,94]       4.229      3.573      5.007
## alpha.G_(40.1,59.4]    47.298     28.845     77.555
## alpha.G_(59.4,94]       3.470      2.855      4.217
## alpha.H_(40.1,59.4]    16.521      8.331     32.761
## alpha.H_(59.4,94]       3.317      2.774      3.966
## alpha.J_(40.1,59.4]    33.032     18.875     57.807
## alpha.J_(59.4,94]       3.775      3.208      4.442
## alpha.K_(40.1,59.4]    61.239     39.549     94.823
## alpha.K_(59.4,94]       3.591      2.996      4.302
## 
## 
## number of observations: 31368;  number of events: 26525
## Likelihood ratio test: 9783  on 42 degree(s) of freedom, p=0
# Calcul des taux de mortalité
fit.haz <- exphaz(Surv(survtyearcens, deadcens) ~ agediagcl + 
        sex + typeKC + dep + Stade + gor + Annee_dg,
        data=lung12boost_P, ratetable = rt_lifetable,
        only_ehazard = FALSE, rmap = list(age = 'agediag', 
        sex = 'sex', year = 'diagdate2', deprivq = 'dep'))

# Calcul des taux de mortalité cumulé
fit_thern <- survexp(survtyearcens*365.24 ~ 1, 
          rmap = list(age = Agediag, 
          sex = sex, year = diagdate2, 
          deprivq = dep), method = "individual.h",
          data = lung12boost_P, ratetable = rt_lifetable)
 
# Extraction des taux 
lung12boost_P$expected <- fit.haz$ehazard
lung12boost_P$expectedCum <- as.vector(fit_thern)

# Modèle d'estimation de mortalité en excès de Rubio
# Ajout des covariables 
lung12boost_P <- lung12boost_P %>%
    mutate(
    # Crétion de la variable sexfemale
      sexfemale = ifelse(sex=="female", 1, 0),
      
    # Création de la variable agediagcl73-90
      agediagcl73_90 = ifelse(agediagcl=="73-90", 1, 0),
    
    # Création de la variable typeKCSCLC
      typeKCSCLC = ifelse(typeKC=="SCLC", 1, 0),
    
    # Création de la variable dep2
      dep2 = ifelse(dep=="2", 1, 0),
    
    # Création de la variable dep3
      dep3 = ifelse(dep=="3", 1, 0),

    # Création de la variable dep4
      dep4 = ifelse(dep=="4", 1, 0),

    # Création de la variable dep5
      dep5 = ifelse(dep=="5", 1, 0),
  
    # Création de la variable StadeIII-IV
      StadeIII_IV = ifelse(Stade=="III-IV", 1, 0),
    
    # Création de la variable Annee_dg2012
      Annee_dg2012 = ifelse(Annee_dg=="2012", 1, 0),
    
    # Création de la variable Annee_dg2012
      Annee_dg2013 = ifelse(Annee_dg=="2013", 1, 0),

    # Création de la variable gorB
      gorB = ifelse(gor=="B", 1, 0),
    
    # Création de la variable gorD
      gorD = ifelse(gor=="D", 1, 0),

    # Création de la variable gorE
      gorE = ifelse(gor=="E", 1, 0),
    
    # Création de la variable gorF
      gorF = ifelse(gor=="F", 1, 0),
    
    # Création de la variable gorG
      gorG = ifelse(gor=="G", 1, 0),
    
    # Création de la variable gorH
      gorH = ifelse(gor=="H", 1, 0),
    
    # Création de la variable gorJ
      gorJ = ifelse(gor=="J", 1, 0),

    # Création de la variable gorK
      gorK = ifelse(gor=="K", 1, 0)
      )

# Exponentiated Weibull Distribution
# Hazard
hexpw <- function(t, lambda, kappa, alpha){
  pdf0 <-  alpha*dweibull(t, scale=lambda,
           shape=kappa)*pweibull(t, scale=lambda, 
           shape=kappa)^(alpha-1) 
  cdf0 <- pweibull(t, scale=lambda, shape=kappa)^alpha
  cdf0 <- ifelse(cdf0==1, 0.9999999, cdf0)
  return(pdf0/(1-cdf0))
}                                                                                      
# Cumulative hazard
Hexpw <- function(t, lambda, kappa, alpha, log.p=FALSE){
  cdf <- pweibull(t, scale=lambda, shape=kappa)^alpha  
  return(-log(1-cdf))
} 

# Fonction de calcul de l'intervalle de confiance
Conf.Int <- function(FUN, MLE, level=0.95){
  sd.int <- abs(qnorm(0.5*(1-level)))
  HESS <- hessian(FUN, x=MLE)
  Fisher.Info <- solve(HESS, tol=1e-18)
  Sigma <- sqrt(diag(Fisher.Info))
  U<- MLE + sd.int*Sigma
  L<- MLE - sd.int*Sigma
  exp.U <- exp(U)
  exp.L <- exp(L)
  C.I <- cbind(L, U, exp.L, exp.U, MLE, Sigma)
  rownames(C.I)<- paste0("par", seq_along(1:length(MLE)))
  colnames(C.I)<- c("Lower","Upper","exp.Lower","exp.Upper","Transf MLE", "Std. Error")
  return(C.I)
}

# Difference of cumulative hazards
Hp.diff.new <- as.vector(lung12boost_P[,"expectedCum"])

# Variables utilisées dans le modèle 
# Taux attendus
hp.as = as.vector(lung12boost_P[,"expected"])

# Statut de censure : 0 = vivant, 1 = décédé
ind.cens = as.vector(lung12boost_P[,"deadcens"]) 

# Temps de survie
surv.time = as.vector(lung12boost_P[,"survtyearcens"]) 

# Matrice des variables explicatives 
x2 = as.matrix(lung12boost_P[,c("agediagcl73_90",
    "sexfemale","typeKCSCLC", "dep2", "dep3", "dep4",
    "dep5", "StadeIII_IV", "Annee_dg2012", "Annee_dg2013",
    "gorB", "gorD", "gorE", "gorF", "gorG", "gorH", "gorJ",
    "gorK")]) 

# Fonction pour le modèle de Rubio
log.lik.E = function(par){
  if(any(is.na(par))) return(Inf)
  theta = exp(par[1]); 
  kappa = exp(par[2])/exp(par[1]); 
  ae0 = exp(par[3]); 
  be0 = exp(par[4]); 
  ce0 <- exp(par[5]);
  beta1 <- par[6:(6 + dim(x2)[2]-1)];
  exp.x.beta1 <- exp(x2%*%beta1)
  haz0 = kappa*theta*hp.as/(1 + theta*Hp.diff.new) +
         hexpw(surv.time, ae0, be0, ce0)*exp.x.beta1
  log.Surv0 = -Hexpw(surv.time, ae0, be0, ce0)*exp.x.beta1 -
         kappa*log(1+theta*Hp.diff.new)
  val = sum(- ind.cens*log(haz0) - log.Surv0)
  ifelse(is.na(val), return(Inf), return(val))
}

# Optimisation step
initE <- rep(0, 23)
OPTE1 = nlminb(initE, log.lik.E, control=list(iter.max=10000))
MLEE <- c(exp(OPTE1$par[1:23]), OPTE1$par[-c(1:23)])
AICE <- 2*OPTE1$objective + 2*length(MLEE)
IC <- Conf.Int(log.lik.E, OPTE1$par, level=0.95)
# Affichage des résultats du modèle Rubio
## Affichage des valeurs des hazards ratios (HR)
MLEE
##  [1]  5.715716819  5.736989400  0.001080326  0.136927675 20.703511701
##  [6]  1.224416144  0.909687696  1.058571353  0.991381764  1.033050666
## [11]  1.032117108  1.050585085  8.314789631  0.714292782  0.919408686
## [16]  0.942624318  0.891339292  0.891612018  0.989804541  0.912320744
## [21]  0.827223604  0.967538782  0.936290191
# Affichage des résultats du modèle Rubio
## Affichage des intervalles de confiance
IC
##              Lower        Upper    exp.Lower    exp.Upper   Transf MLE
## par1   1.419961602  2.066477832 4.136962e+00  7.896959658  1.743219717
## par2   1.603229426  1.890639730 4.969054e+00  6.623604641  1.746934578
## par3  -7.524386119 -6.136599555 5.397599e-04  0.002162264 -6.830492837
## par4  -2.026366895 -1.950237930 1.318135e-01  0.142240224 -1.988302413
## par5   2.897715685  3.162890981 1.813268e+01 23.638836672  3.030303333
## par6   0.169083399  0.235844827 1.184219e+00  1.265977850  0.202464113
## par7  -0.124139158 -0.065168702 8.832569e-01  0.936909392 -0.094653930
## par8   0.014888003  0.098952436 1.014999e+00  1.104013787  0.056920219
## par9  -0.060592464  0.043281287 9.412067e-01  1.044231583 -0.008655588
## par10 -0.017795908  0.082828380 9.823615e-01  1.086355353  0.032516236
## par11 -0.017745741  0.080970015 9.824108e-01  1.084338382  0.031612137
## par12 -0.001155158  0.099849623 9.988455e-01  1.105004739  0.049347233
## par13  2.006586521  2.229485104 7.437885e+00  9.295078849  2.118035812
## par14 -0.494710368 -0.178214315 6.097475e-01  0.836763073 -0.336462341
## par15 -0.361796215  0.193747119 6.964243e-01  1.213789300 -0.084024548
## par16 -0.121115502  0.002940571 8.859316e-01  1.002944899 -0.059087466
## par17 -0.181739994 -0.048320256 8.338181e-01  0.952828589 -0.115030125
## par18 -0.188914246 -0.040534150 8.278575e-01  0.960276370 -0.114724198
## par19 -0.078919583  0.058424006 9.241142e-01  1.060164417 -0.010247788
## par20 -0.160653953 -0.022873363 8.515867e-01  0.977386250 -0.091763658
## par21 -0.258566396 -0.120794085 7.721578e-01  0.886216426 -0.189680241
## par22 -0.098742808  0.032743267 9.059757e-01  1.033285226 -0.032999771
## par23 -0.136598880  0.004939245 8.723201e-01  1.004951464 -0.065829817
##       Std. Error
## par1  0.16493064
## par2  0.07332030
## par3  0.35403369
## par4  0.01942101
## par5  0.06764800
## par6  0.01703129
## par7  0.01504376
## par8  0.02144540
## par9  0.02649889
## par10 0.02566993
## par11 0.02518305
## par12 0.02576700
## par13 0.05686293
## par14 0.08074027
## par15 0.14172284
## par16 0.03164754
## par17 0.03403627
## par18 0.03785276
## par19 0.03503727
## par20 0.03514876
## par21 0.03514664
## par22 0.03354298
## par23 0.03610733
# Affichage des résultats du modèle Rubio
## Affichage de la valeur de l'AIC
AICE
## [1] 47401
## Modèle de GOUNGOUNGA
fit.corrected4r <- mexhazLT(
  Surv(survtyearcens, deadcens) ~ agediagcl + 
  sex + typeKC + dep + Stade + gor + Annee_dg,
  data=lung12boost_P, expected = "expected",
  expectedCum = "expectedCum", base = "exp.bs",
  knots = quantile(lung12boost_P$survtyearcens[
  lung12boost_P$deadcens==1], probs = c(1:2/3)), 
  degree = 2, pophaz = "rescaled")
## iteration = 0
## Step:
##  [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## Parameter:
##  [1] -1 -1 -1 -1 -1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
## Function Value
## [1] 49655.82
## Gradient:
##  [1] -16363.82785  -5346.19897  -6730.01844  -1799.46423   -334.94834
##  [6]  -7617.99289  -7201.65366  -2403.28159  -2767.49509  -3323.35984
## [11]  -3841.67326  -4132.63794 -15703.40386  -2874.69822  -1954.09901
## [16]  -1263.09873  -1692.64363  -1782.85059  -1609.24606  -2311.06228
## [21]  -1571.10256 -16205.92878    -65.97528  -3711.35366
## 
## iteration = 72
## Parameter:
##  [1] -1.679989310  0.200688481 -0.615371220 -0.918226573 -2.488378751
##  [6]  0.234264927 -0.097593819  0.061728345  0.027114199  0.070429325
## [11]  0.074860675  0.104103103  2.002344820  0.004214990 -0.044848139
## [16] -0.032122565  0.054971973 -0.020624787 -0.124032497  0.038336176
## [21]  0.006547801  0.147736658  0.385149632  1.231514229
## Function Value
## [1] 23757.61
## Gradient:
##  [1] -3.196245e-03 -1.498847e-03  1.455192e-05  2.364686e-04 -3.961986e-04
##  [6]  6.893970e-03  1.214357e-02 -1.821900e-02 -1.034277e-02 -6.068149e-03
## [11]  8.472853e-03  3.903551e-03 -7.267437e-05 -2.164597e-03  1.440640e-02
## [16] -8.934876e-03  5.631591e-03  5.784386e-03 -6.642949e-03 -5.311449e-03
## [21] -1.422086e-02 -2.444722e-03  5.966285e-04 -2.126930e-03
## 
## Relative gradient close to zero.
## Current iterate is probably solution.
## 
## 
## Data
##  Name N.Obs.Tot N.Obs N.Events N.Clust
##  data     30729 30729    26204       1
## 
## Details
##  Iter Eval   Base Nb.Leg Nb.Aghq Optim Method Code    LogLik Total.Time
##    72 1158 exp.bs     20      10   nlm    ---    1 -23757.61      78.66
# Prédiction de la mortalité de base en excès
predict_corrected4 <- predict(fit.corrected4r,
            time.pts = lung12boost_P$survtyearcens[
            lung12boost_P$deadcens==1],
            data.val = lung12boost_P[1, ])

# Affichage des résultats
summary(fit.corrected4r)
## Call:
## mexhazLT(formula = Surv(survtyearcens, deadcens) ~ agediagcl + 
##     sex + typeKC + dep + Stade + gor + Annee_dg, data = lung12boost_P, 
##     expected = "expected", expectedCum = "expectedCum", pophaz = "rescaled", 
##     base = "exp.bs", degree = 2, knots = quantile(lung12boost_P$survtyearcens[lung12boost_P$deadcens == 
##         1], probs = c(1:2/3)))
## 
## Coefficients:
##                  Estimate     StdErr  z.value   p.value    
## Intercept      -1.6799893  0.1121381 -14.9814 < 2.2e-16 ***
## BS2.1           0.2006885  0.0396762   5.0582 4.233e-07 ***
## BS2.2          -0.6153712  0.0311002 -19.7867 < 2.2e-16 ***
## BS2.3          -0.9182266  0.0692237 -13.2646 < 2.2e-16 ***
## BS2.4          -2.4883788  0.1256832 -19.7988 < 2.2e-16 ***
## agediagcl73-90  0.2342649  0.0154475  15.1653 < 2.2e-16 ***
## sexfemale      -0.0975938  0.0143438  -6.8039 1.018e-11 ***
## typeKCSCLC      0.0617283  0.0205134   3.0092  0.002620 ** 
## dep2            0.0271142  0.0254465   1.0655  0.286633    
## dep3            0.0704293  0.0246024   2.8627  0.004200 ** 
## dep4            0.0748607  0.0240988   3.1064  0.001894 ** 
## dep5            0.1041031  0.0245250   4.2448 2.188e-05 ***
## StadeIII-IV     2.0023448  0.0407136  49.1812 < 2.2e-16 ***
## gorB            0.0042150  0.0305742   0.1379  0.890350    
## gorD           -0.0448481  0.0327556  -1.3692  0.170945    
## gorE           -0.0321226  0.0362581  -0.8859  0.375648    
## gorF            0.0549720  0.0337308   1.6297  0.103159    
## gorG           -0.0206248  0.0338661  -0.6090  0.542518    
## gorH           -0.1240325  0.0339331  -3.6552  0.000257 ***
## gorJ            0.0383362  0.0324015   1.1832  0.236745    
## gorK            0.0065478  0.0347278   0.1885  0.850449    
## Annee_dg2012    0.1477367  0.0952483   1.5511  0.120885    
## Annee_dg2013    0.3851496  0.1462479   2.6335  0.008450 ** 
## log(Alpha)      1.2315142  0.0413883  29.7551 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Hazard ratios (for proportional effect variables):
##                   Coef     HR CI.lower CI.upper
## agediagcl73-90  0.2343 1.2640   1.2263   1.3028
## sexfemale      -0.0976 0.9070   0.8819   0.9329
## typeKCSCLC      0.0617 1.0637   1.0218   1.1073
## dep2            0.0271 1.0275   0.9775   1.0800
## dep3            0.0704 1.0730   1.0225   1.1260
## dep4            0.0749 1.0777   1.0280   1.1299
## dep5            0.1041 1.1097   1.0576   1.1644
## StadeIII-IV     2.0023 7.4064   6.8384   8.0216
## gorB            0.0042 1.0042   0.9458   1.0662
## gorD           -0.0448 0.9561   0.8967   1.0195
## gorE           -0.0321 0.9684   0.9020   1.0397
## gorF            0.0550 1.0565   0.9889   1.1287
## gorG           -0.0206 0.9796   0.9167   1.0468
## gorH           -0.1240 0.8834   0.8265   0.9441
## gorJ            0.0383 1.0391   0.9751   1.1072
## gorK            0.0065 1.0066   0.9403   1.0775
## Annee_dg2012    0.1477 1.1592   0.9618   1.3971
## Annee_dg2013    0.3851 1.4698   1.1035   1.9577
## 
## log-likelihood: -23757.614 (for 24 degree(s) of freedom)
## 
## number of observations: 30729, number of events: 26204

II.7. Reprèsentations graphiques des taux de mortalité de base en excès

# Mise en page : 3 lignes, 2 colonnes
par(mfrow = c(2, 3), mar = c(4.5, 4.5, 3, 1), oma = c(1, 1, 2, 1))

## -------- Estève I --------
plot(predict_estcr61,
     what = "hazard",
     xlab = "Temps de suivi en années", 
     ylab = "Mortalité en excès de base",
     col = "red",
     main = "Estève I",
     lwd = 2,
     xlim = c(0, 4),
     ylim = c(0, 0.4),
     axes = FALSE)
axis(1); axis(2, at = seq(0, 0.4, by = 0.1)); box()

## -------- Estève II --------
plot(predict_estcr62,
     what = "hazard",
     xlab = "Temps de suivi en années", 
     ylab = "Mortalité en excès de base",
     col = "blue",
     main = "Estève II",
     lwd = 2,
     xlim = c(0, 4),
     ylim = c(0, 0.4),
     axes = FALSE)
axis(1); axis(2, at = seq(0, 0.4, by = 0.1)); box()

## -------- Touraine --------
plot(predict_corrected1,
     what = "hazard",
     xlab = "Temps de suivi en années", 
     ylab = "Mortalité en excès de base",
     col = "darkred",
     main = "Touraine",
     lwd = 2,
     xlim = c(0, 4),
     ylim = c(0, 0.4),
     axes = FALSE)
axis(1); axis(2, at = seq(0, 0.4, by = 0.1)); box()

## -------- MBA --------
plot(predict_corrected2,
     what = "hazard",
     xlab = "Temps de suivi en années", 
     ylab = "Mortalité en excès de base",
     col = "darkgreen",
     main = "MBA",
     lwd = 2,
     xlim = c(0, 4),
     ylim = c(0, 0.4),
     axes = FALSE)
axis(1); axis(2, at = seq(0, 0.4, by = 0.1)); box()

## -------- Goungounga --------
plot(predict_corrected4$results$time.pts,
     predict_corrected4$results$hazard,
     type = "l", lwd = 2,
     xlab = "Temps de suivi en années",
     ylab = "Mortalité en excès de base",
     ylim = c(0, 0.4),
     xlim = c(0, 4),
     col = "purple",
     main = "Goungounga")
lines(predict_corrected4$results$time.pts,
      predict_corrected4$results$hazard.inf,
      lwd = 1, col = "black", lty = 2)
lines(predict_corrected4$results$time.pts,
      predict_corrected4$results$hazard.sup,
      lwd = 1, col = "black", lty = 2)

## -------- Rubio --------
# Fonction principale et intervalles de confiance
fit <- Vectorize(function(t) 
  hexpw(t, MLEE[3], MLEE[4], MLEE[5]))
ic_l <- Vectorize(function(t) 
  hexpw(t, IC[3, 3], IC[4, 3], IC[5, 3]))
ic_u <- Vectorize(function(t) 
  hexpw(t, IC[3, 4], IC[4, 4], IC[5, 4]))

curve(fit, from = 0, to = 4,
      xlab = "Temps de suivi en années", 
      ylab = "Mortalité en excès de base",
      main = "Rubio", col = "black",
      lwd = 2, ylim = c(0, 0.4), n = 1000)
curve(ic_l, from = 0, to = 4, lwd = 1.5, 
      lty = 2, col = "red", add = TRUE)
curve(ic_u, from = 0, to = 4, lwd = 1.5, 
      lty = 2, col = "red", add = TRUE)