Competing risk analysisのデモ

References


## Competing risk analysisに関して
##
## 例えばがんのstudyで"再発"に興味があるときに"再発はしていないけど死亡"した
## というのをどう扱うか? というのが生存分析で問題になると思います。
##
## おおざっぱには"打ち切り"として扱うこともできますが、感覚的には、再発はない
## とはいえ患者が死んでいるのにKM曲線のグラフが下に下がらないのは不自然ですし、
## "打ち切り例はもし観察が続けられたなら、studyに残っている患者と同じよな確率
## や時間で再発を経験する"という仮定をみたしていません。(死んだ後に再発はしません)
##
## こういったevent of interest(ここでは再発)を起こしていないけど、それに匹敵する
## 用な重要なイベントが他にもある(ここでは再発無し死亡)場合には、打ち切り(censored)
## 以外に2種類以上のイベント(end-point)が定義されます。
##
## このように他のイベントが起こるとその後の主のイベントの確率がかわる場合competing risk
## eventと呼びます。このように複数のeventを扱えるように拡張した生存分析がcompeting
## risk analysisです。(間違っている点があったらコメントでご指摘いただければ幸いです)
##
## 和文の説明: http://www.occn.zaq.ne.jp/cuhxr802/R-stat-intro_13.pdf
## なぜ打ち切りにしては行けないのかの詳しい解説があります。


## 必要なパッケージを読み込み
library(cmprsk)


### 解析例: Bone Marrow Transplantation誌より、ALL vs AMLの生存データ。
## 参考: http://www.nature.com/bmt/journal/v40/n4/full/1705727a.html
## 参考: http://www.stat.unipg.it/luca/R/
## 通常の生存分析のKaplan-Meier曲線とLogrank検定にあたる解析を行います。

## データをインターネットから読み込み
bmt <- read.csv("http://www.stat.unipg.it/luca/R/bmt.csv", sep = ";")

## データ確認
head(bmt)
  dis ftime status
1   0    13      2
2   0     1      1
3   0    72      0
4   0     7      2
5   0     8      2
6   1    67      0
## dis    疾患 0 ALL; 1 AML
## ftime  フォロー期間 follow-up timeのこと
## status 状態 0 censored(生存); 1 Transplant-related mortality(治療関連死); 2 relapse(再発)
##        # 再発するかに主に興味がありますが、治療関連死が起こるとその後の再発の確率が0になります。
##        # このため治療関連死はcompeting risk eventと考えられます。
##
##   dis ftime status   # 最初6行のデータ
## 1   0    13      2   ALL 13週で再発
## 2   0     1      1   ALL  1週で治療関連死
## 3   0    72      0   ALL 72週で打ち切り
## 4   0     7      2
## 5   0     8      2
## 6   1    67      0


## CumIncidence() 関数をインターネットから読み込み
## 論文の筆者が書いた解析とグラフをいっぺんにやってくれる便利関数です。
## Bone Marrow Transplant. 2007 Aug;40(4):381-7.で引用して使用するとよいと思います。
source("http://www.stat.unipg.it/luca/R/CumIncidence.R")


## 疾患が0,1では何か分からないので、病名記載してALLとAMLにカテゴリー名を変えます。
## factor()関数にlabelsオプションを使って行います。
bmt$dis <- factor(bmt$dis, levels = c(0,1),
                  labels = c("ALL", "AML"))

## 同じくイベント発生状況を、打ち切り、治療関連死、再発に書き換えます。
bmt$status <- factor(bmt$status, levels = c(0,1,2),
                     labels = c("Censored","Transplant-related mortality","Relapse"))

## 発生したイベントの内訳の表示
addmargins(xtabs(data = bmt, ~ dis + status))
     status
dis   Censored Transplant-related mortality Relapse Sum
  ALL        2                            3      12  17
  AML        9                            6       3  18
  Sum       11                            9      15  35
##      status
## dis   Censored Transplant related mortality Relapse Sum
##   ALL        2                            3      12  17
##   AML        9                            6       3  18 # AMLの方が打ち切り(生存)が多そうです
##   Sum       11                            9      15  35


## グループごとの平均follow-up期間を計算します。
## with()はbmtデータフレームの中で作業することを意味し、attach()と似ています。
## tapply(計算される変数, グループ分け変数(複数ok), 計算のための関数)でグループごとの集約をします。
## ここでは疾患とアウトカムで分類して、フォロー期間の平均をグループごとに出します。
with(bmt, tapply(ftime, list(dis, status), mean))
    Censored Transplant-related mortality Relapse
ALL    53.50                        4.333   9.333
AML    32.56                        4.167   6.333
##     Censored Transplant-related mortality  Relapse
## ALL 53.50000                     4.333333 9.333333
## AML 32.55556                     4.166667 6.333333


## 先ほどの便利関数を使います。これは内部的にはcmprskパッケージを利用しています。
## cumulative incidence curvesを求める
with(bmt,
     CumIncidence(ftime = ftime,            # ftimeにフォロー期間の変数を指定
                  fstatus = status,         # fstatusにイベントの種類の変数を指定
                  group = dis,              # groupにグループ分け変数を指定。一群の時は"all"とか適当に。
                  cencode = "Censored",     # cencodeに打ち切りの場合の値を指定。
                  xlab = "Months")          # X軸のラベル。時間の単位を入れます。
     )

+-------------------------------------------------------------------+
| Cumulative incidence function estimates from competing risks data |
+-------------------------------------------------------------------+
Test equality across groups:
                             Statistic  p-value df
Transplant-related mortality     1.302 0.253915  1
Relapse                          7.082 0.007785  1

Estimates at time points:
                                       0     10     20     30     40     50     60     70
ALL Transplant-related mortality 0.05882 0.1176 0.1765 0.1765 0.1765 0.1765 0.1765 0.1765
AML Transplant-related mortality 0.00000 0.3682 0.3682 0.3682 0.3682 0.3682 0.3682 0.3682
ALL Relapse                      0.05882 0.4706 0.5882 0.7059 0.7059 0.7059 0.7059 0.7059
AML Relapse                      0.00000 0.2057 0.2057 0.2057 0.2057 0.2057 0.2057 0.2057

Standard errors:
                                       0      10      20      30      40      50      60      70
ALL Transplant-related mortality 0.05882 0.08056 0.09667 0.09667 0.09667 0.09667 0.09667 0.09667
AML Transplant-related mortality 0.00000 0.12653 0.12653 0.12653 0.12653 0.12653 0.12653 0.12653
ALL Relapse                      0.05882 0.12631 0.12667 0.12077 0.12077 0.12077 0.12077 0.12077
AML Relapse                      0.00000 0.11558 0.11558 0.11558 0.11558 0.11558 0.11558 0.11558

plot of chunk unnamed-chunk-2

## グラフ
## 複数イベントがあるので無イベント(生存)をプロットせずにイベントを下からプロットします。
## イベントの種類ごとに色がつきます。
## 黒の治療関連死は点線のAMLの方が多いようです。
## 赤の再発は実線のALLの方がだいぶ多いようです。
##
## 解析結果
##      # Gray's testでcumulative incidence functionに違いがあるかを検討する。Logrank testに対応します。
## +-------------------------------------------------------------------+
## | Cumulative incidence function estimates from competing risks data |
## +-------------------------------------------------------------------+
## Test equality across groups:
##                              Statistic  p-value df
## Transplant-related mortality     1.302 0.253915  1 # 黒の治療関連死はALL vs AMLで有意差がない
## Relapse                          7.082 0.007785  1 # 赤の再発はALL vs AMLで有意差がある
##
## Estimates at time points:
##                                        0     10     20     30     40     50     60     70 # 月
## ALL Transplant-related mortality 0.05882 0.1176 0.1765 0.1765 0.1765 0.1765 0.1765 0.1765
## AML Transplant-related mortality 0.00000 0.3682 0.3682 0.3682 0.3682 0.3682 0.3682 0.3682
## ALL Relapse                      0.05882 0.4706 0.5882 0.7059 0.7059 0.7059 0.7059 0.7059 # ALL累積再発の推定
## AML Relapse                      0.00000 0.2057 0.2057 0.2057 0.2057 0.2057 0.2057 0.2057
##
## Standard errors:
##                                        0      10      20      30      40      50      60      70
## ALL Transplant-related mortality 0.05882 0.08056 0.09667 0.09667 0.09667 0.09667 0.09667 0.09667
## AML Transplant-related mortality 0.00000 0.12653 0.12653 0.12653 0.12653 0.12653 0.12653 0.12653
## ALL Relapse                      0.05882 0.12631 0.12667 0.12077 0.12077 0.12077 0.12077 0.12077 # そのSE
## AML Relapse                      0.00000 0.11558 0.11558 0.11558 0.11558 0.11558 0.11558 0.11558


## Cumulative incidence estimatesを任意の時間で表示することもできる
with(bmt,
     CumIncidence(ftime = ftime, fstatus = status, group = dis, cencode = "Censored", xlab = "Months",
                  t = c(0, 3, 6, 9, 12, 24, 36, 48))
     )

+-------------------------------------------------------------------+
| Cumulative incidence function estimates from competing risks data |
+-------------------------------------------------------------------+
Test equality across groups:
                             Statistic  p-value df
Transplant-related mortality     1.302 0.253915  1
Relapse                          7.082 0.007785  1

Estimates at time points:
                                       0       3       6      9     12     24     36     48
ALL Transplant-related mortality 0.05882 0.11765 0.11765 0.1176 0.1765 0.1765 0.1765 0.1765
AML Transplant-related mortality 0.00000 0.17708 0.29514 0.3682 0.3682 0.3682 0.3682 0.3682
ALL Relapse                      0.05882 0.17647 0.29412 0.4706 0.4706 0.6471 0.7059 0.7059
AML Relapse                      0.00000 0.05556 0.05556 0.1205 0.2057 0.2057 0.2057 0.2057

Standard errors:
                                       0       3       6       9      12      24      36      48
ALL Transplant-related mortality 0.05882 0.08056 0.08056 0.08056 0.09667 0.09667 0.09667 0.09667
AML Transplant-related mortality 0.00000 0.09575 0.11478 0.12653 0.12653 0.12653 0.12653 0.12653
ALL Relapse                      0.05882 0.09534 0.11429 0.12631 0.12631 0.12450 0.12077 0.12077
AML Relapse                      0.00000 0.05556 0.05556 0.08379 0.11558 0.11558 0.11558 0.11558
## グラフの形が変わったのはX軸の最大値がかわったから。
## +-------------------------------------------------------------------+
## | Cumulative incidence function estimates from competing risks data |
## +-------------------------------------------------------------------+
## Test equality across groups:
##                              Statistic  p-value df
## Transplant-related mortality     1.302 0.253915  1   # 検定結果は同じ
## Relapse                          7.082 0.007785  1
##
## Estimates at time points:
##                                        0       3       6      9     12     24     36     48  # 月がかわった
## ALL Transplant-related mortality 0.05882 0.11765 0.11765 0.1176 0.1765 0.1765 0.1765 0.1765
## AML Transplant-related mortality 0.00000 0.17708 0.29514 0.3682 0.3682 0.3682 0.3682 0.3682
## ALL Relapse                      0.05882 0.17647 0.29412 0.4706 0.4706 0.6471 0.7059 0.7059
## AML Relapse                      0.00000 0.05556 0.05556 0.1205 0.2057 0.2057 0.2057 0.2057
##
## Standard errors:
##                                        0       3       6       9      12      24      36      48
## ALL Transplant-related mortality 0.05882 0.08056 0.08056 0.08056 0.09667 0.09667 0.09667 0.09667
## AML Transplant-related mortality 0.00000 0.09575 0.11478 0.12653 0.12653 0.12653 0.12653 0.12653
## ALL Relapse                      0.05882 0.09534 0.11429 0.12631 0.12631 0.12450 0.12077 0.12077
## AML Relapse                      0.00000 0.05556 0.05556 0.08379 0.11558 0.11558 0.11558 0.11558

## 95%信頼区間を表示することもできる
layout(matrix(1:4, byrow = TRUE, ncol = 2))     # 4つのグラフを1つの画面におさめる

plot of chunk unnamed-chunk-2

with(bmt,
     CumIncidence(ftime = ftime, fstatus = status, group = dis, cencode = "Censored", xlab = "Months",
                  t = c(0, 3, 6, 9, 12, 24, 36, 48),
                  level = 0.95)                 # levelオプションをつけると信頼区間が入り、個別グラフになる。
     )

+-------------------------------------------------------------------+
| Cumulative incidence function estimates from competing risks data |
+-------------------------------------------------------------------+
Test equality across groups:
                             Statistic  p-value df
Transplant-related mortality     1.302 0.253915  1
Relapse                          7.082 0.007785  1

Estimates at time points:
                                       0       3       6      9     12     24     36     48
ALL Transplant-related mortality 0.05882 0.11765 0.11765 0.1176 0.1765 0.1765 0.1765 0.1765
AML Transplant-related mortality 0.00000 0.17708 0.29514 0.3682 0.3682 0.3682 0.3682 0.3682
ALL Relapse                      0.05882 0.17647 0.29412 0.4706 0.4706 0.6471 0.7059 0.7059
AML Relapse                      0.00000 0.05556 0.05556 0.1205 0.2057 0.2057 0.2057 0.2057

Standard errors:
                                       0       3       6       9      12      24      36      48
ALL Transplant-related mortality 0.05882 0.08056 0.08056 0.08056 0.09667 0.09667 0.09667 0.09667
AML Transplant-related mortality 0.00000 0.09575 0.11478 0.12653 0.12653 0.12653 0.12653 0.12653
ALL Relapse                      0.05882 0.09534 0.11429 0.12631 0.12631 0.12450 0.12077 0.12077
AML Relapse                      0.00000 0.05556 0.05556 0.08379 0.11558 0.11558 0.11558 0.11558

95% pointwise confidence intervals:

, , ALL Transplant-related mortality

             0       3       6       9      12      24      36      48
lower 0.003487 0.01819 0.01819 0.01819 0.03991 0.03991 0.03991 0.03991
upper 0.242066 0.31884 0.31884 0.31884 0.39293 0.39293 0.39293 0.39293

, , AML Transplant-related mortality

        0       3      6      9     12     24     36     48
lower NaN 0.04105 0.1024 0.1408 0.1408 0.1408 0.1408 0.1408
upper NaN 0.39118 0.5203 0.6010 0.6010 0.6010 0.6010 0.6010

, , ALL Relapse

             0       3      6      9     12     24     36     48
lower 0.003487 0.04101 0.1023 0.2199 0.2199 0.3552 0.4017 0.4017
upper 0.242066 0.38982 0.5185 0.6872 0.6872 0.8327 0.8755 0.8755

, , AML Relapse

        0        3        6       9      12      24      36      48
lower NaN 0.003365 0.003365 0.01778 0.04188 0.04188 0.04188 0.04188
upper NaN 0.230594 0.230594 0.32913 0.45472 0.45472 0.45472 0.45472




### 解析例続き: Bone Marrow Transplantation誌よりCompeting risk存在下の多変量解析について
## 参考: http://www.nature.com/bmt/journal/v45/n9/full/bmt2009359a.html
## 参考: http://www.stat.unipg.it/luca/R/
## 上記は通常の生存分析でいうところのKM曲線を書いてLogrank検定をしたところにあたります。
## 今度は通常の生存分析でいうところのCox比例ハザードモデルをおこないます。

## 論文筆者のサイトより便利関数を読み込み
source("http://www.stat.unipg.it/luca/R/crr-addson.R")

## 論文筆者のサイトよりデータを読み込み
bmtcrr <- read.csv("http://www.stat.unipg.it/luca/R/bmtcrr.csv")

## データのチェック
## 変数の解説: http://www.nature.com/bmt/journal/v45/n9/full/bmt2009359a.html#tbl1
## 急性白血病に対して幹細胞移植を行い、Relapseが見られたかどうかというデータです。
## 治療関連死がCompeting risk eventとして定義されています。
head(bmtcrr)
  Sex   D   Phase Age Status Source  ftime
1   M ALL Relapse  48      2  BM+PB   0.67
2   F AML     CR2  23      1  BM+PB   9.50
3   M ALL     CR3   7      0  BM+PB 131.77
4   F ALL     CR2  26      2  BM+PB  24.03
5   F ALL     CR2  36      2  BM+PB   1.47
6   M ALL Relapse  17      2  BM+PB   2.23
##   Sex   D   Phase Age Status Source  ftime
## 1   M ALL Relapse  48      2  BM+PB   0.67
## 2   F AML     CR2  23      1  BM+PB   9.50
## 3   M ALL     CR3   7      0  BM+PB 131.77
## 4   F ALL     CR2  26      2  BM+PB  24.03
## 5   F ALL     CR2  36      2  BM+PB   1.47
## 6   M ALL Relapse  17      2  BM+PB   2.23

## 変数名をDからDisへ変更。Statusを読んで意味が分かるように変えておきます。
names(bmtcrr)[2] <- "Dis"
bmtcrr <-
    within(bmtcrr, {
        Status <- factor(Status, levels = 0:2, labels = c("Censored", "Relapse", "Transplant-related death"))
    })


## crr()関数が通常の outcome ~ predictor + predictor... のformula形式に
## 対応していないので説明変数をまとめて行列(model matrix)にしておく必要があります。
## 論文筆者の書いたfactor2ind()関数でダミー変数を作ります。
matrix.of.fixed.covariates <-
    with(bmtcrr,
         cbind(Age,
               factor2ind(Sex,    baseline = "M"),      # Reference levelを設定しながら
               factor2ind(Dis,    baseline = "ALL"),    # ダミー変数にしていきます。
               factor2ind(Phase,  baseline = "Relapse"),
               factor2ind(Source, baseline = "BM+PB")
               )
         )

## 説明変数行列データの確認
## Reference levelではないことを示す変数に0, 1が入っています。Sex:F 0は男性、1は女性です。
head(matrix.of.fixed.covariates)
     Age Sex:F Dis:AML Phase:CR1 Phase:CR2 Phase:CR3 Source:PB
[1,]  48     0       0         0         0         0         0
[2,]  23     1       1         0         1         0         0
[3,]   7     0       0         0         0         1         0
[4,]  26     1       0         0         1         0         0
[5,]  36     1       0         0         1         0         0
[6,]  17     0       0         0         0         0         0
##      Age Sex:F Dis:AML Phase:CR1 Phase:CR2 Phase:CR3 Source:PB
## [1,]  48     0       0         0         0         0         0
## [2,]  23     1       1         0         1         0         0
## [3,]   7     0       0         0         0         1         0
## [4,]  26     1       0         0         1         0         0
## [5,]  36     1       0         0         1         0         0
## [6,]  17     0       0         0         0         0         0


## モデル1の作成: 説明変数としてAge, Sex, Dis, Phaseを含みます。
mod1 <-
    crr(ftime = bmtcrr$ftime,                   # フォロー期間の変数
        fstatus = bmtcrr$Status,                # フォロー終了時の状態変数
        cov1 = matrix.of.fixed.covariates,      # 先ほどの説明変数行列
        cencode = "Censored",                   # 打ち切りのコード
        failcode = "Relapse")                   # Event of interestのコード。再発リスクを検討します。

## 結果を表示
summary(mod1)
Competing Risks Regression

Call:
crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates, 
    failcode = "Relapse", cencode = "Censored")

             coef exp(coef) se(coef)      z p-value
Age       -0.0185     0.982   0.0119 -1.554  0.1200
Sex:F     -0.0352     0.965   0.2900 -0.122  0.9000
Dis:AML   -0.4723     0.624   0.3054 -1.547  0.1200
Phase:CR1 -1.1018     0.332   0.3764 -2.927  0.0034
Phase:CR2 -1.0200     0.361   0.3558 -2.867  0.0041
Phase:CR3 -0.7314     0.481   0.5766 -1.268  0.2000
Source:PB  0.9211     2.512   0.5530  1.666  0.0960

          exp(coef) exp(-coef)  2.5% 97.5%
Age           0.982      1.019 0.959 1.005
Sex:F         0.965      1.036 0.547 1.704
Dis:AML       0.624      1.604 0.343 1.134
Phase:CR1     0.332      3.009 0.159 0.695
Phase:CR2     0.361      2.773 0.180 0.724
Phase:CR3     0.481      2.078 0.155 1.490
Source:PB     2.512      0.398 0.850 7.426

Num. cases = 177
Pseudo Log-likelihood = -267 
Pseudo likelihood ratio test = 24.4  on 7 df,
## Competing Risks Regression   # CRRのフルネームです。
##
## Call:                        # 使われた式
## crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates)
##
##              coef exp(coef) se(coef)      z p-value  # exp(coef)がsubdistribution hazard ratio
## Age       -0.0185     0.982   0.0119 -1.554  0.1200  # competing riskである治療関連死に配慮したHRです。
## Sex:F     -0.0352     0.965   0.2900 -0.122  0.9000
## Dis:AML   -0.4723     0.624   0.3054 -1.547  0.1200
## Phase:CR1 -1.1018     0.332   0.3764 -2.927  0.0034  # 第一寛解期ではsHR0.33で再発状態での移植より再発少ない
## Phase:CR2 -1.0200     0.361   0.3558 -2.867  0.0041  # 第二寛解期ではsHR0.36で再発状態での移植より再発少ない
## Phase:CR3 -0.7314     0.481   0.5766 -1.268  0.2000
## Source:PB  0.9211     2.512   0.5530  1.666  0.0960  # 末梢血のみの移植ではsHR2.5で再発リスク高め(p=0.09)
##
##                                              # sHRとその逆数、sHRの95%信頼区間
##           exp(coef) exp(-coef)  2.5% 97.5%   # Stataではsubhazard ratioというterminologyが使われている様子。
## Age           0.982      1.019 0.959 1.005   # 年齢が1あがるとsHR0.982 (NS)
## Sex:F         0.965      1.036 0.547 1.704   # 女性だと男性よりsHR0.965 (NS)
## Dis:AML       0.624      1.604 0.343 1.134   # AMLだとALLよりsHR0.624 (NS)
## Phase:CR1     0.332      3.009 0.159 0.695   # 第一寛解期だと再発状態での移植よりsHR0.332 有意
## Phase:CR2     0.361      2.773 0.180 0.724   # 第二寛解期だと再発状態での移植よりsHR0.361 有意
## Phase:CR3     0.481      2.078 0.155 1.490   # 第三寛解期だと再発状態での移植よりsHR0.481 (NS)
## Source:PB     2.512      0.398 0.850 7.426   # 末梢血のみだと骨髄+末梢血移植よりsHR2.512 (NS)
##
## Num. cases = 177
## Pseudo Log-likelihood = -267
## Pseudo likelihood ratio test = 24.4  on 7 df,


## 3値以上のカテゴリー変数ではダミー変数ごとのp値だけでなく、overall p-valueを求める必要有り
## ここではPhaseは4行目(Phase:CR1)から6行目まで(Phase:CR3)まであり、overall p-valueが必要。
## 必要な aod パッケージを用意。
inst("aod") ; library(aod)
## Wald chi-squared test
wald.test(Sigma = mod1$var,     # 分散共分散行列を与える モデル名$var でok
          b = mod1$coef,        # それぞれの変数の係数 モデル名$coef でok
          Terms = 4:6)          # 4行目(Phase:CR1)から6行目まで(Phase:CR3)を含める。
Wald test:
----------

Chi-squared test:
X2 = 14.0, df = 3, P(> X2) = 0.0029
## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 14.0, df = 3, P(> X2) = 0.0029  # Phaseというダミー変数化前のもとの変数が有意であることを示す。



## モデル選択
## 様々なモデルを作成してみます。
mod2 <-
    crr(ftime = bmtcrr$ftime,
        fstatus = bmtcrr$Status,
        cov1 = matrix.of.fixed.covariates[,4:6], # Phaseのダミー変数のみのモデル
        cencode = "Censored",
        failcode = "Relapse")

mod3 <-
    crr(ftime = bmtcrr$ftime,
        fstatus = bmtcrr$Status,
        cov1 = matrix.of.fixed.covariates[,c(4:6,7)], # Phase + Source モデル
        cencode = "Censored",
        failcode = "Relapse")

mod4 <-
    crr(ftime = bmtcrr$ftime,
        fstatus = bmtcrr$Status,
        cov1 = matrix.of.fixed.covariates[,c(4:6,7,1)], # Phase + Source + Age モデル
        cencode = "Censored",
        failcode = "Relapse")

mod5 <-
    crr(ftime = bmtcrr$ftime,
        fstatus = bmtcrr$Status,
        cov1 = matrix.of.fixed.covariates[,c(4:6,7,2)], # Phase + Source + Sex モデル
        cencode = "Censored",
        failcode = "Relapse")

mod6 <-
    crr(ftime = bmtcrr$ftime,
        fstatus = bmtcrr$Status,
        cov1 = matrix.of.fixed.covariates[,c(4:6,7,3)], # Phase + Source + Dis モデル
        cencode = "Censored",
        failcode = "Relapse")

## 論文筆者のmodsel.crr()という便利関数でモデル比較ができます。
modsel.crr(mod1, mod2, mod3, mod4, mod5, mod6)
Model selection table

Model 0: Null model
Model 1: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates, failcode = "Relapse", cencode = "Censored")
Model 2: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, 4:6], failcode = "Relapse", cencode = "Censored")
Model 3: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, c(4:6, 7)], failcode = "Relapse", cencode = "Censored")
Model 4: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, c(4:6, 7, 1)], failcode = "Relapse", cencode = "Censored")
Model 5: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, c(4:6, 7, 2)], failcode = "Relapse", cencode = "Censored")
Model 6: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, c(4:6, 7, 3)], failcode = "Relapse", cencode = "Censored")
  Num.obs logLik Df.fit BIC BIC diff
0     177   -279      0 557     0.00
1     177   -266      7 569    11.87
2     177   -272      3 559     1.18
3     177   -271      4 562     4.85
4     177   -268      5 561     4.08
5     177   -271      5 567     9.92
6     177   -268      5 562     4.11
## Model selection table
##
## Model 0: Null model
## Model 1: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates, failcode = "Relapse", cencode = "Censored")
## Model 2: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, 4:6], failcode = "Relapse", cencode = "Censored")
## Model 3: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, c(4:6, 7)], failcode = "Relapse", cencode = "Censored")
## Model 4: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, c(4:6, 7, 1)], failcode = "Relapse", cencode = "Censored")
## Model 5: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, c(4:6, 7, 2)], failcode = "Relapse", cencode = "Censored")
## Model 6: crr(ftime = bmtcrr$ftime, fstatus = bmtcrr$Status, cov1 = matrix.of.fixed.covariates[, c(4:6, 7, 3)], failcode = "Relapse", cencode = "Censored")
##   Num.obs  logLik Df.fit    BIC BIC diff     # Bayesian information criteria (BIC)小さいほどよいモデル
## 0     177 -278.71      0 557.41   0.0000     # 説明変数のない空モデルがBIC最小?
## 1     177 -266.52      7 569.28  11.8679
## 2     177 -271.53      3 558.59   1.1761     # Phase変数のみを含むモデルが二番手
## 3     177 -270.78      4 562.27   4.8545
## 4     177 -267.81      5 561.49   4.0797
## 5     177 -270.73      5 567.33   9.9170
## 6     177 -267.82      5 561.53   4.1125


## 以下は二番手だったmod2で行う。

## Model diagnostic。モデルのできを評価する。
head(mod2$res)                                  # これはPhase:CR1, Phase:CR2, Phase:CR3に対する
        [,1]    [,2]     [,3]
[1,] -0.1394 -0.1525 -0.04922
[2,] -0.1407 -0.1539 -0.04967
[3,] -0.1420 -0.1554 -0.05013
[4,] -0.1433 -0.1568 -0.05059
[5,] -0.1446 -0.1583 -0.05106
[6,] -0.1460 -0.1598 -0.05155
##            [,1]       [,2]        [,3]       # Schoenfeld residual(残差)
## [1,] -0.1394277 -0.1525481 -0.04922084
## [2,] -0.1406975 -0.1539373 -0.04966909
## [3,] -0.1419906 -0.1553521 -0.05012558
## [4,] -0.1433077 -0.1567931 -0.05059054
## [5,] -0.1446494 -0.1582611 -0.05106420
## [6,] -0.1460165 -0.1597569 -0.05154682

## グラフで検討。いずれも近似曲線が水平に近いのでよいと考えられる。 
layout(matrix(1:3, ncol = 3))

plot of chunk unnamed-chunk-2

for (j in seq_len(ncol(mod2$res))) {
    scatter.smooth(x = mod2$uft, y = mod2$res[,j],
                   main = names(mod2$coef)[j],
                   xlab = "Failure time",
                   ylab = "Schoenfeld residuals"
        )
    }

plot of chunk unnamed-chunk-2






test2 <-
    with(bmtcrr, {
        crr(ftime = ftime,
            fstatus = Status,
            cov1 = matrix.of.fixed.covariates,
            cencode = "Censored",
            failcode = "Relapse")
    })


bmtcrr2 <-
    within(bmtcrr, {
        Status <- factor(Status, levels = 0:2, labels = c("Censored", "Relapse", "Transplant-related death"))
        Phase <- relevel(Phase, ref = "Relapse")
        Sex <- relevel(Sex, ref = "M")
        Source <- relevel(Source, ref = "PB")
    })

bmtcrr2
    Sex Dis   Phase Age Status Source  ftime
1     M ALL Relapse  48   <NA>  BM+PB   0.67
2     F AML     CR2  23   <NA>  BM+PB   9.50
3     M ALL     CR3   7   <NA>  BM+PB 131.77
4     F ALL     CR2  26   <NA>  BM+PB  24.03
5     F ALL     CR2  36   <NA>  BM+PB   1.47
6     M ALL Relapse  17   <NA>  BM+PB   2.23
7     M ALL     CR1   7   <NA>  BM+PB 124.83
8     F ALL     CR1  17   <NA>  BM+PB  10.53
9     M ALL     CR1  26   <NA>  BM+PB 123.90
10    F ALL Relapse   8   <NA>  BM+PB   2.00
11    M ALL Relapse  29   <NA>  BM+PB   1.70
12    M AML Relapse  13   <NA>  BM+PB   6.33
13    F AML     CR2  29   <NA>  BM+PB   2.30
14    M ALL Relapse   7   <NA>  BM+PB   3.57
15    M ALL Relapse  40   <NA>  BM+PB   0.40
16    F ALL     CR2  20   <NA>  BM+PB   0.87
17    M ALL     CR3  18   <NA>  BM+PB 121.10
18    M AML     CR1  29   <NA>  BM+PB   5.27
19    M ALL     CR1   6   <NA>  BM+PB   7.50
20    M AML     CR2  33   <NA>  BM+PB   6.30
21    F ALL     CR3  27   <NA>  BM+PB   6.13
22    F ALL     CR2  22   <NA>     PB   2.80
23    M ALL     CR3  15   <NA>     PB   5.37
24    M ALL     CR2   8   <NA>     PB   2.87
25    M AML     CR2  46   <NA>     PB  10.90
26    M ALL Relapse  21   <NA>     PB   3.07
27    F ALL Relapse  21   <NA>     PB   1.70
28    M AML Relapse   7   <NA>     PB 114.70
29    M ALL Relapse   4   <NA>     PB   1.80
30    F AML Relapse  53   <NA>     PB 114.10
31    M AML     CR2  27   <NA>     PB  35.00
32    M AML Relapse  28   <NA>     PB 112.93
33    F ALL     CR2  23   <NA>     PB   3.63
34    M ALL Relapse  17   <NA>     PB 112.23
35    F ALL Relapse  19   <NA>     PB   1.67
36    M AML Relapse  39   <NA>     PB   7.63
37    M ALL     CR2  15   <NA>     PB   7.53
38    M AML     CR2  47   <NA>     PB 110.37
39    F AML Relapse  16   <NA>     PB  20.93
40    F ALL     CR1  17   <NA>     PB   6.03
41    M AML     CR3  26   <NA>     PB   2.83
42    M AML     CR2  44   <NA>     PB   9.63
43    F AML Relapse  30   <NA>     PB   3.17
44    M AML Relapse  32   <NA>     PB   9.97
45    M AML     CR1  20   <NA>     PB   5.70
46    M AML     CR1  38   <NA>     PB 106.13
47    M ALL Relapse  17   <NA>     PB   2.53
48    M AML Relapse  38   <NA>     PB   4.20
49    M AML     CR2  36   <NA>     PB   9.17
50    M AML     CR2  34   <NA>     PB   8.73
51    F AML     CR2  42   <NA>     PB  85.10
52    F AML Relapse  58   <NA>     PB   4.80
53    F AML Relapse  30   <NA>     PB   3.57
54    F ALL     CR1  37   <NA>     PB   8.00
55    M ALL     CR1  25   <NA>     PB  82.60
56    F AML     CR1  47   <NA>     PB  81.67
57    F ALL     CR1  46   <NA>     PB   6.00
58    M AML     CR2  21   <NA>     PB   6.20
59    F AML Relapse  17   <NA>     PB   3.47
60    M ALL Relapse  25   <NA>     PB   4.23
61    F AML     CR2  45   <NA>     PB   1.00
62    M ALL     CR1  17   <NA>     PB   7.47
63    M AML Relapse  44   <NA>     PB   6.10
64    F AML     CR1  20   <NA>     PB  77.00
65    M AML     CR2  18   <NA>     PB  76.03
66    F AML     CR2  46   <NA>     PB  28.53
67    F AML Relapse  49   <NA>     PB   1.97
68    M ALL Relapse  17   <NA>     PB  13.07
69    F ALL Relapse  36   <NA>     PB   1.43
70    M ALL     CR2  14   <NA>     PB  72.93
71    M ALL Relapse  22   <NA>     PB  71.83
72    M AML     CR3  38   <NA>     PB  67.80
73    F ALL     CR2  20   <NA>     PB   8.17
74    F ALL Relapse  26   <NA>     PB   7.57
75    M ALL Relapse  20   <NA>     PB   2.40
76    M AML     CR1  28   <NA>     PB  70.20
77    F AML Relapse  45   <NA>     PB   3.03
78    M AML Relapse  17   <NA>     PB   5.87
79    M AML     CR2  40   <NA>     PB  68.83
80    F ALL Relapse   9   <NA>     PB   1.10
81    M AML Relapse  18   <NA>     PB   9.13
82    F AML Relapse  33   <NA>     PB   6.07
83    M ALL     CR1  33   <NA>     PB   3.30
84    F AML     CR2  17   <NA>     PB  59.17
85    F ALL Relapse  17   <NA>     PB   1.23
86    M AML     CR2  28   <NA>     PB  58.23
87    F AML     CR1  26   <NA>     PB  58.00
88    M ALL     CR2  32   <NA>     PB  56.60
89    M AML Relapse  62   <NA>     PB   3.60
90    M AML Relapse  46   <NA>     PB   1.60
91    F AML     CR1  20   <NA>     PB   5.63
92    F AML     CR2  50   <NA>     PB   4.40
93    M AML Relapse  40   <NA>     PB  13.80
94    M AML     CR1  43   <NA>     PB  49.27
95    F AML     CR3  31   <NA>     PB   9.20
96    F ALL     CR1  46   <NA>     PB   8.63
97    M AML Relapse  51   <NA>     PB   6.70
98    M ALL Relapse  39   <NA>     PB   1.10
99    F ALL Relapse  19   <NA>     PB   5.43
100   F ALL     CR1  33   <NA>     PB  45.30
101   F AML Relapse  23   <NA>     PB  12.57
102   F ALL     CR1  50   <NA>     PB  43.67
103   F AML     CR1  36   <NA>     PB  11.80
104   M AML     CR2  36   <NA>     PB  43.20
105   M ALL     CR1  35   <NA>     PB   4.17
106   F AML Relapse  41   <NA>     PB   6.63
107   F AML     CR1  28   <NA>     PB  42.50
108   M AML     CR1  30   <NA>     PB   1.87
109   F AML     CR1  39   <NA>     PB   1.23
110   M AML     CR1  52   <NA>     PB  41.30
111   M AML     CR1  38   <NA>     PB   3.90
112   M AML     CR1  35   <NA>     PB   4.57
113   F AML     CR3  27   <NA>     PB   4.40
114   M AML     CR1  38   <NA>     PB   3.70
115   F AML Relapse  22   <NA>     PB   3.40
116   M ALL     CR1  26   <NA>     PB  20.83
117   F AML     CR2  47   <NA>     PB   8.60
118   M AML Relapse  17   <NA>     PB   1.30
119   M ALL Relapse  21   <NA>     PB   0.37
120   M ALL     CR1  28   <NA>     PB   0.27
121   M ALL     CR2  24   <NA>     PB  35.50
122   F AML     CR1  35   <NA>     PB  35.50
123   F ALL     CR3  19   <NA>     PB   4.07
124   M ALL     CR1  23   <NA>     PB   7.93
125   M ALL     CR2  33   <NA>     PB  17.23
126   F ALL     CR3  38   <NA>     PB   3.43
127   M AML     CR1  36   <NA>     PB   2.00
128   M AML Relapse  48   <NA>     PB   8.80
129   F AML Relapse  25   <NA>     PB  32.63
130   M ALL     CR2  22   <NA>     PB  12.17
131   F AML Relapse  37   <NA>     PB   7.20
132   M AML     CR2  52   <NA>     PB   9.00
133   F AML     CR1  53   <NA>     PB   6.00
134   M AML     CR2  22   <NA>     PB  29.90
135   M ALL     CR2  20   <NA>     PB  29.20
136   M AML     CR1  51   <NA>     PB  28.70
137   M AML Relapse  33   <NA>     PB  42.17
138   F AML     CR3  33   <NA>     PB   0.13
139   M ALL     CR1  20   <NA>     PB   3.63
140   M AML Relapse  37   <NA>     PB  12.27
141   F AML     CR1  14   <NA>     PB  26.13
142   M AML     CR1  33   <NA>     PB   4.73
143   F AML Relapse  55   <NA>     PB   3.93
144   F ALL     CR2  16   <NA>     PB  24.77
145   F AML Relapse  52   <NA>     PB   2.03
146   M ALL     CR1  43   <NA>     PB   6.60
147   M ALL Relapse  18   <NA>     PB   9.43
148   M AML     CR2  39   <NA>     PB   5.00
149   F ALL Relapse  14   <NA>     PB   1.87
150   F AML Relapse  28   <NA>     PB   5.60
151   M ALL Relapse  13   <NA>     PB  19.90
152   F AML Relapse  34   <NA>     PB   1.20
153   M AML Relapse  58   <NA>     PB   5.23
154   F AML     CR2  14   <NA>     PB   3.50
155   M AML Relapse  16   <NA>     PB   2.80
156   M AML Relapse  52   <NA>     PB   1.47
157   M ALL Relapse  48   <NA>     PB   7.63
158   F ALL     CR2  45   <NA>     PB   2.23
159   F AML Relapse  41   <NA>     PB   0.87
160   M AML Relapse  45   <NA>     PB   7.20
161   F ALL Relapse  25   <NA>     PB   3.23
162   F AML     CR2  34   <NA>     PB   3.57
163   M AML     CR1  37   <NA>     PB   4.57
164   F AML Relapse  41   <NA>     PB   1.77
165   F ALL     CR2  32   <NA>     PB  16.83
166   M AML     CR1  28   <NA>     PB  16.60
167   F ALL     CR1  51   <NA>     PB  16.13
168   M ALL     CR2  11   <NA>     PB   2.70
169   M ALL Relapse  20   <NA>     PB  10.30
170   F AML     CR1  54   <NA>     PB  15.17
171   F AML     CR2  17   <NA>     PB   7.00
172   M AML Relapse  44   <NA>     PB   2.53
173   M AML     CR2  21   <NA>     PB  14.50
174   F AML Relapse  44   <NA>     PB   2.30
175   M AML     CR1  23   <NA>     PB  10.20
176   F AML     CR3  27   <NA>     PB   9.97
177   M AML Relapse  48   <NA>     PB   7.63

within(bmtcrr, {
        Status <- factor(Status, levels = 0:2, labels = c("Censored", "Relapse", "Transplant-related death"))
        Phase <- relevel(Phase, ref = "Relapse")
        Sex <- relevel(Sex, ref = "M")
        Source <- relevel(Source, ref = "PB")
    })
    Sex Dis   Phase Age Status Source  ftime
1     M ALL Relapse  48   <NA>  BM+PB   0.67
2     F AML     CR2  23   <NA>  BM+PB   9.50
3     M ALL     CR3   7   <NA>  BM+PB 131.77
4     F ALL     CR2  26   <NA>  BM+PB  24.03
5     F ALL     CR2  36   <NA>  BM+PB   1.47
6     M ALL Relapse  17   <NA>  BM+PB   2.23
7     M ALL     CR1   7   <NA>  BM+PB 124.83
8     F ALL     CR1  17   <NA>  BM+PB  10.53
9     M ALL     CR1  26   <NA>  BM+PB 123.90
10    F ALL Relapse   8   <NA>  BM+PB   2.00
11    M ALL Relapse  29   <NA>  BM+PB   1.70
12    M AML Relapse  13   <NA>  BM+PB   6.33
13    F AML     CR2  29   <NA>  BM+PB   2.30
14    M ALL Relapse   7   <NA>  BM+PB   3.57
15    M ALL Relapse  40   <NA>  BM+PB   0.40
16    F ALL     CR2  20   <NA>  BM+PB   0.87
17    M ALL     CR3  18   <NA>  BM+PB 121.10
18    M AML     CR1  29   <NA>  BM+PB   5.27
19    M ALL     CR1   6   <NA>  BM+PB   7.50
20    M AML     CR2  33   <NA>  BM+PB   6.30
21    F ALL     CR3  27   <NA>  BM+PB   6.13
22    F ALL     CR2  22   <NA>     PB   2.80
23    M ALL     CR3  15   <NA>     PB   5.37
24    M ALL     CR2   8   <NA>     PB   2.87
25    M AML     CR2  46   <NA>     PB  10.90
26    M ALL Relapse  21   <NA>     PB   3.07
27    F ALL Relapse  21   <NA>     PB   1.70
28    M AML Relapse   7   <NA>     PB 114.70
29    M ALL Relapse   4   <NA>     PB   1.80
30    F AML Relapse  53   <NA>     PB 114.10
31    M AML     CR2  27   <NA>     PB  35.00
32    M AML Relapse  28   <NA>     PB 112.93
33    F ALL     CR2  23   <NA>     PB   3.63
34    M ALL Relapse  17   <NA>     PB 112.23
35    F ALL Relapse  19   <NA>     PB   1.67
36    M AML Relapse  39   <NA>     PB   7.63
37    M ALL     CR2  15   <NA>     PB   7.53
38    M AML     CR2  47   <NA>     PB 110.37
39    F AML Relapse  16   <NA>     PB  20.93
40    F ALL     CR1  17   <NA>     PB   6.03
41    M AML     CR3  26   <NA>     PB   2.83
42    M AML     CR2  44   <NA>     PB   9.63
43    F AML Relapse  30   <NA>     PB   3.17
44    M AML Relapse  32   <NA>     PB   9.97
45    M AML     CR1  20   <NA>     PB   5.70
46    M AML     CR1  38   <NA>     PB 106.13
47    M ALL Relapse  17   <NA>     PB   2.53
48    M AML Relapse  38   <NA>     PB   4.20
49    M AML     CR2  36   <NA>     PB   9.17
50    M AML     CR2  34   <NA>     PB   8.73
51    F AML     CR2  42   <NA>     PB  85.10
52    F AML Relapse  58   <NA>     PB   4.80
53    F AML Relapse  30   <NA>     PB   3.57
54    F ALL     CR1  37   <NA>     PB   8.00
55    M ALL     CR1  25   <NA>     PB  82.60
56    F AML     CR1  47   <NA>     PB  81.67
57    F ALL     CR1  46   <NA>     PB   6.00
58    M AML     CR2  21   <NA>     PB   6.20
59    F AML Relapse  17   <NA>     PB   3.47
60    M ALL Relapse  25   <NA>     PB   4.23
61    F AML     CR2  45   <NA>     PB   1.00
62    M ALL     CR1  17   <NA>     PB   7.47
63    M AML Relapse  44   <NA>     PB   6.10
64    F AML     CR1  20   <NA>     PB  77.00
65    M AML     CR2  18   <NA>     PB  76.03
66    F AML     CR2  46   <NA>     PB  28.53
67    F AML Relapse  49   <NA>     PB   1.97
68    M ALL Relapse  17   <NA>     PB  13.07
69    F ALL Relapse  36   <NA>     PB   1.43
70    M ALL     CR2  14   <NA>     PB  72.93
71    M ALL Relapse  22   <NA>     PB  71.83
72    M AML     CR3  38   <NA>     PB  67.80
73    F ALL     CR2  20   <NA>     PB   8.17
74    F ALL Relapse  26   <NA>     PB   7.57
75    M ALL Relapse  20   <NA>     PB   2.40
76    M AML     CR1  28   <NA>     PB  70.20
77    F AML Relapse  45   <NA>     PB   3.03
78    M AML Relapse  17   <NA>     PB   5.87
79    M AML     CR2  40   <NA>     PB  68.83
80    F ALL Relapse   9   <NA>     PB   1.10
81    M AML Relapse  18   <NA>     PB   9.13
82    F AML Relapse  33   <NA>     PB   6.07
83    M ALL     CR1  33   <NA>     PB   3.30
84    F AML     CR2  17   <NA>     PB  59.17
85    F ALL Relapse  17   <NA>     PB   1.23
86    M AML     CR2  28   <NA>     PB  58.23
87    F AML     CR1  26   <NA>     PB  58.00
88    M ALL     CR2  32   <NA>     PB  56.60
89    M AML Relapse  62   <NA>     PB   3.60
90    M AML Relapse  46   <NA>     PB   1.60
91    F AML     CR1  20   <NA>     PB   5.63
92    F AML     CR2  50   <NA>     PB   4.40
93    M AML Relapse  40   <NA>     PB  13.80
94    M AML     CR1  43   <NA>     PB  49.27
95    F AML     CR3  31   <NA>     PB   9.20
96    F ALL     CR1  46   <NA>     PB   8.63
97    M AML Relapse  51   <NA>     PB   6.70
98    M ALL Relapse  39   <NA>     PB   1.10
99    F ALL Relapse  19   <NA>     PB   5.43
100   F ALL     CR1  33   <NA>     PB  45.30
101   F AML Relapse  23   <NA>     PB  12.57
102   F ALL     CR1  50   <NA>     PB  43.67
103   F AML     CR1  36   <NA>     PB  11.80
104   M AML     CR2  36   <NA>     PB  43.20
105   M ALL     CR1  35   <NA>     PB   4.17
106   F AML Relapse  41   <NA>     PB   6.63
107   F AML     CR1  28   <NA>     PB  42.50
108   M AML     CR1  30   <NA>     PB   1.87
109   F AML     CR1  39   <NA>     PB   1.23
110   M AML     CR1  52   <NA>     PB  41.30
111   M AML     CR1  38   <NA>     PB   3.90
112   M AML     CR1  35   <NA>     PB   4.57
113   F AML     CR3  27   <NA>     PB   4.40
114   M AML     CR1  38   <NA>     PB   3.70
115   F AML Relapse  22   <NA>     PB   3.40
116   M ALL     CR1  26   <NA>     PB  20.83
117   F AML     CR2  47   <NA>     PB   8.60
118   M AML Relapse  17   <NA>     PB   1.30
119   M ALL Relapse  21   <NA>     PB   0.37
120   M ALL     CR1  28   <NA>     PB   0.27
121   M ALL     CR2  24   <NA>     PB  35.50
122   F AML     CR1  35   <NA>     PB  35.50
123   F ALL     CR3  19   <NA>     PB   4.07
124   M ALL     CR1  23   <NA>     PB   7.93
125   M ALL     CR2  33   <NA>     PB  17.23
126   F ALL     CR3  38   <NA>     PB   3.43
127   M AML     CR1  36   <NA>     PB   2.00
128   M AML Relapse  48   <NA>     PB   8.80
129   F AML Relapse  25   <NA>     PB  32.63
130   M ALL     CR2  22   <NA>     PB  12.17
131   F AML Relapse  37   <NA>     PB   7.20
132   M AML     CR2  52   <NA>     PB   9.00
133   F AML     CR1  53   <NA>     PB   6.00
134   M AML     CR2  22   <NA>     PB  29.90
135   M ALL     CR2  20   <NA>     PB  29.20
136   M AML     CR1  51   <NA>     PB  28.70
137   M AML Relapse  33   <NA>     PB  42.17
138   F AML     CR3  33   <NA>     PB   0.13
139   M ALL     CR1  20   <NA>     PB   3.63
140   M AML Relapse  37   <NA>     PB  12.27
141   F AML     CR1  14   <NA>     PB  26.13
142   M AML     CR1  33   <NA>     PB   4.73
143   F AML Relapse  55   <NA>     PB   3.93
144   F ALL     CR2  16   <NA>     PB  24.77
145   F AML Relapse  52   <NA>     PB   2.03
146   M ALL     CR1  43   <NA>     PB   6.60
147   M ALL Relapse  18   <NA>     PB   9.43
148   M AML     CR2  39   <NA>     PB   5.00
149   F ALL Relapse  14   <NA>     PB   1.87
150   F AML Relapse  28   <NA>     PB   5.60
151   M ALL Relapse  13   <NA>     PB  19.90
152   F AML Relapse  34   <NA>     PB   1.20
153   M AML Relapse  58   <NA>     PB   5.23
154   F AML     CR2  14   <NA>     PB   3.50
155   M AML Relapse  16   <NA>     PB   2.80
156   M AML Relapse  52   <NA>     PB   1.47
157   M ALL Relapse  48   <NA>     PB   7.63
158   F ALL     CR2  45   <NA>     PB   2.23
159   F AML Relapse  41   <NA>     PB   0.87
160   M AML Relapse  45   <NA>     PB   7.20
161   F ALL Relapse  25   <NA>     PB   3.23
162   F AML     CR2  34   <NA>     PB   3.57
163   M AML     CR1  37   <NA>     PB   4.57
164   F AML Relapse  41   <NA>     PB   1.77
165   F ALL     CR2  32   <NA>     PB  16.83
166   M AML     CR1  28   <NA>     PB  16.60
167   F ALL     CR1  51   <NA>     PB  16.13
168   M ALL     CR2  11   <NA>     PB   2.70
169   M ALL Relapse  20   <NA>     PB  10.30
170   F AML     CR1  54   <NA>     PB  15.17
171   F AML     CR2  17   <NA>     PB   7.00
172   M AML Relapse  44   <NA>     PB   2.53
173   M AML     CR2  21   <NA>     PB  14.50
174   F AML Relapse  44   <NA>     PB   2.30
175   M AML     CR1  23   <NA>     PB  10.20
176   F AML     CR3  27   <NA>     PB   9.97
177   M AML Relapse  48   <NA>     PB   7.63

with(bmtcrr2, model.matrix( ~ Sex + Dis + Phase))[,-1]
    SexF DisAML PhaseCR1 PhaseCR2 PhaseCR3
1      0      0        0        0        0
2      1      1        0        1        0
3      0      0        0        0        1
4      1      0        0        1        0
5      1      0        0        1        0
6      0      0        0        0        0
7      0      0        1        0        0
8      1      0        1        0        0
9      0      0        1        0        0
10     1      0        0        0        0
11     0      0        0        0        0
12     0      1        0        0        0
13     1      1        0        1        0
14     0      0        0        0        0
15     0      0        0        0        0
16     1      0        0        1        0
17     0      0        0        0        1
18     0      1        1        0        0
19     0      0        1        0        0
20     0      1        0        1        0
21     1      0        0        0        1
22     1      0        0        1        0
23     0      0        0        0        1
24     0      0        0        1        0
25     0      1        0        1        0
26     0      0        0        0        0
27     1      0        0        0        0
28     0      1        0        0        0
29     0      0        0        0        0
30     1      1        0        0        0
31     0      1        0        1        0
32     0      1        0        0        0
33     1      0        0        1        0
34     0      0        0        0        0
35     1      0        0        0        0
36     0      1        0        0        0
37     0      0        0        1        0
38     0      1        0        1        0
39     1      1        0        0        0
40     1      0        1        0        0
41     0      1        0        0        1
42     0      1        0        1        0
43     1      1        0        0        0
44     0      1        0        0        0
45     0      1        1        0        0
46     0      1        1        0        0
47     0      0        0        0        0
48     0      1        0        0        0
49     0      1        0        1        0
50     0      1        0        1        0
51     1      1        0        1        0
52     1      1        0        0        0
53     1      1        0        0        0
54     1      0        1        0        0
55     0      0        1        0        0
56     1      1        1        0        0
57     1      0        1        0        0
58     0      1        0        1        0
59     1      1        0        0        0
60     0      0        0        0        0
61     1      1        0        1        0
62     0      0        1        0        0
63     0      1        0        0        0
64     1      1        1        0        0
65     0      1        0        1        0
66     1      1        0        1        0
67     1      1        0        0        0
68     0      0        0        0        0
69     1      0        0        0        0
70     0      0        0        1        0
71     0      0        0        0        0
72     0      1        0        0        1
73     1      0        0        1        0
74     1      0        0        0        0
75     0      0        0        0        0
76     0      1        1        0        0
77     1      1        0        0        0
78     0      1        0        0        0
79     0      1        0        1        0
80     1      0        0        0        0
81     0      1        0        0        0
82     1      1        0        0        0
83     0      0        1        0        0
84     1      1        0        1        0
85     1      0        0        0        0
86     0      1        0        1        0
87     1      1        1        0        0
88     0      0        0        1        0
89     0      1        0        0        0
90     0      1        0        0        0
91     1      1        1        0        0
92     1      1        0        1        0
93     0      1        0        0        0
94     0      1        1        0        0
95     1      1        0        0        1
96     1      0        1        0        0
97     0      1        0        0        0
98     0      0        0        0        0
99     1      0        0        0        0
100    1      0        1        0        0
101    1      1        0        0        0
102    1      0        1        0        0
103    1      1        1        0        0
104    0      1        0        1        0
105    0      0        1        0        0
106    1      1        0        0        0
107    1      1        1        0        0
108    0      1        1        0        0
109    1      1        1        0        0
110    0      1        1        0        0
111    0      1        1        0        0
112    0      1        1        0        0
113    1      1        0        0        1
114    0      1        1        0        0
115    1      1        0        0        0
116    0      0        1        0        0
117    1      1        0        1        0
118    0      1        0        0        0
119    0      0        0        0        0
120    0      0        1        0        0
121    0      0        0        1        0
122    1      1        1        0        0
123    1      0        0        0        1
124    0      0        1        0        0
125    0      0        0        1        0
126    1      0        0        0        1
127    0      1        1        0        0
128    0      1        0        0        0
129    1      1        0        0        0
130    0      0        0        1        0
131    1      1        0        0        0
132    0      1        0        1        0
133    1      1        1        0        0
134    0      1        0        1        0
135    0      0        0        1        0
136    0      1        1        0        0
137    0      1        0        0        0
138    1      1        0        0        1
139    0      0        1        0        0
140    0      1        0        0        0
141    1      1        1        0        0
142    0      1        1        0        0
143    1      1        0        0        0
144    1      0        0        1        0
145    1      1        0        0        0
146    0      0        1        0        0
147    0      0        0        0        0
148    0      1        0        1        0
149    1      0        0        0        0
150    1      1        0        0        0
151    0      0        0        0        0
152    1      1        0        0        0
153    0      1        0        0        0
154    1      1        0        1        0
155    0      1        0        0        0
156    0      1        0        0        0
157    0      0        0        0        0
158    1      0        0        1        0
159    1      1        0        0        0
160    0      1        0        0        0
161    1      0        0        0        0
162    1      1        0        1        0
163    0      1        1        0        0
164    1      1        0        0        0
165    1      0        0        1        0
166    0      1        1        0        0
167    1      0        1        0        0
168    0      0        0        1        0
169    0      0        0        0        0
170    1      1        1        0        0
171    1      1        0        1        0
172    0      1        0        0        0
173    0      1        0        1        0
174    1      1        0        0        0
175    0      1        1        0        0
176    1      1        0        0        1
177    0      1        0        0        0






## bmtcrr <-
##     within(bmtcrr, {
##         Status <- factor(Status, levels = 0:2, labels = c("Censored", "Relapse", "Transplant-related death"))
##         Phase <- relevel(Phase, ref = "Relapse")
##         Sex <- relevel(Sex, ref = "M")
##         Source <- relevel(Source, ref = "PB")
##     })



### 悪性黒色腫の生存データの解析例
## Introductory Statistics with R (ISwR)より

## 必要なパッケージを用意
inst("ISwR") ; library(ISwR)

## 悪性黒色腫のデータを読み込み
data(melanom)

## データの解説
## 悪性黒色腫で手術後の生存率のデータ
data.frame(var = names(melanom))
     var
1     no
2 status
3   days
4    ulc
5  thick
6    sex
##      var
## 1     no     # 患者ID
## 2 status     # 最終フォローの状態 1悪性黒色腫で死亡, 2生存打ち切り, 3他の原因で死亡
## 3   days     # フォロー日数
## 4    ulc     # 1潰瘍有り, 2潰瘍無し
## 5  thick     # 腫瘍の厚さ 1/100mm単位
## 6    sex     # 1女性, 2男性


## 数字コードは分かりにくいので書き換え
## within()の中で変数をいじると最後にもとのデータフレームが改変されたものが出てきます。
melanom <-
    within(melanom, {
        status <- factor(status, levels = 1:3, labels = c("Dead from melanoma", "Censored", "Dead from other cause"))
        ulc <- factor(ulc, levels = 1:2, labels = c("Ulcerated", "No ulcer"))
        sex <- factor(sex, levels = 1:2, labels = c("Female", "Male"))
    })


## 男女で群分けしてイベントごとの累積発生率を比べます。
with(melanom,
     CumIncidence(ftime = days,         # ftimeにフォロー期間の変数を指定
                  fstatus = status,     # fstatusにイベントの種類の変数を指定
                  group = sex,          # groupにグループ分け変数を指定。一群の時は"all"とか適当に。
                  cencode = "Censored", # cencodeに打ち切りの場合の値を指定。
                  xlab = "Days")        # X軸のラベル。時間の単位を入れます。
     )

+-------------------------------------------------------------------+
| Cumulative incidence function estimates from competing risks data |
+-------------------------------------------------------------------+
Test equality across groups:
                      Statistic p-value df
Dead from melanoma       5.8140  0.0159  1
Dead from other cause    0.8544  0.3553  1

Estimates at time points:
                             0    1000    2000    3000    4000    5000
Female Dead from melanoma    0 0.08730 0.18078 0.23565 0.28424 0.28424
Male Dead from melanoma      0 0.19237 0.31010 0.42454 0.42454      NA
Female Dead from other cause 0 0.03175 0.03984 0.05221 0.08538 0.08538
Male Dead from other cause   0 0.03814 0.06694 0.06694 0.13474      NA

Standard errors:
                             0    1000    2000    3000    4000    5000
Female Dead from melanoma    0 0.02525 0.03529 0.04255 0.05249 0.05249
Male Dead from melanoma      0 0.04497 0.05310 0.06534 0.06534      NA
Female Dead from other cause 0 0.01568 0.01753 0.02128 0.03910 0.03910
Male Dead from other cause   0 0.02174 0.02935 0.02935 0.05432      NA
## +-------------------------------------------------------------------+
## | Cumulative incidence function estimates from competing risks data |
## +-------------------------------------------------------------------+
## Test equality across groups:
##                       Statistic p-value df
## Dead from melanoma       5.8140  0.0159  1   # 悪性黒色腫での死亡は有意差あり
## Dead from other cause    0.8544  0.3553  1   # 他の原因の死亡は有意差なし
##
## Estimates at time points:
##                              0    1000    2000    3000    4000    5000
## Female Dead from melanoma    0 0.08730 0.18078 0.23565 0.28424 0.28424
## Male Dead from melanoma      0 0.19237 0.31010 0.42454 0.42454      NA
## Female Dead from other cause 0 0.03175 0.03984 0.05221 0.08538 0.08538
## Male Dead from other cause   0 0.03814 0.06694 0.06694 0.13474      NA
##
## Standard errors:
##                              0    1000    2000    3000    4000    5000
## Female Dead from melanoma    0 0.02525 0.03529 0.04255 0.05249 0.05249
## Male Dead from melanoma      0 0.04497 0.05310 0.06534 0.06534      NA
## Female Dead from other cause 0 0.01568 0.01753 0.02128 0.03910 0.03910
## Male Dead from other cause   0 0.02174 0.02935 0.02935 0.05432      NA





### 濾胞性リンパ腫の生存データの解析例
## 参考: http://www.jstatsoft.org/v38/i02/paper
## 元文献: follicular cell lymphoma data from Pintilie (2007)

## 必要なパッケージをインストール&読み込み
inst("timereg") ; library(timereg)

## インターネットからデータ読み込み
fol <- read.csv("http://www.uhnres.utoronto.ca/labs/hill/datasets/Pintilie/datasets/follic.txt")

## データ確認
head(fol)
  age path1 hgb ldh clinstg blktxcat relsite ch rt survtime stat    dftime dfcens resp stnum
1  56  NH04 140  NA       2        1       B     Y   0.6982    1  0.238193      1   CR     1
2  36  NH02 130  NA       2        1       D     Y  14.5024    1 12.418891      1   CR     2
3  39  NH02 140  NA       2        3          Y  Y   4.9144    1  0.002738      1   NR     3
4  37  NH03 140  NA       1        1             Y  15.6851    1 15.685147      1   CR     4
5  61  NH04 110  NA       2        2             Y   0.2355    1  0.002738      1   NR     5
6  69  NH02 120  NA       1        1             Y   8.4189    1  8.418891      1   CR     6
## stage I or II Follicular lymphomaに対する治療法と生命予後のデータ
##
##  年齢 病理  Hgb LDH  Stage          再発部位 化 放   生存時間 Event  Event時間       治療反応
##   age path1 hgb ldh clinstg blktxcat relsite ch rt   survtime stat       dftime dfcens resp stnum
## 1  56  NH04 140  NA       2        1       B     Y  0.6981520    1  0.238193018      1   CR     1
## 2  36  NH02 130  NA       2        1       D     Y 14.5023956    1 12.418891170      1   CR     2
## 3  39  NH02 140  NA       2        3          Y  Y  4.9144422    1  0.002737851      1   NR     3
## 4  37  NH03 140  NA       1        1             Y 15.6851472    1 15.685147159      1   CR     4
## 5  61  NH04 110  NA       2        2             Y  0.2354552    1  0.002737851      1   NR     5
## 6  69  NH02 120  NA       1        1             Y  8.4188912    1  8.418891170      1   CR     6


## データの作成
evcens <- as.numeric(fol$resp == "NR" | fol$relsite != "")                 # No-response群のflag
crcens <- as.numeric(fol$resp == "CR" & fol$relsite == "" & fol$stat == 1) # CR中死亡群のflag
cause <- ifelse(evcens == 1, 1, ifelse(crcens == 1, 2, 0))      #

## イベントの種類の集計
## 193例の打ち切り例、272名の疾患イベント(治療無反応or再発)、76例のCR中死亡。
## 打ち切りの症例がその後study外で疾患イベントを経験する傾向は非打ち切り群と同様と仮定される
## CR中死亡群はその後再発することはないため、上記仮定を必要とする打ち切りとは見なせない
table(cause)
cause
  0   1   2 
193 272  76 
## cause
##   0   1   2
## 193 272  76


## 他の変数の定義
stage <- as.numeric(fol$clinstg == 2)           # Stage 2なら1、Stage 1なら0
chemo <- as.numeric(fol$ch == "Y")              # 化学療法をしていれば1、なければ0
times1 <- sort(unique(fol$dftime[cause == 1]))  # 疾患イベントの発生時間(重複のぞく)



## test
factor(cause, levels = c(0,1,2), labels = c("Censored", "No response", ""))
  [1] No response No response No response             No response                         Censored    Censored   
 [10] No response No response No response No response No response             No response No response No response
 [19] Censored    No response             No response No response Censored    No response No response No response
 [28]             No response No response Censored    Censored    Censored                No response Censored   
 [37]             No response No response No response No response No response             No response No response
 [46] No response Censored    No response Censored    Censored    No response No response Censored    No response
 [55] No response             No response No response No response No response No response Censored    No response
 [64]             No response                         No response No response No response                        
 [73] Censored    No response No response No response No response No response Censored    No response No response
 [82]             No response Censored                Censored                Censored                No response
 [91]                         No response No response No response No response No response No response No response
[100]                         Censored    Censored    No response No response Censored    No response No response
[109] No response No response No response No response No response Censored                No response            
[118]             No response Censored    No response No response No response No response No response Censored   
[127] Censored    No response No response No response Censored    No response No response No response Censored   
[136]             No response Censored    Censored                No response No response Censored    No response
[145]             No response No response No response Censored    No response Censored    Censored    Censored   
[154] Censored    No response No response No response             Censored                No response Censored   
[163] No response No response             No response No response Censored    No response No response No response
[172] No response Censored    No response No response Censored    Censored    Censored    No response Censored   
[181]             No response             No response                         No response No response            
[190] No response Censored    No response No response             No response No response                        
[199] No response No response No response Censored    No response Censored    Censored    No response No response
[208]             Censored    Censored    No response No response No response No response No response Censored   
[217] Censored                            No response Censored    No response Censored    Censored    No response
[226] No response No response No response No response Censored    No response Censored    No response Censored   
[235] No response No response Censored    No response No response No response No response Censored    No response
[244] No response Censored    No response No response No response No response Censored    No response No response
[253]                         No response             No response No response No response Censored               
[262] No response No response Censored    No response Censored    No response Censored                Censored   
[271] No response No response No response Censored    Censored                Censored    Censored    No response
[280] No response No response Censored    No response No response No response                         No response
[289] Censored    Censored    No response Censored    No response No response No response             Censored   
[298] No response Censored    No response No response Censored                Censored    No response No response
[307] No response Censored    Censored    No response             No response No response No response No response
[316] No response Censored    Censored    No response             No response No response No response No response
[325] Censored    No response No response No response                         No response No response No response
[334] No response No response No response No response No response Censored    Censored    Censored    No response
[343] Censored    No response No response No response Censored                Censored    Censored    Censored   
[352] Censored    Censored    Censored    No response No response                                     Censored   
[361] Censored    No response Censored    Censored    Censored    No response Censored    Censored    No response
[370] No response No response No response             Censored    No response Censored    Censored    No response
[379]             No response No response Censored                No response Censored    No response Censored   
[388] No response             No response No response No response Censored    No response Censored    No response
[397] No response No response No response Censored    No response No response No response Censored    No response
[406] Censored    Censored                No response             Censored    No response No response Censored   
[415]             Censored    Censored    No response No response No response Censored    Censored               
[424] No response No response No response Censored    Censored    No response No response No response            
[433] Censored    Censored                Censored    No response Censored                No response            
[442] Censored    No response Censored    Censored    Censored    Censored    Censored    No response Censored   
[451] Censored    Censored    Censored    No response No response Censored    No response Censored    Censored   
[460] No response Censored    Censored    Censored                No response No response Censored    Censored   
[469] Censored    Censored    Censored    Censored    Censored    No response Censored    Censored    Censored   
[478] Censored    Censored    Censored    Censored    Censored    No response Censored    Censored    Censored   
[487] Censored    Censored    Censored    Censored    No response Censored    Censored    Censored    Censored   
[496] Censored    Censored    No response             No response Censored    Censored    Censored    Censored   
[505] Censored    No response Censored    No response Censored    Censored    No response No response Censored   
[514] Censored    No response No response Censored    No response Censored    Censored    No response Censored   
[523] Censored    Censored    No response No response             No response No response Censored    No response
[532] No response No response Censored    No response Censored    Censored    Censored    Censored    Censored   
[541] Censored   
Levels: Censored No response 
    with(fol,
         CumIncidence(ftime = dftime,            # ftimeにフォロー期間の変数を指定
                      fstatus = cause,         # fstatusにイベントの種類の変数を指定
                      group = "all",              # groupにグループ分け変数を指定。一群なので"all"と名付ける。
                      cencode = 0,     # cencodeに打ち切りの場合の値を指定。
                      xlab = "Years")          # X軸のラベル。時間の単位を入れます。
         )

+-------------------------------------------------------------------+
| Cumulative incidence function estimates from competing risks data |
+-------------------------------------------------------------------+

Estimates at time points:
      0       5      10     15     20     25     30
all 1 0 0.37737 0.49081 0.5405 0.5618 0.5718 0.5718
all 2 0 0.05235 0.09456 0.1444 0.1803 0.2132 0.3452

Standard errors:
      0        5      10      15      20      25      30
all 1 0 0.020953 0.02278 0.02371 0.02493 0.02630 0.02630
all 2 0 0.009646 0.01344 0.01751 0.02123 0.02736 0.08716



## library(timereg)のcomp.risk()を用いて解析
## cause == 0を打ち切り、1疾患イベントと2CR中死亡をcompeting risk eventとして解析
out1 <- comp.risk(Surv(dftime, cause == 0) ~ + 1, data = fol,
                  cause, causeS = 1, n.sim = 5000, cens.code = 0, model = "additive")

## Predicted survival for an additive competing risks model
pout1 <- predict(out1, X = 1)

group <- rep(1, nrow(fol))
fit <- cuminc(ftime = fol$dftime, fstatus = cause, group = group, cencode = 0)
## ftime: failure time; fstatus: failure causes

layout(matrix(1:2, ncol = 2))

plot of chunk unnamed-chunk-2


## (a) Cumulative incidence curves based on the cuminc() function for the two causes
plot(fit, main = "cmprsk", curvlab = c("NR or relapse","Death in CR"), xlab = "Years (a)")

## (b) cumulative incidence curve based on comp.risk() function for relapse
## (solid line) with 95% confidence intervals (dotted lines) and
## 95% confidence bands (broken lines) based on resampling.
plot(pout1, xlim = c(0, 30), xlab = "Years (b)", main = "timereg",
     uniform = 3, se = 2)

plot of chunk unnamed-chunk-2



## Test
with(fol,
     CumIncidence(ftime = dftime, fstatus = cause, group = path1, cencode = 0)
     )

+-------------------------------------------------------------------+
| Cumulative incidence function estimates from competing risks data |
+-------------------------------------------------------------------+
Test equality across groups:
  Statistic p-value df
1     1.647  0.4389  2
2     2.164  0.3390  2

Estimates at time points:
       0       5      10     15     20     25     30
NH02 1 0 0.33953 0.46798 0.5105 0.5420 0.5720     NA
NH03 1 0 0.43012 0.54018 0.6033 0.6199     NA     NA
NH04 1 0 0.36212 0.46622 0.5112 0.5233 0.5233 0.5233
NH02 2 0 0.05289 0.09576 0.1221 0.1362 0.1963     NA
NH03 2 0 0.07875 0.12163 0.1506 0.2213     NA     NA
NH04 2 0 0.02660 0.06790 0.1603 0.1915 0.2152 0.3911

Standard errors:
       0       5      10      15      20      25      30
NH02 1 0 0.03643 0.03972 0.04092 0.04406 0.05092      NA
NH03 1 0 0.03724 0.04027 0.04229 0.04371      NA      NA
NH04 1 0 0.03521 0.03872 0.04042 0.04125 0.04125 0.04125
NH02 2 0 0.01723 0.02385 0.02760 0.03059 0.05190      NA
NH03 2 0 0.02030 0.02594 0.03001 0.04353      NA      NA
NH04 2 0 0.01178 0.02014 0.03298 0.03823 0.04363 0.09612

out1 <- comp.risk(Surv(dftime, cause == 0) ~ path1, data = fol,
                  cause, causeS = 1, n.sim = 5000, cens.code = 0, model = "additive")
plot(out1)

plot of chunk unnamed-chunk-2 plot of chunk unnamed-chunk-2

summary(out1)
Competing risks Model 

Test for nonparametric terms 

Test for non-significant effects 
            Supremum-test of significance p-value H_0: B(t)=0
(Intercept)                          8.34               0.000
path1NH03                            1.73               0.484
path1NH04                            2.42               0.114

Test for time invariant effects 
                  Kolmogorov-Smirnov test p-value H_0:constant effect
(Intercept)                         0.305                       0.000
path1NH03                           0.136                       0.340
path1NH04                           0.133                       0.278
                    Cramer von Mises test p-value H_0:constant effect
(Intercept)                        0.7550                       0.000
path1NH03                          0.1090                       0.242
path1NH04                          0.0605                       0.419

  Call: 
comp.risk(Surv(dftime, cause == 0) ~ path1, data = fol, cause, 
    causeS = 1, n.sim = 5000, cens.code = 0, model = "additive")