2013-11-19
Load necessary packages
> library(survival) # survival analysis
Loading required package: splines
> library(nlme) # longitudinal analysis
> library(JM) # Joint Model
Loading required package: MASS
Import data and basic check
> dat <- read.csv("cancer data.csv")
> str(dat)
'data.frame': 1169 obs. of 9 variables:
$ id : int 1028 1028 1028 1028 1031 1031 1031 1031 1031 1159 ...
$ status : int 0 0 0 0 1 1 1 1 1 0 ...
$ start : int 0 7 13 19 0 6 10 12 15 0 ...
$ end : int 7 13 19 26 6 10 12 15 18 7 ...
$ FV.width : int 0 0 4 4 12 18 0 20 20 7 ...
$ age : int 82 82 82 82 93 93 93 93 93 54 ...
$ sex : Factor w/ 2 levels "F","M": 1 1 1 1 1 1 1 1 1 1 ...
$ smoke : int 0 0 0 0 1 1 1 1 1 0 ...
$ family.history: int 1 1 1 1 0 0 0 0 0 0 ...
> head(dat)
id status start end FV.width age sex smoke family.history
1 1028 0 0 7 0 82 F 0 1
2 1028 0 7 13 0 82 F 0 1
3 1028 0 13 19 4 82 F 0 1
4 1028 0 19 26 4 82 F 0 1
5 1031 1 0 6 12 93 F 1 0
6 1031 1 6 10 18 93 F 1 0
Survival model only.
> fit1.cox <- coxph(Surv(start, end, event = status) ~ FV.width + age+ sex +
+ smoke + family.history, data = dat)
> summary(fit1.cox)
Call:
coxph(formula = Surv(start, end, event = status) ~ FV.width +
age + sex + smoke + family.history, data = dat)
n= 1082, number of events= 93
(87 observations deleted due to missingness)
coef exp(coef) se(coef) z Pr(>|z|)
FV.width 0.07235 1.07503 0.01128 6.42 1.4e-10 ***
age 0.00213 1.00213 0.00931 0.23 0.82
sexM -0.31354 0.73085 0.21558 -1.45 0.15
smoke 1.45148 4.26943 0.35047 4.14 3.5e-05 ***
family.history 0.37679 1.45760 0.31300 1.20 0.23
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(coef) exp(-coef) lower .95 upper .95
FV.width 1.075 0.930 1.052 1.10
age 1.002 0.998 0.984 1.02
sexM 0.731 1.368 0.479 1.12
smoke 4.269 0.234 2.148 8.49
family.history 1.458 0.686 0.789 2.69
Concordance= 0.716 (se = 0.032 )
Rsquare= 0.058 (max possible= 0.563 )
Likelihood ratio test= 65 on 5 df, p=1.13e-12
Wald test = 69.6 on 5 df, p=1.24e-13
Score (logrank) test = 78.1 on 5 df, p=2.11e-15
First fit a longitudinal model (LME model) for FV.width.
> fit2.lme <- lme(FV.width ~ end+ age*end + sex*end + smoke*end + family.history*end,
+ data = dat, random = ~ end | id, na.action = na.omit, method = "ML")
> summary(fit2.lme)
Linear mixed-effects model fit by maximum likelihood
Data: dat
AIC BIC logLik
6370 6440 -3171
Random effects:
Formula: ~end | id
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
(Intercept) 4.7642 (Intr)
end 0.1494 -0.59
Residual 3.6585
Fixed effects: FV.width ~ end + age * end + sex * end + smoke * end + family.history * end
Value Std.Error DF t-value p-value
(Intercept) 6.470 2.0884 901 3.0978 0.0020
end -0.142 0.0834 901 -1.6999 0.0895
age 0.013 0.0355 171 0.3718 0.7105
sexM -1.929 0.9432 171 -2.0456 0.0423
smoke 1.866 1.0295 171 1.8130 0.0716
family.history -1.737 1.3660 171 -1.2715 0.2053
end:age 0.001 0.0015 901 0.9041 0.3662
end:sexM 0.041 0.0382 901 1.0613 0.2888
end:smoke -0.029 0.0417 901 -0.7048 0.4811
end:family.history 0.011 0.0582 901 0.1838 0.8542
Correlation:
(Intr) end age sexM smoke fmly.h end:ag end:sM
end -0.669
age -0.906 0.622
sexM -0.243 0.163 0.057
smoke 0.063 -0.071 -0.318 -0.276
family.history -0.042 0.022 -0.054 0.041 0.021
end:age 0.595 -0.909 -0.675 -0.032 0.238 0.046
end:sexM 0.162 -0.234 -0.033 -0.684 0.188 -0.016 0.041
end:smoke -0.067 0.131 0.241 0.189 -0.686 -0.037 -0.371 -0.274
end:family.history 0.022 -0.017 0.045 -0.013 -0.037 -0.678 -0.089 0.021
end:sm
end
age
sexM
smoke
family.history
end:age
end:sexM
end:smoke
end:family.history 0.085
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-4.3007 -0.4490 -0.0768 0.3299 8.0857
Number of Observations: 1082
Number of Groups: 176
Then use coef() to get the estimated parameters for each patients, and put the fitted value of FV.width in the longitudinal model (I haven't done it yet but should not have much difficulity).
I use the JM package and jointModel() function to fit the joint model. However, I get an error of “non-conformable arrays” when fitting the model, it seems that the algorithm doesn't work for this data, sigh…
> fit3.cox <- coxph(Surv(start, end, event = status) ~ FV.width + age+ sex + smoke
+ + family.history + cluster(id), data = dat, x = TRUE, model = TRUE, cluster(id))
> fit4.jm <- jointModel(fit2.lme, fit3.cox, timeVar = "end", method = "Cox-PH-GH")
Warning: data length [704] is not a sub-multiple or multiple of the number
of rows [1082]
Error: non-conformable arrays