SPPH 551: Big Picture of My Data Analysis

Yumian Hu

2013-11-19

Raw Model

Load necessary packages

> library(survival)  # survival analysis 
Loading required package: splines
> library(nlme)      # longitudinal analysis 
> library(JM)        # Joint Model
Loading required package: MASS

Import data and basic check

> dat <- read.csv("cancer data.csv")
> str(dat)
'data.frame':   1169 obs. of  9 variables:
 $ id            : int  1028 1028 1028 1028 1031 1031 1031 1031 1031 1159 ...
 $ status        : int  0 0 0 0 1 1 1 1 1 0 ...
 $ start         : int  0 7 13 19 0 6 10 12 15 0 ...
 $ end           : int  7 13 19 26 6 10 12 15 18 7 ...
 $ FV.width      : int  0 0 4 4 12 18 0 20 20 7 ...
 $ age           : int  82 82 82 82 93 93 93 93 93 54 ...
 $ sex           : Factor w/ 2 levels "F","M": 1 1 1 1 1 1 1 1 1 1 ...
 $ smoke         : int  0 0 0 0 1 1 1 1 1 0 ...
 $ family.history: int  1 1 1 1 0 0 0 0 0 0 ...
> head(dat)
    id status start end FV.width age sex smoke family.history
1 1028      0     0   7        0  82   F     0              1
2 1028      0     7  13        0  82   F     0              1
3 1028      0    13  19        4  82   F     0              1
4 1028      0    19  26        4  82   F     0              1
5 1031      1     0   6       12  93   F     1              0
6 1031      1     6  10       18  93   F     1              0

Survival model only.

> fit1.cox <- coxph(Surv(start, end, event = status) ~ FV.width + age+ sex + 
+                     smoke + family.history, data = dat)
> summary(fit1.cox)
Call:
coxph(formula = Surv(start, end, event = status) ~ FV.width + 
    age + sex + smoke + family.history, data = dat)

  n= 1082, number of events= 93 
   (87 observations deleted due to missingness)

                   coef exp(coef) se(coef)     z Pr(>|z|)    
FV.width        0.07235   1.07503  0.01128  6.42  1.4e-10 ***
age             0.00213   1.00213  0.00931  0.23     0.82    
sexM           -0.31354   0.73085  0.21558 -1.45     0.15    
smoke           1.45148   4.26943  0.35047  4.14  3.5e-05 ***
family.history  0.37679   1.45760  0.31300  1.20     0.23    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

               exp(coef) exp(-coef) lower .95 upper .95
FV.width           1.075      0.930     1.052      1.10
age                1.002      0.998     0.984      1.02
sexM               0.731      1.368     0.479      1.12
smoke              4.269      0.234     2.148      8.49
family.history     1.458      0.686     0.789      2.69

Concordance= 0.716  (se = 0.032 )
Rsquare= 0.058   (max possible= 0.563 )
Likelihood ratio test= 65  on 5 df,   p=1.13e-12
Wald test            = 69.6  on 5 df,   p=1.24e-13
Score (logrank) test = 78.1  on 5 df,   p=2.11e-15

Two-step Model

First fit a longitudinal model (LME model) for FV.width.

> fit2.lme <- lme(FV.width ~ end+ age*end + sex*end + smoke*end + family.history*end,
+             data = dat, random = ~ end | id, na.action = na.omit, method = "ML")
> summary(fit2.lme)
Linear mixed-effects model fit by maximum likelihood
 Data: dat 
   AIC  BIC logLik
  6370 6440  -3171

Random effects:
 Formula: ~end | id
 Structure: General positive-definite, Log-Cholesky parametrization
            StdDev Corr  
(Intercept) 4.7642 (Intr)
end         0.1494 -0.59 
Residual    3.6585       

Fixed effects: FV.width ~ end + age * end + sex * end + smoke * end + family.history *      end 
                    Value Std.Error  DF t-value p-value
(Intercept)         6.470    2.0884 901  3.0978  0.0020
end                -0.142    0.0834 901 -1.6999  0.0895
age                 0.013    0.0355 171  0.3718  0.7105
sexM               -1.929    0.9432 171 -2.0456  0.0423
smoke               1.866    1.0295 171  1.8130  0.0716
family.history     -1.737    1.3660 171 -1.2715  0.2053
end:age             0.001    0.0015 901  0.9041  0.3662
end:sexM            0.041    0.0382 901  1.0613  0.2888
end:smoke          -0.029    0.0417 901 -0.7048  0.4811
end:family.history  0.011    0.0582 901  0.1838  0.8542
 Correlation: 
                   (Intr) end    age    sexM   smoke  fmly.h end:ag end:sM
end                -0.669                                                 
age                -0.906  0.622                                          
sexM               -0.243  0.163  0.057                                   
smoke               0.063 -0.071 -0.318 -0.276                            
family.history     -0.042  0.022 -0.054  0.041  0.021                     
end:age             0.595 -0.909 -0.675 -0.032  0.238  0.046              
end:sexM            0.162 -0.234 -0.033 -0.684  0.188 -0.016  0.041       
end:smoke          -0.067  0.131  0.241  0.189 -0.686 -0.037 -0.371 -0.274
end:family.history  0.022 -0.017  0.045 -0.013 -0.037 -0.678 -0.089  0.021
                   end:sm
end                      
age                      
sexM                     
smoke                    
family.history           
end:age                  
end:sexM                 
end:smoke                
end:family.history  0.085

Standardized Within-Group Residuals:
    Min      Q1     Med      Q3     Max 
-4.3007 -0.4490 -0.0768  0.3299  8.0857 

Number of Observations: 1082
Number of Groups: 176 

Then use coef() to get the estimated parameters for each patients, and put the fitted value of FV.width in the longitudinal model (I haven't done it yet but should not have much difficulity).

Joint Model

I use the JM package and jointModel() function to fit the joint model. However, I get an error of “non-conformable arrays” when fitting the model, it seems that the algorithm doesn't work for this data, sigh…

> fit3.cox <- coxph(Surv(start, end, event = status) ~ FV.width + age+ sex + smoke 
+                   + family.history + cluster(id), data = dat, x = TRUE, model = TRUE, cluster(id))
> fit4.jm <- jointModel(fit2.lme, fit3.cox, timeVar = "end", method = "Cox-PH-GH")
Warning: data length [704] is not a sub-multiple or multiple of the number
of rows [1082]
Error: non-conformable arrays