1 Sample Groups

1.1 Comparison 01:

kSORT assay genes (table below) predicting acute rejection (Group 3), chronic antibody mediated rejection (Group 4), and BKV viremia (Group 5) compared to Groups 1 and 2.

1.1.0.1 Sample Summary

1.1.0.2 Data Preprocessing

CEL files were processed using the oligo package. Robust multichip averaging (rma) was used to background correct, normalize, and summarize probe level data. Annotations were taken from the hugene10sttranscriptcluser database. Control probes were removed before linear modelling.

##Exploratory Data Analysis plots Exploratory data analysis was carried out to examine sample-to-sample variation. The heatmap is generated from Pearson’s correlation between each sample pair and is based on all expression values. The multidimensinal scaling plot (MDS) is based on the top 500 gene expression differences between samples and is colored according to sample group.

1.1.1 G3 vs G1 and G2

## 
## Call:
## glm(formula = as.factor(Class) ~ ., family = "binomial", data = g3DF, 
##     weights = na.action(na.omit))
## 
## Deviance Residuals: 
##        Min          1Q      Median          3Q         Max  
## -1.295e-05   2.110e-08   2.110e-08   1.337e-06   1.392e-05  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept)  4.378e+02  7.043e+06       0        1
## NAMPT       -6.598e+01  6.186e+05       0        1
## PSEN1       -3.426e+01  1.404e+06       0        1
## ITGAX        4.845e+01  1.244e+06       0        1
## RARA         2.579e+01  1.434e+06       0        1
## EPOR        -1.624e+02  2.170e+06       0        1
## CEACAM4      3.051e+01  8.497e+05       0        1
## CFLAR       -1.330e+02  6.239e+05       0        1
## NKTR         2.267e+01  2.501e+06       0        1
## RNF13       -2.661e+01  1.113e+06       0        1
## RYBP         5.044e+01  1.663e+06       0        1
## GZMK         2.884e+00  7.892e+05       0        1
## DUSP1        3.356e+01  5.276e+05       0        1
## MAPK9       -1.924e+01  2.019e+06       0        1
## IFNGR1       6.739e+01  9.345e+05       0        1
## RHEB         1.652e+00  1.606e+06       0        1
## SLC25A37     1.004e+02  7.772e+05       0        1
## RXRA        -2.952e+01  4.597e+05       0        1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 4.0191e+01  on 48  degrees of freedom
## Residual deviance: 1.5303e-09  on 31  degrees of freedom
## AIC: 36
## 
## Number of Fisher Scoring iterations: 25

1.1.2 G4 vs G1 and G2

## 
## Call:
## glm(formula = as.factor(Class) ~ ., family = "binomial", data = g4DF, 
##     weights = na.action(na.omit))
## 
## Deviance Residuals: 
##        Min          1Q      Median          3Q         Max  
## -4.048e-05  -2.100e-08   2.100e-08   2.100e-08   3.465e-05  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept)  1.578e+02  3.845e+06   0.000    1.000
## NAMPT        1.509e+02  1.981e+06   0.000    1.000
## PSEN1        9.886e-01  4.649e+05   0.000    1.000
## ITGAX        5.560e+00  3.067e+05   0.000    1.000
## RARA         5.951e+01  3.994e+05   0.000    1.000
## EPOR        -1.751e+02  8.799e+05   0.000    1.000
## CEACAM4      1.759e+02  1.427e+05   0.001    0.999
## CFLAR       -5.021e+02  4.962e+05  -0.001    0.999
## NKTR        -7.505e+01  4.569e+05   0.000    1.000
## RNF13       -2.477e+02  9.394e+05   0.000    1.000
## RYBP         2.469e+02  3.869e+05   0.001    0.999
## GZMK        -2.018e+01  7.929e+04   0.000    1.000
## DUSP1        7.184e+01  5.400e+05   0.000    1.000
## MAPK9        9.888e+01  7.523e+05   0.000    1.000
## IFNGR1       4.297e+01  1.136e+06   0.000    1.000
## RHEB         8.816e+01  4.924e+05   0.000    1.000
## SLC25A37     2.261e+02  7.800e+05   0.000    1.000
## RXRA        -2.196e+02  6.617e+05   0.000    1.000
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 8.0201e+01  on 62  degrees of freedom
## Residual deviance: 8.5551e-09  on 45  degrees of freedom
## AIC: 36
## 
## Number of Fisher Scoring iterations: 25

1.1.3 G5 vs G1 and G2

## 
## Call:
## glm(formula = as.factor(Class) ~ ., family = "binomial", data = g5DF, 
##     weights = na.action(na.omit))
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.46928  -0.21842   0.05239   0.56639   1.49113  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept) 219.4842   126.9207   1.729   0.0838 .
## NAMPT       -14.6150     7.0697  -2.067   0.0387 *
## PSEN1         9.3106     6.2122   1.499   0.1339  
## ITGAX        -4.2172     4.3173  -0.977   0.3287  
## RARA         -2.8295     3.3621  -0.842   0.4000  
## EPOR         -2.4074     2.8942  -0.832   0.4055  
## CEACAM4       5.9847     3.8414   1.558   0.1193  
## CFLAR        -2.5103     3.8041  -0.660   0.5093  
## NKTR        -12.1980     5.7908  -2.106   0.0352 *
## RNF13        -0.3145     5.7372  -0.055   0.9563  
## RYBP          7.6547     4.3793   1.748   0.0805 .
## GZMK          1.6102     1.4330   1.124   0.2612  
## DUSP1        -2.2398     3.1057  -0.721   0.4708  
## MAPK9         0.2652     2.8027   0.095   0.9246  
## IFNGR1        5.0440     4.2903   1.176   0.2397  
## RHEB         -2.6692     3.2608  -0.819   0.4130  
## SLC25A37      7.1320     3.4838   2.047   0.0406 *
## RXRA        -13.0264     6.4258  -2.027   0.0426 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 75.674  on 60  degrees of freedom
## Residual deviance: 37.017  on 43  degrees of freedom
## AIC: 73.017
## 
## Number of Fisher Scoring iterations: 8

1.1.4 Rejection vs Normal (Pre & Post-Transplant)

## 
## Call:
## glm(formula = as.factor(Class) ~ ., family = binomial(link = "logit"), 
##     data = g34DF, weights = na.action(na.omit))
## 
## Deviance Residuals: 
##        Min          1Q      Median          3Q         Max  
## -3.713e-05  -2.100e-08  -2.100e-08   2.100e-08   4.292e-05  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.469e+02  6.290e+06       0        1
## NAMPT       -2.262e+02  3.201e+06       0        1
## PSEN1        4.179e+01  1.369e+06       0        1
## ITGAX        4.962e+01  9.077e+05       0        1
## RARA         3.547e+01  1.976e+06       0        1
## EPOR         3.562e+01  2.293e+06       0        1
## CEACAM4     -1.600e+02  4.003e+05       0        1
## CFLAR        4.536e+02  1.340e+06       0        1
## NKTR        -8.777e+00  1.362e+06       0        1
## RNF13        1.926e+02  2.329e+06       0        1
## RYBP        -2.656e+02  8.842e+05       0        1
## GZMK        -7.085e-01  3.969e+05       0        1
## DUSP1       -2.974e+01  1.498e+06       0        1
## MAPK9       -4.099e+01  1.259e+06       0        1
## IFNGR1      -2.080e+01  1.866e+06       0        1
## RHEB        -4.273e+01  1.406e+06       0        1
## SLC25A37    -1.166e+02  1.809e+06       0        1
## RXRA         1.465e+02  2.118e+06       0        1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 9.4222e+01  on 69  degrees of freedom
## Residual deviance: 9.2267e-09  on 52  degrees of freedom
## AIC: 36
## 
## Number of Fisher Scoring iterations: 25

1.1.5 G3,G4 and G5 vs G1 and G2

## 
## Call:
## glm(formula = as.factor(Class) ~ ., family = "binomial", data = g345DF, 
##     weights = na.action(na.omit))
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.9548  -0.4397  -0.1100   0.3160   2.8630  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  82.0447    48.9149   1.677   0.0935 .
## NAMPT        -4.7563     3.9788  -1.195   0.2319  
## PSEN1         2.5695     3.8725   0.664   0.5070  
## ITGAX        -1.0303     2.0023  -0.515   0.6069  
## RARA         -4.4329     2.6659  -1.663   0.0964 .
## EPOR         -3.7469     2.3787  -1.575   0.1152  
## CEACAM4       3.4260     1.4870   2.304   0.0212 *
## CFLAR        -5.3213     2.4777  -2.148   0.0317 *
## NKTR         -5.2602     2.7803  -1.892   0.0585 .
## RNF13         0.7688     3.6206   0.212   0.8318  
## RYBP          5.6846     2.3926   2.376   0.0175 *
## GZMK         -0.1151     0.7638  -0.151   0.8803  
## DUSP1         0.7883     1.9093   0.413   0.6797  
## MAPK9         1.8072     1.9687   0.918   0.3586  
## IFNGR1        0.4555     2.4189   0.188   0.8506  
## RHEB         -0.7275     2.1230  -0.343   0.7318  
## SLC25A37      7.0630     2.8795   2.453   0.0142 *
## RXRA         -6.5280     3.1250  -2.089   0.0367 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 123.099  on 88  degrees of freedom
## Residual deviance:  52.758  on 71  degrees of freedom
## AIC: 88.758
## 
## Number of Fisher Scoring iterations: 6

1.1.6 G6 vs G1

## 
## Call:
## glm(formula = as.factor(Class) ~ ., family = "binomial", data = g6DF, 
##     weights = na.action(na.omit))
## 
## Deviance Residuals: 
##        Min          1Q      Median          3Q         Max  
## -1.133e-05  -2.110e-08  -2.110e-08   1.140e-06   8.907e-06  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept) -4.140e+02  1.703e+07       0        1
## NAMPT       -2.788e+01  1.517e+06       0        1
## PSEN1       -3.440e+01  9.226e+05       0        1
## ITGAX       -3.238e+01  5.802e+05       0        1
## RARA         1.144e+02  7.932e+05       0        1
## EPOR         1.321e+01  5.427e+05       0        1
## CEACAM4     -4.022e+01  7.937e+05       0        1
## CFLAR        2.325e+01  1.166e+06       0        1
## NKTR         2.551e+00  9.888e+05       0        1
## RNF13       -8.273e+01  1.509e+06       0        1
## RYBP        -1.674e+01  2.715e+05       0        1
## GZMK        -5.563e+00  2.155e+05       0        1
## DUSP1       -8.365e+00  4.860e+05       0        1
## MAPK9        3.151e+01  6.735e+05       0        1
## IFNGR1       1.129e+02  8.731e+05       0        1
## RHEB        -1.312e+01  6.630e+05       0        1
## SLC25A37     4.719e+01  7.254e+05       0        1
## RXRA        -3.416e+01  8.000e+05       0        1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 5.3834e+01  on 38  degrees of freedom
## Residual deviance: 6.5185e-10  on 21  degrees of freedom
## AIC: 36
## 
## Number of Fisher Scoring iterations: 25

1.1.7 Feature Selection

Is there any subgroup of those 17 genes are better predictor for acute (Group 3) or chronic rejection (Group 4) and BKV viremia (Group 5)?

A Recursive Feature Elimination (RFE) with 3-fold cross-validation was used to select the optimal subset of features that maximise the area under the receiver operating characteristic curve.

1.1.7.1 Rejection vs Normal

The confusion matrix below represent the proportion of correctly assigned classes for the Rejection vs Normal comparison with the kSORT data set.

A caption

A caption

1.1.7.2 BKV viremia vs Normal

The confusion matrix below represent the proportion of correctly assigned classes for the BKV vs Normal comparison with the kSORT dataset.

A caption

A caption

2 Comparison 2

We want look at the following 5 gene transcripts if they are predicting acute (Group 3) or chronic rejection (Group 4) and BKV viremia (Group 5). These genes are:

2.0.1 Rejection vs Normal

## 
## Call:
## glm(formula = as.factor(Class) ~ ., family = "binomial", data = Q2_1DF, 
##     weights = na.action(na.omit))
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.3638  -0.4914  -0.2517   0.4592   2.1738  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  64.27720   27.46152   2.341 0.019251 *  
## MARCHF8       8.80456    2.46511   3.572 0.000355 ***
## FLT3          0.12287    0.77174   0.159 0.873507    
## IL1R2        -0.09188    0.65049  -0.141 0.887670    
## PDCD1        -1.60371    2.81293  -0.570 0.568595    
## DCAF12      -12.57539    3.24842  -3.871 0.000108 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 94.222  on 69  degrees of freedom
## Residual deviance: 50.615  on 64  degrees of freedom
## AIC: 62.615
## 
## Number of Fisher Scoring iterations: 6

2.0.2 BKV viremia vs Normal

## 
## Call:
## glm(formula = as.factor(Class) ~ ., family = "binomial", data = Q2_2DF, 
##     weights = na.action(na.omit))
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.0741  -1.1699   0.6841   0.8152   1.5445  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -31.45741   20.60003  -1.527   0.1267  
## MARCHF8      -3.81741    2.12124  -1.800   0.0719 .
## FLT3          0.53397    0.71490   0.747   0.4551  
## IL1R2         0.05057    0.57909   0.087   0.9304  
## PDCD1         0.75821    1.86053   0.408   0.6836  
## DCAF12        5.39137    2.74110   1.967   0.0492 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 75.674  on 60  degrees of freedom
## Residual deviance: 69.290  on 55  degrees of freedom
## AIC: 81.29
## 
## Number of Fisher Scoring iterations: 4

3 Comparison 3

3.1 Exploratory Data Analysis plots

3.1.0.1 Linear Modelling

The limma package was used to fit a group-means paramaterization. Moderated F-statistics are calculated using the eBayes function.

## [1] 0.65633

The maximum absolute value for logFC is 0.65633.

3.1.0.2 Differentially expressed gene lists

4 Limma romer analysis to test for Halloran PBTs

[1] “Comparison NvD”
NGenes Up Down Mixed
core KT1 517.00 0.03 0.97 0.01
core KT2 63.00 0.02 0.98 0.01
core ENDAT 112.00 0.34 0.66 0.11
core IRITD5 196.00 0.66 0.34 0.13
core DSAST 19.00 0.41 0.59 0.14
core CMAT 61.00 1.00 0.00 0.26
GST IQR 62.00 0.61 0.39 0.29
core GRIT1 39.00 0.94 0.06 0.33
GSTs 85.00 0.58 0.42 0.37
core IRITD3 302.00 0.96 0.04 0.53
CAT1 IQR 120.00 1.00 0.00 0.56
AMA 173.00 0.99 0.01 0.64
AMA IQR 87.00 0.98 0.02 0.65
CAT1 204.00 1.00 0.00 0.81
BATs IQR 46.00 0.61 0.39 0.97