Load data

## Loading required package: cluster
## Loading required package: survival
## 
## Read in  2308 genes
## Read in  63 samples
## Read in  63 sample labels
## 
## Make sure these figures are correct!!

Optional pre-processing

Impute missing expression values (optional)

Bacth adjust (optional)

Output adjusted data (optional)

Non-interactive step-by-step analysis

Train the classifeir and cross-validate

## 123456789101112131415161718192021222324252627282930
## Call:
## pamr.train(data = khan.data)
##    threshold nonzero errors
## 1  0.000     2308    2     
## 2  0.262     2289    1     
## 3  0.524     2145    1     
## 4  0.786     1878    0     
## 5  1.048     1494    0     
## 6  1.309     1137    0     
## 7  1.571      853    0     
## 8  1.833      609    0     
## 9  2.095      436    0     
## 10 2.357      330    0     
## 11 2.619      244    0     
## 12 2.881      193    0     
## 13 3.143      151    0     
## 14 3.404      107    0     
## 15 3.666       87    0     
## 16 3.928       68    0     
## 17 4.190       52    0     
## 18 4.452       39    0     
## 19 4.714       32    1     
## 20 4.976       23    4     
## 21 5.238       21    11    
## 22 5.499       16    14    
## 23 5.761       11    16    
## 24 6.023       10    18    
## 25 6.285        9    21    
## 26 6.547        7    21    
## 27 6.809        5    23    
## 28 7.071        4    39    
## 29 7.333        1    40    
## 30 7.595        0    40
## 1234Fold 1 :123456789101112131415161718192021222324252627282930
## Fold 2 :123456789101112131415161718192021222324252627282930
## Fold 3 :123456789101112131415161718192021222324252627282930
## Fold 4 :123456789101112131415161718192021222324252627282930
## Fold 5 :123456789101112131415161718192021222324252627282930
## Fold 6 :123456789101112131415161718192021222324252627282930
## Fold 7 :123456789101112131415161718192021222324252627282930
## Fold 8 :123456789101112131415161718192021222324252627282930
## Call:
## pamr.cv(fit = khan.train, data = khan.data)
##    threshold nonzero errors
## 1  0.000     2308    2     
## 2  0.262     2289    2     
## 3  0.524     2145    2     
## 4  0.786     1878    2     
## 5  1.048     1494    2     
## 6  1.309     1137    2     
## 7  1.571      853    1     
## 8  1.833      609    1     
## 9  2.095      436    1     
## 10 2.357      330    0     
## 11 2.619      244    0     
## 12 2.881      193    0     
## 13 3.143      151    0     
## 14 3.404      107    0     
## 15 3.666       87    0     
## 16 3.928       68    0     
## 17 4.190       52    0     
## 18 4.452       39    1     
## 19 4.714       32    1     
## 20 4.976       23    4     
## 21 5.238       21    12    
## 22 5.499       16    14    
## 23 5.761       11    15    
## 24 6.023       10    18    
## 25 6.285        9    21    
## 26 6.547        7    21    
## 27 6.809        5    25    
## 28 7.071        4    32    
## 29 7.333        1    37    
## 30 7.595        0    37

Plot the cross-validated error curves

Compute the confusion matrix for a particular model

##     BL EWS NB RMS Class Error rate
## BL   8   0  0   0                0
## EWS  0  23  0   0                0
## NB   0   0 12   0                0
## RMS  0   0  0  20                0
## Overall error rate= 0

Plot the cross-validated class probabilities by class

Plot the class centroids

Make a gene plot of the most significant genes

List the significant genes

##       id       BL-score EWS-score NB-score RMS-score
##  [1,] GENE1389 -0.0629  0.5972    0        0        
##  [2,] GENE1955 0        0         0        0.5729   
##  [3,] GENE187  0        -0.0576   0        0.5631   
##  [4,] GENE2050 0        -0.5301   0        0        
##  [5,] GENE246  0        0.5219    0        0        
##  [6,] GENE2198 -0.5083  0         0        0        
##  [7,] GENE509  0        0         0        0.4803   
##  [8,] GENE2046 0        0         0        0.4688   
##  [9,] GENE2022 -0.4635  0         0        0        
## [10,] GENE851  -0.4424  0         0        0        
## [11,] GENE1319 0        0.426     0        0        
## [12,] GENE1003 0        0         0        0.4136   
## [13,] GENE1954 0        0.3966    0        0        
## [14,] GENE1    -0.3915  0         0        0        
## [15,] GENE842  0        0         -0.3641  0        
## [16,] GENE1708 0        0.3226    0        0        
## [17,] GENE129  0        0         0        0.3107   
## [18,] GENE1427 -0.3075  0         0        0        
## [19,] GENE566  0        0.2897    0        0        
## [20,] GENE545  0        0.2747    0        0        
## [21,] GENE836  0.2693   0         0        0        
## [22,] GENE1645 0        0.2659    0        0        
## [23,] GENE107  -0.2552  0         -0.0238  0        
## [24,] GENE2162 -0.2552  0         0        0        
## [25,] GENE255  0        0         0.2441   0        
## [26,] GENE846  0.2402   0         0        0        
## [27,] GENE1055 0        0         0        0.2326   
## [28,] GENE819  0        0         -0.2296  0        
## [29,] GENE554  0        0         0        0.2292   
## [30,] GENE742  0        0         0.2248   0        
## [31,] GENE1066 -0.1943  0         0        0        
## [32,] GENE1886 -0.1932  0         0        0        
## [33,] GENE174  0        0         0        0.1917   
## [34,] GENE1911 0        0         0        0.1455   
## [35,] GENE1764 0        0         0.1424   0        
## [36,] GENE1194 0        0         0        0.1296   
## [37,] GENE1916 0.1192   0         0        0        
## [38,] GENE1750 -0.1166  0         0        0        
## [39,] GENE368  0        0.1122    0        0        
## [40,] GENE783  0.0981   0         0        0        
## [41,] GENE603  0        0         0        0.0896   
## [42,] GENE1723 0        0         0        0.0836   
## [43,] GENE544  -0.0818  0         0        0        
## [44,] GENE1896 0        0         0        0.0745   
## [45,] GENE2    0        0         0        0.0667   
## [46,] GENE248  -0.0665  0         0        0        
## [47,] GENE1601 0        0         0.0596   0        
## [48,] GENE338  0        0         0        0.0538   
## [49,] GENE1799 0        -0.0469   0        0        
## [50,] GENE433  0        0         0        0.0439   
## [51,] GENE1980 -0.0265  0.0374    0        0        
## [52,] GENE1105 0        0         0        0.0354   
## [53,] GENE2166 -0.0315  0         0        0        
## [54,] GENE2303 -0.0305  0         0        0        
## [55,] GENE123  0.0302   0         0        0        
## [56,] GENE1387 0.0276   0         0        0        
## [57,] GENE2146 0        0         0        0.0229   
## [58,] GENE788  -0.0225  0         0        0        
## [59,] GENE335  0.0164   0         0        0        
## [60,] GENE1207 0        0         0        0.016    
## [61,] GENE567  -0.0112  0         0        0        
## [62,] GENE1353 0        0         0        0.0084   
## [63,] GENE714  0        0         0        0.0081   
## [64,] GENE2144 0        0         0.0024   0        
## [65,] GENE910  0        0         0        3e-04

Try heterogeneity analysis, with class “BL” taken to be the normal group

## 123456789101112131415161718192021222324252627282930
## 1234Fold 1 :123456789101112131415161718192021222324252627282930
## Fold 2 :123456789101112131415161718192021222324252627282930
## Fold 3 :123456789101112131415161718192021222324252627282930
## Fold 4 :123456789101112131415161718192021222324252627282930
## Fold 5 :123456789101112131415161718192021222324252627282930
## Fold 6 :123456789101112131415161718192021222324252627282930
## Fold 7 :123456789101112131415161718192021222324252627282930
## Fold 8 :123456789101112131415161718192021222324252627282930

Identify better threshold scalings

## Initial errors: 2.60000 0.10000 3.53333 2.16667 Roc 9.66305 
## Update 1 
## 123456789101112131415161718192021222324252627282930
## Errors 3.03333 0.10000 4.06667 3.86667 Roc 9.39042 
## Update 2 
## 123456789101112131415161718192021222324252627282930
## Errors 3.40000 0.06667 4.50000 5.50000 Roc 9.16457 
## Update 3 
## 123456789101112131415161718192021222324252627282930
## Errors 3.40000 0.10000 4.36667 4.13333 Roc 8.95462 
## Update 4 
## 123456789101112131415161718192021222324252627282930
## Errors 3.70000 0.10000 4.66667 5.73333 Roc 8.56622 
## Update 5 
## 123456789101112131415161718192021222324252627282930
## Errors 3.70000 0.26667 4.56667 4.16667 Roc 10.05755 
## Update 6 
## 123456789101112131415161718192021222324252627282930
## Errors 4.00000 0.23333 4.80000 5.73333 Roc 9.23474 
## Update 7 
## 123456789101112131415161718192021222324252627282930
## Errors 4.00000 0.36667 4.80000 4.16667 Roc 10.91507 
## Update 8 
## 123456789101112131415161718192021222324252627282930
## Errors 4.40000 0.30000 4.83333 5.73333 Roc 10.09324 
## Update 9 
## 123456789101112131415161718192021222324252627282930
## Errors 4.40000 0.40000 4.83333 4.16667 Roc 11.41245 
## Update 10 
## 123456789101112131415161718192021222324252627282930
## Errors 4.73333 0.36667 4.86667 5.73333 Roc 10.92883
## 123456789101112131415161718192021222324252627282930
## 1234Fold 1 :123456789101112131415161718192021222324252627282930
## Fold 2 :123456789101112131415161718192021222324252627282930
## Fold 3 :123456789101112131415161718192021222324252627282930
## Fold 4 :123456789101112131415161718192021222324252627282930
## Fold 5 :123456789101112131415161718192021222324252627282930
## Fold 6 :123456789101112131415161718192021222324252627282930
## Fold 7 :123456789101112131415161718192021222324252627282930
## Fold 8 :123456789101112131415161718192021222324252627282930

Interactive analysis

Begin by typing “1” to select pamr.train, and then after that computation is done, pick “2” for pamr.cv Typically, you would go through steps 3 through 8, to generate plots and gene lists. Along the way, in some of the steps you are asked for a threshold value: this value you choose visually from the plot created by pamr.plotcv. Menu Choice 9 is optional.

References

The tutorial 1
The tutorial 2
The PNAS paper containing Khan dataset
The Khan dataset
The JCO paper implemented the pamr approach

Session Info

sessionInfo()
## R version 3.3.2 (2016-10-31)
## Platform: x86_64-apple-darwin13.4.0 (64-bit)
## Running under: macOS  10.13.3
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] pamr_1.55       survival_2.41-3 cluster_2.0.6  
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_0.12.14    lattice_0.20-35 digest_0.6.13   rprojroot_1.3-2 grid_3.3.2      backports_1.1.2 magrittr_1.5    evaluate_0.10.1 stringi_1.1.6   Matrix_1.2-12   rmarkdown_1.9   splines_3.3.2   tools_3.3.2     stringr_1.3.0   yaml_2.1.16     htmltools_0.3.6 knitr_1.20