## Extracting data for downstream analysiscount <- brca.tcga$countfpkm <- brca.tcga$fpkmmaf <- brca.tcga$mafsegment <- brca.tcga$segmentsurv.info <- brca.tcga$clin.info
getElites() function to filter out features that meet some stringent requirements, and those features that are preserved in this procedure are considered elites by MOVICS. Five filtering methods are provided here, namely mad for median absolute deviation, sd for standard deviation, pca for principal components analysis, cox for univariate Cox proportional hazards regression, and freq for binary omics data. This function also handles missing values coded in NA
by removing them directly or imputing them by k
nearest neighbors using a Euclidean metric through argument of na.action
# ident optimal clustering number optk.brca <-getClustNum(data = mo.data,is.binary =c(F,F,F,T),try.N.clust =2:8, # trying cluster number from 2 to 8fig.name ="CLUSTER NUMBER OF TCGA-BRCA")
calculating Cluster Prediction Index...
5% complete
5% complete
10% complete
10% complete
15% complete
15% complete
20% complete
25% complete
25% complete
30% complete
30% complete
35% complete
35% complete
40% complete
45% complete
45% complete
50% complete
50% complete
55% complete
55% complete
60% complete
65% complete
65% complete
70% complete
70% complete
75% complete
75% complete
80% complete
85% complete
85% complete
90% complete
90% complete
95% complete
95% complete
100% complete
calculating Gap-statistics...
visualization done...
--the imputed optimal cluster number is 3 arbitrarily, but it would be better referring to other priori knowledge.
# convert beta value to M value for stronger signalindata <- mo.dataindata$meth.beta <-log2(indata$meth.beta / (1- indata$meth.beta))# data normalization for heatmapplotdata <-getStdiz(data = indata,halfwidth =c(2,2,2,NA), # no truncation for mutationcenterFlag =c(T,T,T,F), # no center for mutationscaleFlag =c(T,T,T,F)) # no scale for mutation
After identification of cancer subtypes, it is essential to further characterize each subtype by discovering their difference from multiple aspects. To this end, MOVICS provides commonly used downstream analyses in cancer subtyping researches for easily cohesion with results derived from GET Module.
# survival comparisonsurv.brca <-compSurv(moic.res = iClusterBayes.res,surv.info = surv.info,convt.time ="m", #m for month as unitsurv.median.line ="h", xyrs.est =c(5,10), # 5 and 10-year survivalfig.name ="KAPLAN-MEIER CURVE OF CONSENSUSMOIC")
--a total of 643 samples are identified.
--removed missing values.
--leaving 642 observations.
Warning in geom_segment(aes(x = 0, y = max(y2), xend = max(x1), yend = max(y2)), : All aesthetics have length 1, but the data has 4 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 4 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 4 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 4 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 4 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
All aesthetics have length 1, but the data has 4 rows.
ℹ Please consider using `annotate()` or provide this layer with data containing
a single row.
print(surv.brca)
$fitd
Call:
survdiff(formula = Surv(futime, fustat) ~ Subtype, data = mosurv.res,
na.action = na.exclude)
N Observed Expected (O-E)^2/E (O-E)^2/V
Subtype=CS1 113 16 14.65 0.124 0.156
Subtype=CS2 69 13 5.61 9.755 10.786
Subtype=CS3 205 23 25.50 0.246 0.376
Subtype=CS4 148 7 18.95 7.536 10.163
Subtype=CS5 107 16 10.29 3.169 3.689
Chisq= 21.2 on 4 degrees of freedom, p= 3e-04
$fit
Call: survfit(formula = Surv(futime, fustat) ~ Subtype, data = mosurv.res,
na.action = na.exclude, error = "greenwood", type = "kaplan-meier",
conf.type = "plain")
n events median 0.95LCL 0.95UCL
CS1 113 16 NA 97.2 NA
CS2 69 13 71.9 49.4 NA
CS3 205 23 122.6 94.0 NA
CS4 148 7 216.2 113.5 NA
CS5 107 16 102.5 68.8 NA
$xyrs.est
Call: survfit(formula = Surv(futime, fustat) ~ Subtype, data = mosurv.res)
Subtype=CS1
time n.risk n.event survival std.err lower 95% CI upper 95% CI
1825 25 12 0.823 0.0497 0.731 0.926
3650 3 4 0.504 0.1601 0.270 0.939
Subtype=CS2
time n.risk n.event survival std.err lower 95% CI
1825.000 8.000 12.000 0.563 0.107 0.387
upper 95% CI
0.818
Subtype=CS3
time n.risk n.event survival std.err lower 95% CI upper 95% CI
1825 42 12 0.837 0.0464 0.751 0.933
3650 8 8 0.558 0.0926 0.403 0.772
Subtype=CS4
time n.risk n.event survival std.err lower 95% CI upper 95% CI
1825 27 4 0.924 0.0403 0.848 1
3650 4 2 0.616 0.1799 0.348 1
Subtype=CS5
time n.risk n.event survival std.err lower 95% CI upper 95% CI
1825 17 10 0.772 0.0708 0.645 0.924
3650 3 6 0.401 0.1256 0.217 0.741
$overall.p
[1] 0.00029119
$pairwise.p
Pairwise comparisons using Log-Rank test
data: mosurv.res and Subtype
CS1 CS2 CS3 CS4
CS2 0.1538 - - -
CS3 0.6218 0.0079 - -
CS4 0.0122 3.9e-05 0.0830 -
CS5 0.4563 0.4904 0.1538 0.0019
P value adjustment method: BH