The purpose of this script is to perform variant analysis on maf files with maftools. The maftools documentation is available here: https://www.bioconductor.org/packages/release/bioc/vignettes/maftools/inst/doc/maftools.html
# First time only
BiocManager::install("maftools")
# Load every time
library(knitr)
library(maftools)
library(dplyr)
library(VennDiagram)
library(ggplot2)
Input: One concatenated .maf file of all 96 canine CD4+ PTCL samples and one concatenated .maf file of all control samples (5 samples of sorted nodal CD4+ T cells and 2 samples of sorted CD4+ thymocytes). These input files were generated from RNA-seq data with the GATK pipeline. SNPs and INDELs were filtered based on these recommendations for hard filtering from the GATK documentation: https://gatk.broadinstitute.org/hc/en-us/articles/360035890471-Hard-filtering-germline-short-variants. SNPs with QD2 < 2, SOR > 3, FS > 60, MQ < 40, MQRankSum < -12.5, and ReadPosRankSum < -8 and INDELs with QD < 2, FS > 200, and ReadPosRankSum < -20 were excluded. After filtering, vcf files were annotated with the Ensembl Variant Effect Predictor (VEP) and converted to maf files with vcf2maf. Variants annotated as intronic or having low or mediator effects were also filtered out and excluded.
# set working directory
setwd("C:/Users/edlarsen/Documents/PTCLRNASeq")
# read PTCL maf file
mymaf <- read.maf(maf = 'Cohort_2/Input/AllPTCLs.CanFam31.QD2.vep.filtered.maf')
# read CTRL maf file
mymaf_ctrl <- read.maf(maf = 'Cohort_2/Input/AllCTRLs.CanFam31.QD2.vep.filtered.maf')
# Print a table of the 20 samples with the highest and lowest number of variants
sampleSummary <- getSampleSummary(mymaf)
first20Samples <- head(sampleSummary, 20)
last20Samples <- tail(sampleSummary, 20)
kable(first20Samples, caption = "Top 20 Canine CD4 PTCL Samples with Highest Number of Variants")
Tumor_Sample_Barcode | Frame_Shift_Del | Frame_Shift_Ins | In_Frame_Del | In_Frame_Ins | Missense_Mutation | Nonsense_Mutation | Nonstop_Mutation | Splice_Site | Translation_Start_Site | total |
---|---|---|---|---|---|---|---|---|---|---|
CI155427 | 143 | 709 | 119 | 338 | 704 | 239 | 8 | 9367 | 2 | 11629 |
CI161277 | 151 | 634 | 131 | 332 | 661 | 240 | 8 | 9327 | 4 | 11488 |
CI104568 | 148 | 616 | 123 | 331 | 685 | 206 | 9 | 9348 | 8 | 11474 |
CI124799 | 168 | 627 | 116 | 308 | 604 | 238 | 12 | 9328 | 8 | 11409 |
CI153070 | 192 | 640 | 128 | 302 | 729 | 236 | 11 | 9130 | 4 | 11372 |
CI153427 | 140 | 652 | 128 | 335 | 612 | 235 | 8 | 9255 | 7 | 11372 |
CI158606 | 138 | 538 | 124 | 353 | 560 | 213 | 5 | 9410 | 6 | 11347 |
CI150689 | 138 | 630 | 118 | 326 | 654 | 254 | 14 | 9188 | 6 | 11328 |
CI165189 | 128 | 661 | 132 | 309 | 765 | 232 | 9 | 9078 | 8 | 11322 |
CI124809 | 156 | 656 | 121 | 295 | 593 | 230 | 9 | 9245 | 15 | 11320 |
CI165776 | 152 | 590 | 123 | 321 | 648 | 230 | 12 | 9218 | 4 | 11298 |
CI164860 | 139 | 615 | 126 | 322 | 632 | 243 | 10 | 9203 | 6 | 11296 |
CI167185 | 172 | 571 | 115 | 261 | 639 | 215 | 11 | 9303 | 8 | 11295 |
CI105835 | 149 | 601 | 133 | 308 | 685 | 223 | 7 | 9172 | 7 | 11285 |
CI152837 | 140 | 569 | 105 | 317 | 678 | 246 | 12 | 9204 | 7 | 11278 |
CI171792 | 132 | 711 | 123 | 313 | 579 | 225 | 10 | 9139 | 6 | 11238 |
CI164934 | 149 | 551 | 111 | 257 | 856 | 233 | 11 | 9048 | 4 | 11220 |
CI166465 | 149 | 584 | 127 | 307 | 609 | 202 | 9 | 9224 | 9 | 11220 |
CI163077 | 131 | 619 | 119 | 336 | 659 | 242 | 10 | 9092 | 8 | 11216 |
CI164968 | 157 | 621 | 106 | 258 | 629 | 235 | 14 | 9187 | 8 | 11215 |
# Print a table of the top 20 mutated genes
geneSummary <- getGeneSummary(mymaf)
topVariantGenes <- head(geneSummary, 20)
kable(topVariantGenes, caption = "Top 20 Mutated Genes in Canine CD4 PTCL")
Hugo_Symbol | Frame_Shift_Del | Frame_Shift_Ins | In_Frame_Del | In_Frame_Ins | Missense_Mutation | Nonsense_Mutation | Nonstop_Mutation | Splice_Site | Translation_Start_Site | total | MutatedSamples | AlteredSamples |
---|---|---|---|---|---|---|---|---|---|---|---|---|
COL7A1 | 3 | 7 | 0 | 1 | 40 | 2 | 0 | 3527 | 0 | 3580 | 96 | 96 |
FLNA | 0 | 0 | 0 | 2 | 81 | 0 | 0 | 2230 | 0 | 2313 | 96 | 96 |
WDR90 | 87 | 32 | 2 | 5 | 29 | 0 | 0 | 2089 | 0 | 2244 | 96 | 96 |
SZT2 | 5 | 1 | 186 | 0 | 38 | 3 | 0 | 1970 | 0 | 2203 | 96 | 96 |
NBEAL2 | 10 | 3 | 96 | 2 | 38 | 0 | 0 | 2003 | 0 | 2152 | 96 | 96 |
PIEZO1 | 3 | 2 | 0 | 2 | 28 | 0 | 0 | 2093 | 0 | 2128 | 96 | 96 |
CPSF1 | 39 | 4 | 0 | 0 | 5 | 1 | 0 | 1951 | 0 | 2000 | 96 | 96 |
DYNC1H1 | 2 | 31 | 0 | 73 | 21 | 1 | 0 | 1863 | 0 | 1991 | 96 | 96 |
FASN | 88 | 0 | 2 | 1 | 24 | 0 | 0 | 1875 | 0 | 1990 | 96 | 96 |
STAB1 | 2 | 29 | 0 | 2 | 31 | 3 | 0 | 1877 | 0 | 1944 | 96 | 96 |
KMT2D | 8 | 12 | 14 | 2 | 97 | 5 | 0 | 1704 | 0 | 1842 | 96 | 96 |
PLXNB2 | 2 | 3 | 0 | 1 | 21 | 1 | 0 | 1666 | 0 | 1694 | 96 | 96 |
LAMB2 | 4 | 0 | 0 | 1 | 10 | 1 | 0 | 1630 | 0 | 1646 | 96 | 96 |
SSPO | 4 | 8 | 0 | 8 | 49 | 6 | 0 | 1473 | 0 | 1548 | 96 | 96 |
SCRIB | 3 | 8 | 3 | 0 | 12 | 0 | 0 | 1495 | 0 | 1521 | 96 | 96 |
IPO4 | 1 | 4 | 95 | 1 | 8 | 1 | 0 | 1386 | 1 | 1497 | 96 | 96 |
TRAF7 | 76 | 0 | 1 | 0 | 10 | 0 | 0 | 1396 | 0 | 1483 | 96 | 96 |
PLEC | 3 | 2 | 4 | 1 | 50 | 4 | 0 | 1395 | 0 | 1459 | 96 | 96 |
USP19 | 0 | 80 | 2 | 0 | 5 | 1 | 0 | 1343 | 0 | 1431 | 96 | 96 |
AGRN | 3 | 5 | 0 | 2 | 23 | 1 | 0 | 1392 | 0 | 1426 | 96 | 96 |
# Write maf summary to an output file
write.mafSummary(maf = mymaf, basename = 'Cohort_2/Output/CD4PTCL_maftools')
# Print a table of the 20 samples with the highest and lowest number of variants
sampleSummary_ctrl <- getSampleSummary(mymaf_ctrl)
first20Samples_ctrl <- head(sampleSummary_ctrl, 20)
last20Samples_ctrl <- tail(sampleSummary_ctrl, 20)
kable(first20Samples_ctrl, caption = "Top 20 Canine CD4 Control Samples with Highest Number of Variants")
Tumor_Sample_Barcode | Frame_Shift_Del | Frame_Shift_Ins | In_Frame_Del | In_Frame_Ins | Missense_Mutation | Nonsense_Mutation | Nonstop_Mutation | Splice_Site | Translation_Start_Site | total |
---|---|---|---|---|---|---|---|---|---|---|
CI80400 | 206 | 524 | 121 | 278 | 662 | 202 | 9 | 8412 | 8 | 10422 |
CI157953 | 142 | 578 | 133 | 254 | 682 | 228 | 10 | 8334 | 5 | 10366 |
CI80397 | 194 | 618 | 119 | 284 | 536 | 204 | 10 | 8387 | 7 | 10359 |
CI156615 | 143 | 554 | 120 | 276 | 586 | 195 | 8 | 8389 | 5 | 10276 |
CI80399 | 191 | 496 | 115 | 275 | 652 | 216 | 7 | 8314 | 9 | 10275 |
CI156616 | 163 | 541 | 106 | 261 | 576 | 202 | 6 | 8195 | 7 | 10057 |
CI157907 | 171 | 589 | 114 | 269 | 536 | 209 | 10 | 8083 | 2 | 9983 |
0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
Tumor_Sample_Barcode | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
# Print a table of the top 20 mutated genes
geneSummary_ctrl <- getGeneSummary(mymaf_ctrl)
topVariantGenes_ctrl <- head(geneSummary_ctrl, 20)
kable(topVariantGenes_ctrl, caption = "Top 20 Mutated Genes in Canine CD4 Controls")
Hugo_Symbol | Frame_Shift_Del | Frame_Shift_Ins | In_Frame_Del | In_Frame_Ins | Missense_Mutation | Nonsense_Mutation | Nonstop_Mutation | Splice_Site | Translation_Start_Site | total | MutatedSamples | AlteredSamples |
---|---|---|---|---|---|---|---|---|---|---|---|---|
COL7A1 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 248 | 0 | 251 | 7 | 7 |
FLNA | 0 | 0 | 0 | 0 | 9 | 0 | 0 | 165 | 0 | 174 | 7 | 7 |
SZT2 | 0 | 0 | 13 | 1 | 3 | 2 | 0 | 153 | 0 | 172 | 7 | 7 |
PIEZO1 | 1 | 0 | 0 | 0 | 3 | 0 | 0 | 162 | 0 | 166 | 7 | 7 |
NBEAL2 | 2 | 1 | 7 | 0 | 2 | 0 | 0 | 153 | 0 | 165 | 7 | 7 |
WDR90 | 7 | 3 | 1 | 0 | 4 | 0 | 0 | 150 | 0 | 165 | 7 | 7 |
CPSF1 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 145 | 0 | 147 | 7 | 7 |
KMT2D | 0 | 0 | 1 | 0 | 10 | 0 | 0 | 130 | 0 | 141 | 7 | 7 |
SSPO | 0 | 1 | 0 | 1 | 6 | 1 | 0 | 131 | 0 | 140 | 7 | 7 |
FASN | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 134 | 0 | 138 | 7 | 7 |
DYNC1H1 | 0 | 2 | 0 | 4 | 3 | 0 | 0 | 123 | 0 | 132 | 7 | 7 |
LAMB2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 125 | 0 | 126 | 7 | 7 |
PLXNB2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 124 | 0 | 124 | 7 | 7 |
TRAF7 | 6 | 0 | 0 | 0 | 0 | 0 | 0 | 107 | 0 | 113 | 7 | 7 |
SCRIB | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 110 | 0 | 111 | 7 | 7 |
KMT2B | 0 | 2 | 6 | 0 | 0 | 0 | 0 | 102 | 0 | 110 | 7 | 7 |
IPO4 | 0 | 1 | 7 | 0 | 0 | 0 | 0 | 98 | 0 | 106 | 7 | 7 |
PLEC | 1 | 2 | 0 | 0 | 2 | 0 | 0 | 101 | 0 | 106 | 7 | 7 |
CC2D1A | 1 | 0 | 0 | 0 | 2 | 1 | 0 | 100 | 0 | 104 | 7 | 7 |
TLN1 | 0 | 2 | 0 | 0 | 1 | 0 | 0 | 100 | 0 | 103 | 7 | 7 |
# Write maf summary to an output file
write.mafSummary(maf = mymaf_ctrl, basename = 'Cohort_2/Output/CD4CTRL_maftools')
# set colors to annotate mutation types
var_cols = RColorBrewer::brewer.pal(n = 10, name = 'Paired')
names(var_cols) = c(
'In_Frame_Ins',
'Missense_Mutation',
'In_Frame_Del',
'Frame_Shift_Ins',
'Translation_Start_Site',
'Nonstop_Mutation',
'Frame_Shift_Del',
'Multi_Hit',
'Nonsense_Mutation',
'Splice_Site'
)
titvcols = RColorBrewer::brewer.pal(n = 6, name = 'Set3')
names(titvcols) = c("C>T", "C>G", "C>A", "T>A", "T>C", "T>G")
Displays the number of variants in each sample as a stacked barplot and variant types as a boxplot summarized by Variant_Classification. ### PTCLs
plotmafSummary(maf = mymaf,
color = var_cols,
titvColor = titvcols,
rmOutlier = TRUE,
addStat = 'median',
dashboard = TRUE,
titvRaw = FALSE)
plotmafSummary(maf = mymaf_ctrl,
color = var_cols,
titvColor = titvcols,
rmOutlier = TRUE,
addStat = 'median',
dashboard = TRUE,
titvRaw = FALSE)
par(mar = c(5, 0.1, 4, 2))
mafbarplot(
mymaf,
color = var_cols,
n = 20,
genes = NULL,
fontSize = 0.6,
includeCN = FALSE,
legendfontSize = 1,
borderCol = "#34495e",
showPct = TRUE
)
par(mar = c(5, 0.1, 4, 2))
mafbarplot(
mymaf_ctrl,
color = var_cols,
n = 20,
genes = NULL,
fontSize = 0.6,
includeCN = FALSE,
legendfontSize = 1,
borderCol = "#34495e",
showPct = TRUE
)
hPTCLgenes = c("TET2", "DNMT3A", "PTEN", "TP53", "CDKN2A", "MYC", "STAT3", "BCL11B", "BCL6", "CD244", "CD247", "FASLG", "TP63", "TPRG1", "FYN", "IBTK", "LATS1", "ZC3H12D", "TNFAIP3")
# subset PTCL maf for only fusion gene partners
mymaf_hPTCL <- subsetMaf(mymaf, genes = hPTCLgenes)
## -Processing clinical data
# draw oncoplot
oncoplot(
maf = mymaf_hPTCL,
genes = hPTCLgenes,
colors = var_cols,
titleText = "Canine CD4+ PTCL Variants in Genes Commonly Mutated in Human PTCL"
)
# subset control maf for only fusion gene partners
mymaf_CTRL_hPTCL <- subsetMaf(mymaf_ctrl, genes = hPTCLgenes)
## -Processing clinical data
# draw oncoplot
oncoplot(
maf = mymaf_CTRL_hPTCL,
genes = hPTCLgenes,
colors = var_cols,
titleText = "Canine CD4+ CTRL Variants in Genes Commonly Mutated in Human PTCL"
)
# define list of genes
genes <- c("PTEN", "SATB1", "MAP2K1", "NLRP14", "KCND2", "PSMA1", "TBC1D26")
# subset PTCL maf for only these genes
mymaf_PTCLgenes <- subsetMaf(mymaf, genes=genes)
## -Processing clinical data
# draw oncoplot
oncoplot(
maf = mymaf_PTCLgenes,
genes = genes,
color = var_cols,
titleText = "Canine CD4+ PTCL Variants in Genes Commonly Mutated in Canine TCL"
)
# subset control maf for only these genes
mymaf_ctrl_PTCLgenes <- subsetMaf(mymaf_ctrl, genes=genes)
## -Processing clinical data
# draw oncoplot
oncoplot(
maf = mymaf_ctrl_PTCLgenes,
genes = genes,
color = var_cols,
titleText = "Canine CD4+ CTRL Variants in Genes Commonly Mutated in Canine TCL"
)
# define list of fusion gene partners
fusiongenes = c("GATD3A", "LMO4", "PTMA", "NCL", "JPT1", "MROH1", "TPD52L2", "TOX2", "REV3L", "FYN", "HMGB1", "BZW1", "HSPD1", "CHD3", "PER1", "EIF5A", "GRB10", "IKZF1", "MYC", "TRIB1", "YWHAZ", "KLF10", "SRSF5")
# subset PTCL maf for only fusion gene partners
mymaf_PTCLfusion <- subsetMaf(mymaf, genes = fusiongenes)
## -Processing clinical data
# draw oncoplot
oncoplot(
maf = mymaf_PTCLfusion,
genes = fusiongenes,
color = var_cols,
titleText = "Canine CD4+ PTCL Variants Called in Fusion Partner Genes"
)
# subset control maf for only fusion gene partners
mymaf_CTRLfusion <- subsetMaf(mymaf_ctrl, genes = fusiongenes)
## -Processing clinical data
# draw oncoplot
oncoplot(
maf = mymaf_CTRLfusion,
genes = fusiongenes,
color = var_cols,
titleText = "Canine CD4+ CTRL Variants Called in Fusion Partner Genes"
)
Boxplot summarizes the overall distribution of different conversions, and stacked barplot shows fraction of conversions in each sample. ### PTCLs
mymaf.titv = titv(maf = mymaf,
plot = FALSE,
useSyn = TRUE)
# plot titv summary
plotTiTv(res = mymaf.titv, color = titvcols)
mymaf_ctrl.titv = titv(maf = mymaf_ctrl,
plot = FALSE,
useSyn = TRUE)
# plot titv summary
plotTiTv(res = mymaf_ctrl.titv, color = titvcols)
tumor_data <- mymaf@data
ctrl_data <- mymaf_ctrl@data
# select columns for matching
tumor_vars <- tumor_data[, c("Chromosome", "Start_Position", "End_Position", "Variant_Classification", "Variant_Type")]
ctrl_vars <- ctrl_data[, c("Chromosome", "Start_Position", "End_Position", "Variant_Classification", "Variant_Type")]
# find common variants
shared_variants <- merge(tumor_vars, ctrl_vars,
by = c("Chromosome", "Start_Position", "End_Position", "Variant_Classification", "Variant_Type"),
allow.cartesian = TRUE)
##### Venn Diagram #####
# Read in list of variants in both groups
tumor_vars_unique <- unique(paste(tumor_vars$Chromosome, tumor_vars$Start_Position, tumor_vars$End_Position, tumor_vars$Variant_Classification, tumor_vars$Variant_Type))
ctrl_vars_unique <- unique(paste(ctrl_vars$Chromosome, ctrl_vars$Start_Position, ctrl_vars$End_Position, ctrl_vars$Variant_Classification, ctrl_vars$Variant_Type))
venn1 <- venn.diagram(
x = list(tumor_vars_unique, ctrl_vars_unique),
category.names = c("CD4+ PTCL", "CD4+ CTRL"),
# Output features
filename = NULL,
disable.logging = TRUE,
# Title
main = "Variants Shared Between CD4+ PTCL and \nControl CD4+ Lymphocytes and Thymocytes",
main.cex = 1.5,
main.fontfamily = "sans",
main.fontface = "bold",
# Circles
fill = c(alpha("#440154ff", 0.3), alpha('#21908dff', 0.3)),
lwd = 1,
col = c("#440154ff", '#21908dff'),
# Numbers
cex=1.5,
fontfamly = "sans",
# Categories
cat.cex = 1.5,
cat.fontfamily = "sans",
cat.fontface = "bold",
cat.dist = c(0.05, 0.05),
cat.pos = c(-27, 27),
cat.default.pos = "outer",
cat.col = c("#440154ff", '#21908dff'),
scaled = FALSE,
)
grid.newpage()
grid.draw(venn1)
# filter tumor maf for shared variants
shared_condition <- with(tumor_data,
paste(Chromosome, Start_Position, End_Position, Variant_Classification, Variant_Type) %in%
paste(shared_variants$Chromosome, shared_variants$Start_Position, shared_variants$End_Position,
shared_variants$Variant_Classification, shared_variants$Variant_Type))
tumor_data_filtered <- tumor_data[!shared_condition, ]
mymaf_filtered <- read.maf(tumor_data_filtered)
# export
write.table(tumor_data_filtered, file = "ptcl_unique_vars_only_with_splice_vars.maf", sep = "\t", quote = FALSE, row.names = FALSE)
filtered_tumor_data <- mymaf_filtered@data
filtered_tumor_vars <- filtered_tumor_data[, c("Chromosome", "Start_Position", "End_Position", "Variant_Classification", "Variant_Type")]
filtered_tumor_vars_unique <- unique(paste(filtered_tumor_vars$Chromosome, filtered_tumor_vars$Start_Position, filtered_tumor_vars$End_Position, filtered_tumor_vars$Variant_Classification, filtered_tumor_vars$Variant_Type))
paste("Unique variant calls:", length(filtered_tumor_vars_unique), sep=" ")
Data visualization of only those genes that were mutated in tumor samples, and not in control samples.
Displays the number of variants in each PTCL sample as a stacked barplot and variant types as a boxplot summarized by Variant_Classification.
plotmafSummary(maf = mymaf_filtered,
color = var_cols,
titvColor = titvcols,
rmOutlier = TRUE,
addStat = 'median',
dashboard = TRUE,
titvRaw = FALSE)
par(mar = c(5, 0.1, 4, 2))
mafbarplot(
mymaf_filtered,
n = 20,
genes = NULL,
color = var_cols,
fontSize = 0.7,
includeCN = FALSE,
legendfontSize = 1,
borderCol = "#34495e",
showPct = TRUE
)
Oncoplot for the top 20 mutated genes after filtering out genes also called in control samples. Note: Variants annotated as Multi_Hit are those genes which are mutated more than once in the same sample.
#par(mar = c(5, 2, 4, 2))
oncoplot(maf = mymaf_filtered,
fontSize = 0.5,
top = 20,
colors = var_cols,
titleText = "Top Canine CD4+ PTCL-Specific Variants")
oncoplot(maf = mymaf_filtered,
colors = var_cols,
pathways = "sigpw",
titleText = "Top 10 Mutated Oncogenic Signaling Pathways in Canine CD4+ PTCL",
gene_mar = 8,
fontSize = 0.8,
topPathways = 10,
collapsePathway = TRUE)
oncoplot(maf = mymaf_filtered,
colors = var_cols,
pathways = "sigpw",
titleText = "Details of Top Mutated Oncogenic Signaling Pathway in Canine CD4+ PTCL",
gene_mar = 8,
fontSize = 0.8,
topPathways = 1)
# draw oncoplot
oncoplot(
maf = mymaf_filtered,
genes = hPTCLgenes,
colors = var_cols,
titleText = "Canine CD4+ PTCL-Specific Variants in Genes Commonly Mutated in Human PTCL"
)
oncoplot(
maf = mymaf_filtered,
genes = genes,
color = var_cols,
titleText = "Canine CD4+ PTCL-Specific Variants in Genes Commonly Mutated in Canine TCL"
)
oncoplot(
maf = mymaf_filtered,
genes = fusiongenes,
color = var_cols,
titleText = "Canine CD4+ PTCL-Specific Variants Called in Fusion Partner Genes"
)
Boxplot summarizes the overall distribution of different conversions, and stacked barplot shows fraction of conversions in each sample.
mymafFiltered.titv = titv(maf = mymaf_filtered,
plot = FALSE,
useSyn = TRUE)
# plot titv summary
plotTiTv(res = mymafFiltered.titv, color = titvcols)
Visualizes hypermutated genomic regions in cancer genomes by plotting inter variant distance on a linear genomic scale. “Kataegis” are defined as those genomic segments containing 6 or more consecutive mutations with an average inter-mutation distance of less than or equal to 1,00 bp 5. If tsb = NULL, the most mutated sample is plotted.
top5 <- head(sampleSummary, 5)
top5 <- as.character(top5$Tumor_Sample_Barcode[1:5])
top5
## [1] "CI155427" "CI161277" "CI104568" "CI124799" "CI153070"
for (barcode in top5){
rainfallPlot(maf = mymaf_filtered,
detectChangePoints = TRUE,
tsb = barcode,
width = 10,
height = 5,
pointSize = 0.8)
}
The somaticInteractions function performs pair-wise Fisher’s Exact test to detect signfiicant pairs of mutually exclusive or co-occurring sets of genes.
#exclusive/co-occurance event analysis on top 25 tumor-specific mutated genes.
somaticInteractions(maf = mymaf_filtered,
top = 25,
pvalue = c(0.05, 0.1),
fontSize = 0.6)
sessionInfo()
## R version 4.4.0 (2024-04-24 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 22631)
##
## Matrix products: default
##
##
## locale:
## [1] LC_COLLATE=English_United States.utf8
## [2] LC_CTYPE=English_United States.utf8
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C
## [5] LC_TIME=English_United States.utf8
##
## time zone: America/Denver
## tzcode source: internal
##
## attached base packages:
## [1] grid stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] ggplot2_3.5.1 VennDiagram_1.7.3 futile.logger_1.4.3
## [4] dplyr_1.1.4 maftools_2.20.0 knitr_1.49
##
## loaded via a namespace (and not attached):
## [1] Matrix_1.7-0 gtable_0.3.6 jsonlite_1.8.9
## [4] compiler_4.4.0 tidyselect_1.2.1 jquerylib_0.1.4
## [7] scales_1.3.0 splines_4.4.0 yaml_2.3.8
## [10] fastmap_1.2.0 lattice_0.22-6 DNAcopy_1.78.0
## [13] R6_2.5.1 generics_0.1.3 tibble_3.2.1
## [16] munsell_0.5.1 bslib_0.8.0 pillar_1.9.0
## [19] RColorBrewer_1.1-3 rlang_1.1.3 utf8_1.2.4
## [22] cachem_1.1.0 xfun_0.49 sass_0.4.9
## [25] cli_3.6.2 withr_3.0.2 magrittr_2.0.3
## [28] formatR_1.14 futile.options_1.0.1 digest_0.6.35
## [31] rstudioapi_0.17.1 lifecycle_1.0.4 vctrs_0.6.5
## [34] evaluate_1.0.1 glue_1.7.0 data.table_1.16.4
## [37] farver_2.1.2 lambda.r_1.2.4 codetools_0.2-20
## [40] survival_3.5-8 colorspace_2.1-1 fansi_1.0.6
## [43] rmarkdown_2.29 tools_4.4.0 pkgconfig_2.0.3
## [46] htmltools_0.5.8.1
citation()
## To cite R in publications use:
##
## R Core Team (2024). _R: A Language and Environment for Statistical
## Computing_. R Foundation for Statistical Computing, Vienna, Austria.
## <https://www.R-project.org/>.
##
## A BibTeX entry for LaTeX users is
##
## @Manual{,
## title = {R: A Language and Environment for Statistical Computing},
## author = {{R Core Team}},
## organization = {R Foundation for Statistical Computing},
## address = {Vienna, Austria},
## year = {2024},
## url = {https://www.R-project.org/},
## }
##
## We have invested a lot of time and effort in creating R, please cite it
## when using it for data analysis. See also 'citation("pkgname")' for
## citing R packages.