1 Setup

2 LUAD tumor-only mode

2.1 Load data

Results from 442 samples are plotted (out of 507 tumor-only processed samples)

script_dir = "/home/sehyun/Documents/github/PureCN_manuscript"
results_dir = file.path(script_dir, "luad/Results/purity_ploidy")
purecn_puri_ploi = readRDS(file.path(results_dir, "data/luad_ABS_w_tumor_only.rds"))

# 442 obs. which have matched normal samples
# paired = readRDS(file.path(results_dir, "data/luad_ABS_w_matching_normal.rds"))
# write.csv(paired$fullname, "~/Documents/github/PureCN_manuscript/luad/Results/duplication/luad_442.csv")
luad_442 = read.csv("~/Documents/github/PureCN_manuscript/luad/Results/duplication/luad_442.csv")[,2]  

purecn_puri_ploi = purecn_puri_ploi[which(purecn_puri_ploi$fullname %in% luad_442),]

2.2 Purity and Ploidy Comparisons

These are Figure 1C and 1D in the manuscript.

3 Purity outliers

Top 80% of purity-concordant samples are marked in red and the others are colored in cyan. If I calculate Pearson correlation only with the concordant samples, it give 0.98.

3.1 Ploidy of purity outliers

[Left] Ploidy of PureCN and ABSOLUTE results from 20% of purity-discordant samples (the cyan dots from the above plot) are plotted. Pearson correltation coefficient is 0.19.
[Right] Ploidy of PureCN and ABSOLUTE results from the purity-concordant samples. Pearson correltation coefficient is 0.65.

–> So a wrong purity estimates lead to a wrong ploidy estimates?

4 Ploidy outliers

Top 80% of ploidy-concordant samples are marked in red and the others are colored in cyan. If I calculate Pearson correlation only with the concordant samples, it give 0.97.

4.1 Purity of ploidy outliers

[Left] Purity of PureCN and ABSOLUTE results from 20% of ploidy-discordant samples (the cyan dots from the above plot) are plotted. Pearson correltation coefficient is 0.69.
[Right] Purity of PureCN and ABSOLUTE results from the ploidy-concordant samples. Pearson correltation coefficient is 0.87.

5 Extreme cases (2%)

5.1 Purity outliers

puriOutlierInd = which(df_puri_outlier$absdiff == "discordant")
df_puri_outlier[puriOutlierInd,]
##            SampleId Purity_ABS Ploidy_ABS             fullname
## 30  TCGA-05-5425-01       0.71       1.08 TCGA-05-5425-01A-02D
## 130 TCGA-50-5930-01       0.95       2.08 TCGA-50-5930-01A-11D
## 143 TCGA-50-6591-01       0.90       4.67 TCGA-50-6591-01A-11D
## 150 TCGA-50-8457-01       0.60       2.03 TCGA-50-8457-01A-11D
## 177 TCGA-55-6984-01       0.81       4.06 TCGA-55-6984-01A-11D
## 328 TCGA-78-7153-01       0.57       4.00 TCGA-78-7153-01A-11D
## 381 TCGA-86-8671-01       1.00       2.00 TCGA-86-8671-01A-11D
## 424 TCGA-97-7546-01       0.67       1.85 TCGA-97-7546-01A-11D
## 454 TCGA-99-AA5R-01       0.96       2.00 TCGA-99-AA5R-01A-11D
##     Purity_tumor_only Ploidy_tumor_only Flagged          Comment capture_kit
## 30               0.30          2.827928   FALSE             <NA>      931070
## 130              0.34          2.122316   FALSE             <NA>      931070
## 143              0.43          2.677163   FALSE             <NA>      931070
## 150              0.16          2.129037    TRUE       LOW PURITY      931070
## 177              0.41          2.054388    TRUE POOR GOF (79.1%)      931070
## 328              0.24          2.356133    TRUE       LOW PURITY      931070
## 381              0.17          1.944552    TRUE       LOW PURITY      931070
## 424              0.24          2.143490    TRUE       LOW PURITY      931070
## 454              0.34          2.003381    TRUE     NON-ABERRANT      931070
##        absdiff
## 30  discordant
## 130 discordant
## 143 discordant
## 150 discordant
## 177 discordant
## 328 discordant
## 381 discordant
## 424 discordant
## 454 discordant

5.2 Ploidy outliers

ploiOutlierInd = which(df_ploi_outlier$absdiff == "discordant")
df_ploi_outlier[ploiOutlierInd,]
##            SampleId Purity_ABS Ploidy_ABS             fullname
## 15  TCGA-05-4410-01       0.24       4.49 TCGA-05-4410-01A-21D
## 101 TCGA-49-4512-01       0.32       4.56 TCGA-49-4512-01A-21D
## 134 TCGA-50-5935-01       0.39       5.81 TCGA-50-5935-01A-11D
## 265 TCGA-64-1681-01       0.31       5.63 TCGA-64-1681-01A-11D
## 280 TCGA-67-6217-01       0.25       6.17 TCGA-67-6217-01A-11D
## 400 TCGA-91-7771-01       0.32       5.00 TCGA-91-7771-01A-11D
## 413 TCGA-95-7039-01       0.41       5.66 TCGA-95-7039-01A-11D
## 428 TCGA-97-7554-01       0.29       5.21 TCGA-97-7554-01A-11D
## 465 TCGA-L9-A743-01       0.09       6.48 TCGA-L9-A743-01A-43D
##     Purity_tumor_only Ploidy_tumor_only Flagged          Comment capture_kit
## 15               0.25          2.304502    TRUE    EXCESSIVE LOH      931070
## 101              0.43          2.289707   FALSE             <NA>      931070
## 134              0.52          3.171802   FALSE             <NA>      931070
## 265              0.39          3.158382   FALSE             <NA>      931070
## 280              0.33          2.026858    TRUE POOR GOF (79.4%)      931070
## 400              0.31          2.286902    TRUE POOR GOF (78.7%)      931070
## 413              0.43          3.024252   FALSE             <NA>      931070
## 428              0.30          2.846310   FALSE             <NA>      931070
## 465              0.16          1.750916    TRUE    EXCESSIVE LOH      931070
##        absdiff
## 15  discordant
## 101 discordant
## 134 discordant
## 265 discordant
## 280 discordant
## 400 discordant
## 413 discordant
## 428 discordant
## 465 discordant