Introduction

This report uses TargetQC to evaluate the quality of a clinical Whole Genome Sequencing (WGS) sample. The evaluation covers three main modules: Target Exon QC, Target Gene QC, and Variant Capture QC.

Target Exon QC

This section checks how well the exons of target genes are covered by the sequencing data.

Base distribution across coverage thresholds

First, we look at how much of the target region is covered across different coverage thresholds. In this report, the target region is the exons of protein-coding genes with known phenotype associations in OMIM. Coverage thresholds used: 10X, 20X, 30X, and 40X.

Table: Shows coverage threshold, target region size, average coverage, region size meeting the threshold, and its proportion.

thresholds size avg count percent
10X 21939429 54.37 13138567 59.89
20X 21939429 54.37 11843446 53.98
30X 21939429 54.37 10625457 48.43
40X 21939429 54.37 9486638 43.24

Figure: Visualization of Table (x-axis = coverage thresholds, y-axis = proportion of region meeting threshold).

Qualified exons

Each exon is assigned a classification label. For WGS, only coverage depth is used (ES uses both capture efficiency and coverage depth).

Table: Lists exons with chromosome position (0-base), average coverage (“coverage” column), coverage-based label (“type” column), and capture-based label.For WGS, all exons are treated as “full capture.” For ES, extra columns would show capture percentage.

Exon summary

Counts of exons in each category are summarized.
Table: Matrix of exon counts per category.

capture well middle poor capture_total
full 44664 8232 18521 71417
partial 0 0 0 0
near 0 0 0 0
no 0 0 0 0
coverage_total 44664 8232 18521 71417

Figure:Visualizes proportion of exons in 3 coverage types: well-covered, middle-covered, poor-covered.

For ES, it would show coverage conditions under each capture level and capture conditions under each coverage level. Such as:

Qualified Exons (1)

Figure a: coverage conditions under each capture level

Qualified Exons (2)

Figure b: capture conditions under each coverage level

Target Gene QC

This module evaluates gene coverage from an overall perspective, focusing on protein-coding genes with OMIM phenotype associations.

Coverage range proportions

Proportion of bases in coverage ranges (≥30X, 20–30X, ≤20X) is calculated for:
- All target gene exons.
- Coding regions of protein-coding genes.

Table: Proportions of bases in these three ranges for all target gene exons.
(columns: gene ID, name, type, target transcript ID, 3 coverage range proportions).

Table: The same proportions but for coding regions of protein-coding genes (columns: gene ID, name, type, target transcript ID, 3 coverage range proportions).

Figure:Cumulative curve (x-axis = coverage range percentage, y-axis = number of genes reaching that percentage). You can choose to plot curves for different gene types (e.g., all target genes or protein-coding genes). For protein-coding genes, coding regions are used.

Qualified gene

Genes are graded using user-defined criteria, here:
- Well-covered: ≥95% of bases ≥30X.
- Poor-covered: >5% of bases ≤20X.

Table: Grade (high/middle/low) for each gene.

Gene summary

Counts of well-/middle-/poor-covered genes across functional categories (e.g., protein-coding genes, pseudogenes).

Table:Counts per category.

Figure:Visualizes counts (optional: plot for “all” or “protein-coding” genes).
Note that since the target genes selected are protein-coding genes with known phenotype associations in OMIM, all genes are presented here without distinguishing protein-coding genes or pseudogenes.

Variant Capture QC

This module evaluates key variant sites using sequencing depth (DP) and B-allele frequency (BAF), focusing on pathogenic/likely pathogenic (p/lp) and disease-causing mutation (DM) sites from ClinVar and HGMD.

Known pathogenic variants quality

Lists detected clinical key sites with details.

Table: Columns: chromosome position, variant type, genotype, DP, BAF.

DP statistics

Percentage of sites with DP ≥30X (reliable) or ≤20X (low quality risk).

Figure:Visualizes DP percentages.

BAF deviation

BAF deviation for homozygous (hom) and heterozygous (het) SNVs/indels. Above the figure: percentage of sites with BAF ≥0.05 (abnormal deviation).

Figure:Displays BAF deviations and abnormal site percentages.

In target regions (if added “regions” option)

Quality data for clinical pathogenic variants detected in target regions.Here, the target regions are the CDS regions of protein-coding genes with known phenotype associations in OMIM.

Table: Lists variants in target regions.

Figures:Visualize variant quality metrics.

Conclusion

This report uses TargetQC v1.0’s framework to provide a clear view of a WGS Sample’s sequencing quality. It offers a practical resource for investigators seeking to evaluate and compare sequencing quality in clinically relevant genes and supports more informed selection and optimization of sequencing strategies for clinical genomic testing.