This post shows how to determine the fetal gender using cfDNA as in NIPT test.

1. Download data

Downloading open-access data from the NBCI

system("prefetch SRR31264073")

the output file is SRR31264073.sralite

Convert SRR31264073.sralite to SRR31264073.fastq.gz file

system("fasterq-dump --split-files ./SRR31264073/SRR31264073.sralite
")

The output files are: SRR31264073_1.fastq and SRR31264073_2.fastq

I convert these 2 files into SRR31264073_1.fastq.gz and SRR31264073_2.fastq.gz:

system("gzip SRR31264073_1.fastq  SRR31264073_2.fastq")

QC check the files, if they needed trimming the:

system("trimmomatic PE SRR31264073_1.fastq.gz SRR31264073_2.fastq.gz \
    SRR31264073_1_trimmed.fastq.gz SRR31264073_1_unpaired.fastq.gz \
    SRR31264073_2_trimmed.fastq.gz SRR31264073_2_unpaired.fastq.gz \
    SLIDINGWINDOW:4:20 MINLEN:50")

In this case, it is not needed to trim data because their quality is so good. I proceed to the next steps.

2. Align reads to the reference genome

system("bwa mem -t 4 /Users/nnthieu/genome_ref/hg38/hg38.fa SRR31264073_1.fastq.gz SRR31264073_2.fastq.gz > SRR31264073_aligned.sam
")

Convert sam to bam file, index and sort

system("samtools view -S -b SRR31264073_aligned.sam > SRR31264073_aligned.bam")
system("samtools sort SRR31264073_aligned.bam -o SRR31264073_sorted.bam")
system("samtools index SRR31264073_sorted.bam")

3. Count reads mapped to X, Y chromosome

system("samtools view -c SRR31264073_sorted.bam chrX ") # chrX_count

# Count reads mapped to Y chromosome
system("samtools view -c SRR31264073_sorted.bam chrY ")  # chrY_count

# Count reads mapped to an autosomal chromosome (e.g., chromosome 1)  
system("samtools view -c SRR31264073_sorted.bam chr1 ")   # chr1_count

4. Fetal fraction calculation in Python

Replace these values with actual read counts obtained from SAMtools:

chrX_count = 10574281

chrY_count = 187969

chr1_count = 28401565

Estimate fetal fraction based on Y chromosome read count:

fetal_fraction = (chrY_count / (chrX_count + chr1_count)) * 100

fetal_fraction = (187969 / (10574281 + 28401565)) * 100 = 0.4822705

Interpretation of gender

In Python:

if chrY_count > 0 and fetal_fraction > 4:

print("Fetal gender: Male")

else:

print("Fetal gender: Female")

It is 0.4822705 < 4, so the fetal gender is female.

#———–