Sequencing data is of an individual (NA12892) from the 1000 genome project.
Google Cloud Platform (GCP), 4 vCPUs, 15 GB RAM
Google Cloud Platform (GCP), 8 vCPUs, 30 GB RAM
Google Cloud Platform (GCP), 24 vCPUs, 32 GB RAM
AWS: c5.18xlarge (72 vCPUs, 144 GB RAM)
AWS: m5.24xlarge (96 vCPUs, 384 GiB RAM)
There are 14 voltage-gated sodium channel subunit genes in humans.
Ten of them code for alpha subunits.
Four of them code for beta subunits.
scn1a/Nav1.1 Chromosome 2, NC_000002.12 (165989160..166149216, complement)
scn2a/Nav1.2 Chromosome 2, NC_000002.12 (165208056..165392310)
scn3a/Nav1.3 Chromosome 2, NC_000002.12 (165087520..165204295, complement)
scn4a/Nav1.4 Chromosome 17, NC_000017.11 (63938554..63972918, complement)
scn5a/Nav1.5 Chromosome 3, NC_000003.12 (38548061..38649673, complement)
scn7a/Nax Chromosome 2, NC_000002.12 (166403573..166494264, complement)
scn8a/Nav1.6 Chromosome 12, NC_000012.12 (51589958..51812864)
scn9a/Nav1.7 Chromosome 2, NC_000002.12 (166195185..166375987, complement)
scn10a/Nav1.8 Chromosome 3, NC_000003.12 (38697110..38794010, complement)
scn11a/Nav1.9 Chromosome 3, NC_000003.12 (38845764..39051945, complement)
scn1b Chromosome 19, NC_000019.10 (35030688..35040449)
scn2b Chromosome 11, NC_000011.10 (118162804..118176622, complement)
scn3b Chromosome 11, NC_000011.10 (123629187..123654607, complement)
scn4b Chromosome 11, NC_000011.10 (118133377..118152915, complement)
scn1a Chromosome 2, NC_000002.12 (165989160..166149216, complement)
scn2a Chromosome 2, NC_000002.12 (165208056..165392310)
scn3a Chromosome 2, NC_000002.12 (165087520..165204295, complement)
scn7a Chromosome 2, NC_000002.12 (166403573..166494264, complement)
scn9a Chromosome 2, NC_000002.12 (166195185..166375987, complement)
scn5a Chromosome 3, NC_000003.12 (38548061..38649673, complement)
scn10a Chromosome 3, NC_000003.12 (38697110..38794010, complement)
scn11a Chromosome 3, NC_000003.12 (38845764..39051945, complement)
scn2b Chromosome 11, NC_000011.10 (118162804..118176622, complement)
scn3b Chromosome 11, NC_000011.10 (123629187..123654607, complement)
scn4b Chromosome 11, NC_000011.10 (118133377..118152915, complement)
scn8a Chromosome 12, NC_000012.12 (51589958..51812864)
scn4a Chromosome 17, NC_000017.11 (63938554..63972918, complement)
scn1b Chromosome 19, NC_000019.10 (35030688..35040449)
~/gatk-data-ref$ wget -bqc ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR622/SRR622459/SRR622459_1.fastq.gz
~/gatk-data-ref$ wget -bqc ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR622/SRR622459/SRR622459_2.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 68029335546 Apr 30 15:50 SRR622459_1.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 69219443789 Apr 30 15:54 SRR622459_2.fastq.gz
@SRR622459.1 1/1
GTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGTTAGGGAGATCGGAAG
@SRR622459.1 1/2
CCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACAGATCGGAAG
~/gatk-data-ref$ aws s3 cp s3://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta .
~/gatk-data-ref$ aws s3 cp s3://broad-references/hg38/v0/Homo_sapiens_assembly38.dict .
~/gatk-data-ref$ aws s3 cp s3://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.fai .
~/gatk-data-ref$ aws s3 cp s3://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.alt .
~/gatk-data-ref$ aws s3 cp s3://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.amb .
~/gatk-data-ref$ aws s3 cp s3://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.ann .
~/gatk-data-ref$ aws s3 cp s3://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.bwt .
~/gatk-data-ref$ aws s3 cp s3://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.pac .
~/gatk-data-ref$ aws s3 cp s3://broad-references/hg38/v0/Homo_sapiens_assembly38.fasta.64.sa .
-rw-rw-r– 1 ubuntu ubuntu 581712 Jan 6 2016 Homo_sapiens_assembly38.dict
-rw-rw-r– 1 ubuntu ubuntu 3249912778 Jan 5 2016 Homo_sapiens_assembly38.fasta
-rw-rw-r– 1 ubuntu ubuntu 487553 Nov 6 23:47 Homo_sapiens_assembly38.fasta.64.alt
-rw-rw-r– 1 ubuntu ubuntu 20199 Nov 6 23:47 Homo_sapiens_assembly38.fasta.64.amb
-rw-rw-r– 1 ubuntu ubuntu 455474 Nov 6 23:47 Homo_sapiens_assembly38.fasta.64.ann
-rw-rw-r– 1 ubuntu ubuntu 3217347004 Nov 6 23:47 Homo_sapiens_assembly38.fasta.64.bwt
-rw-rw-r– 1 ubuntu ubuntu 804336731 Nov 6 23:48 Homo_sapiens_assembly38.fasta.64.pac
-rw-rw-r– 1 ubuntu ubuntu 1608673512 Nov 6 23:48 Homo_sapiens_assembly38.fasta.64.sa
-rw-rw-r– 1 ubuntu ubuntu 160928 Dec 1 2016 Homo_sapiens_assembly38.fasta.fai
Alignment (bwa mem) will lead to a SAM (sequence alignment file) file, that will be converted to a BAM file (using samtools view).
main] Version: 0.7.17-r1188
[main] CMD: bwa mem -M -t 96 -R @RG:SRR622459:Q:illumina:FCC1H7WACXX:NA12892 Homo_sapiens_assembly38.fasta SRR622459_1.fastq.gz SRR622459_2.fastq.gz
[main] Real time: 22680.497 sec; CPU: 1393622.442 sec
sambamba 0.6.9 by Artem Tarasov and Pjotr Prins (C) 2012-2019
2467570854 + 0 in total (QC-passed reads + QC-failed reads)
10585005 + 0 secondary
11607447 + 0 supplementary
0 + 0 duplicates
2364732708 + 0 mapped (95.83%:N/A)
2445378402 + 0 paired in sequencing
1222689201 + 0 read1
1222689201 + 0 read2
2254822518 + 0 properly paired (92.21%:N/A)
2313333564 + 0 with itself and mate mapped
29206692 + 0 singletons (1.19%:N/A)
27935066 + 0 with mate mapped to a different chr
16234878 + 0 with mate mapped to a different chr (mapQ>=5)
-rw-rw-r– 1 ubuntu ubuntu 200710884147 May 2 11:47 SRR622459_1.bam
-rw-rw-r– 1 ubuntu ubuntu 192007490886 May 2 13:42 SRR622459_1.sorted.bam
-rw-rw-r– 1 ubuntu ubuntu 9775584 May 2 14:05 SRR622459_1.sorted.bam.bai
-rw-rw-r– 1 ubuntu ubuntu 4257061759 May 2 15:43 SRR622459_1.sorted.chr20.bam
[bam_sort_core] merging from 192 files and 96 in-memory blocks…
-rw-rw-r– 1 ubuntu ubuntu 192007490886 May 2 13:42 SRR622459_1.sorted.bam
-rw-rw-r– 1 ubuntu ubuntu 9775584 May 2 14:05 SRR622459_1.sorted.bam.bai
## HISTOGRAM java.lang.String
Error Type Count
ERROR:INVALID_TAG_NM 245,444
[Sat May 04 19:07:58 UTC 2019] picard.sam.ValidateSamFile done. Elapsed time: 331.75 minutes.
…………..
Tool returned: 0
No errors found
……………..
[Sun May 05 05:03:29 UTC 2019] picard.sam.markduplicates.MarkDuplicates done. Elapsed time: 680.30 minutes.
Runtime.totalMemory()=3481796608
Tool returned: 0
Using GATK jar /home/b0d2647/miniconda3/share/gatk4-4.1.2.0-0/gatk-package-4.1.2.0-local.jar
Running: java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/b0d2647/miniconda3/sha re/gatk4-4.1.2.0-0/gatk-package-4.1.2.0-local.jar MarkDuplicates -I SRR622459_1.sorted.NmMdTqTgs.bam -O SRR622459_1.sorted.NmMdTqTgs.mdup.bam -M SRR622459_1.sorted.NmMdTqTgs.dupMetrics.txt
LIBRARY UNPAIRED_READS_EXAMINED READ_PAIRS_EXAMINED SECONDARY_OR_SUPPLEMENTARY_RDS UNMAPPED_READS UNPAIRED_READ_DUPLICATES READ_PAIR_DUPLICATES READ_PAIR_OPTICAL_DUPLICATES PERCENT_DUPLICATION ESTIMATED_LIBRARY_SIZE
Q 29206692 1156666782 22192452 102838146 14050105 58665347 0 0.056085 11013720153
-rw-rw-r– 1 b0d2647 b0d2647 691071 May 5 05:03 59_1.fxNm.log
-rw-rw-r– 1 b0d2647 b0d2647 260704633123 May 4 10:23 SRR622459_1.sorted.NmMdTqTgs.bam
-rw-rw-r– 1 b0d2647 b0d2647 9786952 May 4 18:05 SRR622459_1.sorted.NmMdTqTgs.bam.bai
-rw-rw-r– 1 b0d2647 b0d2647 5565 May 5 05:03 SRR622459_1.sorted.NmMdTqTgs.dupMetrics.txt
-rw-rw-r– 1 b0d2647 b0d2647 263473284570 May 5 05:03 SRR622459_1.sorted.NmMdTqTgs.mdup.bam
-rw-rw-r– 1 b0d2647 b0d2647 9817616 May 5 09:59 SRR622459_1.sorted.NmMdTqTgs.mdup.bam.bai
-rw-rw-r– 1 b0d2647 b0d2647 16 May 4 19:07 summary-SRR622459_1.sorted.NmMdTqTgs
The duplicate marked BAM file is not base recalibrated (before haplotype calling, see below) using gatk BaseRecalibrator and gatk ApplyBQSR as it is time exhaustive.
#!/bin/bash
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr2:165989160-166149216 -O SRR622459_1.sorted.NmM
dTqTgs.mdup.scn1a.vcf &> hcall.59_1.scn1a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr2:165208056-165392310 -O SRR622459_1.sorted.NmM
dTqTgs.mdup.scn2a.vcf &> hcall.59_1.scn2a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr2:165087520-165204295 -O SRR622459_1.sorted.NmM
dTqTgs.mdup.scn3a.vcf &> hcall.59_1.scn3a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr2:166403573-166494264 -O SRR622459_1.sorted.NmM
dTqTgs.mdup.scn7a.vcf &> hcall.59_1.scn7a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr2:166195185-166375987 -O SRR622459_1.sorted.NmM
dTqTgs.mdup.scn9a.vcf &> hcall.59_1.scn9a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr3:38548061-38649673 -O SRR622459_1.sorted.NmMdTqTgs.mdup.scn5a.vcf &> hcall.59_1.scn5a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr3:38697110-38794010 -O SRR622459_1.sorted.NmMdTqTgs.mdup.scn10a.vcf &> hcall.59_1.scn10a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr3:38845764-39051945 -O SRR622459_1.sorted.NmMdTqTgs.mdup.scn11a.vcf &> hcall.59_1.scn11a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr11:118162804-118176622 -O SRR622459_1.sorted.NmMdTqTgs.mdup.scn2b.vcf &> hcall.59_1.scn2b.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr11:123629187-123654607 -O SRR622459_1.sorted.NmMdTqTgs.mdup.scn3b.vcf &> hcall.59_1.scn3b.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr11:118133377-118152915 -O SRR622459_1.sorted.NmMdTqTgs.mdup.scn4b.vcf &> hcall.59_1.scn4b.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr12:51589958-51812864 -O SRR622459_1.sorted.NmMdTqTgs.mdup.scn8a.vcf &> hcall.59_1.scn8a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr17:63938554-63972918 -O SRR622459_1.sorted.NmMdTqTgs.mdup.scn4a.vcf &> hcall.59_1.scn4a.log &
gatk HaplotypeCaller -R Homo_sapiens_assembly38.fasta -I SRR622459_1.sorted.NmMdTqTgs.mdup.bam -L chr19:35030688-35040449 -O SRR622459_1.sorted.NmMdTqTgs.mdup.scn1b.vcf &> hcall.59_1.scn1b.log &
-rw-rw-r-- 1 b0d2647 b0d2647 204909 May 5 14:47 SRR622459_1.sorted.NmMdTqTgs.mdup.scn10a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 123814 May 5 14:47 SRR622459_1.sorted.NmMdTqTgs.mdup.scn10a.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 242692 May 5 14:49 SRR622459_1.sorted.NmMdTqTgs.mdup.scn11a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 123875 May 5 14:49 SRR622459_1.sorted.NmMdTqTgs.mdup.scn11a.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 259643 May 5 11:54 SRR622459_1.sorted.NmMdTqTgs.mdup.scn1a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 197187 May 5 11:54 SRR622459_1.sorted.NmMdTqTgs.mdup.scn1a.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 177291 May 5 14:54 SRR622459_1.sorted.NmMdTqTgs.mdup.scn1b.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 114343 May 5 14:54 SRR622459_1.sorted.NmMdTqTgs.mdup.scn1b.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 236893 May 5 11:57 SRR622459_1.sorted.NmMdTqTgs.mdup.scn2a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 155459 May 5 11:57 SRR622459_1.sorted.NmMdTqTgs.mdup.scn2a.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 177127 May 5 14:50 SRR622459_1.sorted.NmMdTqTgs.mdup.scn2b.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 115006 May 5 14:50 SRR622459_1.sorted.NmMdTqTgs.mdup.scn2b.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 178557 May 5 11:58 SRR622459_1.sorted.NmMdTqTgs.mdup.scn3a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 115402 May 5 11:58 SRR622459_1.sorted.NmMdTqTgs.mdup.scn3a.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 189976 May 5 14:52 SRR622459_1.sorted.NmMdTqTgs.mdup.scn3b.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 115073 May 5 14:52 SRR622459_1.sorted.NmMdTqTgs.mdup.scn3b.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 182634 May 5 14:58 SRR622459_1.sorted.NmMdTqTgs.mdup.scn4a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 114612 May 5 14:58 SRR622459_1.sorted.NmMdTqTgs.mdup.scn4a.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 183280 May 5 14:53 SRR622459_1.sorted.NmMdTqTgs.mdup.scn4b.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 114991 May 5 14:53 SRR622459_1.sorted.NmMdTqTgs.mdup.scn4b.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 212612 May 5 14:44 SRR622459_1.sorted.NmMdTqTgs.mdup.scn5a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 123771 May 5 14:44 SRR622459_1.sorted.NmMdTqTgs.mdup.scn5a.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 198907 May 5 11:59 SRR622459_1.sorted.NmMdTqTgs.mdup.scn7a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 134920 May 5 11:59 SRR622459_1.sorted.NmMdTqTgs.mdup.scn7a.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 226542 May 5 14:56 SRR622459_1.sorted.NmMdTqTgs.mdup.scn8a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 120588 May 5 14:56 SRR622459_1.sorted.NmMdTqTgs.mdup.scn8a.vcf.idx
-rw-rw-r-- 1 b0d2647 b0d2647 250724 May 5 12:00 SRR622459_1.sorted.NmMdTqTgs.mdup.scn9a.vcf
-rw-rw-r-- 1 b0d2647 b0d2647 197300 May 5 12:00 SRR622459_1.sorted.NmMdTqTgs.mdup.scn9a.vcf.idx
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn1a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn1a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 424
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 351
SN 0 number of MNPs: 0
SN 0 number of indels: 73
SN 0 number of others: 0
SN 0 number of multiallelic sites: 6
SN 0 number of multiallelic SNP sites: 3
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn2a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn2a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 276
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 228
SN 0 number of MNPs: 0
SN 0 number of indels: 48
SN 0 number of others: 0
SN 0 number of multiallelic sites: 2
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn3a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn3a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 16
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 10
SN 0 number of MNPs: 0
SN 0 number of indels: 6
SN 0 number of others: 0
SN 0 number of multiallelic sites: 1
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn4a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn4a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 38
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 28
SN 0 number of MNPs: 0
SN 0 number of indels: 10
SN 0 number of others: 0
SN 0 number of multiallelic sites: 0
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn5a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn5a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 183
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 156
SN 0 number of MNPs: 0
SN 0 number of indels: 27
SN 0 number of others: 0
SN 0 number of multiallelic sites: 4
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn7a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn7a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 105
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 90
SN 0 number of MNPs: 0
SN 0 number of indels: 15
SN 0 number of others: 0
SN 0 number of multiallelic sites: 1
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn8a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn8a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 226
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 183
SN 0 number of MNPs: 0
SN 0 number of indels: 44
SN 0 number of others: 0
SN 0 number of multiallelic sites: 3
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn9a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn9a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 376
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 306
SN 0 number of MNPs: 0
SN 0 number of indels: 72
SN 0 number of others: 1
SN 0 number of multiallelic sites: 10
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn10a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn10a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 137
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 122
SN 0 number of MNPs: 0
SN 0 number of indels: 15
SN 0 number of others: 0
SN 0 number of multiallelic sites: 3
SN 0 number of multiallelic SNP sites: 1
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn11a.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn11a.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 308
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 258
SN 0 number of MNPs: 0
SN 0 number of indels: 51
SN 0 number of others: 0
SN 0 number of multiallelic sites: 3
SN 0 number of multiallelic SNP sites: 1
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn1b.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn1b.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 10
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 8
SN 0 number of MNPs: 0
SN 0 number of indels: 2
SN 0 number of others: 0
SN 0 number of multiallelic sites: 0
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn2b.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn2b.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 9
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 7
SN 0 number of MNPs: 0
SN 0 number of indels: 2
SN 0 number of others: 0
SN 0 number of multiallelic sites: 0
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn3b.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn3b.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 66
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 58
SN 0 number of MNPs: 0
SN 0 number of indels: 8
SN 0 number of others: 1
SN 0 number of multiallelic sites: 1
SN 0 number of multiallelic SNP sites: 0
# This file was produced by bcftools stats (1.9+htslib-1.9) and can be plotted using plot-vcfstats.
# The command line was: bcftools stats SRR622459_1.sorted.NmMdTqTgs.mdup.scn4b.vcf
#
# Definition of sets:
# ID [2]id [3]tab-separated file names
ID 0 SRR622459_1.sorted.NmMdTqTgs.mdup.scn4b.vcf
# SN, Summary numbers:
# number of records .. number of data rows in the VCF
# number of no-ALTs .. reference-only sites, ALT is either "." or identical to REF
# number of SNPs .. number of rows with a SNP
# number of MNPs .. number of rows with a MNP, such as CC>TT
# number of indels .. number of rows with an indel
# number of others .. number of rows with other type, for example a symbolic allele or
# a complex substitution, such as ACT>TCGA
# number of multiallelic sites .. number of rows with multiple alternate alleles
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
#
# Note that rows containing multiple types will be counted multiple times, in each
# counter. For example, a row with a SNP and an indel increments both the SNP and
# the indel counter.
#
# SN [2]id [3]key [4]value
SN 0 number of samples: 1
SN 0 number of records: 37
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 32
SN 0 number of MNPs: 0
SN 0 number of indels: 5
SN 0 number of others: 0
SN 0 number of multiallelic sites: 0
SN 0 number of multiallelic SNP sites: 0