The alignment of the the 5 samples to the CanFam genome experienced a handful of hiccups, but ultimately was successful. The variant calling however was not able to be completed due I think in part to how the genome was formatted.
Alignment QC
Using SRR24300331.bam as an example we can see that the samstats and multiQC files show an average quality of 38.9 and 24800473590 total bases mapped.
Sample Name
Error rate
M Non-Primary
M Reads Mapped
% Mapped
% Proper Pairs
M Total seqs
SRR24300331
0.50%
0.0
171.1
97.6%
95.4%
175.3
SRR24300332
0.51%
0.0
145.4
97.6%
95.3%
149.0
SRR24300333
0.59%
0.0
153.6
96.4%
90.5%
159.3
SRR24300335
0.50%
0.0
171.7
97.6%
95.4%
176.0
SRR24300349
5.10%
0.0
416.2
99.4%
71.1%
418.9
SRR24300350
3.96%
0.0
434.5
99.5%
77.1%
436.9
SRR24300357
0.51%
0.0
244.5
96.2%
93.4%
254.2
MultiQC of allignments
Alignment Plot
Variant Calling
After creating the freebayes.query.txt file with the genome parameters listed in the paper chr37:18000042-20145745, it became clear to me that something went wrong with the way that the genome was labeled.
This generated an empty data frame with 17 headers and no outputs. After playing around with the parameters I was able to figure out that it has something to do with the nomenclature of the genome and potentially the way that it is indexed.