References, resources, data and initial directory contents

STAR-Fusion: Fast and Accurate Fusion Transcript Detection from RNA-Seq. rian Haas, Alexander Dobin, Nicolas Stransky, Bo Li, Xiao Yang, Timothy Tickle, Asma Bankapur, Carrie Ganote, Thomas Doak, Natalie Pochet, Jing Sun, Catherine Wu, Thomas Gingeras, Aviv Regev. bioRxiv 120295; doi: https://doi.org/10.1101/120295

AWS instance type: r5.2xlarge (64 GB RAM)

~$ git clone https://github.com/STAR-Fusion/STAR-Fusion-Tutorial.git

Cloning into ‘STAR-Fusion-Tutorial’…

remote: Enumerating objects: 3, done.

remote: Counting objects: 100% (3/3), done.

remote: Compressing objects: 100% (3/3), done.

remote: Total 28 (delta 0), reused 2 (delta 0), pack-reused 25

Unpacking objects: 100% (28/28), done.

~$ ls -l

drwxrwxr-x 4 ubuntu ubuntu 4096 Mar 29 00:50 STAR-Fusion-Tutorial

~$ cd STAR-Fusion-Tutorial/

~/STAR-Fusion-Tutorial$ ls -l

-rw-rw-r– 1 ubuntu ubuntu 1719 Mar 29 00:50 AnnotFilterRule.pm

-rw-rw-r– 1 ubuntu ubuntu 6586 Mar 29 00:50 CTAT_HumanFusionLib.mini.dat.gz

-rw-rw-r– 1 ubuntu ubuntu 129 Mar 29 00:50 README.md

drwxrwxr-x 2 ubuntu ubuntu 4096 Mar 29 00:50 STAR-Fusion-Tutorial.wiki

-rwxrwxr-x 1 ubuntu ubuntu 188 Mar 29 00:50 cleanMe.sh

-rw-rw-r– 1 ubuntu ubuntu 11299925 Mar 29 00:50 minigenome.fa

-rw-rw-r– 1 ubuntu ubuntu 15636276 Mar 29 00:50 minigenome.gtf

-rw-rw-r– 1 ubuntu ubuntu 41380839 Mar 29 00:50 rnaseq_1.fastq.gz

-rw-rw-r– 1 ubuntu ubuntu 45350714 Mar 29 00:50 rnaseq_2.fastq.gz

Obtain resouce library

~$ wget https://data.broadinstitute.org/Trinity/CTAT_RESOURCE_LIB/GRCh38_v27_CTAT_lib_Feb092018.plug-n-play.tar.gz

100%[++=====================================================================================>] 28.39G 1.51MB/s in 3h 18m

2019-03-29 01:12:21 (2.37 MB/s) - 018GRCh38_v27_CTAT_lib_Feb092018.plug-n-play.tar.gz019 saved [30484302229/30484302229]

~$ tar xvf GRCh38_v27_CTAT_lib_Feb092018.plug-n-play.tar.gz

GRCh38_v27_CTAT_lib_Feb092018/ GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/AnnotFilterRule.pm GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/pfam_domains.dbm GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.fai GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_annot.gtf.mini.sortu GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/blast_pairs.idx GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_annot.gtf.gene_spans GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.gmap.ok GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/ GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/build.ok GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/sjdbList.fromGTF.out.tab GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/exonGeTrInfo.tab GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/SAindex GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/transcriptInfo.tab GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/geneInfo.tab GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/sjdbInfo.txt GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/SA GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/chrStart.txt GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/chrName.txt GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/chrLength.txt GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/genomeParameters.txt GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/sjdbList.out.tab GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/Genome GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/chrNameLength.txt GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/exonInfo.tab GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_annot.pep GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/trans.blast.align_coords.align_coords.dat GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/trans.blast.align_coords.align_coords.dbm GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_annot.gtf GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_annot.cds GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/blast_pairs.idx.prev.1518217631 GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/fusion_annot_lib.idx GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_annot.prot_info.dbm

~$ ls -l GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/

-rw-rw-r– 1 ubuntu ubuntu 2839 Feb 23 2018 AnnotFilterRule.pm

-rw-rw-r– 1 ubuntu ubuntu 1400848384 Feb 21 2018 blast_pairs.idx

-rw-rw-r– 1 ubuntu ubuntu 1400848384 Feb 21 2018 blast_pairs.idx.prev.1518217631

-rw-rw-r– 1 ubuntu ubuntu 2320236544 Feb 21 2018 fusion_annot_lib.idx

-rw-rw-r– 1 ubuntu ubuntu 190857216 Feb 23 2018 pfam_domains.dbm

-rw-rw-r– 1 ubuntu ubuntu 108993409 Feb 21 2018 ref_annot.cds

-rw-rw-r– 1 ubuntu ubuntu 1150835305 Feb 21 2018 ref_annot.gtf

-rw-rw-r– 1 ubuntu ubuntu 3933998 Feb 21 2018 ref_annot.gtf.gene_spans

-rw-rw-r– 1 ubuntu ubuntu 49752053 Feb 21 2018 ref_annot.gtf.mini.sortu

-rw-rw-r– 1 ubuntu ubuntu 37597442 Feb 21 2018 ref_annot.pep

-rw-rw-r– 1 ubuntu ubuntu 719781888 Feb 21 2018 ref_annot.prot_info.dbm

-rw-rw-r– 1 ubuntu ubuntu 3139758082 Feb 21 2018 ref_genome.fa

-rw-rw-r– 1 ubuntu ubuntu 788 Feb 21 2018 ref_genome.fa.fai

-rw-rw-r– 1 ubuntu ubuntu 0 Feb 21 2018 ref_genome.fa.gmap.ok

drwxrwxr-x 2 ubuntu ubuntu 4096 Feb 21 2018 ref_genome.fa.star.idx

-rw-rw-r– 1 ubuntu ubuntu 1552146399 Feb 21 2018 trans.blast.align_coords.align_coords.dat

-rw-rw-r– 1 ubuntu ubuntu 4034531328 Feb 21 2018 trans.blast.align_coords.align_coords.dbm

~$ ls -l GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx/

-rw-rw-r– 1 ubuntu ubuntu 3195539919 Feb 21 2018 Genome

-rw-rw-r– 1 ubuntu ubuntu 25110292089 Feb 21 2018 SA

-rw-rw-r– 1 ubuntu ubuntu 1565873619 Feb 21 2018 SAindex

-rw-rw-r– 1 ubuntu ubuntu 0 Feb 21 2018 build.ok

-rw-rw-r– 1 ubuntu ubuntu 238 Feb 21 2018 chrLength.txt

-rw-rw-r– 1 ubuntu ubuntu 138 Feb 21 2018 chrName.txt

-rw-rw-r– 1 ubuntu ubuntu 376 Feb 21 2018 chrNameLength.txt

-rw-rw-r– 1 ubuntu ubuntu 273 Feb 21 2018 chrStart.txt

-rw-rw-r– 1 ubuntu ubuntu 42635542 Feb 21 2018 exonGeTrInfo.tab

-rw-rw-r– 1 ubuntu ubuntu 17254967 Feb 21 2018 exonInfo.tab

-rw-rw-r– 1 ubuntu ubuntu 1061759 Feb 21 2018 geneInfo.tab

-rw-rw-r– 1 ubuntu ubuntu 1018 Feb 21 2018 genomeParameters.txt

-rw-rw-r– 1 ubuntu ubuntu 10355017 Feb 21 2018 sjdbInfo.txt

-rw-rw-r– 1 ubuntu ubuntu 9147473 Feb 21 2018 sjdbList.fromGTF.out.tab

-rw-rw-r– 1 ubuntu ubuntu 9140743 Feb 21 2018 sjdbList.out.tab

-rw-rw-r– 1 ubuntu ubuntu 12267969 Feb 21 2018 transcriptInfo.tab

Predict Fusions Using STAR-Fusion: Running STAR-Fusion starting with FASTQ files (typical)

~/STAR-Fusion-Tutorial$ STAR-Fusion –left_fq rnaseq_1.fastq.gz –right_fq rnaseq_2.fastq.gz –genome_lib_dir ~/GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/

CMD: mkdir -p /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir

……………………………..

  • Running CMD:

/home/ubuntu/miniconda3/bin/STAR –genomeDir /home/ubuntu/GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir//ref_genome.fa.star.idx –outReadsUnmapped None –chimSegmentMin 12 –chimJunctionOverhangMin 12 –chimOutJunctionFormat 1 –alignSJDBoverhangMin 10 –alignMatesGapMax 100000 –alignIntronMax 100000 –alignSJstitchMismatchNmax 5 -1 5 5 –runThreadN 4 –outSAMstrandField intronMotif –outSAMunmapped Within –outSAMtype BAM Unsorted –readFilesIn /home/ubuntu/STAR-Fusion-Tutorial/rnaseq_1.fastq.gz /home/ubuntu/STAR-Fusion-Tutorial/rnaseq_2.fastq.gz –outSAMattrRGline ID:GRPundef –chimMultimapScoreRange 10 –chimMultimapNmax 10 –chimNonchimScoreDropMin 10 –peOverlapNbasesMin 12 –peOverlapMMp 0.1 –genomeLoad NoSharedMemory –twopassMode Basic –readFilesCommand ‘gunzip -c’

Mar 29 16:15:18 ….. started STAR run

Mar 29 16:15:18 ….. loading genome

Mar 29 16:17:48 ….. started 1st pass mapping

Mar 29 16:18:06 ….. finished 1st pass mapping

Mar 29 16:18:07 ….. inserting junctions into the genome indices

Mar 29 16:19:35 ….. started mapping

Mar 29 16:19:58 ….. finished successfully

-sample contains 733290

  • Running CMD:

/home/ubuntu/miniconda3/lib/STAR-Fusion/util/STAR-Fusion.map_chimeric_reads_to_genes –genome_lib_dir /home/ubuntu/GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ -J Chimeric.out.junction > /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/star-fusion.preliminary/star-fusion.junction_breakpts_to_genes.txt -building interval tree based on /home/ubuntu/GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir//ref_annot.gtf.mini.sortu -done building interval tree (0.07 min). -parsing fusion evidence: Chimeric.out.junction -mapping reads to genes [20000], rate=1200000.00/min

……………………………..

  • Running CMD:

cp /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/star-fusion.preliminary/star-fusion.fusion_candidates.preliminary.wSpliceInfo.wAnnot.pass.minFFPM.0.1.pass star-fusion.fusion_predictions.tsv

  • Running CMD:

/home/ubuntu/miniconda3/lib/STAR-Fusion/util/column_exclusions.pl star-fusion.fusion_predictions.tsv JunctionReads,SpanningFrags > star-fusion.fusion_predictions.abridged.tsv

STAR-Fusion complete. See output: star-fusion.fusion_candidates.tsv (or .abridged.tsv version)

Additional files/directories/subdirectories and their contents following the above run

~/STAR-Fusion-Tutorial$ ls -l

drwxrwxr-x 6 ubuntu ubuntu 4096 Mar 29 16:20 STAR-Fusion_outdir

~/STAR-Fusion-Tutorial$ ls -l STAR-Fusion_outdir/

-rw-rw-r– 1 ubuntu ubuntu 132072054 Mar 29 16:19 Aligned.out.bam

-rw-rw-r– 1 ubuntu ubuntu 2317214 Mar 29 16:19 Chimeric.out.junction

-rw-rw-r– 1 ubuntu ubuntu 1851 Mar 29 16:19 Log.final.out

-rw-rw-r– 1 ubuntu ubuntu 25044 Mar 29 16:19 Log.out

-rw-rw-r– 1 ubuntu ubuntu 447 Mar 29 16:19 Log.progress.out

-rw-rw-r– 1 ubuntu ubuntu 190840 Mar 29 16:19 SJ.out.tab

drwx—— 2 ubuntu ubuntu 4096 Mar 29 16:18 _STARgenome

drwx—— 2 ubuntu ubuntu 4096 Mar 29 16:18 _STARpass1

drwxrwxr-x 2 ubuntu ubuntu 4096 Mar 29 16:20 _starF_checkpoints

-rw-rw-r– 1 ubuntu ubuntu 4782 Mar 29 16:20 pipeliner.5985.cmds

-rw-rw-r– 1 ubuntu ubuntu 8597 Mar 29 16:20 star-fusion.fusion_predictions.abridged.tsv

-rw-rw-r– 1 ubuntu ubuntu 42353 Mar 29 16:20 star-fusion.fusion_predictions.tsv

drwxrwxr-x 3 ubuntu ubuntu 4096 Mar 29 16:20 star-fusion.preliminary

~/STAR-Fusion-Tutorial$ ls -l STAR-Fusion_outdir/star-fusion.preliminary/

drwxrwxr-x 2 ubuntu ubuntu 4096 Mar 29 16:20 star-fusion.filter.intermediates_dir

-rw-rw-r– 1 ubuntu ubuntu 58304 Mar 29 16:20 star-fusion.fusion_candidates.preliminary

lrwxrwxrwx 1 ubuntu ubuntu 159 Mar 29 16:20 star-fusion.fusion_candidates.preliminary.filtered -> /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/star-fusion.preliminary/star-fusion.filter.intermediates_dir/star-fusion.post_blast_and_promiscuity_filter

-rw-rw-r– 1 ubuntu ubuntu 38764 Mar 29 16:20 star-fusion.fusion_candidates.preliminary.filtered.FFPM

-rw-rw-r– 1 ubuntu ubuntu 39550 Mar 29 16:20 star-fusion.fusion_candidates.preliminary.wSpliceInfo

-rw-rw-r– 1 ubuntu ubuntu 42353 Mar 29 16:20 star-fusion.fusion_candidates.preliminary.wSpliceInfo.wAnnot

-rw-rw-r– 1 ubuntu ubuntu 252 Mar 29 16:20 star-fusion.fusion_candidates.preliminary.wSpliceInfo.wAnnot.annot_filt

-rw-rw-r– 1 ubuntu ubuntu 42353 Mar 29 16:20 star-fusion.fusion_candidates.preliminary.wSpliceInfo.wAnnot.pass

-rw-rw-r– 1 ubuntu ubuntu 42353 Mar 29 16:20 star-fusion.fusion_candidates.preliminary.wSpliceInfo.wAnnot.pass.minFFPM.0.1.pass

-rw-rw-r– 1 ubuntu ubuntu 9002233 Mar 29 16:20 star-fusion.junction_breakpts_to_genes.txt

-rw-rw-r– 1 ubuntu ubuntu 8032339 Mar 29 16:20 star-fusion.junction_breakpts_to_genes.txt.fail

-rw-rw-r– 1 ubuntu ubuntu 1401934 Mar 29 16:20 star-fusion.junction_breakpts_to_genes.txt.pass

-rw-rw-r– 1 ubuntu ubuntu 51475 Mar 29 16:20 star-fusion.junction_read_names

-rw-rw-r– 1 ubuntu ubuntu 62556 Mar 29 16:20 star-fusion.spanning_frag_names

~/STAR-Fusion-Tutorial$ head STAR-Fusion_outdir/star-fusion.fusion_predictions.abridged.tsv

FusionName JunctionReadCount SpanningFragCount SpliceType LeftGene LeftBreakpoint RightGene RightBreakpoint LargeAnchorSupport FFPM LeftBreakDinuc LeftBreakEntropy RightBreakDinuc RightBreakEntropy annots

TATDN1–GSDMB 81 184 ONLY_REF_SPLICE TATDN1^ENSG00000147687.18 chr8:124539025:- GSDMB^ENSG00000073605.18 chr17:39909924:- YES_LDAS 361.385 GT 1.9219 AG 1.5628 [“CCLE”,“Klijn_CellLines”,“FA_CancerSupp”,“ChimerPub”,“INTERCHROMOSOMAL[chr8–chr17]”]

THRA–THRA1/BTR 63 93 ONLY_REF_SPLICE THRA^ENSG00000126351.12 chr17:40086853:+ THRA1/BTR^ENSG00000235300.4 chr17:48294347:+ YES_LDAS 212.7399 GT 1.8892 AG 1.9656 [“INTRACHROMOSOMAL[chr17:8.20Mb]”]

TATDN1–GSDMB 32 182 ONLY_REF_SPLICE TATDN1^ENSG00000147687.18 chr8:124539025:- GSDMB^ENSG00000073605.18 chr17:39905985:- YES_LDAS 291.8354 GT 1.9219 AG 1.9086 [“CCLE”,“Klijn_CellLines”,“FA_CancerSupp”,“ChimerPub”,“INTERCHROMOSOMAL[chr8–chr17]”]

TATDN1–GSDMB 23 182 ONLY_REF_SPLICE TATDN1^ENSG00000147687.18 chr8:124539025:- GSDMB^ENSG00000073605.18 chr17:39906271:- YES_LDAS 279.562 GT 1.9219 AG 1.7819 [“CCLE”,“Klijn_CellLines”,“FA_CancerSupp”,“ChimerPub”,“INTERCHROMOSOMAL[chr8–chr17]”]

ACACA–STAC2 31 47 ONLY_REF_SPLICE ACACA^ENSG00000278540.4 chr17:37122531:- STAC2^ENSG00000141750.6 chr17:39218173:- YES_LDAS 106.3699 GT 1.9656 AG 1.9656 [“ChimerSeq”,“CCLE”,“Klijn_CellLines”,“FA_CancerSupp”,“INTRACHROMOSOMAL[chr17:1.80Mb]”]

BCAS4–BCAS3 10 122 ONLY_REF_SPLICE BCAS4^ENSG00000124243.17 chr20:50795173:+ BCAS3^ENSG00000141376.22 chr17:61368327:+ YES_LDAS 180.0107 GT 1.6402 AG 1.9899 [“ChimerPub”,“ChimerSeq”,“chimerdb_pubmed”,“CCLE”,“FA_CancerSupp”,“INTERCHROMOSOMAL[chr20–chr17]”]

THRA–THRA1/BTR 17 92 ONLY_REF_SPLICE THRA^ENSG00000126351.12 chr17:40086853:+ THRA1/BTR^ENSG00000235300.4 chr17:48307331:+ YES_LDAS 148.6452 GT 1.8892 AG 1.4295 [“INTRACHROMOSOMAL[chr17:8.20Mb]”]

BCAS4–BCAS3 4 122 ONLY_REF_SPLICE BCAS4^ENSG00000124243.17 chr20:50795173:+ BCAS3^ENSG00000141376.22 chr17:61353588:+ NO_LDAS 171.8284 GT 1.6402 AG 1.3996 [“ChimerPub”,“ChimerSeq”,“chimerdb_pubmed”,“CCLE”,“FA_CancerSupp”,“INTERCHROMOSOMAL[chr20–chr17]”]

CCDC6–RET 26 15 ONLY_REF_SPLICE CCDC6^ENSG00000108091.10 chr10:59906122:- RET^ENSG00000165731.18 chr10:43116584:+ YES_LDAS 55.9124 GT 1.8892 AG 1.8323 [“ChimerSeq”,“ChimerKB”,“FA_CancerSupp”,“Cosmic”,“Mitelman”,“ChimerPub”,“HaasMedCancer”,“Larsson_TCGA”,“YOSHIHARA_TCGA”,“INTRACHROMOSOMAL[chr10:16.66Mb]”]

~/STAR-Fusion-Tutorial$ head STAR-Fusion_outdir/star-fusion.fusion_predictions.abridged.tsv | column -t

FusionName JunctionReadCount SpanningFragCount SpliceType LeftGene LeftBreakpoint RightGene RightBreakpoint LargeAnchorSupport FFPM LeftBreakDinuc LeftBreakEntropy RightBreakDinuc RightBreakEntropy annots

TATDN1–GSDMB 81 184 ONLY_REF_SPLICE TATDN1^ENSG00000147687.18 chr8:124539025:- GSDMB^ENSG00000073605.18 chr17:39909924:- YES_LDAS 361.385 GT 1.9219 AG 1.5628 [“CCLE”,“Klijn_CellLines”,“FA_CancerSupp”,“ChimerPub”,“INTERCHROMOSOMAL[chr8–chr17]”]

THRA–THRA1/BTR 63 93 ONLY_REF_SPLICE THRA^ENSG00000126351.12 chr17:40086853:+ THRA1/BTR^ENSG00000235300.4 chr17:48294347:+ YES_LDAS 212.7399 GT 1.8892 AG 1.9656 [“INTRACHROMOSOMAL[chr17:8.20Mb]”]

TATDN1–GSDMB 32 182 ONLY_REF_SPLICE TATDN1^ENSG00000147687.18 chr8:124539025:- GSDMB^ENSG00000073605.18 chr17:39905985:- YES_LDAS 291.8354 GT 1.9219 AG 1.9086 [“CCLE”,“Klijn_CellLines”,“FA_CancerSupp”,“ChimerPub”,“INTERCHROMOSOMAL[chr8–chr17]”]

TATDN1–GSDMB 23 182 ONLY_REF_SPLICE TATDN1^ENSG00000147687.18 chr8:124539025:- GSDMB^ENSG00000073605.18 chr17:39906271:- YES_LDAS 279.562 GT 1.9219 AG 1.7819 [“CCLE”,“Klijn_CellLines”,“FA_CancerSupp”,“ChimerPub”,“INTERCHROMOSOMAL[chr8–chr17]”]

ACACA–STAC2 31 47 ONLY_REF_SPLICE ACACA^ENSG00000278540.4 chr17:37122531:- STAC2^ENSG00000141750.6 chr17:39218173:- YES_LDAS 106.3699 GT 1.9656 AG 1.9656 [“ChimerSeq”,“CCLE”,“Klijn_CellLines”,“FA_CancerSupp”,“INTRACHROMOSOMAL[chr17:1.80Mb]”]

BCAS4–BCAS3 10 122 ONLY_REF_SPLICE BCAS4^ENSG00000124243.17 chr20:50795173:+ BCAS3^ENSG00000141376.22 chr17:61368327:+ YES_LDAS 180.0107 GT 1.6402 AG 1.9899 [“ChimerPub”,“ChimerSeq”,“chimerdb_pubmed”,“CCLE”,“FA_CancerSupp”,“INTERCHROMOSOMAL[chr20–chr17]”]

THRA–THRA1/BTR 17 92 ONLY_REF_SPLICE THRA^ENSG00000126351.12 chr17:40086853:+ THRA1/BTR^ENSG00000235300.4 chr17:48307331:+ YES_LDAS 148.6452 GT 1.8892 AG 1.4295 [“INTRACHROMOSOMAL[chr17:8.20Mb]”]

BCAS4–BCAS3 4 122 ONLY_REF_SPLICE BCAS4^ENSG00000124243.17 chr20:50795173:+ BCAS3^ENSG00000141376.22 chr17:61353588:+ NO_LDAS 171.8284 GT 1.6402 AG 1.3996 [“ChimerPub”,“ChimerSeq”,“chimerdb_pubmed”,“CCLE”,“FA_CancerSupp”,“INTERCHROMOSOMAL[chr20–chr17]”]

CCDC6–RET 26 15 ONLY_REF_SPLICE CCDC6^ENSG00000108091.10 chr10:59906122:- RET^ENSG00000165731.18 chr10:43116584:+ YES_LDAS 55.9124 GT 1.8892 AG 1.8323 [“ChimerSeq”,“ChimerKB”,“FA_CancerSupp”,“Cosmic”,“Mitelman”,“ChimerPub”,“HaasMedCancer”,“Larsson_TCGA”,“YOSHIHARA_TCGA”,“INTRACHROMOSOMAL[chr10:16.66Mb]”]

In silico Validation Using FusionInspector

~/STAR-Fusion-Tutorial$ STAR-Fusion –left_fq rnaseq_1.fastq.gz –right_fq rnaseq_2.fastq.gz –genome_lib_dir ~/GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ –FusionInspector validate

– Skipping CMD:

/home/ubuntu/miniconda3/bin/STAR –genomeDir /home/ubuntu/GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir//ref_genome.fa.star.idx –outReadsUnmapped None –chimSegmentMin 12 –chimJunctionOverhangMin 12 –chimOutJunctionFormat 1 –alignSJDBoverhangMin 10 –alignMatesGapMax 100000 –alignIntronMax 100000 –alignSJstitchMismatchNmax 5 -1 5 5 –runThreadN 4 –outSAMstrandField intronMotif –outSAMunmapped Within –outSAMtype BAM Unsorted –readFilesIn /home/ubuntu/STAR-Fusion-Tutorial/rnaseq_1.fastq.gz /home/ubuntu/STAR-Fusion-Tutorial/rnaseq_2.fastq.gz –outSAMattrRGline ID:GRPundef –chimMultimapScoreRange 10 –chimMultimapNmax 10 –chimNonchimScoreDropMin 10 –peOverlapNbasesMin 12 –peOverlapMMp 0.1 –genomeLoad NoSharedMemory –twopassMode Basic –readFilesCommand ‘gunzip -c’ , checkpoint [/home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/_starF_checkpoints/run_star_aligner.ok] exists. -sample contains 733290

………………………………

STAR-Fusion complete. See output: star-fusion.fusion_candidates.tsv (or .abridged.tsv version)

  • Running CMD:

/home/ubuntu/miniconda3/lib/STAR-Fusion/FusionInspector/FusionInspector –fusions star-fusion.fusion_predictions.abridged.tsv –out_prefix finspector –min_junction_reads 1 –min_novel_junction_support 3 –min_spanning_frags_only 5 –prep_for_IGV –max_promiscuity 10 –out_dir /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate –genome_lib_dir /home/ubuntu/GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ –CPU 4 –left_fq /home/ubuntu/STAR-Fusion-Tutorial/rnaseq_1.fastq.gz –right_fq /home/ubuntu/STAR-Fusion-Tutorial/rnaseq_2.fastq.gz –annotate

…………………………..

CMD:

sort -k1,1 -k2,2n /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/finspector.bed > /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/finspector.bed.sorted.bed already processed. Skipping.

Running: bgzip -f /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/finspector.bed.sorted.bed

Running: touch /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/chckpts_dir/finspector.bed.bgzip.ok

Running: tabix -p bed /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/finspector.bed.sorted.bed.gz

Running: touch /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/chckpts_dir/finspector.bed.tabix.ok

Running: samtools faidx /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/finspector.fa

Running: touch /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/chckpts_dir/merged_contig_fai.ok

………………………………….

  • Running CMD: /home/ubuntu/miniconda3/bin/STAR –runThreadN 4 –genomeDir /home/ubuntu/GRCh38_v27_CTAT_lib_Feb092018/ctat_genome_lib_build_dir/ref_genome.fa.star.idx –outSAMtype BAM SortedByCoordinate –twopassMode Basic –alignMatesGapMax 100000 –alignIntronMax 100000 –alignSJDBoverhangMin 10 –genomeSuffixLengthMax 10000 –limitBAMsortRAM 20000000000 –readFilesIn /home/ubuntu/STAR-Fusion-Tutorial/rnaseq_1.fastq.gz /home/ubuntu/STAR-Fusion-Tutorial/rnaseq_2.fastq.gz –genomeFastaFiles /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.fa –outSAMfilter KeepAllAddedReferences –sjdbGTFfile /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.gtf –alignSJstitchMismatchNmax 5 -1 5 5 –readFilesCommand ‘gunzip -c’

Mar 29 16:42:44 ….. started STAR run

Mar 29 16:42:44 ….. loading genome

Mar 29 16:44:54 … generating Suffix Array index

Mar 29 16:49:24 … completed Suffix Array index

Mar 29 16:49:25 ….. processing annotations GTF

Mar 29 16:49:25 ….. inserting junctions into the genome indices

Mar 29 16:50:53 ….. started 1st pass mapping

Mar 29 16:51:05 ….. finished 1st pass mapping

Mar 29 16:51:06 ….. inserting junctions into the genome indices

Mar 29 16:52:34 ….. started mapping

Mar 29 16:52:47 ….. started sorting BAM

Mar 29 16:52:48 ….. finished successfully

  • Running CMD: mv Aligned.sortedByCoord.out.bam finspector.star.sortedByCoord.out.bam

  • Running CMD: samtools index finspector.star.sortedByCoord.out.bam

Running: touch /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/chckpts_dir/run_STAR.ok

Running: java -jar /home/ubuntu/miniconda3/lib/STAR-Fusion/FusionInspector/plugins/MarkDuplicates.jar I=/home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.star.sortedByCoord.out.bam O=/home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.star.cSorted.dupsMarked.bam M=/home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.star.cSorted.dupsMarked.bam.stats TMP_DIR=/home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir VALIDATION_STRINGENCY=SILENT

[Fri Mar 29 16:52:49 UTC 2019] picard.sam.MarkDuplicates

INPUT=[/home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.star.sortedByCoord.out.bam] OUTPUT=/home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.star.cSorted.dupsMarked.bam METRICS_FILE=/home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.star.cSorted.dupsMarked.bam.stats TMP_DIR=[/home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir] VALIDATION_STRINGENCY=SILENT PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates REMOVE_DUPLICATES=false ASSUME_SORTED=false MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false

………………………………………………….

INFO 2019-03-29 16:52:51 MarkDuplicates Read 487046 records. 0 pairs never matched.

INFO 2019-03-29 16:52:51 MarkDuplicates After buildSortedReadEndLists freeMemory: 557182696; totalMemory: 1140850688; maxMemory: 16680747008

INFO 2019-03-29 16:52:51 MarkDuplicates Will retain up to 521273344 duplicate indices before spilling to disk.

INFO 2019-03-29 16:52:53 MarkDuplicates Traversing read pair information and detecting duplicates.

INFO 2019-03-29 16:52:53 MarkDuplicates Traversing fragment information and detecting duplicates.

INFO 2019-03-29 16:52:53 MarkDuplicates Sorting list of duplicate records.

INFO 2019-03-29 16:52:53 MarkDuplicates After generateDuplicateIndexes freeMemory: 2822714136; totalMemory: 7000293376; maxMemory: 16680747008

INFO 2019-03-29 16:52:53 MarkDuplicates Marking 12084 records as duplicates.

INFO 2019-03-29 16:52:53 MarkDuplicates Found 0 optical duplicate clusters.

INFO 2019-03-29 16:52:57 MarkDuplicates Before output close freeMemory: 68119192; totalMemory: 71303168; maxMemory: 16680747008

INFO 2019-03-29 16:52:57 MarkDuplicates After output close freeMemory: 14175400; totalMemory: 16777216; maxMemory: 16680747008

[Fri Mar 29 16:52:57 UTC 2019] picard.sam.MarkDuplicates done. Elapsed time: 0.14 minutes.

Runtime.totalMemory()=16777216

…………………………..

[487000] -retrieving read alignments for 633 spanning frags.

………………………………..

CMD: ln -sf /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.fusion_preds.coalesced.summary.min_frag_thresh.starFfmt.wSpliceInfo.post_blast_filter.post_promisc_filter /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/finspector.fusion_predictions.post_blast_and_promiscuity_filter

…………………………

Additional files/directories/subdirectories and some of their contents following the above run

~/STAR-Fusion-Tutorial$ ls -ltr STAR-Fusion_outdir/

drwxrwxr-x 4 ubuntu ubuntu 4096 Mar 29 16:54 FusionInspector-validate

-rw-rw-r– 1 ubuntu ubuntu 24626 Mar 29 16:54 FusionInspector.log

-rw-rw-r– 1 ubuntu ubuntu 0 Mar 29 16:54 _fi_validate_598.ok

~/STAR-Fusion-Tutorial$ ls -ltr STAR-Fusion_outdir/FusionInspector-validate/

-rw-rw-r– 1 ubuntu ubuntu 2903977 Mar 29 16:40 finspector.gtf

-rw-rw-r– 1 ubuntu ubuntu 1307933 Mar 29 16:40 finspector.fa

-rw-rw-r– 1 ubuntu ubuntu 5613 Mar 29 16:40 cytoBand.txt

-rw-rw-r– 1 ubuntu ubuntu 117434 Mar 29 16:40 finspector.bed

-rw-rw-r– 1 ubuntu ubuntu 29097 Mar 29 16:42 finspector.bed.sorted.bed.gz

-rw-rw-r– 1 ubuntu ubuntu 873 Mar 29 16:42 finspector.bed.sorted.bed.gz.tbi

-rw-rw-r– 1 ubuntu ubuntu 807 Mar 29 16:42 finspector.fa.fai

-rw-rw-r– 1 ubuntu ubuntu 26255 Mar 29 16:53 finspector.junction_reads.bam

-rw-rw-r– 1 ubuntu ubuntu 2208 Mar 29 16:53 finspector.junction_reads.bam.bai

-rw-rw-r– 1 ubuntu ubuntu 34215 Mar 29 16:53 finspector.junction_reads.bam.bed

-rw-rw-r– 1 ubuntu ubuntu 6230 Mar 29 16:53 finspector.junction_reads.bam.bed.sorted.bed.gz

-rw-rw-r– 1 ubuntu ubuntu 675 Mar 29 16:53 finspector.junction_reads.bam.bed.sorted.bed.gz.tbi

-rw-rw-r– 1 ubuntu ubuntu 65864 Mar 29 16:53 finspector.spanning_reads.bam

-rw-rw-r– 1 ubuntu ubuntu 2832 Mar 29 16:53 finspector.spanning_reads.bam.bai

-rw-rw-r– 1 ubuntu ubuntu 48476 Mar 29 16:53 finspector.spanning_reads.bam.bed

-rw-rw-r– 1 ubuntu ubuntu 11417 Mar 29 16:53 finspector.spanning_reads.bam.bed.sorted.bed.gz

-rw-rw-r– 1 ubuntu ubuntu 714 Mar 29 16:53 finspector.spanning_reads.bam.bed.sorted.bed.gz.tbi

-rw-rw-r– 1 ubuntu ubuntu 27459384 Mar 29 16:54 finspector.consolidated.cSorted.bam

-rw-rw-r– 1 ubuntu ubuntu 6248 Mar 29 16:54 finspector.consolidated.cSorted.bam.bai

-rw-rw-r– 1 ubuntu ubuntu 11775 Mar 29 16:54 finspector.igv.FusionJuncSpan

drwxrwxrwx 4 ubuntu ubuntu 4096 Mar 29 16:54 fi_workdir

lrwxrwxrwx 1 ubuntu ubuntu 205 Mar 29 16:54 finspector.fusion_predictions.post_blast_and_promiscuity_filter -> /home/ubuntu/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate/fi_workdir/finspector.fusion_preds.coalesced.summary.min_frag_thresh.starFfmt.wSpliceInfo.post_blast_filter.post_promisc_filter

-rw-rw-r– 1 ubuntu ubuntu 154510 Mar 29 16:54 finspector.fusion_predictions.final

-rw-rw-r– 1 ubuntu ubuntu 5895 Mar 29 16:54 finspector.fusion_predictions.final.abridged

-rw-rw-r– 1 ubuntu ubuntu 6161 Mar 29 16:54 finspector.fusion_predictions.final.abridged.FFPM

-rw-rw-r– 1 ubuntu ubuntu 14520 Mar 29 16:54 finspector.fusion_inspector_web.json

-rw-rw-r– 1 ubuntu ubuntu 8810 Mar 29 16:54 finspector.fusion_predictions.final.abridged.FFPM.annotated

drwxrwxr-x 2 ubuntu ubuntu 4096 Mar 29 16:54 chckpts_dir

~/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate$ head finspector.bed | column -t

ACACA–STAC2 50155 53604 ID=ACACA–STAC2ENST00000619245.1;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 50155 53604 0 4 85,216,156,178 0,1085,2115,3271

ACACA–STAC2 43239 48865 ID=ACACA–STAC2ENST00000614450.4;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 43239 48865 0 7 44,119,24,144,97,108,55 0,186,1511,2535,3456,4463,5571

ACACA–STAC2 11923 13492 ID=ACACA–STAC2ENST00000614438.1;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 11923 13492 0 2 227,342 0,1227 ACACA–STAC2 1000 7569 ID=ACACA–STAC2ENST00000618351.1;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 1000 7569 0 2 557,159 0,6410

ACACA–STAC2 71436 73042 ID=ACACA–STAC2ENST00000618053.1;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 71436 73042 0 3 46,120,297 0,1046,1309

ACACA–STAC2 1038 17224 ID=ACACA–STAC2ENST00000617548.1;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 1038 17224 0 5 519,47,253,111,133 0,9812,10859,14942,16053

ACACA–STAC2 8602 17224 ID=ACACA–STAC2ENST00000614789.4;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 8602 17224 0 5 118,47,253,111,133 0,2248,3295,7378,8489

ACACA–STAC2 14492 73105 ID=ACACA–STAC2ENST00000617649.4;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 14492 73105 0 55 488,111,133,139,110,82,99,107,111,210,171,162,164,151,104,82,146,151,135,147,189,101,89,125,114,114,90,119,24,144,97,108,57,45,42,216,156,204,156,147,270,98,121,111,144,121,97,97,136,178,113,155,171,137,623 0,1488,2599,3732,4871,5762,6827,7926,9033,10144,11354,12511,13018,14182,15110,16216,16734,17914,19065,19545,20692,21881,22982,24071,25196,26543,27657,28933,30258,31282,32203,33210,34318,34618,35706,36748,37778,38934,40138,41294,42441,43892,44990,46111,47222,47955,49076,50340,51437,52573,53751,54864,55682,56853,57990

ACACA–STAC2 14492 74928 ID=ACACA–STAC2ENST00000612895.4;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 14492 74928 0 54 488,133,139,110,82,99,107,111,210,171,162,164,151,104,82,146,151,135,147,189,101,89,125,114,114,90,119,24,144,97,108,57,45,42,216,156,204,156,147,270,98,121,111,144,121,97,97,136,178,113,155,171,137,2446 0,2599,3732,4871,5762,6827,7926,9033,10144,11354,12511,13018,14182,15110,16216,16734,17914,19065,19545,20692,21881,22982,24071,25196,26543,27657,28933,30258,31282,32203,33210,34318,34618,35706,36748,37778,38934,40138,41294,42441,43892,44990,46111,47222,47955,49076,50340,51437,52573,53751,54864,55682,56853,57990

ACACA–STAC2 1146 22501 ID=ACACA–STAC2ENST00000615229.4;ACACA–STAC2ACACA^ENSG00000278540.4;ACACA 0 + 1146 22501 0 10 411,47,253,111,133,139,110,82,99,83 0,9704,10751,14834,15945,17078,18217,19108,20173,21272

~/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate$ samtools view finspector.junction_reads.bam | head -10

fustut000014464b 163 ACACA–STAC2 64894 255 36M14465N14M = 79515 14660 CCAGCTGATCCAGCAAGCCTGGATTCTGAAGCCAAGCTCCAGCGATTCAA CC@ACAB>BCCBACBBBBCBB@BBBC@BBBABBCBBBBBBAABA9@AB?B PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:2 AS:i:84

fustut0000185b7b 163 ACACA–STAC2 64894 255 36M14465N14M = 79509 14665 CCAGCTGATCCAGCAAACCTGGATTCTGAAGCCAAGCTCCAGCGATTCAA BBB?B??>@BBB;BBBABB???><@B?AAB?AAAA?B=A@B?B<15:B?? PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:99

fustut0000078bb4 163 ACACA–STAC2 64896 255 34M14465N16M = 79425 14579 AGCTGATCCAGCAAACCTGGATTCTGAAGCCAAGCTCCAGCGATTCAAGC BCCBCBB@AABBBAB@AAAB@?B@@A@@@@?B??<==54=;>/65;<63/ PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:99

fustut000011d1a6 83 ACACA–STAC2 64898 255 32M14465N18M = 63619 -15794 CTGATCCAGCAAACCTGGATTCTGAAGCCAAGCTCCAGCGATTCAAGCGC ?67==?@?=@@88?@???@:?<??@A?A?@??@>BBAB??BABBBA@BAB PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:1 AS:i:95

fustut0000143249 163 ACACA–STAC2 64899 255 31M14465N19M = 79551 14702 TGATCCAGCAAACCTGGATTCTGAAGCCAAGCTCCAGCGATTCAAGCGCT CABBCCBBCBCCBCBBACBBBBACAACCCCACABCB@B>>AABBC?B>B? PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:99

fustut0000102c56 99 ACACA–STAC2 64900 255 30M14465N20M = 79426 14576 GATCCAGCAAACCTGGATTCTGAAGCCAAGCTCCAGCGATTCAAGCGCTC BBBBBBCBBBBBBBBBBBCBBBBABABBBBAB?A?BAB?A?@?@@?A??< PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:99

fustut0000101d0e 147 ACACA–STAC2 64901 255 29M14465N21M = 62505 -16911 ATCCAGCAAACCTGGATTCTGAAGCCAAGCTCCAGCGATTCAAGCGCTCC B@>A@BBBA:ABBBBBBBBBBBBBBBBBBBABBBCCCBCCCBBBCCBBBB PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:98

fustut0000110e1d 147 ACACA–STAC2 64901 255 29M14465N21M = 64830 -14586 ATCCAGCAAACCTGGATTCTGAAGCCAAGCTCCAGCGATTCAAGCGCTCC ><@BA?B@@?AB?B@?A?BBABAAABAB?B?BBBBBBBBBBBBBBBBBBB PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:97

fustut0000001745 147 ACACA–STAC2 64904 255 26M14465N24M = 63572 -15847 CAGCAAACCTGGATTCTGAAGCCAAGCTCCAGCGATTCAAGCGCTCCCTC AA?A<<6?@??AA?@?A?@A??@?A@B?ABABBAABBBAAABAB@BBBAB PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:99

fustut00001a4553 83 ACACA–STAC2 64904 255 26M14465N22M2S = 63617 -15800 CAGCAAACCTGGATTCTGAAGCCAAGCTCCAGCGATTCAAGCGCTCCCCA =?7?7?6??677>47@<<??=?????@7AB??A=AA?BAB?B@B?BBBCB PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:97

~/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate$ samtools view finspector.spanning_reads.bam | head -10

fustut0000030301 99 ACACA–STAC2 62533 255 36M1000N14M = 79410 16927 ATCCTCGATGGATGCTAGCAGGCCGTCCTCACCCAACCCAAAAAGGTCAG @B@@@AA@@B@8<B?A?@?<A==@?7=@8A=@==8>6===;38=82CCCB=9ABBB>B PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:97

fustut0000084fac 163 ACACA–STAC2 63594 255 50M = 79424 15880 CTTTTTTGACTATGGATCTTTCTCAGAGATTATGCAGCCCTGGGCACAGA CBCCCCBCCCCCBCCCCCCBCCCCCCBB@BCCCBCCBBCCBABBCCCCCB PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:1 AS:i:95

fustut00000b744e 99 ACACA–STAC2 63615 255 50M = 79390 15818 CTCAGAGATTATGCAGCCCTGGGCACAGACTGTGGTGGTTGGTAGAGCCA BCA;AACBCCBBCBBBBBBBBBBB?A>B@ABB5B@1@@7><<+15+719/ PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:1 AS:i:88

fustut0000104c44 99 ACACA–STAC2 63621 255 46M4S = 79390 15818 GATTATGCAGCCCTGGGCACAGACTGTGGTGGTTGGTAGAGCCAGGCTAG BABBBBBBBBAAABB?BAB?AB??BB?BA6A@/>7>72:99982:<17;: PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:1 AS:i:90

fustut0000107de9 99 ACACA–STAC2 63637 255 29M1167N21M = 79485 15898 GCACAGACTGTGGTGGTTGGTAGAGCCAGGCTAGGAGGAATAACCGTGGG BCCBB>BCBB=6>;B@(>7?72;91;>5-6=3=(@244);76######## PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:2 AS:i:94

fustut00001a6fb6 99 ACACA–STAC2 63650 255 16M1167N34M = 79412 15812 TGGTTGGTAGAGCCAGGCTAGGAGGAATACCTGTGGGAGTTGTTGCTGTA BAB?CBA:69:=BA3>=C@CBBBB>7=@;B>99= PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:99

fustut00000232a4 99 ACACA–STAC2 64830 255 3S47M = 79524 14744 AGCCAGGCTAGGAGGAATACCTGTGGGAGTTGTTGCTGTAGAAACCCGAA :@CA=ABC@9@=5:?:2?CCCBB??A>0><@;>A>B>>6:;8?ABB@779 PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:94

fustut000006da80 99 ACACA–STAC2 64830 255 1S49M = 79526 14746 CCAGGCTAGGAGGAATACCTGTGGGAGTTGTTGCTGTAGAAACCCGAACA CCB@CCB>??AB?AA@BCBCCACCCACB@CBBCC>;>CABCA8@C@>@>A PG:Z:MarkDuplicates NH:i:1 HI:i:1 nM:i:0 AS:i:96

~/STAR-Fusion-Tutorial/STAR-Fusion_outdir/FusionInspector-validate$ samtools view finspector.consolidated.cSorted.bam | head -10

fustut00000fbb76 99 ACACA–STAC2 171 3 50M = 226 105 GAGAAAAAAAAAAAAAGGCAAAAAAGCACAATATTGCAGTCTGCAGAATC A:ABBBBB@>B>ABAA=>8@9?A?=9;>?=>7:?<990:7:<::69>>>; PG:Z:MarkDuplicates NH:i:2 HI:i:1 nM:i:0 AS:i:98

fustut0000171425 163 ACACA–STAC2 206 3 50M = 358 202 GCAGTCTGCAGAATCTTCTTTCTCCCCCTTTCACAAGAGACCACATTCTG AC@BABBCBBBBCBBCBBBBBBBBABAABBBBBBCBBBBA@ABB?7?AAB PG:Z:MarkDuplicates NH:i:2 HI:i:1 nM:i:0 AS:i:98

fustut00000fbb76 147 ACACA–STAC2 226 3 50M = 171 -105 TCTCCCCCTTTCACAAGAGACCACATTCTGACACCAACTTCCGTGGAAAC =B=AAAA@6?2?ABBBBBB?BB?BA67AABAB?BBC?B>>BBBBCCBCBB PG:Z:MarkDuplicates NH:i:2 HI:i:1 nM:i:0 AS:i:98

fustut0000017d0a 163 ACACA–STAC2 227 3 50M = 377 200 CTCCCCCTTTCACAAGAGACCACATTCTGACACCAACTTCCGTGGAAACA BCBCBBBCB@CBBBB@@BBCAB6AB@BA?>@BAA:2>A?AAB=A<<9;<9 PG:Z:MarkDuplicates NH:i:2 HI:i:1 nM:i:0 AS:i:98

fustut00001b1af0 99 ACACA–STAC2 251 3 50M = 367 166 TTCTGACACCAACTTCCGTGGAAACACTGGCGCTAGCTCCAAACTAACAA BBBBBBBBBBBB@BBB@B@BABBB@BBBBA@AA@A@@>?=6A@@@?@A@? PG:Z:MarkDuplicates NH:i:2 HI:i:1 nM:i:0 AS:i:98

fustut0000066a67 99 ACACA–STAC2 259 3 50M = 301 92 CCAACTTCCGTGGAAACACTGGCGCTAGCTCCAAACTAACAATCGCCAAA BBBBBBBBBC@CB=ABABBABBBBABABBA@?A=??AAA@AB?BA@BB?A?>A5B=AA;A=A@;?=;A==5>?=;?;9395?5==9>5?94A PG:Z:MarkDuplicates NH:i:2 HI:i:1 nM:i:0 AS:i:98

fustut000012fcc3 99 ACACA–STAC2 325 3 50M = 433 158 GGAGAAGGAAGTGAGGCACTTCGCCGGGCAGGAGTCCACGTTGGGAGGAG AB=A66;?=?A?BB>AAABA?AA89::><-:=B:<>B9AA>=,>>1> PG:Z:MarkDuplicates NH:i:2 HI:i:1 nM:i:0 AS:i:98

Vizualize in IGV

finspector.bed # reference transcript structure annotations in BED format

finspector.consolidated.cSorted.bam # reads aligned to the fusion contigs

finspector.junction_reads.bam # junction / split-reads supporting fusions

finspector.spanning_reads.bam # fusion spanning fragment evidence

