University of Nevada, Reno Vintage Logo
igv from the terminal and use the GUI to load our reference genome, choose the X chromosome to view, then enter XIST as our locus of interest.$ igv
stringtie command with the options: ‘-p 11’ to set our threads to 11, ‘-G’ to designate our reference annotation file, ‘-o’ to designate out output file destination (can include full path/ here) and format as [output.gtf], ‘-l’ to set the prefix label for the outputs, and finally [input.bam] to denote our input files.$ stringtie -p 11 -G chrX_data/genes/chrX.gtf -o ERR188044_chrX.gtf -l ERR188044 ERR188044_chrX.bam
bash script to automate this step for us.#!/usr/bin/bash
#bash script for stringtie; assemble transcripts using stringtie
SAMPLES="ERR188044 ERR188104 ERR188234 ERR188245 ERR188257 ERR188273 ERR188337 ERR188383 ERR188401 ERR188428 ERR188454 ERR204916"
for SAMPLE in $SAMPLES; do
stringtie -p 11 -G ~/chrX_data/genes/chrX.gtf -o ${SAMPLE}_chrX.gtf -l ${SAMPLE} ${SAMPLE}_chrX.bam
done
#this works
perl script I wrote for the same task.#!/usr/bin/perl
#perl script for stringtie; assemble transcripts using stringtie
use warnings;
use strict;
my @samples = qw(ERR188044 ERR188104 ERR188234 ERR188245 ERR188257 ERR188273 ERR188337 ERR188383 ERR188401 ERR188428 ERR188454 ERR204916);
foreach(@samples){
do {
system("stringtie", "-p 11", "-G ~/chrX_data/genes/chrX.gtf", "-o ${_}_chrX.gtf", "-l ${_}" "${_}_chrX.bam");
}
}
#perl works too
stringtie command with the options: ‘–merge’ to use the merge function of stringtie, ‘-p 11’ to use 11 threads, ‘-G’ to designate our reference annotation file, ‘-o’ to assign our output file name and format to [output.gtf], and ‘chrX_data/mergelist.txt’ to provide the program with all of our inputs. Note: if you do not run the transcript assembly command/script in the same directory you are running this command, you will need to generate the mergelist.txt file manually, making sure to include the full path/ to each .gtf file.stringtie --merge -p 11 -G chrX_data/genes/chrX.gtf -o stringtie_merged.gtf chrX_data/mergelist.txt
conda install, then utilized gffcompare with options: ‘-r’ to include our reference annotation file, ‘-G’ to tell gffcompare to compare all the transcripts in our input file [stringtie_merged.gtf], and ‘-o’ to denote out output prefix.conda install gffcompare
gffcompare -r chrX_data/genes/chrX.gtf -G -o merged stringtie_merged.gtf
stringtie with options: ‘-e’ to limit the estimation and output of transcripts to only those that match the reference annotation, ‘-B’ turns our outputs into Ballgown input table files (.ctab) with our desired coverage data for our reference transcripts, ‘-p 11’ to denote the use of 11 threads, ‘-G’ to designate our reference annotation file, ‘-o’ to denote the creation of a new ballgown/ directory with all our individaul /[output.gtf] files contained within their own individual /[output_prefix]/ directories, and finally we enter our [input.bam] files.stringtie -e -B -p 11 -G stringtie_merged.gtf -o ballgown/ERR188044/ERR188044_chrX.gtf ERR188044_chrX.bam
bash script to automate this step for us.#!/usr/bin/bash
#bash script for stringtie; calculate abundances and generate ballgown dir containing tables for use in R
SAMPLES="ERR188044 ERR188104 ERR188234 ERR188245 ERR188257 ERR188273 ERR188337 ERR188383 ERR188401 ERR188428 ERR188454 ERR204916"
for SAMPLE in $SAMPLES; do
stringtie -e -B -p 11 -G stringtie_merged.gtf -o ~/ballgown/${SAMPLE}/${SAMPLE}_chrX.gtf ${SAMPLE}_chrX.bam
done
#this works
perl script for those of us who operate better under pressure.#!/usr/bin/perl
#perl script for stringtie; calculate abundances and generate ballgown/ dir containing tables for use in R
use warnings;
use strict;
my @samples = qw(ERR188044 ERR188104 ERR188234 ERR188245 ERR188257 ERR188273 ERR188337 ERR188383 ERR188401 ERR188428 ERR188454 ERR204916);
foreach(@samples){
do {
system("stringtie", "-e", "-B", "-p 11", "-G stringtie_merged.gtf", "-o ~/ballgown/${_}/${_}_chrX.gtf", "${_}_chrX.bam");
}
}
#perl works too
igvtools command in our terminal so that we may view all of our samples in IGV. Here we use the igvtools command with the following options: ‘count’ which computes coverage density for an alignment file and provides either .tdf or .wig output files, ‘-z’ is the max-zoom, ‘-w’ denotes window size, ‘-e’ which extends the read by the designated number in bp before counting, our [input.bam] file comes next, followed by our [output.tdf] file, and finally the path to our reference genome assembly.$ igvtools count -z 5 -w 25 -e 250 ERR188044_chrX.bam ERR188044_chrX.tdf ~/igv/genomes/hg38.genome
bash script to automate this process for us!#!/usr/bin/bash
#bash script for igvtools; generate .tdf files using igvtools
SAMPLES="ERR188044 ERR188104 ERR188234 ERR188245 ERR188257 ERR188273 ERR188337 ERR188383 ERR188401 ERR188428 ERR188454 ERR204916"
for SAMPLE in $SAMPLES; do
igvtools count -z 5 -w 25 -e 250 ${SAMPLE}_chrX.bam ${SAMPLE}_chrX.tdf ~/igv/genomes/hg38.genome
done
#this works
perl scripts!#!/usr/bin/perl
#perl script for igvtools; generate .tdf files using igvtools
use warnings;
use strict;
my @samples = qw(ERR188044 ERR188104 ERR188234 ERR188245 ERR188257 ERR188273 ERR188337 ERR188383 ERR188401 ERR188428 ERR188454 ERR204916);
foreach(@samples){
do {
system("igvtools", "count", "-z 5", "-w 25", "-e 250", "${_}_chrX.bam", "${_}_chrX.tdf", "~/igv/genomes/hg38.genome");
}
}
#perl works too