We’re getting some new flounder RNA-seq data, so I’m playing with Trinity/Trinotate to make sure that it all works.
I picked up some sample data from the Trinity github page at
https://github.com/trinityrnaseq/trinityrnaseq/tree/master/sample_data/test_DATA
setwd("~/Documents/Trinity-Test")
files <- list.files(pattern = "*.fq")
files
[1] "Sp_ds.10k.left.fq" "Sp_ds.10k.right.fq"
The data files are very small, 1.8mb a piece, so they should hopefully be very quick
I’ll be following the example from here:
It looks like, from this, we jump directly in to Trinity
system(paste0("Trinity --seqType fq --left ", files[1], " --right ", files[2], " --CPU 8 --trimmomatic --max_memory 20G"))
Left read files: $VAR1 = [
'Sp_ds.10k.left.fq'
];
Right read files: $VAR1 = [
'Sp_ds.10k.right.fq'
];
Trinity version: Trinity-v2.4.0
-currently using the latest production release of Trinity.
Friday, March 31, 2017: 11:33:06 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /home/shared/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/ExitTester.jar 0
Friday, March 31, 2017: 11:33:06 CMD: java -Xmx64m -XX:ParallelGCThreads=2 -jar /home/shared/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/ExitTester.jar 1
Friday, March 31, 2017: 11:33:06 CMD: mkdir -p /home/sean/Documents/Trinity-Test/trinity_out_dir
Friday, March 31, 2017: 11:33:06 CMD: mkdir -p /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis
----------------------------------------------------------------------------------
-------------- Trinity Phase 1: Clustering of RNA-Seq Reads ---------------------
----------------------------------------------------------------------------------
---------------------------------------------------------------
------ Quality Trimming Via Trimmomatic ---------------------
<< ILLUMINACLIP:/home/shared/trinityrnaseq-Trinity-v2.4.0/trinity-plugins/Trimmomatic/adapters/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25 >>
---------------------------------------------------------------
## Running Trimmomatic on read files: /home/sean/Documents/Trinity-Test/Sp_ds.10k.left.fq, /home/sean/Documents/Trinity-Test/Sp_ds.10k.right.fq
Friday, March 31, 2017: 11:33:06 CMD: java -jar /home/shared/trinityrnaseq-Trinity-v2.4.0/trinity-plugins/Trimmomatic/trimmomatic.jar PE -threads 8 -phred33 /home/sean/Documents/Trinity-Test/Sp_ds.10k.left.fq /home/sean/Documents/Trinity-Test/Sp_ds.10k.right.fq Sp_ds.10k.left.fq.P.qtrim Sp_ds.10k.left.fq.U.qtrim Sp_ds.10k.right.fq.P.qtrim Sp_ds.10k.right.fq.U.qtrim ILLUMINACLIP:/home/shared/trinityrnaseq-Trinity-v2.4.0/trinity-plugins/Trimmomatic/adapters/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25
TrimmomaticPE: Started with arguments:
-threads 8 -phred33 /home/sean/Documents/Trinity-Test/Sp_ds.10k.left.fq /home/sean/Documents/Trinity-Test/Sp_ds.10k.right.fq Sp_ds.10k.left.fq.P.qtrim Sp_ds.10k.left.fq.U.qtrim Sp_ds.10k.right.fq.P.qtrim Sp_ds.10k.right.fq.U.qtrim ILLUMINACLIP:/home/shared/trinityrnaseq-Trinity-v2.4.0/trinity-plugins/Trimmomatic/adapters/TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25
Using PrefixPair: 'TACACTCTTTCCCTACACGACGCTCTTCCGATCT' and 'GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT'
ILLUMINACLIP: Using 1 prefix pairs, 0 forward/reverse sequences, 0 forward only sequences, 0 reverse only sequences
Input Read Pairs: 10000 Both Surviving: 9543 (95.43%) Forward Only Surviving: 104 (1.04%) Reverse Only Surviving: 182 (1.82%) Dropped: 171 (1.71%)
TrimmomaticPE: Completed successfully
Friday, March 31, 2017: 11:33:06 CMD: cp Sp_ds.10k.left.fq.P.qtrim Sp_ds.10k.left.fq.PwU.qtrim.fq
Friday, March 31, 2017: 11:33:06 CMD: cp Sp_ds.10k.right.fq.P.qtrim Sp_ds.10k.right.fq.PwU.qtrim.fq
Friday, March 31, 2017: 11:33:06 CMD: touch trimmomatic.ok
Friday, March 31, 2017: 11:33:06 CMD: gzip Sp_ds.10k.left.fq.P.qtrim Sp_ds.10k.left.fq.U.qtrim Sp_ds.10k.right.fq.P.qtrim Sp_ds.10k.right.fq.U.qtrim &
---------------------------------------------------------------
------------ In silico Read Normalization ---------------------
-- (Removing Excess Reads Beyond 50 Coverage --
---------------------------------------------------------------
# running normalization on reads: $VAR1 = [
[
'Sp_ds.10k.left.fq.PwU.qtrim.fq'
],
[
'Sp_ds.10k.right.fq.PwU.qtrim.fq'
]
];
Friday, March 31, 2017: 11:33:06 CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/insilico_read_normalization.pl --seqType fq --JM 20G --max_cov 50 --CPU 8 --output /home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization --max_pct_stdev 10000 --left Sp_ds.10k.left.fq.PwU.qtrim.fq --right Sp_ds.10k.right.fq.PwU.qtrim.fq --pairs_together --PARALLEL_STATS
Converting input files. (both directions in parallel)CMD: seqtk-trinity seq -A /home/sean/Documents/Trinity-Test/trinity_out_dir/Sp_ds.10k.left.fq.PwU.qtrim.fq >> left.fa
CMD: seqtk-trinity seq -A /home/sean/Documents/Trinity-Test/trinity_out_dir/Sp_ds.10k.right.fq.PwU.qtrim.fq >> right.fa
CMD finished (0 seconds)
CMD finished (0 seconds)
CMD: touch left.fa.ok
CMD finished (0 seconds)
CMD: touch right.fa.ok
CMD finished (0 seconds)
Done converting input files.CMD: cat left.fa right.fa > both.fa
CMD finished (0 seconds)
CMD: touch both.fa.ok
-------------------------------------------
----------- Jellyfish --------------------
-- (building a k-mer catalog from reads) --
-------------------------------------------
CMD finished (0 seconds)
CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/..//trinity-plugins/jellyfish/bin/jellyfish count -t 8 -m 25 -s 100000000 --canonical both.fa
CMD finished (1 seconds)
CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/..//trinity-plugins/jellyfish/bin/jellyfish histo -t 8 -o jellyfish.K25.min2.kmers.fa.histo mer_counts.jf
CMD finished (0 seconds)
CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/..//trinity-plugins/jellyfish/bin/jellyfish dump -L 2 mer_counts.jf > jellyfish.K25.min2.kmers.fa
CMD finished (0 seconds)
CMD: touch jellyfish.K25.min2.kmers.fa.success
CMD finished (0 seconds)
CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/..//Inchworm/bin/fastaToKmerCoverageStats --reads left.fa --kmers jellyfish.K25.min2.kmers.fa --kmer_size 25 --num_threads 8 --DS > left.fa.K25.stats
CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/..//Inchworm/bin/fastaToKmerCoverageStats --reads right.fa --kmers jellyfish.K25.min2.kmers.fa --kmer_size 25 --num_threads 8 --DS > right.fa.K25.stats
-reading Kmer occurrences...-reading Kmer occurrences...
done parsing 116588 Kmers, 116588 added, taking 1 seconds.
done parsing 116588 Kmers, 116588 added, taking 1 seconds.
STATS_GENERATION_TIME: 0 seconds.
CMD finished (1 seconds)
STATS_GENERATION_TIME: 0 seconds.
CMD finished (1 seconds)
CMD: touch left.fa.K25.stats.ok
CMD finished (0 seconds)
CMD: touch right.fa.K25.stats.ok
-sorting each stats file by read name.
CMD finished (0 seconds)
CMD: /usr/bin/sort --parallel=8 -k5,5 -T . -S 10G left.fa.K25.stats > left.fa.K25.stats.sort
CMD: /usr/bin/sort --parallel=8 -k5,5 -T . -S 10G right.fa.K25.stats > right.fa.K25.stats.sort
CMD finished (0 seconds)
CMD finished (0 seconds)
CMD: touch left.fa.K25.stats.sort.ok
CMD finished (0 seconds)
CMD: touch right.fa.K25.stats.sort.ok
CMD finished (0 seconds)
CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/..//util/support_scripts//nbkc_merge_left_right_stats.pl --left left.fa.K25.stats.sort --right right.fa.K25.stats.sort --sorted > pairs.K25.stats
-opening left.fa.K25.stats.sort
-opening right.fa.K25.stats.sort
-done opening files.
CMD finished (0 seconds)
CMD: touch pairs.K25.stats.ok
CMD finished (0 seconds)
CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/..//util/support_scripts//nbkc_normalize.pl pairs.K25.stats 50 10000 > pairs.K25.stats.C50.pctSD10000.accs
9543 / 9543 = 100.00% reads selected during normalization.
0 / 9543 = 0.00% reads discarded as likely aberrant based on coverage profiles.
0 / 9543 = 0.00% reads missing kmer coverage (N chars included?).
CMD finished (0 seconds)
CMD: touch pairs.K25.stats.C50.pctSD10000.accs.ok
CMD finished (0 seconds)
CMD: touch /home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/Sp_ds.10k.left.fq.PwU.qtrim.fq.normalized_K25_C50_pctSD10000.fq.ok
CMD finished (0 seconds)
CMD: touch /home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/Sp_ds.10k.right.fq.PwU.qtrim.fq.normalized_K25_C50_pctSD10000.fq.ok
CMD finished (0 seconds)
CMD: ln -sf /home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/Sp_ds.10k.left.fq.PwU.qtrim.fq.normalized_K25_C50_pctSD10000.fq left.norm.fq
CMD finished (0 seconds)
CMD: ln -sf /home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/Sp_ds.10k.right.fq.PwU.qtrim.fq.normalized_K25_C50_pctSD10000.fq right.norm.fq
-removing tmp dir /home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/tmp_normalized_reads
CMD finished (0 seconds)
Normalization complete. See outputs:
/home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/Sp_ds.10k.left.fq.PwU.qtrim.fq.normalized_K25_C50_pctSD10000.fq
/home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/Sp_ds.10k.right.fq.PwU.qtrim.fq.normalized_K25_C50_pctSD10000.fq
Friday, March 31, 2017: 11:33:08 CMD: touch /home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/normalization.ok
Converting input files. (in parallel)Friday, March 31, 2017: 11:33:08 CMD: cat /home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/left.norm.fq | seqtk-trinity seq -A - >> left.fa
Friday, March 31, 2017: 11:33:08 CMD: cat /home/sean/Documents/Trinity-Test/trinity_out_dir/insilico_read_normalization/right.norm.fq | seqtk-trinity seq -A - >> right.fa
Friday, March 31, 2017: 11:33:08 CMD: touch left.fa.ok
Friday, March 31, 2017: 11:33:08 CMD: touch right.fa.ok
Friday, March 31, 2017: 11:33:08 CMD: touch left.fa.ok right.fa.ok
Friday, March 31, 2017: 11:33:08 CMD: cat left.fa right.fa > /home/sean/Documents/Trinity-Test/trinity_out_dir/both.fa
Friday, March 31, 2017: 11:33:08 CMD: touch /home/sean/Documents/Trinity-Test/trinity_out_dir/both.fa.ok
-------------------------------------------
----------- Jellyfish --------------------
-- (building a k-mer catalog from reads) --
-------------------------------------------
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/trinity-plugins/jellyfish/bin/jellyfish count -t 8 -m 25 -s 100000000 --canonical /home/sean/Documents/Trinity-Test/trinity_out_dir/both.fa
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/trinity-plugins/jellyfish/bin/jellyfish dump -L 1 mer_counts.jf > jellyfish.kmers.fa
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/trinity-plugins/jellyfish/bin/jellyfish histo -t 8 -o jellyfish.kmers.fa.histo mer_counts.jf
----------------------------------------------
--------------- Inchworm ---------------------
-- (Linear contig construction from k-mers) --
----------------------------------------------
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/Inchworm/bin//inchworm --kmers jellyfish.kmers.fa --run_inchworm -K 25 -L 25 --monitor 1 --DS --num_threads 6 --PARALLEL_IWORM > /home/sean/Documents/Trinity-Test/trinity_out_dir/inchworm.K25.L25.DS.fa.tmp
* Running CMD: mv /home/sean/Documents/Trinity-Test/trinity_out_dir/inchworm.K25.L25.DS.fa.tmp /home/sean/Documents/Trinity-Test/trinity_out_dir/inchworm.K25.L25.DS.fa
Friday, March 31, 2017: 11:33:11 CMD: touch /home/sean/Documents/Trinity-Test/trinity_out_dir/inchworm.K25.L25.DS.fa.finished
--------------------------------------------------------
-------------------- Chrysalis -------------------------
-- (Contig Clustering & de Bruijn Graph Construction) --
--------------------------------------------------------
inchworm_target: /home/sean/Documents/Trinity-Test/trinity_out_dir/both.fa
bowite_reads_fa: /home/sean/Documents/Trinity-Test/trinity_out_dir/both.fa
chrysalis_reads_fa: /home/sean/Documents/Trinity-Test/trinity_out_dir/both.fa
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/filter_iworm_by_min_length_or_cov.pl /home/sean/Documents/Trinity-Test/trinity_out_dir/inchworm.K25.L25.DS.fa 100 10 > /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/inchworm.K25.L25.DS.fa.min100
* Running CMD: bowtie2-build -o 3 /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/inchworm.K25.L25.DS.fa.min100 /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/inchworm.K25.L25.DS.fa.min100 1>/dev/null
* Running CMD: bash -c " set -o pipefail;bowtie2 --local -k 2 --threads 8 -f --score-min G,46,0 -x /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/inchworm.K25.L25.DS.fa.min100 /home/sean/Documents/Trinity-Test/trinity_out_dir/both.fa | samtools view -@ 8 -F4 -Sb - | samtools sort -m 1342177280 -@ 8 -no - - > /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/iworm.bowtie.nameSorted.bam"
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/scaffold_iworm_contigs.pl /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/iworm.bowtie.nameSorted.bam /home/sean/Documents/Trinity-Test/trinity_out_dir/inchworm.K25.L25.DS.fa > /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/iworm_scaffolds.txt
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/Chrysalis/GraphFromFasta -i /home/sean/Documents/Trinity-Test/trinity_out_dir/inchworm.K25.L25.DS.fa -r /home/sean/Documents/Trinity-Test/trinity_out_dir/both.fa -min_contig_length 200 -min_glue 2 -glue_factor 0.05 -min_iso_ratio 0.05 -t 8 -k 24 -kk 48 -scaffolding /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/iworm_scaffolds.txt > /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/Chrysalis/BubbleUpClustering -i /home/sean/Documents/Trinity-Test/trinity_out_dir/inchworm.K25.L25.DS.fa -weld_graph /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/iworm_cluster_welds_graph.txt -min_contig_length 200 > /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/GraphFromIwormFasta.out
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/Chrysalis/CreateIwormFastaBundle -i /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/GraphFromIwormFasta.out -o /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/bundled_iworm_contigs.fasta -min 200
* Running CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/Chrysalis/ReadsToTranscripts -i /home/sean/Documents/Trinity-Test/trinity_out_dir/both.fa -f /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/bundled_iworm_contigs.fasta -o /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/readsToComponents.out -t 8 -max_mem_reads 50000000
* Running CMD: /usr/bin/sort --parallel=8 -T . -S 20G -k 1,1n /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/readsToComponents.out > /home/sean/Documents/Trinity-Test/trinity_out_dir/chrysalis/readsToComponents.out.sort
Friday, March 31, 2017: 11:33:14 CMD: mkdir -p read_partitions/Fb_0/CBin_0
Friday, March 31, 2017: 11:33:14 CMD: mkdir -p read_partitions/Fb_0/CBin_1
Friday, March 31, 2017: 11:33:14 CMD: touch partitioned_reads.files.list.ok
Friday, March 31, 2017: 11:33:14 CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/write_partitioned_trinity_cmds.pl --reads_list_file partitioned_reads.files.list --CPU 1 --max_memory 1G --run_as_paired --seqType fa --trinity_complete --full_cleanup > recursive_trinity.cmds
Friday, March 31, 2017: 11:33:14 CMD: touch recursive_trinity.cmds.ok
Friday, March 31, 2017: 11:33:14 CMD: touch recursive_trinity.cmds.ok
--------------------------------------------------------------------------------
------------ Trinity Phase 2: Assembling Clusters of Reads ---------------------
--------------------------------------------------------------------------------
Friday, March 31, 2017: 11:33:14 CMD: /home/shared/trinityrnaseq-Trinity-v2.4.0/trinity-plugins/parafly/bin/ParaFly -c recursive_trinity.cmds -CPU 8 -v
Number of Commands: 159
succeeded(1) 0.628931% completed.
succeeded(2) 1.25786% completed.
succeeded(3) 1.88679% completed.
succeeded(4) 2.51572% completed.
succeeded(5) 3.14465% completed.
succeeded(6) 3.77358% completed.
succeeded(7) 4.40252% completed.
succeeded(8) 5.03145% completed.
succeeded(9) 5.66038% completed.
succeeded(10) 6.28931% completed.
succeeded(11) 6.91824% completed.
succeeded(12) 7.54717% completed.
succeeded(13) 8.1761% completed.
succeeded(14) 8.80503% completed.
succeeded(15) 9.43396% completed.
succeeded(16) 10.0629% completed.
succeeded(17) 10.6918% completed.
succeeded(18) 11.3208% completed.
succeeded(19) 11.9497% completed.
succeeded(20) 12.5786% completed.
succeeded(21) 13.2075% completed.
succeeded(22) 13.8365% completed.
succeeded(23) 14.4654% completed.
succeeded(24) 15.0943% completed.
succeeded(25) 15.7233% completed.
succeeded(26) 16.3522% completed.
succeeded(27) 16.9811% completed.
succeeded(28) 17.6101% completed.
succeeded(29) 18.239% completed.
succeeded(30) 18.8679% completed.
succeeded(31) 19.4969% completed.
succeeded(32) 20.1258% completed.
succeeded(33) 20.7547% completed.
succeeded(34) 21.3836% completed.
succeeded(35) 22.0126% completed.
succeeded(36) 22.6415% completed.
succeeded(37) 23.2704% completed.
succeeded(38) 23.8994% completed.
succeeded(39) 24.5283% completed.
succeeded(40) 25.1572% completed.
succeeded(41) 25.7862% completed.
succeeded(42) 26.4151% completed.
succeeded(43) 27.044% completed.
succeeded(44) 27.673% completed.
succeeded(45) 28.3019% completed.
succeeded(46) 28.9308% completed.
succeeded(47) 29.5597% completed.
succeeded(48) 30.1887% completed.
succeeded(49) 30.8176% completed.
succeeded(50) 31.4465% completed.
succeeded(51) 32.0755% completed.
succeeded(52) 32.7044% completed.
succeeded(53) 33.3333% completed.
succeeded(54) 33.9623% completed.
succeeded(55) 34.5912% completed.
succeeded(56) 35.2201% completed.
succeeded(57) 35.8491% completed.
succeeded(58) 36.478% completed.
succeeded(59) 37.1069% completed.
succeeded(60) 37.7359% completed.
succeeded(61) 38.3648% completed.
succeeded(62) 38.9937% completed.
succeeded(63) 39.6226% completed.
succeeded(64) 40.2516% completed.
succeeded(65) 40.8805% completed.
succeeded(66) 41.5094% completed.
succeeded(67) 42.1384% completed.
succeeded(68) 42.7673% completed.
succeeded(69) 43.3962% completed.
succeeded(70) 44.0252% completed.
succeeded(71) 44.6541% completed.
succeeded(72) 45.283% completed.
succeeded(73) 45.9119% completed.
succeeded(74) 46.5409% completed.
succeeded(75) 47.1698% completed.
succeeded(76) 47.7987% completed.
succeeded(77) 48.4277% completed.
succeeded(78) 49.0566% completed.
succeeded(79) 49.6855% completed.
succeeded(80) 50.3145% completed.
succeeded(81) 50.9434% completed.
succeeded(82) 51.5723% completed.
succeeded(83) 52.2013% completed.
succeeded(84) 52.8302% completed.
succeeded(85) 53.4591% completed.
succeeded(86) 54.0881% completed.
succeeded(87) 54.717% completed.
succeeded(88) 55.3459% completed.
succeeded(89) 55.9748% completed.
succeeded(90) 56.6038% completed.
succeeded(91) 57.2327% completed.
succeeded(92) 57.8616% completed.
succeeded(93) 58.4906% completed.
succeeded(94) 59.1195% completed.
succeeded(95) 59.7484% completed.
succeeded(96) 60.3774% completed.
succeeded(97) 61.0063% completed.
succeeded(98) 61.6352% completed.
succeeded(99) 62.2641% completed.
succeeded(100) 62.8931% completed.
succeeded(101) 63.522% completed.
succeeded(102) 64.1509% completed.
succeeded(103) 64.7799% completed.
succeeded(104) 65.4088% completed.
succeeded(105) 66.0377% completed.
succeeded(106) 66.6667% completed.
succeeded(107) 67.2956% completed.
succeeded(108) 67.9245% completed.
succeeded(109) 68.5535% completed.
succeeded(110) 69.1824% completed.
succeeded(111) 69.8113% completed.
succeeded(112) 70.4403% completed.
succeeded(113) 71.0692% completed.
succeeded(114) 71.6981% completed.
succeeded(115) 72.327% completed.
succeeded(116) 72.956% completed.
succeeded(117) 73.5849% completed.
succeeded(118) 74.2138% completed.
succeeded(119) 74.8428% completed.
succeeded(120) 75.4717% completed.
succeeded(121) 76.1006% completed.
succeeded(122) 76.7296% completed.
succeeded(123) 77.3585% completed.
succeeded(124) 77.9874% completed.
succeeded(125) 78.6163% completed.
succeeded(126) 79.2453% completed.
succeeded(127) 79.8742% completed.
succeeded(128) 80.5031% completed.
succeeded(129) 81.1321% completed.
succeeded(130) 81.761% completed.
succeeded(131) 82.3899% completed.
succeeded(132) 83.0189% completed.
succeeded(133) 83.6478% completed.
succeeded(134) 84.2767% completed.
succeeded(135) 84.9057% completed.
succeeded(136) 85.5346% completed.
succeeded(137) 86.1635% completed.
succeeded(138) 86.7924% completed.
succeeded(139) 87.4214% completed.
succeeded(140) 88.0503% completed.
succeeded(141) 88.6792% completed.
succeeded(142) 89.3082% completed.
succeeded(143) 89.9371% completed.
succeeded(144) 90.566% completed.
succeeded(145) 91.195% completed.
succeeded(146) 91.8239% completed.
succeeded(147) 92.4528% completed.
succeeded(148) 93.0818% completed.
succeeded(149) 93.7107% completed.
succeeded(150) 94.3396% completed.
succeeded(151) 94.9686% completed.
succeeded(152) 95.5975% completed.
succeeded(153) 96.2264% completed.
succeeded(154) 96.8553% completed.
succeeded(155) 97.4843% completed.
succeeded(156) 98.1132% completed.
succeeded(157) 98.7421% completed.
succeeded(158) 99.3711% completed.
succeeded(159) 100% completed.
All commands completed successfully. :-)
** Harvesting all assembled transcripts into a single multi-fasta file...
Friday, March 31, 2017: 11:33:51 CMD: find read_partitions/ -name '*inity.fasta' | /home/shared/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/partitioned_trinity_aggregator.pl TRINITY_DN > Trinity.fasta.tmp
###################################################################
Butterfly assemblies are written to /home/sean/Documents/Trinity-Test/trinity_out_dir/Trinity.fasta
###################################################################
Wow, less than a minute to finish. Why can’t everything be that fast?
setwd("~/Documents/Trinity-Test/trinity_out_dir")
list.files(pattern = "*")
[1] "both.fa" "both.fa.ok" "both.fa.read_count"
[4] "chrysalis" "inchworm.K25.L25.DS.fa" "inchworm.K25.L25.DS.fa.finished"
[7] "inchworm.kmer_count" "insilico_read_normalization" "jellyfish.kmers.fa"
[10] "jellyfish.kmers.fa.histo" "left.fa.ok" "partitioned_reads.files.list"
[13] "partitioned_reads.files.list.ok" "pipeliner.4911.cmds" "read_partitions"
[16] "recursive_trinity.cmds" "recursive_trinity.cmds.completed" "recursive_trinity.cmds.ok"
[19] "right.fa.ok" "scaffolding_entries.sam" "Sp_ds.10k.left.fq.P.qtrim.gz"
[22] "Sp_ds.10k.left.fq.PwU.qtrim.fq" "Sp_ds.10k.left.fq.U.qtrim.gz" "Sp_ds.10k.right.fq.P.qtrim.gz"
[25] "Sp_ds.10k.right.fq.PwU.qtrim.fq" "Sp_ds.10k.right.fq.U.qtrim.gz" "trimmomatic.ok"
[28] "Trinity.fasta" "Trinity.timing"
system("head Trinity.fasta")
>TRINITY_DN116_c0_g1_i1 len=205 path=[1217:0-204] [-1, 1217, -2]
TTTGGAAACCTTGACCAGTGGGCCAGTTGGTGGTGTTGGTGGTGTAAGAACCAGTGCTGG
TGTCAACAGTGTCCAAGTATTGGGTAACGGTGGGATAAACAGCAAAGTTGGTCAAGTAAA
GAGCAGCACTGGGTTCATCAGTGGAAACGACATTCCAGGTGATTTGCTCCTCGCCGTTGG
TTTGCCAAGTGTCACCATTGGTGGG
>TRINITY_DN116_c0_g2_i1 len=420 path=[241:0-419] [-1, 241, -2]
GGCTTACCCTGGTCGTCCTGAGCAAATTTATGCTCAATCTCAACAATTTAACATTGTTGA
GGGTGCTGCTTCTTCTTCTTCCTCTTCCTCTTCTTCCTCCAGCTCTTTGGTTTCCTCCAC
AACCTCTTCTTCCAGCTCTGCCACTCCTTCGACCACTTCTTCCTCCTCCTCCTCTTCTTC
TTCCTCTTCCTCATCCTCATCTAAATCTTCATCCTCTTCTTCCAAGTCTTCCTCTCGTAG
Looks like we’ve got transcript data! Now on to Trinotate by way of TransDecoder.
I’ll be following the workflow from the Trinotate website at https://trinotate.github.io/#SeqAnalyses
Trinotate requires the Trinity.fasta file produced by trinity, as well as a Most Likely Longest-ORF peptide candidate from TransDecoder, so lets make that second file now.
system("TransDecoder.LongOrfs -t Trinity.fasta")
CMD: /home/shared/TransDecoder-3.0.1/util/compute_base_probs.pl Trinity.fasta 0 > Trinity.fasta.transdecoder_dir/base_freqs.dat
-first extracting base frequencies, we'll need them later.
CMD: touch Trinity.fasta.transdecoder_dir/base_freqs.dat.ok
- extracting ORFs from transcripts.
-total transcripts to examine: 115
[100/115] = 86.96% done
#################################
### Done preparing long ORFs. ###
##################################
Use file: Trinity.fasta.transdecoder_dir/longest_orfs.pep for Pfam and/or BlastP searches to enable homology-based coding region identification.
Then, run TransDecoder.Predict for your final coding region predictions.
system("TransDecoder.Predict -t Trinity.fasta")
CMD: /home/shared/TransDecoder-3.0.1/util/get_top_longest_fasta_entries.pl Trinity.fasta.transdecoder_dir/longest_orfs.cds 5000 > Trinity.fasta.transdecoder_dir/longest_orfs.cds.top_longest_5000
CMD: /home/shared/TransDecoder-3.0.1/util/bin/cd-hit-est -r 1 -i Trinity.fasta.transdecoder_dir/longest_orfs.cds.top_longest_5000 -T 1 -c 0.80 -o Trinity.fasta.transdecoder_dir/longest_orfs.cds.top_longest_5000.nr80 -M 0
================================================================
Program: CD-HIT, V4.6 (+OpenMP), Mar 31 2017, 09:18:00
Command: /home/shared/TransDecoder-3.0.1/util/bin/cd-hit-est
-r 1 -i
Trinity.fasta.transdecoder_dir/longest_orfs.cds.top_longest_5000
-T 1 -c 0.80 -o
Trinity.fasta.transdecoder_dir/longest_orfs.cds.top_longest_5000.nr80
-M 0
Started: Fri Mar 31 11:42:35 2017
================================================================
Output
----------------------------------------------------------------
total seq: 55
longest and shortest : 1023 and 303
Total letters: 25215
Sequences have been sorted
Approximated minimal memory consumption:
Sequence : 0M
Buffer : 1 X 12M = 12M
Table : 1 X 16M = 16M
Miscellaneous : 4M
Total : 33M
Table limit with the given memory limit:
Max number of representatives: 4000000
Max number of word counting entries: 295011500
comparing sequences from 0 to 55
55 finished 34 clusters
Apprixmated maximum memory consumption: 33M
writing new database
writing clustering information
program completed !
Total CPU time 0.13
CMD: /home/shared/TransDecoder-3.0.1/util/get_top_longest_fasta_entries.pl Trinity.fasta.transdecoder_dir/longest_orfs.cds.top_longest_5000.nr80 500 > Trinity.fasta.transdecoder_dir/longest_orfs.cds.top_500_longest
CMD: touch Trinity.fasta.transdecoder_dir/longest_orfs.cds.top_500_longest.ok
CMD: /home/shared/TransDecoder-3.0.1/util/seq_n_baseprobs_to_logliklihood_vals.pl Trinity.fasta.transdecoder_dir/longest_orfs.cds.top_500_longest Trinity.fasta.transdecoder_dir/base_freqs.dat > Trinity.fasta.transdecoder_dir/hexamer.scores
CMD: touch Trinity.fasta.transdecoder_dir/hexamer.scores.ok
CMD: /home/shared/TransDecoder-3.0.1/util/score_CDS_liklihood_all_6_frames.pl Trinity.fasta.transdecoder_dir/longest_orfs.cds Trinity.fasta.transdecoder_dir/hexamer.scores > Trinity.fasta.transdecoder_dir/longest_orfs.cds.scores
CMD: touch Trinity.fasta.transdecoder_dir/longest_orfs.cds.scores.ok
CMD: /home/shared/TransDecoder-3.0.1/util/index_gff3_files_by_isoform.pl Trinity.fasta.transdecoder_dir/longest_orfs.gff3
#####################
Counts of kept entries according to attributes:
FRAMESCORE 31
FRAMESCORE|LONGORF 3
########################
-indexing [TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.10]
-indexing [TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.9]
-indexing [TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.11]
-indexing [TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.37]
-indexing [TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.38]
-indexing [TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.39]
-indexing [TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.40]
-indexing [TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.1]
-indexing [TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.2]
-indexing [TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.27]
-indexing [TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.28]
-indexing [TRINITY_DN123_c0_g1::TRINITY_DN123_c0_g1_i1::g.15]
-indexing [TRINITY_DN123_c0_g5::TRINITY_DN123_c0_g5_i1::g.16]
-indexing [TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.7]
-indexing [TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.6]
-indexing [TRINITY_DN127_c0_g3::TRINITY_DN127_c0_g3_i1::g.8]
-indexing [TRINITY_DN128_c0_g1::TRINITY_DN128_c0_g1_i1::g.13]
-indexing [TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.29]
-indexing [TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.30]
-indexing [TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.31]
-indexing [TRINITY_DN129_c0_g3::TRINITY_DN129_c0_g3_i1::g.33]
-indexing [TRINITY_DN129_c0_g3::TRINITY_DN129_c0_g3_i1::g.32]
-indexing [TRINITY_DN131_c0_g1::TRINITY_DN131_c0_g1_i1::g.4]
-indexing [TRINITY_DN134_c0_g1::TRINITY_DN134_c0_g1_i1::g.22]
-indexing [TRINITY_DN134_c0_g1::TRINITY_DN134_c0_g1_i1::g.23]
-indexing [TRINITY_DN137_c0_g1::TRINITY_DN137_c0_g1_i1::g.18]
-indexing [TRINITY_DN137_c0_g1::TRINITY_DN137_c0_g1_i1::g.17]
-indexing [TRINITY_DN139_c0_g1::TRINITY_DN139_c0_g1_i1::g.5]
-indexing [TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.24]
-indexing [TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.26]
-indexing [TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.25]
-indexing [TRINITY_DN142_c0_g1::TRINITY_DN142_c0_g1_i1::g.41]
-indexing [TRINITY_DN142_c0_g1::TRINITY_DN142_c0_g1_i1::g.42]
-indexing [TRINITY_DN142_c0_g2::TRINITY_DN142_c0_g2_i1::g.44]
-indexing [TRINITY_DN142_c0_g2::TRINITY_DN142_c0_g2_i1::g.43]
-indexing [TRINITY_DN143_c0_g1::TRINITY_DN143_c0_g1_i1::g.14]
-indexing [TRINITY_DN144_c0_g1::TRINITY_DN144_c0_g1_i1::g.12]
-indexing [TRINITY_DN150_c0_g1::TRINITY_DN150_c0_g1_i1::g.21]
-indexing [TRINITY_DN154_c0_g1::TRINITY_DN154_c0_g1_i1::g.3]
-indexing [TRINITY_DN155_c0_g1::TRINITY_DN155_c0_g1_i1::g.20]
-indexing [TRINITY_DN155_c0_g1::TRINITY_DN155_c0_g1_i1::g.19]
-indexing [TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.34]
-indexing [TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.35]
-indexing [TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.36]
-indexing [TRINITY_DN4_c0_g2::TRINITY_DN4_c0_g2_i1::g.54]
-indexing [TRINITY_DN58_c0_g1::TRINITY_DN58_c0_g1_i1::g.55]
-indexing [TRINITY_DN60_c0_g1::TRINITY_DN60_c0_g1_i1::g.53]
-indexing [TRINITY_DN62_c0_g1::TRINITY_DN62_c0_g1_i1::g.52]
-indexing [TRINITY_DN62_c0_g1::TRINITY_DN62_c0_g1_i1::g.51]
-indexing [TRINITY_DN69_c0_g2::TRINITY_DN69_c0_g2_i1::g.45]
-indexing [TRINITY_DN80_c0_g1::TRINITY_DN80_c0_g1_i1::g.50]
-indexing [TRINITY_DN83_c0_g2::TRINITY_DN83_c0_g2_i1::g.47]
-indexing [TRINITY_DN83_c0_g2::TRINITY_DN83_c0_g2_i1::g.46]
-indexing [TRINITY_DN85_c0_g3::TRINITY_DN85_c0_g3_i1::g.49]
-indexing [TRINITY_DN96_c0_g2::TRINITY_DN96_c0_g2_i1::g.48]
Indexed TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.10::m.10
Indexed TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.9::m.9
Indexed TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.11::m.11
Indexed TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.37::m.37
Indexed TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.38::m.38
Indexed TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.39::m.39
Indexed TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.40::m.40
Indexed TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.1::m.1
Indexed TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.2::m.2
Indexed TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.27::m.27
Indexed TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.28::m.28
Indexed TRINITY_DN123_c0_g1::TRINITY_DN123_c0_g1_i1::g.15::m.15
Indexed TRINITY_DN123_c0_g5::TRINITY_DN123_c0_g5_i1::g.16::m.16
Indexed TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.7::m.7
Indexed TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.6::m.6
Indexed TRINITY_DN127_c0_g3::TRINITY_DN127_c0_g3_i1::g.8::m.8
Indexed TRINITY_DN128_c0_g1::TRINITY_DN128_c0_g1_i1::g.13::m.13
Indexed TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.29::m.29
Indexed TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.30::m.30
Indexed TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.31::m.31
Indexed TRINITY_DN129_c0_g3::TRINITY_DN129_c0_g3_i1::g.33::m.33
Indexed TRINITY_DN129_c0_g3::TRINITY_DN129_c0_g3_i1::g.32::m.32
Indexed TRINITY_DN131_c0_g1::TRINITY_DN131_c0_g1_i1::g.4::m.4
Indexed TRINITY_DN134_c0_g1::TRINITY_DN134_c0_g1_i1::g.22::m.22
Indexed TRINITY_DN134_c0_g1::TRINITY_DN134_c0_g1_i1::g.23::m.23
Indexed TRINITY_DN137_c0_g1::TRINITY_DN137_c0_g1_i1::g.18::m.18
Indexed TRINITY_DN137_c0_g1::TRINITY_DN137_c0_g1_i1::g.17::m.17
Indexed TRINITY_DN139_c0_g1::TRINITY_DN139_c0_g1_i1::g.5::m.5
Indexed TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.24::m.24
Indexed TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.26::m.26
Indexed TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.25::m.25
Indexed TRINITY_DN142_c0_g1::TRINITY_DN142_c0_g1_i1::g.41::m.41
Indexed TRINITY_DN142_c0_g1::TRINITY_DN142_c0_g1_i1::g.42::m.42
Indexed TRINITY_DN142_c0_g2::TRINITY_DN142_c0_g2_i1::g.44::m.44
Indexed TRINITY_DN142_c0_g2::TRINITY_DN142_c0_g2_i1::g.43::m.43
Indexed TRINITY_DN143_c0_g1::TRINITY_DN143_c0_g1_i1::g.14::m.14
Indexed TRINITY_DN144_c0_g1::TRINITY_DN144_c0_g1_i1::g.12::m.12
Indexed TRINITY_DN150_c0_g1::TRINITY_DN150_c0_g1_i1::g.21::m.21
Indexed TRINITY_DN154_c0_g1::TRINITY_DN154_c0_g1_i1::g.3::m.3
Indexed TRINITY_DN155_c0_g1::TRINITY_DN155_c0_g1_i1::g.20::m.20
Indexed TRINITY_DN155_c0_g1::TRINITY_DN155_c0_g1_i1::g.19::m.19
Indexed TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.34::m.34
Indexed TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.35::m.35
Indexed TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.36::m.36
Indexed TRINITY_DN4_c0_g2::TRINITY_DN4_c0_g2_i1::g.54::m.54
Indexed TRINITY_DN58_c0_g1::TRINITY_DN58_c0_g1_i1::g.55::m.55
Indexed TRINITY_DN60_c0_g1::TRINITY_DN60_c0_g1_i1::g.53::m.53
Indexed TRINITY_DN62_c0_g1::TRINITY_DN62_c0_g1_i1::g.52::m.52
Indexed TRINITY_DN62_c0_g1::TRINITY_DN62_c0_g1_i1::g.51::m.51
Indexed TRINITY_DN69_c0_g2::TRINITY_DN69_c0_g2_i1::g.45::m.45
Indexed TRINITY_DN80_c0_g1::TRINITY_DN80_c0_g1_i1::g.50::m.50
Indexed TRINITY_DN83_c0_g2::TRINITY_DN83_c0_g2_i1::g.47::m.47
Indexed TRINITY_DN83_c0_g2::TRINITY_DN83_c0_g2_i1::g.46::m.46
Indexed TRINITY_DN85_c0_g3::TRINITY_DN85_c0_g3_i1::g.49::m.49
Indexed TRINITY_DN96_c0_g2::TRINITY_DN96_c0_g2_i1::g.48::m.48
CMD: /home/shared/TransDecoder-3.0.1/util/gene_list_to_gff.pl Trinity.fasta.transdecoder_dir/longest_orfs.cds.scores.selected Trinity.fasta.transdecoder_dir/longest_orfs.gff3.inx > Trinity.fasta.transdecoder_dir/longest_orfs.cds.best_candidates.gff3
CMD: /home/shared/TransDecoder-3.0.1/util/remove_eclipsed_ORFs.pl Trinity.fasta.transdecoder_dir/longest_orfs.cds.best_candidates.gff3 > Trinity.fasta.transdecoder_dir/longest_orfs.cds.eclipsed_removed.gff3
-indexing [TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.9]
-indexing [TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.37]
-indexing [TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.39]
-indexing [TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.2]
-indexing [TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.28]
-indexing [TRINITY_DN123_c0_g1::TRINITY_DN123_c0_g1_i1::g.15]
-indexing [TRINITY_DN123_c0_g5::TRINITY_DN123_c0_g5_i1::g.16]
-indexing [TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.6]
-indexing [TRINITY_DN127_c0_g3::TRINITY_DN127_c0_g3_i1::g.8]
-indexing [TRINITY_DN128_c0_g1::TRINITY_DN128_c0_g1_i1::g.13]
-indexing [TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.30]
-indexing [TRINITY_DN129_c0_g3::TRINITY_DN129_c0_g3_i1::g.32]
-indexing [TRINITY_DN131_c0_g1::TRINITY_DN131_c0_g1_i1::g.4]
-indexing [TRINITY_DN134_c0_g1::TRINITY_DN134_c0_g1_i1::g.22]
-indexing [TRINITY_DN137_c0_g1::TRINITY_DN137_c0_g1_i1::g.18]
-indexing [TRINITY_DN139_c0_g1::TRINITY_DN139_c0_g1_i1::g.5]
-indexing [TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.24]
-indexing [TRINITY_DN142_c0_g1::TRINITY_DN142_c0_g1_i1::g.42]
-indexing [TRINITY_DN142_c0_g2::TRINITY_DN142_c0_g2_i1::g.44]
-indexing [TRINITY_DN143_c0_g1::TRINITY_DN143_c0_g1_i1::g.14]
-indexing [TRINITY_DN144_c0_g1::TRINITY_DN144_c0_g1_i1::g.12]
-indexing [TRINITY_DN150_c0_g1::TRINITY_DN150_c0_g1_i1::g.21]
-indexing [TRINITY_DN154_c0_g1::TRINITY_DN154_c0_g1_i1::g.3]
-indexing [TRINITY_DN155_c0_g1::TRINITY_DN155_c0_g1_i1::g.20]
-indexing [TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.34]
-indexing [TRINITY_DN4_c0_g2::TRINITY_DN4_c0_g2_i1::g.54]
-indexing [TRINITY_DN58_c0_g1::TRINITY_DN58_c0_g1_i1::g.55]
-indexing [TRINITY_DN60_c0_g1::TRINITY_DN60_c0_g1_i1::g.53]
-indexing [TRINITY_DN62_c0_g1::TRINITY_DN62_c0_g1_i1::g.51]
-indexing [TRINITY_DN69_c0_g2::TRINITY_DN69_c0_g2_i1::g.45]
-indexing [TRINITY_DN80_c0_g1::TRINITY_DN80_c0_g1_i1::g.50]
-indexing [TRINITY_DN83_c0_g2::TRINITY_DN83_c0_g2_i1::g.47]
-indexing [TRINITY_DN85_c0_g3::TRINITY_DN85_c0_g3_i1::g.49]
-indexing [TRINITY_DN96_c0_g2::TRINITY_DN96_c0_g2_i1::g.48]
CMD: cp Trinity.fasta.transdecoder_dir/longest_orfs.cds.eclipsed_removed.gff3 Trinity.fasta.transdecoder.gff3
CMD: /home/shared/TransDecoder-3.0.1/util/gff3_file_to_bed.pl Trinity.fasta.transdecoder.gff3 > Trinity.fasta.transdecoder.bed
-indexing [TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.9]
-indexing [TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.37]
-indexing [TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.39]
-indexing [TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.2]
-indexing [TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.28]
-indexing [TRINITY_DN123_c0_g1::TRINITY_DN123_c0_g1_i1::g.15]
-indexing [TRINITY_DN123_c0_g5::TRINITY_DN123_c0_g5_i1::g.16]
-indexing [TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.6]
-indexing [TRINITY_DN127_c0_g3::TRINITY_DN127_c0_g3_i1::g.8]
-indexing [TRINITY_DN128_c0_g1::TRINITY_DN128_c0_g1_i1::g.13]
-indexing [TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.30]
-indexing [TRINITY_DN129_c0_g3::TRINITY_DN129_c0_g3_i1::g.32]
-indexing [TRINITY_DN131_c0_g1::TRINITY_DN131_c0_g1_i1::g.4]
-indexing [TRINITY_DN134_c0_g1::TRINITY_DN134_c0_g1_i1::g.22]
-indexing [TRINITY_DN137_c0_g1::TRINITY_DN137_c0_g1_i1::g.18]
-indexing [TRINITY_DN139_c0_g1::TRINITY_DN139_c0_g1_i1::g.5]
-indexing [TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.24]
-indexing [TRINITY_DN142_c0_g1::TRINITY_DN142_c0_g1_i1::g.42]
-indexing [TRINITY_DN142_c0_g2::TRINITY_DN142_c0_g2_i1::g.44]
-indexing [TRINITY_DN143_c0_g1::TRINITY_DN143_c0_g1_i1::g.14]
-indexing [TRINITY_DN144_c0_g1::TRINITY_DN144_c0_g1_i1::g.12]
-indexing [TRINITY_DN150_c0_g1::TRINITY_DN150_c0_g1_i1::g.21]
-indexing [TRINITY_DN154_c0_g1::TRINITY_DN154_c0_g1_i1::g.3]
-indexing [TRINITY_DN155_c0_g1::TRINITY_DN155_c0_g1_i1::g.20]
-indexing [TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.34]
-indexing [TRINITY_DN4_c0_g2::TRINITY_DN4_c0_g2_i1::g.54]
-indexing [TRINITY_DN58_c0_g1::TRINITY_DN58_c0_g1_i1::g.55]
-indexing [TRINITY_DN60_c0_g1::TRINITY_DN60_c0_g1_i1::g.53]
-indexing [TRINITY_DN62_c0_g1::TRINITY_DN62_c0_g1_i1::g.51]
-indexing [TRINITY_DN69_c0_g2::TRINITY_DN69_c0_g2_i1::g.45]
-indexing [TRINITY_DN80_c0_g1::TRINITY_DN80_c0_g1_i1::g.50]
-indexing [TRINITY_DN83_c0_g2::TRINITY_DN83_c0_g2_i1::g.47]
-indexing [TRINITY_DN85_c0_g3::TRINITY_DN85_c0_g3_i1::g.49]
-indexing [TRINITY_DN96_c0_g2::TRINITY_DN96_c0_g2_i1::g.48]
CMD: /home/shared/TransDecoder-3.0.1/util/gff3_file_to_proteins.pl --gff3 Trinity.fasta.transdecoder.gff3 --fasta Trinity.fasta > Trinity.fasta.transdecoder.pep
-indexing [TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.9]
-indexing [TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.37]
-indexing [TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.39]
-indexing [TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.2]
-indexing [TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.28]
-indexing [TRINITY_DN123_c0_g1::TRINITY_DN123_c0_g1_i1::g.15]
-indexing [TRINITY_DN123_c0_g5::TRINITY_DN123_c0_g5_i1::g.16]
-indexing [TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.6]
-indexing [TRINITY_DN127_c0_g3::TRINITY_DN127_c0_g3_i1::g.8]
-indexing [TRINITY_DN128_c0_g1::TRINITY_DN128_c0_g1_i1::g.13]
-indexing [TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.30]
-indexing [TRINITY_DN129_c0_g3::TRINITY_DN129_c0_g3_i1::g.32]
-indexing [TRINITY_DN131_c0_g1::TRINITY_DN131_c0_g1_i1::g.4]
-indexing [TRINITY_DN134_c0_g1::TRINITY_DN134_c0_g1_i1::g.22]
-indexing [TRINITY_DN137_c0_g1::TRINITY_DN137_c0_g1_i1::g.18]
-indexing [TRINITY_DN139_c0_g1::TRINITY_DN139_c0_g1_i1::g.5]
-indexing [TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.24]
-indexing [TRINITY_DN142_c0_g1::TRINITY_DN142_c0_g1_i1::g.42]
-indexing [TRINITY_DN142_c0_g2::TRINITY_DN142_c0_g2_i1::g.44]
-indexing [TRINITY_DN143_c0_g1::TRINITY_DN143_c0_g1_i1::g.14]
-indexing [TRINITY_DN144_c0_g1::TRINITY_DN144_c0_g1_i1::g.12]
-indexing [TRINITY_DN150_c0_g1::TRINITY_DN150_c0_g1_i1::g.21]
-indexing [TRINITY_DN154_c0_g1::TRINITY_DN154_c0_g1_i1::g.3]
-indexing [TRINITY_DN155_c0_g1::TRINITY_DN155_c0_g1_i1::g.20]
-indexing [TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.34]
-indexing [TRINITY_DN4_c0_g2::TRINITY_DN4_c0_g2_i1::g.54]
-indexing [TRINITY_DN58_c0_g1::TRINITY_DN58_c0_g1_i1::g.55]
-indexing [TRINITY_DN60_c0_g1::TRINITY_DN60_c0_g1_i1::g.53]
-indexing [TRINITY_DN62_c0_g1::TRINITY_DN62_c0_g1_i1::g.51]
-indexing [TRINITY_DN69_c0_g2::TRINITY_DN69_c0_g2_i1::g.45]
-indexing [TRINITY_DN80_c0_g1::TRINITY_DN80_c0_g1_i1::g.50]
-indexing [TRINITY_DN83_c0_g2::TRINITY_DN83_c0_g2_i1::g.47]
-indexing [TRINITY_DN85_c0_g3::TRINITY_DN85_c0_g3_i1::g.49]
-indexing [TRINITY_DN96_c0_g2::TRINITY_DN96_c0_g2_i1::g.48]
CMD: /home/shared/TransDecoder-3.0.1/util/gff3_file_to_proteins.pl --gff3 Trinity.fasta.transdecoder.gff3 --fasta Trinity.fasta --seqType CDS > Trinity.fasta.transdecoder.cds
-indexing [TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.9]
-indexing [TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.37]
-indexing [TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.39]
-indexing [TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.2]
-indexing [TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.28]
-indexing [TRINITY_DN123_c0_g1::TRINITY_DN123_c0_g1_i1::g.15]
-indexing [TRINITY_DN123_c0_g5::TRINITY_DN123_c0_g5_i1::g.16]
-indexing [TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.6]
-indexing [TRINITY_DN127_c0_g3::TRINITY_DN127_c0_g3_i1::g.8]
-indexing [TRINITY_DN128_c0_g1::TRINITY_DN128_c0_g1_i1::g.13]
-indexing [TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.30]
-indexing [TRINITY_DN129_c0_g3::TRINITY_DN129_c0_g3_i1::g.32]
-indexing [TRINITY_DN131_c0_g1::TRINITY_DN131_c0_g1_i1::g.4]
-indexing [TRINITY_DN134_c0_g1::TRINITY_DN134_c0_g1_i1::g.22]
-indexing [TRINITY_DN137_c0_g1::TRINITY_DN137_c0_g1_i1::g.18]
-indexing [TRINITY_DN139_c0_g1::TRINITY_DN139_c0_g1_i1::g.5]
-indexing [TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.24]
-indexing [TRINITY_DN142_c0_g1::TRINITY_DN142_c0_g1_i1::g.42]
-indexing [TRINITY_DN142_c0_g2::TRINITY_DN142_c0_g2_i1::g.44]
-indexing [TRINITY_DN143_c0_g1::TRINITY_DN143_c0_g1_i1::g.14]
-indexing [TRINITY_DN144_c0_g1::TRINITY_DN144_c0_g1_i1::g.12]
-indexing [TRINITY_DN150_c0_g1::TRINITY_DN150_c0_g1_i1::g.21]
-indexing [TRINITY_DN154_c0_g1::TRINITY_DN154_c0_g1_i1::g.3]
-indexing [TRINITY_DN155_c0_g1::TRINITY_DN155_c0_g1_i1::g.20]
-indexing [TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.34]
-indexing [TRINITY_DN4_c0_g2::TRINITY_DN4_c0_g2_i1::g.54]
-indexing [TRINITY_DN58_c0_g1::TRINITY_DN58_c0_g1_i1::g.55]
-indexing [TRINITY_DN60_c0_g1::TRINITY_DN60_c0_g1_i1::g.53]
-indexing [TRINITY_DN62_c0_g1::TRINITY_DN62_c0_g1_i1::g.51]
-indexing [TRINITY_DN69_c0_g2::TRINITY_DN69_c0_g2_i1::g.45]
-indexing [TRINITY_DN80_c0_g1::TRINITY_DN80_c0_g1_i1::g.50]
-indexing [TRINITY_DN83_c0_g2::TRINITY_DN83_c0_g2_i1::g.47]
-indexing [TRINITY_DN85_c0_g3::TRINITY_DN85_c0_g3_i1::g.49]
-indexing [TRINITY_DN96_c0_g2::TRINITY_DN96_c0_g2_i1::g.48]
transdecoder is finished. See output files Trinity.fasta.transdecoder.*
list.files(pattern = "*.pep")
[1] "Trinity.fasta.transdecoder.pep"
There’s the transdecoder.pep file we needed. On to blastx!
system("blastx -query /home/sean/Documents/Trinity-Test/trinity_out_dir/Trinity.fasta -db /home/shared/Trinotate-3.0.2/admin/uniprot_sprot.pep -num_threads 8 -max_target_seqs 1 -outfmt 6 > blastx.outfmt6")
BLAST Database error: No alias or index file found for protein database [/home/shared/Trinotate-3.0.2/admin/uniprot_sprot.pep] in search path [/home/sean/Documents/Trinity-Test/trinity_out_dir::]
It looks like I forgot to run the makeblastdb command, so I did that real quick via sudo makeblastdb -in uniprot_sprot.pep -dbtype prot. Hopefully it’ll work better now.
system("blastx -query /home/sean/Documents/Trinity-Test/trinity_out_dir/Trinity.fasta -db /home/shared/Trinotate-3.0.2/admin/uniprot_sprot.pep -num_threads 8 -max_target_seqs 1 -outfmt 6 > blastx.outfmt6")
Lets check if blastx made our file.
list.files(pattern = "*.outfmt6")
[1] "blastx.outfmt6"
system("head blastx.outfmt6")
TRINITY_DN116_c0_g1_i1 YJBA_SCHPO 100.00 44 0 0 205 74 22 65 3e-23 94.0
TRINITY_DN116_c0_g2_i1 YJBA_SCHPO 100.00 23 0 0 2 70 93 115 2e-07 52.8
TRINITY_DN116_c0_g2_i1 YJBA_SCHPO 100.00 43 0 0 290 418 189 231 4e-06 48.5
TRINITY_DN152_c0_g1_i1 RL6_SCHPO 100.00 79 0 0 237 1 66 144 2e-49 160
TRINITY_DN154_c0_g1_i1 GLD1_SCHPO 100.00 133 0 0 554 156 318 450 1e-93 286
TRINITY_DN131_c0_g1_i1 GHT5_SCHPO 100.00 232 0 0 1 696 158 389 3e-156 452
TRINITY_DN131_c0_g2_i1 PTR31_ARATH 42.11 38 19 1 177 73 165 202 3.1 29.3
TRINITY_DN118_c0_g1_i1 RTN1_SCHPO 100.00 96 0 0 289 2 24 119 9e-49 162
TRINITY_DN133_c0_g2_i1 LSD90_SCHPO 97.06 34 1 0 102 1 1 34 1e-12 69.3
TRINITY_DN139_c0_g1_i1 HSP72_SCHPO 99.08 109 1 0 327 1 15 123 8e-73 234
Looks like it did!
next we run blastp
system("blastp -query Trinity.fasta.transdecoder.pep -db /home/shared/Trinotate-3.0.2/admin/uniprot_sprot.pep -num_threads 8 -max_target_seqs 1 -outfmt 6 > blastp.outfmt6")
On to HMMER, which required a similar database prep step of sudo hmmpress Pfam-A.hmm. Also, the databases are under the admin directory of Trinotate, so I’ll need to add that to my path variable.
system("hmmscan --cpu 8 --domtblout TrinotatePFAM.out /home/shared/Trinotate-3.0.2/admin/Pfam-A.hmm Trinity.fasta.transdecoder.pep > pfam.log")
Lets check if that was sucessful.
system("head pfam.log")
# hmmscan :: search sequence(s) against a profile database
# HMMER 3.1b2 (February 2015); http://hmmer.org/
# Copyright (C) 2015 Howard Hughes Medical Institute.
# Freely distributed under the GNU General Public License (GPLv3).
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query sequence file: Trinity.fasta.transdecoder.pep
# target HMM database: /home/shared/Trinotate-3.0.2/admin/Pfam-A.hmm
# per-dom hits tabular output: TrinotatePFAM.out
# number of worker threads: 8
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Looks good I guess?
Next, SignalP
system("signalp -f short -n signalp.out Trinity.fasta.transdecoder.pep")
# SignalP-4.1 euk predictions
# name Cmax pos Ymax pos Smax pos Smean D ? Dmaxcut Networks-used
# No sequences predicted with a signal peptide
TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.9::m.9 0.113 24 0.105 24 0.125 4 0.098 0.101 N 0.450 SignalP-noTM
TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.37::m.37 0.109 64 0.107 38 0.119 27 0.098 0.102 N 0.450 SignalP-noTM
TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.39::m.39 0.108 9 0.103 70 0.107 36 0.095 0.099 N 0.450 SignalP-noTM
TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.2::m.2 0.252 37 0.275 37 0.482 30 0.242 0.262 N 0.500 SignalP-TM
TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.28::m.28 0.125 41 0.152 41 0.339 36 0.152 0.152 N 0.450 SignalP-noTM
TRINITY_DN123_c0_g1::TRINITY_DN123_c0_g1_i1::g.15::m.15 0.129 35 0.117 11 0.155 1 0.108 0.112 N 0.450 SignalP-noTM
TRINITY_DN123_c0_g5::TRINITY_DN123_c0_g5_i1::g.16::m.16 0.108 19 0.112 16 0.137 13 0.116 0.114 N 0.450 SignalP-noTM
TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.6::m.6 0.119 39 0.133 31 0.222 15 0.154 0.144 N 0.450 SignalP-noTM
TRINITY_DN127_c0_g3::TRINITY_DN127_c0_g3_i1::g.8::m.8 0.123 14 0.135 14 0.284 20 0.131 0.133 N 0.450 SignalP-noTM
TRINITY_DN128_c0_g1::TRINITY_DN128_c0_g1_i1::g.13::m.13 0.159 29 0.155 29 0.218 20 0.149 0.152 N 0.450 SignalP-noTM
TRINITY_DN129_c0_g1::TRINITY_DN129_c0_g1_i1::g.30::m.30 0.118 26 0.116 26 0.168 25 0.113 0.114 N 0.450 SignalP-noTM
TRINITY_DN129_c0_g3::TRINITY_DN129_c0_g3_i1::g.32::m.32 0.117 43 0.136 43 0.376 36 0.137 0.136 N 0.450 SignalP-noTM
TRINITY_DN131_c0_g1::TRINITY_DN131_c0_g1_i1::g.4::m.4 0.112 18 0.140 18 0.303 11 0.165 0.150 N 0.500 SignalP-TM
TRINITY_DN134_c0_g1::TRINITY_DN134_c0_g1_i1::g.22::m.22 0.131 70 0.117 70 0.189 49 0.098 0.107 N 0.450 SignalP-noTM
TRINITY_DN137_c0_g1::TRINITY_DN137_c0_g1_i1::g.18::m.18 0.238 23 0.175 23 0.198 21 0.128 0.150 N 0.450 SignalP-noTM
TRINITY_DN139_c0_g1::TRINITY_DN139_c0_g1_i1::g.5::m.5 0.108 31 0.105 58 0.114 38 0.099 0.102 N 0.450 SignalP-noTM
TRINITY_DN141_c0_g1::TRINITY_DN141_c0_g1_i1::g.24::m.24 0.108 36 0.102 36 0.109 15 0.095 0.098 N 0.450 SignalP-noTM
TRINITY_DN142_c0_g1::TRINITY_DN142_c0_g1_i1::g.42::m.42 0.126 67 0.153 15 0.264 13 0.196 0.176 N 0.450 SignalP-noTM
TRINITY_DN142_c0_g2::TRINITY_DN142_c0_g2_i1::g.44::m.44 0.113 19 0.104 59 0.112 54 0.094 0.099 N 0.450 SignalP-noTM
TRINITY_DN143_c0_g1::TRINITY_DN143_c0_g1_i1::g.14::m.14 0.116 18 0.117 18 0.133 7 0.117 0.117 N 0.450 SignalP-noTM
TRINITY_DN144_c0_g1::TRINITY_DN144_c0_g1_i1::g.12::m.12 0.110 41 0.105 41 0.110 28 0.096 0.100 N 0.450 SignalP-noTM
TRINITY_DN150_c0_g1::TRINITY_DN150_c0_g1_i1::g.21::m.21 0.107 66 0.120 12 0.163 1 0.108 0.114 N 0.450 SignalP-noTM
TRINITY_DN154_c0_g1::TRINITY_DN154_c0_g1_i1::g.3::m.3 0.113 24 0.128 11 0.201 10 0.142 0.136 N 0.450 SignalP-noTM
TRINITY_DN155_c0_g1::TRINITY_DN155_c0_g1_i1::g.20::m.20 0.143 43 0.132 43 0.219 41 0.114 0.122 N 0.450 SignalP-noTM
TRINITY_DN156_c0_g1::TRINITY_DN156_c0_g1_i1::g.34::m.34 0.108 62 0.106 44 0.124 35 0.101 0.103 N 0.450 SignalP-noTM
TRINITY_DN4_c0_g2::TRINITY_DN4_c0_g2_i1::g.54::m.54 0.188 43 0.154 24 0.248 3 0.168 0.162 N 0.450 SignalP-noTM
TRINITY_DN58_c0_g1::TRINITY_DN58_c0_g1_i1::g.55::m.55 0.109 33 0.106 43 0.147 32 0.100 0.103 N 0.450 SignalP-noTM
TRINITY_DN60_c0_g1::TRINITY_DN60_c0_g1_i1::g.53::m.53 0.113 62 0.111 11 0.141 1 0.106 0.108 N 0.450 SignalP-noTM
TRINITY_DN62_c0_g1::TRINITY_DN62_c0_g1_i1::g.51::m.51 0.112 23 0.122 12 0.153 1 0.119 0.120 N 0.450 SignalP-noTM
TRINITY_DN69_c0_g2::TRINITY_DN69_c0_g2_i1::g.45::m.45 0.107 68 0.155 11 0.330 2 0.181 0.169 N 0.450 SignalP-noTM
TRINITY_DN80_c0_g1::TRINITY_DN80_c0_g1_i1::g.50::m.50 0.122 27 0.106 27 0.110 24 0.091 0.098 N 0.450 SignalP-noTM
TRINITY_DN83_c0_g2::TRINITY_DN83_c0_g2_i1::g.47::m.47 0.122 55 0.133 23 0.209 18 0.163 0.145 N 0.500 SignalP-TM
TRINITY_DN85_c0_g3::TRINITY_DN85_c0_g3_i1::g.49::m.49 0.127 28 0.111 28 0.113 27 0.097 0.104 N 0.450 SignalP-noTM
TRINITY_DN96_c0_g2::TRINITY_DN96_c0_g2_i1::g.48::m.48 0.215 21 0.153 21 0.161 43 0.106 0.128 N 0.450 SignalP-noTM
For some reason signalp didn’t produce the signalp.out it was supposed to. I just redirected it to signalp.out, and hopefully that will work the same way?
system("signalp -f short -n signalp.out Trinity.fasta.transdecoder.pep > signalp.out")
# No sequences predicted with a signal peptide
Next tmHMM. Looks like I only added /home/shared/tmhmm-2.0c/ to my path, and not /home/shared/tmhmm-2.0c/bin like I needed to. Oops. Once more!
system("/home/shared/tmhmm-2.0c/bin/tmhmm --short < Trinity.fasta.transdecoder.pep > tmhmm.out")
system("head tmhmm.out")
TRINITY_DN104_c0_g1::TRINITY_DN104_c0_g1_i1::g.9::m.9 len=341 ExpAA=1.02 First60=0.70 PredHel=0 Topology=o
TRINITY_DN110_c0_g1::TRINITY_DN110_c0_g1_i1::g.37::m.37 len=138 ExpAA=0.00 First60=0.00 PredHel=0 Topology=o
TRINITY_DN110_c0_g2::TRINITY_DN110_c0_g2_i1::g.39::m.39 len=175 ExpAA=0.00 First60=0.00 PredHel=0 Topology=o
TRINITY_DN116_c0_g2::TRINITY_DN116_c0_g2_i1::g.2::m.2 len=139 ExpAA=66.89 First60=20.85 PredHel=3 Topology=i16-38o81-103i115-137o
TRINITY_DN120_c0_g1::TRINITY_DN120_c0_g1_i1::g.28::m.28 len=106 ExpAA=0.04 First60=0.04 PredHel=0 Topology=o
TRINITY_DN123_c0_g1::TRINITY_DN123_c0_g1_i1::g.15::m.15 len=111 ExpAA=0.08 First60=0.08 PredHel=0 Topology=o
TRINITY_DN123_c0_g5::TRINITY_DN123_c0_g5_i1::g.16::m.16 len=137 ExpAA=0.05 First60=0.00 PredHel=0 Topology=o
TRINITY_DN127_c0_g2::TRINITY_DN127_c0_g2_i1::g.6::m.6 len=210 ExpAA=4.12 First60=4.11 PredHel=0 Topology=o
TRINITY_DN127_c0_g3::TRINITY_DN127_c0_g3_i1::g.8::m.8 len=165 ExpAA=0.85 First60=0.31 PredHel=0 Topology=o
TRINITY_DN128_c0_g1::TRINITY_DN128_c0_g1_i1::g.13::m.13 len=107 ExpAA=11.40 First60=11.38 PredHel=0 Topology=o
Thats some data. On to RNAMMER. Hopefully this works, as it was the thing we had to hack together to install.
system("/home/shared/Trinotate-3.0.2/util/rnammer_support/RnammerTranscriptome.pl --transcriptome Trinity.fasta --path_to_rnammer /home/shared/RNAMMER/rnammer")
CMD: /home/shared/Trinotate-3.0.2/util/rnammer_support/util/superScaffoldGenerator.pl Trinity.fasta transcriptSuperScaffold 100
acc: TRINITY_DN116_c0_g1_i1
acc: TRINITY_DN116_c0_g2_i1
acc: TRINITY_DN152_c0_g1_i1
acc: TRINITY_DN154_c0_g1_i1
acc: TRINITY_DN131_c0_g1_i1
acc: TRINITY_DN131_c0_g2_i1
acc: TRINITY_DN118_c0_g1_i1
acc: TRINITY_DN133_c0_g1_i1
acc: TRINITY_DN133_c0_g2_i1
acc: TRINITY_DN139_c0_g1_i1
acc: TRINITY_DN127_c0_g1_i1
acc: TRINITY_DN127_c0_g2_i1
acc: TRINITY_DN127_c0_g3_i1
acc: TRINITY_DN140_c0_g1_i1
acc: TRINITY_DN104_c0_g1_i1
acc: TRINITY_DN104_c0_g2_i1
acc: TRINITY_DN144_c0_g1_i1
acc: TRINITY_DN128_c0_g1_i1
acc: TRINITY_DN128_c0_g2_i1
acc: TRINITY_DN143_c0_g1_i1
acc: TRINITY_DN123_c0_g1_i1
acc: TRINITY_DN123_c0_g2_i1
acc: TRINITY_DN123_c0_g3_i1
acc: TRINITY_DN123_c0_g4_i1
acc: TRINITY_DN123_c0_g5_i1
acc: TRINITY_DN126_c0_g1_i1
acc: TRINITY_DN137_c0_g1_i1
acc: TRINITY_DN137_c0_g2_i1
acc: TRINITY_DN122_c0_g1_i1
acc: TRINITY_DN122_c0_g2_i1
acc: TRINITY_DN112_c0_g1_i1
acc: TRINITY_DN112_c0_g2_i1
acc: TRINITY_DN155_c0_g1_i1
acc: TRINITY_DN150_c0_g1_i1
acc: TRINITY_DN157_c0_g1_i1
acc: TRINITY_DN117_c0_g1_i1
acc: TRINITY_DN136_c0_g1_i1
acc: TRINITY_DN111_c0_g1_i1
acc: TRINITY_DN111_c0_g2_i1
acc: TRINITY_DN125_c0_g1_i1
acc: TRINITY_DN148_c0_g1_i1
acc: TRINITY_DN105_c0_g1_i1
acc: TRINITY_DN105_c0_g2_i1
acc: TRINITY_DN153_c0_g1_i1
acc: TRINITY_DN134_c0_g1_i1
acc: TRINITY_DN145_c0_g1_i1
acc: TRINITY_DN119_c0_g1_i1
acc: TRINITY_DN119_c0_g2_i1
acc: TRINITY_DN119_c0_g3_i1
acc: TRINITY_DN119_c0_g4_i1
acc: TRINITY_DN141_c0_g1_i1
acc: TRINITY_DN141_c0_g2_i1
acc: TRINITY_DN120_c0_g1_i1
acc: TRINITY_DN146_c0_g1_i1
acc: TRINITY_DN138_c0_g1_i1
acc: TRINITY_DN114_c0_g1_i1
acc: TRINITY_DN129_c0_g1_i1
acc: TRINITY_DN129_c0_g2_i1
acc: TRINITY_DN129_c0_g3_i1
acc: TRINITY_DN124_c0_g1_i1
acc: TRINITY_DN124_c0_g2_i1
acc: TRINITY_DN124_c1_g1_i1
acc: TRINITY_DN130_c0_g1_i1
acc: TRINITY_DN130_c0_g2_i1
acc: TRINITY_DN147_c0_g1_i1
acc: TRINITY_DN156_c0_g1_i1
acc: TRINITY_DN132_c0_g1_i1
acc: TRINITY_DN158_c0_g1_i1
acc: TRINITY_DN110_c0_g1_i1
acc: TRINITY_DN110_c0_g2_i1
acc: TRINITY_DN142_c0_g1_i1
acc: TRINITY_DN142_c0_g2_i1
acc: TRINITY_DN151_c0_g1_i1
acc: TRINITY_DN69_c0_g1_i1
acc: TRINITY_DN69_c0_g2_i1
acc: TRINITY_DN65_c0_g1_i1
acc: TRINITY_DN83_c0_g1_i1
acc: TRINITY_DN83_c0_g2_i1
acc: TRINITY_DN6_c0_g1_i1
acc: TRINITY_DN45_c0_g1_i1
acc: TRINITY_DN78_c0_g1_i1
acc: TRINITY_DN12_c0_g1_i1
acc: TRINITY_DN96_c0_g1_i1
acc: TRINITY_DN96_c0_g2_i1
acc: TRINITY_DN66_c0_g1_i1
acc: TRINITY_DN61_c0_g1_i1
acc: TRINITY_DN89_c0_g1_i1
acc: TRINITY_DN1_c0_g1_i1
acc: TRINITY_DN81_c0_g1_i1
acc: TRINITY_DN73_c0_g1_i1
acc: TRINITY_DN85_c0_g1_i1
acc: TRINITY_DN85_c0_g2_i1
acc: TRINITY_DN85_c0_g3_i1
acc: TRINITY_DN80_c0_g1_i1
acc: TRINITY_DN62_c0_g1_i1
acc: TRINITY_DN62_c0_g2_i1
acc: TRINITY_DN9_c0_g1_i1
acc: TRINITY_DN38_c0_g1_i1
acc: TRINITY_DN38_c0_g2_i1
acc: TRINITY_DN27_c0_g1_i1
acc: TRINITY_DN60_c0_g1_i1
acc: TRINITY_DN40_c0_g1_i1
acc: TRINITY_DN95_c0_g1_i1
acc: TRINITY_DN49_c0_g1_i1
acc: TRINITY_DN67_c0_g1_i1
acc: TRINITY_DN28_c0_g1_i1
acc: TRINITY_DN91_c0_g1_i1
acc: TRINITY_DN93_c0_g1_i1
acc: TRINITY_DN93_c0_g2_i1
acc: TRINITY_DN17_c0_g1_i1
acc: TRINITY_DN7_c0_g1_i1
acc: TRINITY_DN4_c0_g1_i1
acc: TRINITY_DN4_c0_g2_i1
acc: TRINITY_DN58_c0_g1_i1
acc: TRINITY_DN59_c0_g1_i1
Done.
CMD: perl /home/shared/RNAMMER/rnammer -S euk -m tsu,lsu,ssu -gff tmp.superscaff.rnammer.gff < transcriptSuperScaffold.fasta
CMD: /home/shared/Trinotate-3.0.2/util/rnammer_support/util/rnammer_supperscaffold_gff_to_indiv_transcripts.pl -R tmp.superscaff.rnammer.gff -T transcriptSuperScaffold.bed > Trinity.fasta.rnammer.gff
WARNING: No RNAMMER features are described in file: tmp.superscaff.rnammer.gff at /home/shared/Trinotate-3.0.2/util/rnammer_support/util/rnammer_supperscaffold_gff_to_indiv_transcripts.pl line 46.
Error, cmd: /home/shared/Trinotate-3.0.2/util/rnammer_support/util/rnammer_supperscaffold_gff_to_indiv_transcripts.pl -R tmp.superscaff.rnammer.gff -T transcriptSuperScaffold.bed > Trinity.fasta.rnammer.gff died with ret 65280 at /home/shared/Trinotate-3.0.2/util/rnammer_support/RnammerTranscriptome.pl line 80.
Well… Hmm. I tested RNAMMER the way the Trinotate manual suggested and it worked. Not 100% sure what’s going on here, but since RNAMMER is optional for the funcionality of Trinotate, I’ll skip this for now, and do some more reaseach as to why RNAMMER isn’t outputting anything to the .gff file. Strange.
Now we start combining things in Trinotate’s SQL database. Note, the next command uses Trinity, not Trinotate. Don’t make the same mistake I did.
system("/home/shared/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/get_Trinity_gene_to_trans_map.pl Trinity.fasta > Trinity.fasta.gene_trans_map")
system("head Trinity.fasta.gene_trans_map")
TRINITY_DN116_c0_g1 TRINITY_DN116_c0_g1_i1
TRINITY_DN116_c0_g2 TRINITY_DN116_c0_g2_i1
TRINITY_DN152_c0_g1 TRINITY_DN152_c0_g1_i1
TRINITY_DN154_c0_g1 TRINITY_DN154_c0_g1_i1
TRINITY_DN131_c0_g1 TRINITY_DN131_c0_g1_i1
TRINITY_DN131_c0_g2 TRINITY_DN131_c0_g2_i1
TRINITY_DN118_c0_g1 TRINITY_DN118_c0_g1_i1
TRINITY_DN133_c0_g1 TRINITY_DN133_c0_g1_i1
TRINITY_DN133_c0_g2 TRINITY_DN133_c0_g2_i1
TRINITY_DN139_c0_g1 TRINITY_DN139_c0_g1_i1
Looks like there’s some stuff there.
system("/home/shared/Trinotate-3.0.2/Trinotate /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite init --gene_trans_map Trinity.fasta.gene_trans_map --transcript_fasta Trinity.fasta --transdecoder_pep Trinity.fasta.transdecoder.pep")
CMD: /home/shared/Trinotate-3.0.2/util/trinotateSeqLoader/TrinotateSeqLoader.pl --sqlite /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite --gene_trans_map Trinity.fasta.gene_trans_map --transcript_fasta Trinity.fasta --transdecoder_pep Trinity.fasta.transdecoder.pep --bulk_load
-parsing gene/trans map file.... done.
DBD::SQLite::db do failed: attempt to write a readonly database at /home/shared/Trinotate-3.0.2/util/trinotateSeqLoader/TrinotateSeqLoader.pl line 93.
No such file or directory at /home/shared/Trinotate-3.0.2/util/trinotateSeqLoader/TrinotateSeqLoader.pl line 93.
Error, cmd: /home/shared/Trinotate-3.0.2/util/trinotateSeqLoader/TrinotateSeqLoader.pl --sqlite /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite --gene_trans_map Trinity.fasta.gene_trans_map --transcript_fasta Trinity.fasta --transdecoder_pep Trinity.fasta.transdecoder.pep --bulk_load died with ret 512 at /home/shared/Trinotate-3.0.2/Trinotate line 126.
Well, looks like installing things in /home/shared/ may have finally bit us. Trinotate wants to write to the Trinotate.sql stored in the Trinotate directory, but doesn’t have proper rights to do that. Will have to figure out how to fix that long term, but I’ll just run things that need to update those files as sudo in the terminal, and paste the commands here for now.
Next we load the blastx results.
system("sudo /home/shared/Trinotate-3.0.2/Trinotate /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite LOAD_swissprot_blastp blastx.outfmt6")
And then the blastp results
system("sudo /home/shared/Trinotate-3.0.2/Trinotate /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite LOAD_swissprot_blastp blastp.outfmt6")
then PFAM results.
system("sudo /home/shared/Trinotate-3.0.2/Trinotate /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite LOAD_pfam TrinotatePFAM.out")
And signalp results
system("sudo /home/shared/Trinotate-3.0.2/Trinotate /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite LOAD_signalp signalp.out")
Finally the transmembrane domains!
system("sudo /home/shared/Trinotate-3.0.2/Trinotate /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite LOAD_tmhmm tmhmm.out")
I believe we’re done loading results, except for the missing RNAMMER results. Lets make an Annotation file
system("/home/shared/Trinotate-3.0.2/Trinotate /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite report > results.txt")
CMD: /home/shared/Trinotate-3.0.2/util/Trinotate_report_writer.pl --sqlite /home/shared/Trinotate-3.0.2/admin/Trinotate.sqlite
library(readr)
results <- read_delim("~/Documents/Trinity-Test/trinity_out_dir/results.txt",
"\t", escape_double = FALSE, trim_ws = TRUE)
Parsed with column specification:
cols(
`#gene_id` = col_character(),
transcript_id = col_character(),
sprot_Top_BLASTX_hit = col_character(),
RNAMMER = col_character(),
prot_id = col_character(),
prot_coords = col_character(),
sprot_Top_BLASTP_hit = col_character(),
Pfam = col_character(),
SignalP = col_character(),
TmHMM = col_character(),
eggnog = col_character(),
Kegg = col_character(),
gene_ontology_blast = col_character(),
gene_ontology_pfam = col_character(),
transcript = col_character(),
peptide = col_character()
)
head(results)
Well, hopefully that’s what we were looking for?