PLoS One. 2013;8(3):e59582. doi: 10.1371/journal.pone.0059582.
AWS r5.2xlarge and CentOS linux 7 on a Virtual Machine
drwxrwxr-x 2 ubuntu ubuntu 4096 Apr 10 02:48 f344_pfc_nicotine
drwxrwxr-x 2 ubuntu ubuntu 4096 Apr 10 03:28 f344_pfc_saline
-rw-rw-r– 1 ubuntu ubuntu 1918376826 Apr 7 13:08 SRR869032.sra
-rw-rw-r– 1 ubuntu ubuntu 1512702873 Apr 10 03:07 SRR869032_1.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 1565027421 Apr 10 03:07 SRR869032_2.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 1989086241 Apr 7 13:09 SRR869033.sra
-rw-rw-r– 1 ubuntu ubuntu 1559510818 Apr 10 03:11 SRR869033_1.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 1612258618 Apr 10 03:11 SRR869033_2.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 1978327667 Apr 7 13:10 SRR869034.sra
-rw-rw-r– 1 ubuntu ubuntu 1568979784 Apr 10 03:12 SRR869034_1.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 1615385331 Apr 10 03:12 SRR869034_2.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 2681330074 Apr 7 12:59 SRR869044.sra
-rw-rw-r– 1 ubuntu ubuntu 2119745955 Apr 10 03:10 SRR869044_1.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 2202580482 Apr 10 03:10 SRR869044_2.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 2709383194 Apr 7 13:01 SRR869045.sra
-rw-rw-r– 1 ubuntu ubuntu 2103892097 Apr 10 03:12 SRR869045_1.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 2198363062 Apr 10 03:12 SRR869045_2.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 2614023328 Apr 7 13:02 SRR869046.sra
-rw-rw-r– 1 ubuntu ubuntu 2054942375 Apr 10 03:13 SRR869046_1.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 2138446269 Apr 10 03:13 SRR869046_2.fastq.gz
-rw-rw-r– 1 ubuntu ubuntu 2927607333 Apr 9 19:26 rn6.fa
-rw-rw-r– 1 ubuntu ubuntu 40654 Apr 9 19:26 rn6.fa.fai
-rw-rw-r– 1 ubuntu ubuntu 61891341 Apr 9 19:27 refGene.gtf
-rw-rw-r– 1 ubuntu ubuntu 5143380 Apr 9 19:27 refGene.input
-rw-rw-r– 1 ubuntu ubuntu 5226507 Apr 9 19:27 refGene.txt
$ STAR --runThreadN 16 --runMode genomeGenerate --genomeDir StarIndexRat/ --genomeFastaFiles rn6/rn6.fa
Apr 09 18:39:04 ….. started STAR run
Apr 09 18:39:04 … starting to generate Genome files
Apr 09 18:40:18 … starting to sort Suffix Array. This may take a long time…
Apr 09 18:40:32 … sorting Suffix Array chunks and saving them to disk…
Apr 09 19:05:33 … loading chunks from disk, packing SA…
Apr 09 19:06:42 … finished generating suffix array
Apr 09 19:06:42 … generating Suffix Array index
Apr 09 19:10:57 … completed Suffix Array index
Apr 09 19:10:57 … writing Genome to disk …
Apr 09 19:10:59 … writing Suffix Array to disk …
Apr 09 19:12:19 … writing SAindex to disk
Apr 09 19:12:27 ….. finished successfully
-rw-rw-r– 1 ubuntu ubuntu 3095134208 Apr 9 19:10 Genome
-rw-rw-r– 1 ubuntu ubuntu 22521351645 Apr 9 19:12 SA
-rw-rw-r– 1 ubuntu ubuntu 1565873619 Apr 9 19:12 SAindex
-rw-rw-r– 1 ubuntu ubuntu 5213 Apr 9 18:40 chrLength.txt
-rw-rw-r– 1 ubuntu ubuntu 19347 Apr 9 18:40 chrName.txt
-rw-rw-r– 1 ubuntu ubuntu 24560 Apr 9 18:40 chrNameLength.txt
-rw-rw-r– 1 ubuntu ubuntu 10387 Apr 9 18:40 chrStart.txt
-rw-rw-r– 1 ubuntu ubuntu 479 Apr 9 19:10 genomeParameters.txt
~/results/rnaseq/Rn/star_rat$ STAR
--runThreadN 12
--genomeDir StarIndexRat/
--sjdbGTFfile ~/genome_rat/refGene.gtf
--sjdbOverhang 100
--readFilesIn ~/PLoS-One-8-e59582/f344_pfc_saline/SRR869044_1.fastq.gz
~/PLoS-One-8-e59582/f344_pfc_saline/SRR869044_2.fastq.gz
--readFilesCommand zcat
--quantMode TranscriptomeSAM
--outFileNamePrefix 1align/
Apr 10 03:30:30 ….. started STAR run
Apr 10 03:30:30 ….. loading genome
Apr 10 03:32:39 ….. processing annotations GTF
Apr 10 03:32:43 ….. inserting junctions into the genome indices
Apr 10 03:37:23 ….. started mapping
Apr 10 03:42:29 ….. finished successfully
-rw-rw-r– 1 ubuntu ubuntu 15933444986 Apr 10 03:42 Aligned.out.sam
-rw-rw-r– 1 ubuntu ubuntu 2358643579 Apr 10 03:42 Aligned.toTranscriptome.out.bam
-rw-rw-r– 1 ubuntu ubuntu 1862 Apr 10 03:42 Log.final.out
-rw-rw-r– 1 ubuntu ubuntu 272935 Apr 10 03:42 Log.out
-rw-rw-r– 1 ubuntu ubuntu 836 Apr 10 03:42 Log.progress.out
-rw-rw-r– 1 ubuntu ubuntu 7368886 Apr 10 03:42 SJ.out.tab
drwx—— 2 ubuntu ubuntu 4096 Apr 10 03:32 _STARgenome
Started job on | Apr 10 03:30:30
Started mapping on | Apr 10 03:37:23
Finished on | Apr 10 03:42:29
Mapping speed, Million of reads per hour | 489.23
Number of input reads | 41584891
Average input read length | 100
UNIQUE READS:
Uniquely mapped reads number | 34796331
Uniquely mapped reads % | 83.68%
Average mapped length | 99.37
Number of splices: Total | 7910199
Number of splices: Annotated (sjdb) | 7194940
Number of splices: GT/AG | 7764677
Number of splices: GC/AG | 62004
Number of splices: AT/AC | 7634
Number of splices: Non-canonical | 75884
Mismatch rate per base, % | 0.37%
Deletion rate per base | 0.01%
Deletion average length | 1.61
Insertion rate per base | 0.00%
Insertion average length | 1.35
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 2553852
% of reads mapped to multiple loci | 6.14%
Number of reads mapped to too many loci | 57308
% of reads mapped to too many loci | 0.14%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 9.92%
% of reads unmapped: other | 0.13%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
~/results/rnaseq/Rn/star_rat/transcriptomeOut$ STAR --runThreadN 12 --genomeDir ~/genomeAndIndices/Rn/star/ --sjdbGTFfile ~/genomeAndIndices/Rn/rn6.refGene.gtf --sjdbOverhang 100 --readFilesIn ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_saline/SRR869045_1.fastq.gz ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_saline/SRR869045_2.fastq.gz --readFilesCommand zcat --quantMode TranscriptomeSAM --outFileNamePrefix 2align/
Apr 11 15:08:19 ….. started STAR run
Apr 11 15:08:20 ….. loading genome
Apr 11 15:12:01 ….. processing annotations GTF
Apr 11 15:12:25 ….. inserting junctions into the genome indices
Apr 11 15:37:46 ….. started mapping
Apr 11 15:58:54 ….. finished successfully
-rw-rw-r– 1 ubuntu ubuntu 14869436644 Apr 11 15:58 Aligned.out.sam
-rw-rw-r– 1 ubuntu ubuntu 2216166668 Apr 11 15:58 Aligned.toTranscriptome.out.bam
-rw-rw-r– 1 ubuntu ubuntu 1863 Apr 11 15:58 Log.final.out
-rw-rw-r– 1 ubuntu ubuntu 290971 Apr 11 15:58 Log.out
-rw-rw-r– 1 ubuntu ubuntu 2488 Apr 11 15:58 Log.progress.out
-rw-rw-r– 1 ubuntu ubuntu 7278953 Apr 11 15:58 SJ.out.tab
drwx—— 2 ubuntu ubuntu 4096 Apr 11 15:12 _STARgenome
Started job on | Apr 11 15:08:19
Started mapping on | Apr 11 15:37:46
Finished on | Apr 11 15:58:54
Mapping speed, Million of reads per hour | 116.02
Number of input reads | 40866035
Average input read length | 100
UNIQUE READS:
Uniquely mapped reads number | 32594115
Uniquely mapped reads % | 79.76%
Average mapped length | 99.35
Number of splices: Total | 7372181
Number of splices: Annotated (sjdb) | 6695155
Number of splices: GT/AG | 7235135
Number of splices: GC/AG | 58523
Number of splices: AT/AC | 6995
Number of splices: Non-canonical | 71528
Mismatch rate per base, % | 0.38%
Deletion rate per base | 0.01%
Deletion average length | 1.61
Insertion rate per base | 0.01%
Insertion average length | 1.35
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 2353903
% of reads mapped to multiple loci | 5.76%
Number of reads mapped to too many loci | 52876
% of reads mapped to too many loci | 0.13%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 14.23%
% of reads unmapped: other | 0.12%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
~/results/rnaseq/Rn/star_rat/transcriptomeOut$ STAR --runThreadN 12 --genomeDir ~/genomeAndIndices/Rn/star/ --sjdbGTFfile ~/genomeAndIndices/Rn/rn6.refGene.gtf --sjdbOverhang 100 --readFilesIn ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_saline/SRR869046_1.fastq.gz ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_saline/SRR869046_2.fastq.gz --readFilesCommand zcat --quantMode TranscriptomeSAM --outFileNamePrefix 3align/
Apr 11 16:00:52 ….. started STAR run
Apr 11 16:00:52 ….. loading genome
Apr 11 16:02:35 ….. processing annotations GTF
Apr 11 16:02:37 ….. inserting junctions into the genome indices
Apr 11 16:05:23 ….. started mapping
Apr 11 16:11:29 ….. finished successfully
-rw-rw-r– 1 ubuntu ubuntu 14998319245 Apr 11 16:11 Aligned.out.sam
-rw-rw-r– 1 ubuntu ubuntu 2183814631 Apr 11 16:11 Aligned.toTranscriptome.out.bam
-rw-rw-r– 1 ubuntu ubuntu 1863 Apr 11 16:11 Log.final.out
-rw-rw-r– 1 ubuntu ubuntu 290970 Apr 11 16:11 Log.out
-rw-rw-r– 1 ubuntu ubuntu 954 Apr 11 16:11 Log.progress.out
-rw-rw-r– 1 ubuntu ubuntu 7220923 Apr 11 16:11 SJ.out.tab
drwx—— 2 ubuntu ubuntu 4096 Apr 11 16:02 _STARgenome
Started job on | Apr 11 16:00:52
Started mapping on | Apr 11 16:05:23
Finished on | Apr 11 16:11:29
Mapping speed, Million of reads per hour | 387.26
Number of input reads | 39370935
Average input read length | 100
UNIQUE READS:
Uniquely mapped reads number | 32717958
Uniquely mapped reads % | 83.10%
Average mapped length | 99.35
Number of splices: Total | 7102328
Number of splices: Annotated (sjdb) | 6458206
Number of splices: GT/AG | 6971391
Number of splices: GC/AG | 55634
Number of splices: AT/AC | 6999
Number of splices: Non-canonical | 68304
Mismatch rate per base, % | 0.40%
Deletion rate per base | 0.01%
Deletion average length | 1.61
Insertion rate per base | 0.01%
Insertion average length | 1.35
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 2462309
% of reads mapped to multiple loci | 6.25%
Number of reads mapped to too many loci | 51917
% of reads mapped to too many loci | 0.13%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 10.38%
% of reads unmapped: other | 0.13%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
~/results/rnaseq/Rn/star_rat/transcriptomeOut$ STAR --runThreadN 12 --genomeDir ~/genomeAndIndices/Rn/star/ --sjdbGTFfile ~/genomeAndIndices/Rn/rn6.refGene.gtf --sjdbOverhang 100 --readFilesIn ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_nicotine/SRR869032_1.fastq.gz ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_nicotine/SRR869032_2.fastq.gz --readFilesCommand zcat --quantMode TranscriptomeSAM --outFileNamePrefix 4align/
Apr 11 16:21:26 ….. started STAR run
Apr 11 16:21:26 ….. loading genome
Apr 11 16:23:09 ….. processing annotations GTF
Apr 11 16:23:10 ….. inserting junctions into the genome indices
Apr 11 16:25:24 ….. started mapping
Apr 11 16:28:58 ….. finished successfully
-rw-rw-r– 1 ubuntu ubuntu 10933116395 Apr 11 16:28 Aligned.out.sam
-rw-rw-r– 1 ubuntu ubuntu 1617775311 Apr 11 16:28 Aligned.toTranscriptome.out.bam
-rw-rw-r– 1 ubuntu ubuntu 1863 Apr 11 16:28 Log.final.out
-rw-rw-r– 1 ubuntu ubuntu 291006 Apr 11 16:28 Log.out
-rw-rw-r– 1 ubuntu ubuntu 600 Apr 11 16:28 Log.progress.out
-rw-rw-r– 1 ubuntu ubuntu 6778075 Apr 11 16:28 SJ.out.tab
drwx—— 2 ubuntu ubuntu 4096 Apr 11 16:23 _STARgenome
Started job on | Apr 11 16:21:26
Started mapping on | Apr 11 16:25:24
Finished on | Apr 11 16:28:58
Mapping speed, Million of reads per hour | 486.66
Number of input reads | 28929485
Average input read length | 100
UNIQUE READS:
Uniquely mapped reads number | 23884850
Uniquely mapped reads % | 82.56%
Average mapped length | 99.37
Number of splices: Total | 5337591
Number of splices: Annotated (sjdb) | 4843698
Number of splices: GT/AG | 5237284
Number of splices: GC/AG | 42360
Number of splices: AT/AC | 5126
Number of splices: Non-canonical | 52821
Mismatch rate per base, % | 0.38%
Deletion rate per base | 0.01%
Deletion average length | 1.61
Insertion rate per base | 0.01%
Insertion average length | 1.35
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 1745210
% of reads mapped to multiple loci | 6.03%
Number of reads mapped to too many loci | 41759
% of reads mapped to too many loci | 0.14%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 11.13%
% of reads unmapped: other | 0.14%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
~/results/rnaseq/Rn/star_rat/transcriptomeOut$ STAR --runThreadN 12 --genomeDir ~/genomeAndIndices/Rn/star/ --sjdbGTFfile ~/genomeAndIndices/Rn/rn6.refGene.gtf --sjdbOverhang 100 --readFilesIn ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_nicotine/SRR869033_1.fastq.gz ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_nicotine/SRR869033_2.fastq.gz --readFilesCommand zcat --quantMode TranscriptomeSAM --outFileNamePrefix 5align/
Apr 11 16:32:44 ….. started STAR run
Apr 11 16:32:44 ….. loading genome
Apr 11 16:34:26 ….. processing annotations GTF
Apr 11 16:34:28 ….. inserting junctions into the genome indices
Apr 11 16:36:40 ….. started mapping
Apr 11 16:40:24 ….. finished successfully
-rw-rw-r– 1 ubuntu ubuntu 11152871716 Apr 11 16:40 Aligned.out.sam
-rw-rw-r– 1 ubuntu ubuntu 1650315098 Apr 11 16:40 Aligned.toTranscriptome.out.bam
-rw-rw-r– 1 ubuntu ubuntu 1863 Apr 11 16:40 Log.final.out
-rw-rw-r– 1 ubuntu ubuntu 291006 Apr 11 16:40 Log.out
-rw-rw-r– 1 ubuntu ubuntu 600 Apr 11 16:40 Log.progress.out
-rw-rw-r– 1 ubuntu ubuntu 6943028 Apr 11 16:40 SJ.out.tab
drwx—— 2 ubuntu ubuntu 4096 Apr 11 16:34 _STARgenome
Started job on | Apr 11 16:32:44
Started mapping on | Apr 11 16:36:40
Finished on | Apr 11 16:40:24
Mapping speed, Million of reads per hour | 483.07
Number of input reads | 30057458
Average input read length | 100
UNIQUE READS:
Uniquely mapped reads number | 24242026
Uniquely mapped reads % | 80.65%
Average mapped length | 99.39
Number of splices: Total | 5464310
Number of splices: Annotated (sjdb) | 4969845
Number of splices: GT/AG | 5365023
Number of splices: GC/AG | 42481
Number of splices: AT/AC | 5178
Number of splices: Non-canonical | 51628
Mismatch rate per base, % | 0.37%
Deletion rate per base | 0.01%
Deletion average length | 1.60
Insertion rate per base | 0.00%
Insertion average length | 1.36
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 1816245
% of reads mapped to multiple loci | 6.04%
Number of reads mapped to too many loci | 44886
% of reads mapped to too many loci | 0.15%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 13.03%
% of reads unmapped: other | 0.13%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
~/results/rnaseq/Rn/star_rat/transcriptomeOut$ STAR --runThreadN 12 --genomeDir ~/genomeAndIndices/Rn/star/ --sjdbGTFfile ~/genomeAndIndices/Rn/rn6.refGene.gtf --sjdbOverhang 100 --readFilesIn ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_nicotine/SRR869034_1.fastq.gz ~/data/rnaseq/Rn/PLoS-One-8-e59582/f344_pfc_nicotine/SRR869034_2.fastq.gz --readFilesCommand zcat --quantMode TranscriptomeSAM --outFileNamePrefix 6align/
Apr 11 16:41:46 ….. started STAR run
Apr 11 16:41:46 ….. loading genome
Apr 11 16:42:54 ….. processing annotations GTF
Apr 11 16:42:55 ….. inserting junctions into the genome indices
Apr 11 16:45:08 ….. started mapping
Apr 11 16:48:49 ….. finished successfully
-rw-rw-r– 1 ubuntu ubuntu 11451915825 Apr 11 16:48 Aligned.out.sam
-rw-rw-r– 1 ubuntu ubuntu 1719135420 Apr 11 16:48 Aligned.toTranscriptome.out.bam
-rw-rw-r– 1 ubuntu ubuntu 1862 Apr 11 16:48 Log.final.out
-rw-rw-r– 1 ubuntu ubuntu 291006 Apr 11 16:48 Log.out
-rw-rw-r– 1 ubuntu ubuntu 600 Apr 11 16:48 Log.progress.out
-rw-rw-r– 1 ubuntu ubuntu 6840437 Apr 11 16:48 SJ.out.tab
drwx—— 2 ubuntu ubuntu 4096 Apr 11 16:42 _STARgenome
Started job on | Apr 11 16:41:46
Started mapping on | Apr 11 16:45:08
Finished on | Apr 11 16:48:49
Mapping speed, Million of reads per hour | 485.67
Number of input reads | 29814687
Average input read length | 100
UNIQUE READS:
Uniquely mapped reads number | 25029582
Uniquely mapped reads % | 83.95%
Average mapped length | 99.40
Number of splices: Total | 5664801
Number of splices: Annotated (sjdb) | 5152217
Number of splices: GT/AG | 5560246
Number of splices: GC/AG | 44589
Number of splices: AT/AC | 5504
Number of splices: Non-canonical | 54462
Mismatch rate per base, % | 0.38%
Deletion rate per base | 0.01%
Deletion average length | 1.61
Insertion rate per base | 0.00%
Insertion average length | 1.36
MULTI-MAPPING READS:
Number of reads mapped to multiple loci | 1834753
% of reads mapped to multiple loci | 6.15%
Number of reads mapped to too many loci | 40924
% of reads mapped to too many loci | 0.14%
UNMAPPED READS:
% of reads unmapped: too many mismatches | 0.00%
% of reads unmapped: too short | 9.63%
% of reads unmapped: other | 0.13%
CHIMERIC READS:
Number of chimeric reads | 0
% of chimeric reads | 0.00%
BAM files were imported to CentOS as directories (1s, 2s, 3s, 4s, 5s and 6s) for RSEM analysis. The genome and GTF files are were also imported (& also renamed).
drwxr-xr-x. 2 bdash bdash 45 Apr 11 20:17 1s
drwxr-xr-x. 2 bdash bdash 45 Apr 11 18:21 2s
drwxr-xr-x. 2 bdash bdash 45 Apr 11 20:20 3s
drwxr-xr-x. 2 bdash bdash 45 Apr 11 20:21 4s
drwxr-xr-x. 2 bdash bdash 45 Apr 11 20:27 5s
drwxr-xr-x. 2 bdash bdash 45 Apr 11 20:27 6s
-rw-rw-r–. 1 bdash bdash 2927607333 Apr 9 15:26 rn6.fa
-rw-rw-r–. 1 bdash bdash 40654 Apr 9 15:26 rn6.fa.fai
-rw-rw-r–. 1 bdash bdash 61891341 Apr 9 15:27 rn6.refGene.gtf
rsem-extract-reference-transcripts ./rsemRef 0 rn6.refGene.gtf None 0 rn6.fa
Parsed 200000 lines
Parsing gtf File is done!
rn6.fa is processed!
18939 transcripts are extracted and 0 transcripts are omitted.
Extracting sequences is done!
Group File is generated!
Transcript Information File is generated!
Chromosome List File is generated!
Extracted Sequences File is generated!
rsem-preref ./rsemRef.transcripts.fa 1 ./rsemRef
Refs.makeRefs finished!
Refs.saveRefs finished!
./rsemRef.idx.fa is generated!
./rsemRef.n2g.idx.fa is generated!
-rw-rw-r–. 1 bdash bdash 1293 Apr 11 21:06 rsemRef.chrlist
-rw-rw-r–. 1 bdash bdash 93718 Apr 11 21:06 rsemRef.grp
-rw-rw-r–. 1 bdash bdash 43007289 Apr 11 21:06 rsemRef.idx.fa
-rw-rw-r–. 1 bdash bdash 43007289 Apr 11 21:06 rsemRef.n2g.idx.fa
-rw-rw-r–. 1 bdash bdash 45858412 Apr 11 21:06 rsemRef.seq
-rw-rw-r–. 1 bdash bdash 6041108 Apr 11 21:06 rsemRef.ti
-rw-rw-r–. 1 bdash bdash 43007289 Apr 11 21:06 rsemRef.transcripts.fa
drwxrwxr-x. 3 bdash bdash 66 Apr 11 22:41 1out
drwxrwxr-x. 3 bdash bdash 66 Apr 11 23:56 2out
drwxrwxr-x. 3 bdash bdash 66 Apr 12 03:48 3out
drwxrwxr-x. 3 bdash bdash 66 Apr 12 03:13 4out
drwxrwxr-x. 3 bdash bdash 66 Apr 12 02:48 5out
drwxrwxr-x. 3 bdash bdash 66 Apr 12 01:54 6out
-rw-rw-r–. 1 bdash bdash 309141 Apr 11 22:41 rsem1.log
-rw-rw-r–. 1 bdash bdash 353855 Apr 11 23:56 rsem2.log
-rw-rw-r–. 1 bdash bdash 255588 Apr 12 03:48 rsem3.log
-rw-rw-r–. 1 bdash bdash 326662 Apr 12 03:13 rsem4.log
-rw-rw-r–. 1 bdash bdash 366472 Apr 12 02:48 rsem5.log
-rw-rw-r–. 1 bdash bdash 294402 Apr 12 01:54 rsem6.log
drwxrwxr-x. 2 bdash bdash 163 Apr 11 21:20 rsem_ref
data = read.table("1out/.isoforms.results", header=T, stringsAsFactors=F)
head(data)
data = read.table("1out/.isoforms.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "transcript_id" ,"expected_count", "TPM", "IsoPct")]
data = read.table("2out/.isoforms.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "transcript_id", "expected_count", "TPM", "IsoPct")]
data = read.table("3out/.isoforms.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "transcript_id", "expected_count", "TPM", "IsoPct")]
data = read.table("4out/.isoforms.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "transcript_id", "expected_count", "TPM", "IsoPct")]
data = read.table("5out/.isoforms.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "transcript_id", "expected_count", "TPM", "IsoPct")]
data = read.table("6out/.isoforms.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "transcript_id", "expected_count", "TPM", "IsoPct")]
"1out/.isoforms.results" "2out/.isoforms.results" "3out/.isoforms.results"
"4out/.isoforms.results" "5out/.isoforms.results" "6out/.isoforms.results"
"NM_022258" 0.00 1.00 0.00 0.00 1.00 0.00
"NM_133400" 0.00 0.00 0.00 0.00 0.00 0.00
"NM_012488" 33.00 51.50 14.00 34.50 137.00 23.50
"NM_012488_2" 33.00 51.50 14.00 34.50 137.00 23.50
"NM_138524" 13.00 8.00 19.00 9.00 19.00 20.00
"NM_022240" 0.00 0.00 0.00 0.00 5.00 1.00
"NR_002156" 3.00 1.00 0.00 1.00 1.00 0.00
"NM_001106795" 166.00 171.00 141.00 110.00 108.00 112.00
"NM_023104" 854.00 580.00 647.00 491.00 474.00 547.00
library(EBSeq)
Loading required package: blockmodeling
To cite package 'blockmodeling' in publications please use package citation and (at least) one of
the articles:
㠼㹥iberna, Ale㤼㹡 (2007). Generalized blockmodeling of valued networks. Social Networks 29(1),
105-126.
㠼㹥iberna, Ale㤼㹡 (2008). Direct and indirect approaches to blockmodeling of valued networks in terms
of regular equivalence. Journal of Mathematical Sociology 32(1), 57㤼㸶84.
Ziberna, Ales (2018). Generalized and Classical Blockmodeling of Valued Networks, R package
version 0.3.4.
To see these entries in BibTeX format, use 'print(<citation>, bibtex=TRUE)', 'toBibtex(.)', or set
'options(citation.bibtex.max=999)'.
Loading required package: gplots
Attaching package: 㤼㸱gplots㤼㸲
The following object is masked from 㤼㸱package:stats㤼㸲:
lowess
Loading required package: testthat
IsoMat <- data.matrix(read.table(file="isoMat.txt"))
str(IsoMat)
num [1:18939, 1:6] 0 0 33 33 13 0 3 166 854 0 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:18939] "NM_022258" "NM_133400" "NM_012488" "NM_012488_2" ...
..$ : chr [1:6] "X1out..isoforms.results" "X2out..isoforms.results" "X3out..isoforms.results" "X4out..isoforms.results" ...
Sizes=MedianNorm(IsoMat)
head(Sizes)
X1out..isoforms.results X2out..isoforms.results X3out..isoforms.results X4out..isoforms.results
1.2368789 1.1461308 1.1211385 0.8246803
X5out..isoforms.results X6out..isoforms.results
0.8683307 0.8872458
EBOut=EBTest(Data=IsoMat, Conditions=as.factor(rep(c("C1","C2"),each=3)),sizeFactors=Sizes, maxround=5)
Removing transcripts with 100 th quantile < = 0
15529 transcripts will be tested
iteration 1 done
time 5.74
iteration 2 done
time 2.9
iteration 3 done
time 3.72
iteration 4 done
time 3.61
iteration 5 done
time 2.94
names(EBOut)
[1] "Alpha" "Beta" "P" "PFromZ" "Z" "PoissonZ"
[7] "RList" "MeanList" "VarList" "QList1" "QList2" "C1Mean"
[13] "C2Mean" "C1EstVar" "C2EstVar" "PoolVar" "DataList" "PPDE"
[19] "f0" "f1" "AllZeroIndex" "PPMat" "PPMatWith0" "ConditionOrder"
[25] "Conditions" "DataNorm"
EBDERes=GetDEResults(EBOut, FDR=0.05)
str(EBDERes)
List of 3
$ DEfound: chr "NM_001013054"
$ PPMat : num [1:18939, 1:2] 1 NA 0.999 0.999 0.996 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:18939] "NM_022258" "NM_133400" "NM_012488" "NM_012488_2" ...
.. ..$ : chr [1:2] "PPEE" "PPDE"
$ Status : Named chr [1:18939] "EE" "Filtered: Low Expression" "EE" "EE" ...
..- attr(*, "names")= chr [1:18939] "NM_022258" "NM_133400" "NM_012488" "NM_012488_2" ...
EBDERes$DEfound
[1] "NM_001013054"
head(EBDERes$PPMat)
PPEE PPDE
NM_022258 1.0000000 4.682651e-14
NM_133400 NA NA
NM_012488 0.9985787 1.421285e-03
NM_012488_2 0.9985787 1.421285e-03
NM_138524 0.9959625 4.037536e-03
NM_022240 1.0000000 8.863191e-13
table(EBDERes$Status)
DE EE Filtered: Fold Change Ratio Filtered: Low Expression
1 15506 22 3410
data = read.table("1out/.genes.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "expected_count", "TPM")]
data = read.table("2out/.genes.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "expected_count", "TPM")]
data = read.table("3out/.genes.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "expected_count", "TPM")]
data = read.table("4out/.genes.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "expected_count", "TPM")]
data = read.table("5out/.genes.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "expected_count", "TPM")]
data = read.table("6out/.genes.results", header=T, stringsAsFactors=F)
idx = order(data[,"TPM"], decreasing=T)
data[idx[1:10], c("gene_id", "expected_count", "TPM")]
"1out/.genes.results" "2out/.genes.results" "3out/.genes.results" "4out/.genes.results" "5out/.genes.results" "6out/.genes.results"
"A1bg" 0.00 1.00 0.00 0.00 1.00 0.00
"A1cf" 0.00 0.00 0.00 0.00 0.00 0.00
"A2m" 66.00 103.00 28.00 69.00 274.00 47.00
"A3galt2" 13.00 8.00 19.00 9.00 19.00 20.00
"A4galt" 0.00 0.00 0.00 0.00 5.00 1.00
"AA926063" 3.00 1.00 0.00 1.00 1.00 0.00
"Aaas" 166.00 171.00 141.00 110.00 108.00 112.00
"Aacs" 854.00 580.00 647.00 491.00 474.00 547.00
"Aadac" 0.00 0.00 0.00 0.00 0.00 0.00
library(EBSeq)
GeneMat <- data.matrix(read.table(file="geneMat.txt"))
str(GeneMat)
num [1:17322, 1:6] 0 0 66 13 0 3 166 854 0 0 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:17322] "A1bg" "A1cf" "A2m" "A3galt2" ...
..$ : chr [1:6] "X1out..genes.results" "X2out..genes.results" "X3out..genes.results" "X4out..genes.results" ...
Sizes=MedianNorm(GeneMat)
head(Sizes)
X1out..genes.results X2out..genes.results X3out..genes.results X4out..genes.results X5out..genes.results
1.2366119 1.1459937 1.1202284 0.8244768 0.8681165
X6out..genes.results
0.8868599
EBOut=EBTest(Data=GeneMat, Conditions=as.factor(rep(c("C1","C2"),each=3)),sizeFactors=Sizes, maxround=5)
Removing transcripts with 100 th quantile < = 0
14396 transcripts will be tested
iteration 1 done
time 4.92
iteration 2 done
time 3.55
iteration 3 done
time 3.15
iteration 4 done
time 2.74
iteration 5 done
time 2.95
names(EBOut)
[1] "Alpha" "Beta" "P" "PFromZ" "Z" "PoissonZ"
[7] "RList" "MeanList" "VarList" "QList1" "QList2" "C1Mean"
[13] "C2Mean" "C1EstVar" "C2EstVar" "PoolVar" "DataList" "PPDE"
[19] "f0" "f1" "AllZeroIndex" "PPMat" "PPMatWith0" "ConditionOrder"
[25] "Conditions" "DataNorm"
EBDERes=GetDEResults(EBOut, FDR=0.05)
str(EBDERes)
List of 3
$ DEfound: chr "Adprhl1"
$ PPMat : num [1:17322, 1:2] 1 NA 1 0.998 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:17322] "A1bg" "A1cf" "A2m" "A3galt2" ...
.. ..$ : chr [1:2] "PPEE" "PPDE"
$ Status : Named chr [1:17322] "EE" "Filtered: Low Expression" "EE" "EE" ...
..- attr(*, "names")= chr [1:17322] "A1bg" "A1cf" "A2m" "A3galt2" ...
EBDERes$DEfound
[1] "Adprhl1"
head(EBDERes$PPMat)
PPEE PPDE
A1bg 1.0000000 2.739067e-14
A1cf NA NA
A2m 0.9997427 2.573493e-04
A3galt2 0.9981938 1.806185e-03
A4galt 1.0000000 5.372723e-13
AA926063 1.0000000 1.520845e-14
table(EBDERes$Status)
DE EE Filtered: Fold Change Ratio Filtered: Low Expression
1 14392 3 2926