Since we’ve got trimmed input fastq files and a prepped genome, it’s time to start the alignment. This is the slow step, but the files aren’t super huge/there arenlt a ton, so hopefully this will only take a few days.
setwd("~/Documents/C-virginica-BSSeq/")
bsmap.genome <- "~/Documents/C-virginica-BSSeq/genome/C_virginica_1.0_genome.fasta"
trimmed.input.files <- list.files(pattern = "trimmed*")[2:7]
trimmed.input.files
[1] "trimmed-2112_lane1_ACAGTG_L001_R1.fastq" "trimmed-2112_lane1_ATCACG_R1.fastq"
[3] "trimmed-2112_lane1_CAGATC_R1.fastq" "trimmed-2112_lane1_GCCAAT_L001_R1.fastq"
[5] "trimmed-2112_lane1_TGACCA_L001_R1_001.fastq" "trimmed-2112_lane1_TTAGGC_L001_R1.fastq"
for(i in 1:length(trimmed.input.files)) {
system(paste0("bsmap -p 8 -d ", bsmap.genome ," -a ", trimmed.input.files[i]," -o ", substr(trimmed.input.files[i], 1, 24), "_bsmap_out.sam", " > ", substr(trimmed.input.files[i], 1, 24), "_bsmap_output.txt" ))
}
[bsmap] @Fri Apr 28 08:55:18 2017 loading reference file: /home/srlab/Documents/C-virginica-BSSeq/genome/C_virginica_1.0_genome.fasta (format: FASTA)
[bsmap] @Fri Apr 28 08:55:25 2017 358 reference seqs loaded, total size 685793667 bp. 7 secs passed
[bsmap] @Fri Apr 28 08:55:50 2017 create seed table. 32 secs passed
[bsmap] @Fri Apr 28 08:55:50 2017 Single-end alignment(8 threads),
Input read file: trimmed-2112_lane1_ACAGTG_L001_R1.fastq (format: FASTQ)
Output file: trimmed-2112_lane1_ACAGT_bsmap_out.sam (format: SAM)
[bsmap] @Fri Apr 28 09:02:47 2017 total reads: 12260444 total time: 449 secs
aligned reads: 3080002 (25.1%), unique reads: 1660724 (13.5%), non-unique reads: 1419278 (11.6%)
[bsmap] @Fri Apr 28 09:02:47 2017 loading reference file: /home/srlab/Documents/C-virginica-BSSeq/genome/C_virginica_1.0_genome.fasta (format: FASTA)
[bsmap] @Fri Apr 28 09:02:55 2017 358 reference seqs loaded, total size 685793667 bp. 8 secs passed
[bsmap] @Fri Apr 28 09:03:20 2017 create seed table. 33 secs passed
[bsmap] @Fri Apr 28 09:03:20 2017 Single-end alignment(8 threads),
Input read file: trimmed-2112_lane1_ATCACG_R1.fastq (format: FASTQ)
Output file: trimmed-2112_lane1_ATCAC_bsmap_out.sam (format: SAM)
[bsmap] @Fri Apr 28 09:18:58 2017 total reads: 28381961 total time: 971 secs
aligned reads: 5619655 (19.8%), unique reads: 3756062 (13.2%), non-unique reads: 1863593 (6.6%)
[bsmap] @Fri Apr 28 09:18:58 2017 loading reference file: /home/srlab/Documents/C-virginica-BSSeq/genome/C_virginica_1.0_genome.fasta (format: FASTA)
[bsmap] @Fri Apr 28 09:19:05 2017 358 reference seqs loaded, total size 685793667 bp. 7 secs passed
[bsmap] @Fri Apr 28 09:19:29 2017 create seed table. 31 secs passed
[bsmap] @Fri Apr 28 09:19:29 2017 Single-end alignment(8 threads),
Input read file: trimmed-2112_lane1_CAGATC_R1.fastq (format: FASTQ)
Output file: trimmed-2112_lane1_CAGAT_bsmap_out.sam (format: SAM)
[bsmap] @Fri Apr 28 09:38:07 2017 total reads: 31266957 total time: 1149 secs
aligned reads: 5695529 (18.2%), unique reads: 3530523 (11.3%), non-unique reads: 2165006 (6.9%)
[bsmap] @Fri Apr 28 09:38:07 2017 loading reference file: /home/srlab/Documents/C-virginica-BSSeq/genome/C_virginica_1.0_genome.fasta (format: FASTA)
[bsmap] @Fri Apr 28 09:38:15 2017 358 reference seqs loaded, total size 685793667 bp. 8 secs passed
[bsmap] @Fri Apr 28 09:38:38 2017 create seed table. 31 secs passed
[bsmap] @Fri Apr 28 09:38:38 2017 Single-end alignment(8 threads),
Input read file: trimmed-2112_lane1_GCCAAT_L001_R1.fastq (format: FASTQ)
Output file: trimmed-2112_lane1_GCCAA_bsmap_out.sam (format: SAM)
[bsmap] @Fri Apr 28 09:52:20 2017 total reads: 20811265 total time: 853 secs
aligned reads: 1814226 (8.7%), unique reads: 1116613 (5.4%), non-unique reads: 697613 (3.4%)
[bsmap] @Fri Apr 28 09:52:20 2017 loading reference file: /home/srlab/Documents/C-virginica-BSSeq/genome/C_virginica_1.0_genome.fasta (format: FASTA)
[bsmap] @Fri Apr 28 09:52:27 2017 358 reference seqs loaded, total size 685793667 bp. 7 secs passed
[bsmap] @Fri Apr 28 09:52:50 2017 create seed table. 30 secs passed
[bsmap] @Fri Apr 28 09:52:50 2017 Single-end alignment(8 threads),
Input read file: trimmed-2112_lane1_TGACCA_L001_R1_001.fastq (format: FASTQ)
Output file: trimmed-2112_lane1_TGACC_bsmap_out.sam (format: SAM)
[bsmap] @Fri Apr 28 09:53:56 2017 total reads: 2827107 total time: 96 secs
aligned reads: 924657 (32.7%), unique reads: 566750 (20.0%), non-unique reads: 357907 (12.7%)
[bsmap] @Fri Apr 28 09:53:56 2017 loading reference file: /home/srlab/Documents/C-virginica-BSSeq/genome/C_virginica_1.0_genome.fasta (format: FASTA)
[bsmap] @Fri Apr 28 09:54:04 2017 358 reference seqs loaded, total size 685793667 bp. 8 secs passed
[bsmap] @Fri Apr 28 09:54:28 2017 create seed table. 32 secs passed
[bsmap] @Fri Apr 28 09:54:28 2017 Single-end alignment(8 threads),
Input read file: trimmed-2112_lane1_TTAGGC_L001_R1.fastq (format: FASTQ)
Output file: trimmed-2112_lane1_TTAGG_bsmap_out.sam (format: SAM)
[bsmap] @Fri Apr 28 10:08:22 2017 total reads: 21462840 total time: 866 secs
aligned reads: 2914601 (13.6%), unique reads: 1825092 (8.5%), non-unique reads: 1089509 (5.1%)
Add a new chunk by clicking the Insert Chunk button on the toolbar or by pressing Ctrl+Alt+I.
When you save the notebook, an HTML file containing the code and output will be saved alongside it (click the Preview button or press Ctrl+Shift+K to preview the HTML file).