Using BWA-mem (http://bio-bwa.sourceforge.net) to aligned the fastq files against the pig reference genome (Sscrofa11.1).
$ bwa index pig_reference_genome/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa
$ mkdir 01-BWA_mapped $ for i in $(ls 00-Fastq/); do bwa mem -t 12 ../pig_reference_genome/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa 00-Fastq/$i > 01-BWA_mapped/$i"_unsorted_aligned.sam"; done
$ for i in $(ls 00-Fastq/*R1*); do a=`basename $i | cut -d"_" -f1,2`; bwa mem -t 12 ../pig_reference_genome/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa $i 00-Fastq/$a"_L002_R2_001.fastq.gz" > 01-BWA_mapped/$a"_unsorted_aligned.sam"; done
When BWA is finished, samtools will be used to process the files.
$ for i in $(ls 01-BWA_mapped/); do a=`basename $i | cut -d"." -f1`; samtools view -@ 12 -bS 01-BWA_mapped/$i > 01-BWA_mapped/$a".unsorted_aligned.bam"; done
$ rm 01-BWA_mapped/*.sam $ for i in $(ls 01-BWA_mapped/); do a=`basename $i | cut -d"." -f1`; samtools sort -@ 12 01-BWA_mapped/$i > 01-BWA_mapped/$a"_sorted_aligned.bam"; done
$ rm 01-BWA_mapped/*unsorted_aligned.bam $ for i in $(ls 01-BWA_mapped/*_sorted_aligned.bam); do a=`basename $i | cut -d"." -f1`; samtools rmdup $i 01-BWA_mapped/$a"_RD.bam"; done
$ for i in $(ls 01-BWA_mapped/*_sorted_aligned_RD.bam); do a=`basename $i | cut -d"." -f1`; samtools view -@ 12 -b -q 10 $i > 01-BWA_mapped/$a"_uniq.bam"; done
$ for i in $(ls 01-BWA_mapped/*_sorted_aligned_RD_uniq.bam); do samtools index -@ 12 $i; done
After mapping reads to the genome, macs3 - peak calling will be used to identify regions of ChIP enrichment.
It is strongly recommended installing MACS program in a virtual environment, so that you have full control of your installation and wonโt mess up with your system libraries.
A simple way to create a virtual environment of Python3 is:
$ python3 -m venv MyPythonEnv/
$ source MyPythonEnv/bin/activate
$ git clone https://github.com/macs3-project/MACS.git
$ cd MACS
$ python setup.py install
$ cd ../ $ ln -s MACS/bin/macs3The program MACS3 (https://github.com/taoliu/MACS) is probably the most used program to call ChIP-seq peaks (regions where there is an enrichment of your ChIP-seq mark).
The command line for macs3 need to be done using control and treatment parameters
Download iMacs3pipeline
$ git clone https://github.com/hanielcedraz/iMacs3Pipeline.git
$ cd iMacs3Pipeline
$ chmod +x macs3Pipeline.RFor running macs3Pipeline.R you need a sample file containing the name of each sample and each file (treated and input) like the example
| SAMPLE_ID | Treated | Input |
|---|---|---|
| scrofaPig5 | scrofaPig5_treated_H3K27ac_muscle_sorted_aligned_RD_uniq.bam | scrofaPig5_Input_muscle_sorted_aligned_RD_uniq.bam |
| scrofaPig4 | scrofaPig4_treated_H3K27ac_muscle_sorted_aligned_RD_uniq.bam | scrofaPig4_Input_muscle_sorted_aligned_RD_uniq.bam |
| scrofaPig6 | scrofaPig6_treated_H3K27ac_muscle_sorted_aligned_RD_uniq.bam | scrofaPig6_Input_muscle_sorted_aligned_RD_uniq.bam |
$ macs3Pipeline.R -i 01-BWA_mapped -s 3 $ macs3Pipeline.R -i 01-BWA_mapped --broad -s 3 $ for i in $(ls 01-BWA_mapped/\*uniq.bam); do a=\`basename \$i \| cut -d"\_" -f1\`; macs3 callpeak -f BAMPE -t \$i -g hs --outdir 02-MACS3_narrow/ -n \$a -B -q 0.05 ; done
$ for i in \$(ls 01-BWA_mapped/\*uniq.bam); do a=\`basename \$i \| cut -d"\_" -f1\`; macs3 callpeak -f BAMPE -t \$i -g hs --outdir 02-MACS3_broad -n \$a -B -q 0.05 --broad --broad-cutoff 0.05 ; done