Step 1. - Mapping and Processing

Using BWA-mem (http://bio-bwa.sourceforge.net) to aligned the fastq files against the pig reference genome (Sscrofa11.1).

First we need to index the reference genome

    $ bwa index pig_reference_genome/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa

Creating a directory for the output generated from BWA

    $ mkdir 01-BWA_mapped

Alingning reads against the pig reference genome using bwa-mem

Single-end files

    $ for i in $(ls 00-Fastq/); do bwa mem -t 12 ../pig_reference_genome/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa 00-Fastq/$i > 01-BWA_mapped/$i"_unsorted_aligned.sam"; done

Paired-end files

    $ for i in $(ls 00-Fastq/*R1*); do a=`basename $i | cut -d"_" -f1,2`; bwa mem -t 12 ../pig_reference_genome/Sus_scrofa.Sscrofa11.1.dna.toplevel.fa $i 00-Fastq/$a"_L002_R2_001.fastq.gz" > 01-BWA_mapped/$a"_unsorted_aligned.sam"; done


When BWA is finished, samtools will be used to process the files.

Convert .sam file into .bam format

    $ for i in $(ls 01-BWA_mapped/); do a=`basename $i | cut -d"." -f1`; samtools view -@ 12 -bS 01-BWA_mapped/$i > 01-BWA_mapped/$a".unsorted_aligned.bam"; done

Removing .sam files to save space

    $ rm 01-BWA_mapped/*.sam

Sort the alignment file

    $ for i in $(ls 01-BWA_mapped/); do a=`basename $i | cut -d"." -f1`; samtools sort -@ 12 01-BWA_mapped/$i > 01-BWA_mapped/$a"_sorted_aligned.bam"; done

Removing unsorted .bam files to save space

    $ rm 01-BWA_mapped/*unsorted_aligned.bam

Remove PCR duplicates

    $ for i in $(ls 01-BWA_mapped/*_sorted_aligned.bam); do a=`basename $i | cut -d"." -f1`; samtools rmdup $i 01-BWA_mapped/$a"_RD.bam"; done

only keep reads that are uniquely aligned against the reference genome

    $ for i in $(ls 01-BWA_mapped/*_sorted_aligned_RD.bam); do a=`basename $i | cut -d"." -f1`; samtools view -@ 12 -b -q 10 $i > 01-BWA_mapped/$a"_uniq.bam"; done

Indexing the alignment

    $ for i in $(ls 01-BWA_mapped/*_sorted_aligned_RD_uniq.bam); do samtools index -@ 12 $i; done

Step 2. - Peak calling

After mapping reads to the genome, macs3 - peak calling will be used to identify regions of ChIP enrichment.

Installing MACS3

Prepare a virtual Python environment and installing macs3

It is strongly recommended installing MACS program in a virtual environment, so that you have full control of your installation and wonโ€™t mess up with your system libraries.

A simple way to create a virtual environment of Python3 is:

Creating and activating python environment

    $ python3 -m venv MyPythonEnv/
    $ source MyPythonEnv/bin/activate

Cloning git repository and installing MACS3

    $ git clone https://github.com/macs3-project/MACS.git
    $ cd MACS
    $ python setup.py install
    $ cd ../

Running macs3

The program MACS3 (https://github.com/taoliu/MACS) is probably the most used program to call ChIP-seq peaks (regions where there is an enrichment of your ChIP-seq mark).

The command line for macs3 need to be done using control and treatment parameters

Download iMacs3pipeline

    $ git clone https://github.com/hanielcedraz/iMacs3Pipeline.git
    $ cd iMacs3Pipeline
    $ chmod +x macs3Pipeline.R

For running macs3Pipeline.R you need a sample file containing the name of each sample and each file (treated and input) like the example

SAMPLE_ID Treated Input
scrofaPig5 scrofaPig5_treated_H3K27ac_muscle_sorted_aligned_RD_uniq.bam scrofaPig5_Input_muscle_sorted_aligned_RD_uniq.bam
scrofaPig4 scrofaPig4_treated_H3K27ac_muscle_sorted_aligned_RD_uniq.bam scrofaPig4_Input_muscle_sorted_aligned_RD_uniq.bam
scrofaPig6 scrofaPig6_treated_H3K27ac_muscle_sorted_aligned_RD_uniq.bam scrofaPig6_Input_muscle_sorted_aligned_RD_uniq.bam

- Peak calling on TF ChIP-seq:

    $ macs3Pipeline.R -i 01-BWA_mapped -s 3

- Peak calling on Histone Mark ChIP-seq:

    $ macs3Pipeline.R -i 01-BWA_mapped --broad -s 3

- Peak calling on ATAC-seq:

    $ for i in $(ls 01-BWA_mapped/\*uniq.bam); do a=\`basename \$i \| cut -d"\_" -f1\`; macs3 callpeak -f BAMPE -t \$i -g hs --outdir 02-MACS3_narrow/ -n \$a -B -q 0.05 ; done

    $ for i in \$(ls 01-BWA_mapped/\*uniq.bam); do a=\`basename \$i \| cut -d"\_" -f1\`; macs3 callpeak -f BAMPE -t \$i -g hs --outdir 02-MACS3_broad -n \$a -B -q 0.05 --broad --broad-cutoff 0.05 ; done