Now activate the environment you created last week.
conda activate bfblab
#Install FASTQC for sequence quality check
conda install -c bioconda fastqc
#Install Megahit for de novo assembly
conda install -c bioconda megahit
#Then, this will show the user manual:
megahit -h
#Install gdown for file downloading
conda install -c conda-forge gdown
First, create a separate folder for the lab exercises:
cd ~
mkdir Lab07
cd Lab07
Download the data from this link using wget:
#Download the first read:
gdown https://drive.google.com/uc?id=1CiRkrUcP3S_oNiluGQ1Mh1qSU-3FpKdF
#Download the second read:
gdown https://drive.google.com/uc?id=14di4CJ_J8TrRwISRQm0Nbt9_tILcYF9N
Open the file and visualize the reads file, fastq. Pay attention to using ‘|’ (pipe)
gunzip -c read_1.fq.gz | less
Check the quality of sequences:
Run the FASTQC software on both read_1 and read_2 files as follows:
fastqc *.gz
Open the html outputs generated using a web browser. Investigate the results.
megahit -1 reads_1.fq.gz -2 reads_2.fq.gz -o Unknown_genome_megahit
This command will create a folder called ‘Unknown_genome_megahit’. Please, navigate into that folder.
cd Unknown_genome_megahit/
ls -al
#And open the final contigs fasta file
less final.contigs.fa