This R Lab aims to provide practice at automaing downloads from NCBI by downloading the accession list as a .seq file, uploading it to R, and then modifying script to make sure I find and report all MT CO1 gene sequences for your my organism, Aedes aegypti.
load in libraries
library("ape")
read in accession number list from NCBI where I searched “(MT CO1) AND”Aedes aegypti”” and dowloaded the accesion list for all 28 hits from the search.
Sequences<-read.table("sequence.seq", header=FALSE)
Download all sequential sequences from Genbank, since I read in my NCBI list as a table I am specifying the column that the accession ID is in (V1)
Myseq <- read.GenBank(Sequences$V1,
seq.names = Sequences$V1,
species.names = TRUE,
as.character = TRUE)
writing the sequences to a fasta file
write.dna(Myseq, "Aedes aegypti.fasta", format = "fasta")