This R Lab aims to provide practice at automaing downloads from NCBI by downloading the accession list as a .seq file, uploading it to R, and then modifying script to make sure I find and report all MT CO1 gene sequences for your my organism, Aedes aegypti.

load in libraries

library("ape")

read in accession number list from NCBI where I searched “(MT CO1) AND”Aedes aegypti”” and dowloaded the accesion list for all 28 hits from the search.

Sequences<-read.table("sequence.seq", header=FALSE)

Download all sequential sequences from Genbank, since I read in my NCBI list as a table I am specifying the column that the accession ID is in (V1)

Myseq <- read.GenBank(Sequences$V1,
                          seq.names = Sequences$V1,
                          species.names = TRUE,
                          as.character = TRUE)

writing the sequences to a fasta file

write.dna(Myseq, "Aedes aegypti.fasta", format = "fasta")