Directly pasted from https://rpubs.com/profbiot/readGenBank (until line break)

Use library command to make ape functions accessible by this script

library(ape)
## Warning: package 'ape' was built under R version 4.3.3

Use paste() function to create a chr vector of accession numbers for Gasterosteus sequences

These sequences all belong to one genus of sticklebacks

Change in the tutorial to be sequences previous Endicott Bioinformatics students uploaded MT103163-MT103183

seq1 <- paste("JQ", seq(983161, 983255), sep = "") # paste is similar to c(), but output is a string instead of vector

Download all sequential sequences from Genbank

This would be really hard to do my hand

Note that the downloaded sequences are stored in a single variable called a list

sequences <- read.GenBank(seq1,
                          seq.names = seq1,
                          species.names = TRUE,
                          as.character = TRUE)

Write the sequences to a fasta file

write.dna(sequences, "fish.fasta", format = "fasta")

Pan paniscus (Bonobo) Mitochondrial CO1 Gene Sequence Analysis

This script automates the download of mitochondrial CO1 gene sequences for Pan Paniscus (bonobo) from GenBank using the ape package in R. The taxonomic ID for Pan paniscus is txid:9597.

Read in search result file containing accession numbers
accessions = read.table("bonobo.seq",
                        stringsAsFactors = FALSE)$V1
str(accessions)
##  chr [1:21] "GU189677.1" "GU189676.1" "GU189675.1" "GU189674.1" ...
Download all sequences from GenBank
bonoboSeqs = read.GenBank(accessions,
                          seq.names = accessions,
                          species.name = TRUE,
                          as.character = TRUE)
Display information about downloaded sequences
cat("Successfully downloaded",
    length(sequences),
    "sequences\n")
## Successfully downloaded 95 sequences
cat("Sequence names:\n")
## Sequence names:
cat(paste(names(sequences)),
    "\n")

Export sequences to FASTA file
write.dna(sequences,
          "bonobo_CO1.fasta",
          format = "fasta")