Write a R code to do each of the following tasks :
Search for DNA sequences from the organism “Chlamydia trachomatis” in the ACNUC “genbank”.
choosebank("genbank")
Q <- query("Q", "SP=Chlamydia trachomatis")
How many sequences retrieved?
Q$name
## [1] "Q"
Q$nelem
## [1] 43496
How many bases are there in longest sequence among them?
# vector = c()
#
# for (val in 1:length(Q$req)){
# # vector <- c(vector, length(getSequence(Q$req[[val]])))
# }
#
# which.max( vector[] )
max(sapply(Q$req, getLength))
## [1] 1083893
For the first three sequences, print out the accession numbers?
getName(Q$req[1])
## [1] "A01434"
getName(Q$req[2])
## [1] "A27838"
getName(Q$req[3])
## [1] "A27849"
For the 1000th sequence, print out the nucleotide bases in the range 50 to 75.
s1000 = getSequence(Q$req[[1000]])
#Print the first 10 bases in the sequence
s1000[50:75]
## [1] "t" "a" "g" "c" "t" "a" "a" "g" "t" "c" "g" "t" "a" "t" "t" "c" "t" "t" "t"
## [20] "g" "g" "g" "t" "g" "a" "a"
What is the length of the 250th sequence?
Q[["req"]][[250]]
## name length frame ncbicg
## "AF087303" "175" "2" "11"
175 is the length of 250th sequences
Export the 150th , 151th, and 152th sequences into a FASTA file.
write.fasta(getSequence(Q$req[150]), getName(Q$req[150]), file.out = "Seq_150.fasta")
write.fasta(getSequence(Q$req[151]), getName(Q$req[151]), file.out = "Seq_151.fasta")
write.fasta(getSequence(Q$req[152]), getName(Q$req[152]), file.out = "Seq_152.fasta")
closebank()
• Handwritten answers are not allowed! • Use Rmarkdown (https://rmarkdown.rstudio.com/) and provide a neatly formatted “pdf” file showing both code and output. • Include your name as a comment at the beginning of the script file.