This is an R Markdown document. The data for this analysis was collected on 13 January 2017. I have a pool of paired-end 100 sequences from Cedrela species. These sequences were obtained via hybridization capture, targeted enrichment, and short-read sequencing on the Illumina HiSeq 3000.
I used kmercountexact.sh from bbtools to produce a k-mer frequency distribution.
library(ggplot2)
Load Data
ced_gen <- read.table("CEOD_khist.txt")
khist<-ggplot(ced_gen[5:100,], aes(x=V1, y=V2))+
geom_vline(xintercept = 25, color="red")+
geom_line(size=1)+
theme_bw()+
theme(panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
plot.title = element_text(hjust = 0))+
labs(x = "k-mer Depth", y = "Count")+
ggtitle("K-mer Frequency")
khist
Option: save
#ggsave("khist.jpg",plot=khist, width=5, height=3.5)