Setup

snp_data_path <- params$snp_dataset
sample_id <- params$sample_id
library("dplyr")
library("ggplot2")
library("knitr")
library("devtools")
source("code/functions/rscript.R")

Project Description

Multiple different samples of the SARS-CoV2 virus were taken and then genotyped. After being genotyped, they were compared to reference genome. A single nucleotide polymorphism, or SNP,is anywhere where the samples has a different nucleotide than the reference genome at a given loci. Base R and dyplr Wickham et al. (2023) were used to subset the data by sample id. Base R and ggplot2 Wickham (2016) were used to create figures from the subsetted data. Rstudio, dyplr, and ggplot2 has allowed us to isolate and visualize SNP variants by sample ID.

Subsetting Data

# return function
result <- subset_snp_data(snp_data_path, sample_id)
write.csv(result, paste0("output/snp_subset_", sample_id, ".csv"))

Figures

# use ggplot to make a scatterplot colored by sample
result %>%
  ggplot(aes(x = pos,
             y = qual,
             color = sample)) +
  geom_point() +
  labs(title = "SNP Quality by Position in the SARS-CoV2 Genome")

ggsave(paste0("output/figures", sample_id, "snp_quality_by_position.png"))
# use ggplot to make a scatterplot colored by sample
result %>%
  ggplot(aes(x = alt,
             y = pos,
             color = sample)) +
  geom_point() +
  labs(title = "Change from reference by Position in the SARS-CoV2 genome")

ggsave(paste0("output/figures", sample_id, "nucleotide_change_by_position.png"))

Session Info

session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.4.1 (2024-06-14 ucrt)
##  os       Windows 11 x64 (build 22631)
##  system   x86_64, mingw32
##  ui       RTerm
##  language (EN)
##  collate  English_United States.utf8
##  ctype    English_United States.utf8
##  tz       America/Los_Angeles
##  date     2025-01-09
##  pandoc   3.2 @ C:/Program Files/RStudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  bslib         0.8.0   2024-07-29 [1] CRAN (R 4.4.1)
##  cachem        1.1.0   2024-05-16 [1] CRAN (R 4.4.1)
##  cli           3.6.3   2024-06-21 [1] CRAN (R 4.4.1)
##  colorspace    2.1-1   2024-07-26 [1] CRAN (R 4.4.1)
##  devtools    * 2.4.5   2022-10-11 [1] CRAN (R 4.4.1)
##  digest        0.6.37  2024-08-19 [1] CRAN (R 4.4.1)
##  dplyr       * 1.1.4   2023-11-17 [1] CRAN (R 4.4.1)
##  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.4.1)
##  evaluate      1.0.0   2024-09-17 [1] CRAN (R 4.4.1)
##  fansi         1.0.6   2023-12-08 [1] CRAN (R 4.4.1)
##  farver        2.1.2   2024-05-13 [1] CRAN (R 4.4.1)
##  fastmap       1.2.0   2024-05-15 [1] CRAN (R 4.4.1)
##  fs            1.6.4   2024-04-25 [1] CRAN (R 4.4.1)
##  generics      0.1.3   2022-07-05 [1] CRAN (R 4.4.1)
##  ggplot2     * 3.5.1   2024-04-23 [1] CRAN (R 4.4.1)
##  glue          1.7.0   2024-01-09 [1] CRAN (R 4.4.1)
##  gtable        0.3.5   2024-04-22 [1] CRAN (R 4.4.1)
##  highr         0.11    2024-05-26 [1] CRAN (R 4.4.1)
##  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.4.1)
##  htmlwidgets   1.6.4   2023-12-06 [1] CRAN (R 4.4.1)
##  httpuv        1.6.15  2024-03-26 [1] CRAN (R 4.4.1)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.4.1)
##  jsonlite      1.8.9   2024-09-20 [1] CRAN (R 4.4.1)
##  knitr       * 1.48    2024-07-07 [1] CRAN (R 4.4.1)
##  labeling      0.4.3   2023-08-29 [1] CRAN (R 4.4.0)
##  later         1.3.2   2023-12-06 [1] CRAN (R 4.4.1)
##  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.4.1)
##  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.4.1)
##  memoise       2.0.1   2021-11-26 [1] CRAN (R 4.4.1)
##  mime          0.12    2021-09-28 [1] CRAN (R 4.4.0)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.4.1)
##  munsell       0.5.1   2024-04-01 [1] CRAN (R 4.4.1)
##  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.4.1)
##  pkgbuild      1.4.4   2024-03-17 [1] CRAN (R 4.4.1)
##  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.4.1)
##  pkgload       1.4.0   2024-06-28 [1] CRAN (R 4.4.1)
##  profvis       0.4.0   2024-09-20 [1] CRAN (R 4.4.1)
##  promises      1.3.0   2024-04-05 [1] CRAN (R 4.4.1)
##  purrr         1.0.2   2023-08-10 [1] CRAN (R 4.4.1)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.4.1)
##  ragg          1.3.3   2024-09-11 [1] CRAN (R 4.4.1)
##  Rcpp          1.0.13  2024-07-17 [1] CRAN (R 4.4.1)
##  remotes       2.5.0   2024-03-17 [1] CRAN (R 4.4.1)
##  rlang         1.1.4   2024-06-04 [1] CRAN (R 4.4.1)
##  rmarkdown     2.28    2024-08-17 [1] CRAN (R 4.4.1)
##  rstudioapi    0.17.0  2024-10-16 [1] CRAN (R 4.4.1)
##  sass          0.4.9   2024-03-15 [1] CRAN (R 4.4.1)
##  scales        1.3.0   2023-11-28 [1] CRAN (R 4.4.1)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.4.1)
##  shiny         1.9.1   2024-08-01 [1] CRAN (R 4.4.1)
##  systemfonts   1.1.0   2024-05-15 [1] CRAN (R 4.4.1)
##  textshaping   0.4.0   2024-05-24 [1] CRAN (R 4.4.1)
##  tibble        3.2.1   2023-03-20 [1] CRAN (R 4.4.1)
##  tidyselect    1.2.1   2024-03-11 [1] CRAN (R 4.4.1)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.4.1)
##  usethis     * 3.0.0   2024-07-29 [1] CRAN (R 4.4.1)
##  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.4.1)
##  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.4.1)
##  withr         3.0.1   2024-07-31 [1] CRAN (R 4.4.1)
##  xfun          0.47    2024-08-17 [1] CRAN (R 4.4.1)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.4.1)
##  yaml          2.3.10  2024-07-26 [1] CRAN (R 4.4.1)
## 
##  [1] C:/Users/nayak/AppData/Local/R/win-library/4.4
##  [2] C:/Program Files/R/R-4.4.1/library
## 
## ──────────────────────────────────────────────────────────────────────────────

Sources Cited

Wickham,H. et al. (2023) Dplyr: A grammar of data manipulation.
Wickham,H. (2016) ggplot2: Elegant graphics for data analysis Springer-Verlag New York.