Dr Paul Brennan
My aim was to create a package that would allow visualisation of proteins given data from Uniprot website.
Workflow:
library(magrittr)
library(drawProteins)
library(httr)
library(ggplot2)
# accession numbers of hair keratin
"Q14533 P19012 P02538" %>%
drawProteins::get_features() ->
protein_json
[1] "Download has worked"
# turn JSON object into a dataframe
protein_json %>%
drawProteins::feature_to_dataframe() ->
prot_data
# series of functions to visualise
prot_data %>%
geom_chains() %>%
geom_domains() %>%
geom_region %>%
geom_motif %>%
geom_phospho(size = 8) -> p
p <- p + theme_bw(base_size = 20) + # white background and change text size
theme(panel.grid.minor=element_blank(),
panel.grid.major=element_blank()) +
theme(axis.ticks = element_blank(),
axis.text.y = element_blank()) +
theme(panel.border = element_blank())
New concepts
devtools::document() - Roxygen tagsdevtools::test()devtools::check()Finally with a bit of effort, managed to bring devtools::check() and Travis CI into parallel
No ERRORS but WARNINGS and NOTES
How to be a good documentor
Becoming a better tester
Bioconductor has more checks than CRAN
bioc_required: true to .travis.yml)devtools::check() shows: 0 errors | 0 warnings | 1 note
geom_domains: no visible binding for global variable ‘prot_data’# show the function
geom_chains <- function(prot_data = prot_data,
outline = "black",
fill = "grey",
label_chains = TRUE,
labels = prot_data[prot_data$type == "CHAIN",]$entryName,
size = 0.5,
label_size = 4){
p <-ggplot2::ggplot() +
ggplot2::ylim(0.5, max(prot_data$order)+0.5) +
Passing some arguments in e.g. prot_data BUT not all of them…
geom_domains <- function(p,
label_domains = TRUE,
label_size = 4){
p <- p +
ggplot2::geom_rect(data= prot_data[prot_data$type == "DOMAIN",],
mapping=ggplot2::aes(xmin=begin,
Clearer here: prot_data and begin are NOT passed in to the function so R has to find it…
At the CaRdiff User Group
Options