Visualizing protein features, such as domain architectures, is crucial for understanding protein function, enabling comparative analysis, identifying functional sites, generating hypotheses, and serving educational purposes. It can aid initial drug design approaches by revealing structural features critical for therapeutic targeting, supports protein engineering by pinpointing modifiable regions, and enhances communication and collaboration among researchers. This kind of visualization facilitates deeper insights into protein biology, driving both basic research and applied sciences.
drawProteins is an R package for visualizing protein schematics. It helps in creating detailed and informative plots of protein domain architectures, which can be useful for presenting structural information about proteins.
First, we need to use the get_features function to download data directly from UniProt.
# Retrieving data from UniProt for NF2L2_HUMAN
prot_json <- drawProteins::get_features("Q16236")
Second, we use the feature_to_dataframe function to covert the imported data into a new dataframe.
# Converting data into a dataframe
prot_df <- drawProteins::feature_to_dataframe(prot_json)
Let’s take a look at the dataframe using head(prot_df).
## type description begin end
## featuresTemp CHAIN Nuclear factor erythroid 2-related factor 2 1 605
## featuresTemp.1 DOMAIN bZIP 497 560
## featuresTemp.2 REGION Disordered 334 449
## featuresTemp.3 REGION Basic motif 499 518
## featuresTemp.4 REGION Leucine-zipper 522 529
## featuresTemp.5 REGION Disordered 571 605
## length accession entryName taxid order
## featuresTemp 604 Q16236 NF2L2_HUMAN 9606 1
## featuresTemp.1 63 Q16236 NF2L2_HUMAN 9606 1
## featuresTemp.2 115 Q16236 NF2L2_HUMAN 9606 1
## featuresTemp.3 19 Q16236 NF2L2_HUMAN 9606 1
## featuresTemp.4 7 Q16236 NF2L2_HUMAN 9606 1
## featuresTemp.5 34 Q16236 NF2L2_HUMAN 9606 1
For Nrf2, the types of features include CHAIN, DOMAIN, REGION, MOTIF, COMPBIAS, MOD_RES, CARBOHYD, VAR_SEQ, VARIANT, MUTAGEN, CONFLICT, TURN, STRAND and HELIX.
Following the procedure described in the drawProteins vignette, we first set the canvas for the protein(s), then we plot the chains, and, lastly the domains. Next we can draw other features, such as regions, repeats, motifs and phosphorylation sites.
Setting the canvas with draw_canvas(prot_df).
Plotting the chains with draw_chains(ptn_plot, prot_df).
Plotting the domains with draw_domains(ptn_plot, prot_df).
Plotting regions with draw_regions(ptn_plot, prot_df).
Plotting repeats with draw_repeat(ptn_plot, prot_df).
Plotting motifs with draw_motif(ptn_plot, prot_df).
Plotting phosphorylation sites with draw_phospho(ptn_plot, prot_df).
Nrf2 has two phosphorylation sites at positions 40 and 215.
ptn_plot <- draw_canvas(prot_df)
ptn_plot <- draw_chains(ptn_plot, prot_df, fill = 'gray95', outline = 'gray15')
ptn_plot <- draw_domains(ptn_plot, prot_df)
ptn_plot <- draw_repeat(ptn_plot, prot_df)
ptn_plot <- draw_motif(ptn_plot, prot_df)
ptn_plot <- draw_phospho(ptn_plot, prot_df, size = 8)
ptn_plot <- ptn_plot +
scale_fill_brewer(palette = 'Spectral', name = NULL) +
theme_solid() +
theme(legend.position = 'top')
ptn_plot