2 Visualizing Protein Features

Visualizing protein features, such as domain architectures, is crucial for understanding protein function, enabling comparative analysis, identifying functional sites, generating hypotheses, and serving educational purposes. It can aid initial drug design approaches by revealing structural features critical for therapeutic targeting, supports protein engineering by pinpointing modifiable regions, and enhances communication and collaboration among researchers. This kind of visualization facilitates deeper insights into protein biology, driving both basic research and applied sciences.

2.1 drawProteins

drawProteins is an R package for visualizing protein schematics. It helps in creating detailed and informative plots of protein domain architectures, which can be useful for presenting structural information about proteins.

2.2 Using drawProtein to create a graphical representation of Nrf2 protein

2.2.1 Importing data

First, we need to use the get_features function to download data directly from UniProt.

# Retrieving data from UniProt for NF2L2_HUMAN
prot_json <- drawProteins::get_features("Q16236")

2.2.2 Creating a dataframe with features

Second, we use the feature_to_dataframe function to covert the imported data into a new dataframe.

# Converting data into a dataframe
prot_df <- drawProteins::feature_to_dataframe(prot_json)

Let’s take a look at the dataframe using head(prot_df).

##                  type                                 description begin end
## featuresTemp    CHAIN Nuclear factor erythroid 2-related factor 2     1 605
## featuresTemp.1 DOMAIN                                        bZIP   497 560
## featuresTemp.2 REGION                                  Disordered   334 449
## featuresTemp.3 REGION                                 Basic motif   499 518
## featuresTemp.4 REGION                              Leucine-zipper   522 529
## featuresTemp.5 REGION                                  Disordered   571 605
##                length accession   entryName taxid order
## featuresTemp      604    Q16236 NF2L2_HUMAN  9606     1
## featuresTemp.1     63    Q16236 NF2L2_HUMAN  9606     1
## featuresTemp.2    115    Q16236 NF2L2_HUMAN  9606     1
## featuresTemp.3     19    Q16236 NF2L2_HUMAN  9606     1
## featuresTemp.4      7    Q16236 NF2L2_HUMAN  9606     1
## featuresTemp.5     34    Q16236 NF2L2_HUMAN  9606     1

For Nrf2, the types of features include CHAIN, DOMAIN, REGION, MOTIF, COMPBIAS, MOD_RES, CARBOHYD, VAR_SEQ, VARIANT, MUTAGEN, CONFLICT, TURN, STRAND and HELIX.

2.2.3 Visualizing

Following the procedure described in the drawProteins vignette, we first set the canvas for the protein(s), then we plot the chains, and, lastly the domains. Next we can draw other features, such as regions, repeats, motifs and phosphorylation sites.

2.2.3.1 Canvas

Setting the canvas with draw_canvas(prot_df).

2.2.3.2 Chains

Plotting the chains with draw_chains(ptn_plot, prot_df).

2.2.3.3 Domains

Plotting the domains with draw_domains(ptn_plot, prot_df).

2.2.3.4 Regions

Plotting regions with draw_regions(ptn_plot, prot_df).

2.2.3.5 Repeat

Plotting repeats with draw_repeat(ptn_plot, prot_df).

2.2.3.6 Motif

Plotting motifs with draw_motif(ptn_plot, prot_df).

2.2.3.7 Phosphorylation sites

Plotting phosphorylation sites with draw_phospho(ptn_plot, prot_df). Nrf2 has two phosphorylation sites at positions 40 and 215.

2.3 Wrapping it up

ptn_plot <- draw_canvas(prot_df)
ptn_plot <- draw_chains(ptn_plot, prot_df, fill = 'gray95', outline = 'gray15')
ptn_plot <- draw_domains(ptn_plot, prot_df)
ptn_plot <- draw_repeat(ptn_plot, prot_df)
ptn_plot <- draw_motif(ptn_plot, prot_df)
ptn_plot <- draw_phospho(ptn_plot, prot_df, size = 8)
ptn_plot <- ptn_plot + 
  scale_fill_brewer(palette = 'Spectral', name = NULL) + 
  theme_solid() + 
  theme(legend.position = 'top')
ptn_plot