RefDB Data Analysis App

Xi Chen (Bill)

December 26, 2015

Background and Introduction

NMR is a powerful tool for studying protein structure, dynamics, and interaction perturbations. However, the quality of these NMR studies directly depend on the quality of the referencing, since chemical shift values for specific nuclei in proteins are relative to a “reference” value. Often, protein NMR studies are inevitably caught in a “chicken-egg dilemma”—a circle between reference correction and assignment: a correct reference is required for a decent assignment, and a coherent assignment is needed to validate the correct referencing.

Any research in proteins will encounter two type of technologies: NMR (nuclear magnetic resonance) and crystallography. Due to many reasons, NMR is superior in flexibility and convenience. And the data collected from NMR are easy to much intuitive and easy to understand. However as mentioned in the background, NMR has a fundamental issue: it’s need reference before any analysis.

Therefore, to better understand the RefDB data, I made this app that illustrate all the statistics in the dataset.

Shiny App: RefDB Data Analysis

What’s the app look like?

RefDB Data Analysis

Simple Histgram

For CA resonance in Helix structure for alanine

A.H.CA <- read.table("~/Dropbox/NMR Referencing/NMRRefCorr/data/Generated-NA/A-H-CA.txt", quote="\"", comment.char="")
library(ggplot2)
plt<-ggplot(data = A.H.CA, aes(x = V1)) + geom_histogram(aes(y=..density..), fill = 'red', binwidth = 0.20, color = I('black')) +  stat_function(fun = dnorm, arg = list(mean = mean(A.H.CA$V1, na.rm = T), sd = sd(A.H.CA$V1, na.rm = T)))
suppressWarnings(print(plt))

This figure illustrates the data roughly normal.

Calculated RefDB Statistics for alpha Carbon

##     Residue Coil.mu Coil.sd Helix.mu Helix.sd Beta.mu Beta.sd Avg.mu
## 1       Ala  52.840   1.640   54.830     1.05  51.530    1.48  53.44
## 2   Cys(ox)  55.600   2.580   58.000     2.54  54.990    2.00  56.02
## 3  Cys(red)  57.530   3.050   61.310     3.50  56.880    2.02  58.40
## 4       Cys  56.565   2.815   59.655     3.02  55.935    2.01  57.21
## 5       Asp  54.180   1.600   56.700     1.61  53.870    1.64  54.90
## 6       Glu  56.870   1.820   59.110     1.16  55.520    1.67  57.66
## 7       Phe  57.980   2.020   60.810     1.90  56.650    1.59  58.43
## 8       Gly  45.510   1.050   46.910     1.10  45.220    1.17  45.63
## 9       His  55.860   1.960   59.040     1.74  55.090    1.78  56.65
## 10      Ile  61.030   1.900   64.570     1.74  60.050    1.57  61.89
## 11      Lys  56.590   1.780   58.930     1.44  55.400    1.34  57.12
## 12      Leu  54.920   1.700   57.520     1.23  54.080    1.31  55.78
## 13      Met  55.670   1.540   58.090     1.81  54.580    1.24  56.58
## 14      Asn  53.230   1.510   55.450     1.42  52.740    1.47  53.69
## 15      Pro  63.470   1.260   65.490     1.08  62.640    1.03  63.61
## 16      Gln  56.120   1.720   58.470     1.19  54.830    1.41  56.77
## 17      Arg  56.420   1.940   58.930     1.55  55.140    1.64  57.11
## 18      Ser  58.380   1.690   60.880     1.61  57.540    1.40  58.74
## 19      Thr  61.640   2.070   65.610     2.39  61.060    1.59  62.31
## 20      Val  62.060   2.160   66.160     1.55  60.830    1.64  62.82
## 21      Trp  57.780   1.710   60.010     1.77  56.410    1.87  58.05
## 22      Tyr  57.970   2.170   60.980     1.76  56.830    1.71  58.21
##    Avg.sd
## 1   1.910
## 2   2.670
## 3   3.320
## 4   2.995
## 5   2.010
## 6   2.090
## 7   2.570
## 8   1.180
## 9   2.440
## 10  2.650
## 11  2.110
## 12  2.060
## 13  2.200
## 14  1.820
## 15  1.460
## 16  2.050
## 17  2.290
## 18  2.010
## 19  2.650
## 20  2.900
## 21  2.340
## 22  2.520