The RMSD between predicted and observed (reference corrected) chemical shifts are:

The last histgram looks really like a gamma distribution, so I further analysis the result with \(x^{1/3}\), since after cube root transformation the gamma distribution will approximate to normal.

Here I used the mixtool to predict the two distribution

There are only two possibile classifications for the data and they are:

  • Classification 1: mean are 0.3688677 and 0.4651583, and standard deviations are 0.12236236 and 0.04692031.

  • Classification 2: mean are 0.2100500 and 0.4031785, and standard deviations are 0.0792411 and 0.1059064.

Using a grid-search method, I estimate the cutoff for the files that I can use to estimate the covariance is 0.37.

So all the values below 0.37, I will use that corresponding data file to estimate the covariance of all the amino acids.

## [1] "Sample size will be 729"

Estimate the covariance and testing the significant

To testing the significant, I use the data cutoff for good-data and bad-data are <0.37 and >0.48. This is testing the difference of the two covariances from the two samples. And there are 20 out of the 57 tests return significant, therefore, we estimated the covariance from the samples <0.37. And the results is as below.

##       v     p.value               
##  [1,] "A-B" "0.737142714850322"   
##  [2,] "A-C" "0.0102042045470334"  
##  [3,] "A-H" "0.15238928550741"    
##  [4,] "R-B" "0.852376358109126"   
##  [5,] "R-C" "0.000754221548288747"
##  [6,] "R-H" "0.616361406848271"   
##  [7,] "N-B" "0.00828378008976971" 
##  [8,] "N-C" "0.59184421895955"    
##  [9,] "N-H" "0.349122849369314"   
## [10,] "D-B" "0.0171643909170487"  
## [11,] "D-C" "3.41196493591767e-06"
## [12,] "D-H" "0.049300158039949"   
## [13,] "C-B" "0.0189771133982759"  
## [14,] "C-C" "0.362608626431839"   
## [15,] "C-H" "0.00114826100651721" 
## [16,] "Q-B" "0.513328561865517"   
## [17,] "Q-C" "0.00652971216172382" 
## [18,] "Q-H" "0.163798135846292"   
## [19,] "E-B" "0.403456859932128"   
## [20,] "E-C" "0.291462646922404"   
## [21,] "E-H" "0.0265514929937438"  
## [22,] "H-B" "0.483089639659335"   
## [23,] "H-C" "0.773501951820805"   
## [24,] "H-H" "0.359683642963597"   
## [25,] "I-B" "0.166032256398974"   
## [26,] "I-C" "0.355185645314864"   
## [27,] "I-H" "0.0253774714413806"  
## [28,] "L-B" "0.0410519270691061"  
## [29,] "L-C" "0.00151316083480402" 
## [30,] "L-H" "0.570446967817319"   
## [31,] "K-B" "0.565190727075138"   
## [32,] "K-C" "0.124131758351224"   
## [33,] "K-H" "1.05135657779698e-05"
## [34,] "M-B" "0.988063039237699"   
## [35,] "M-C" "0.0100082605335099"  
## [36,] "M-H" "0.920674171571159"   
## [37,] "F-B" "0.441535379458004"   
## [38,] "F-C" "0.0157426704106727"  
## [39,] "F-H" "0.515723869755458"   
## [40,] "P-B" "0.115839386987015"   
## [41,] "P-C" "0.041013062171352"   
## [42,] "P-H" "0.407787408572974"   
## [43,] "S-B" "0.0140567153901503"  
## [44,] "S-C" "0.348738036097518"   
## [45,] "S-H" "0.0225359135698493"  
## [46,] "T-B" "6.68529146352626e-06"
## [47,] "T-C" "0.000823267392149107"
## [48,] "T-H" "0.00032282638144876" 
## [49,] "Y-B" "0.257062103123471"   
## [50,] "Y-C" "0.00580678697605563" 
## [51,] "Y-H" "0.952058985867915"   
## [52,] "W-B" "0.498675977146035"   
## [53,] "W-C" "0.385558685405271"   
## [54,] "W-H" "0.453685420324706"   
## [55,] "V-B" "0.221829774085398"   
## [56,] "V-C" "0.439396782123487"   
## [57,] "V-H" "4.1903946979005e-10"

Number of pvalues less than 0.05.

## [1] 20

The actual cov values are:

column names are: sample size, covariance, correlation

## $`A-B`
## [1] 1186.0000000   -0.9933715   -0.3765986
## 
## $`A-C`
## [1] 1766.000000   -0.577054   -0.279880
## 
## $`A-H`
## [1] 2604.0000000   -0.3102361   -0.3439466
## 
## $`R-B`
## [1] 1064.0000000   -1.0488134   -0.4012831
## 
## $`R-C`
## [1] 1338.0000000   -0.7331641   -0.2009029
## 
## $`R-H`
## [1] 1.480000e+03 3.160306e-03 2.446663e-03
## 
## $`N-B`
## [1] 626.00000000   0.22535841   0.08110761
## 
## $`N-C`
## [1] 1820.0000000   -0.4554736   -0.2103610
## 
## $`N-H`
## [1] 780.0000000  -0.2001824  -0.1306576
## 
## $`D-B`
## [1] 837.0000000  -0.1029460  -0.0381642
## 
## $`D-C`
## [1] 2523.0000000   -0.4825455   -0.1164453
## 
## $`D-H`
## [1] 1.328000e+03 5.057505e-02 3.286730e-02
## 
## $`C-B`
## [1] 433.0000000  -6.0550625  -0.4229983
## 
## $`C-C`
## [1] 452.0000000  -7.9736806  -0.5084138
## 
## $`C-H`
## [1] 350.0000000 -19.5210649  -0.3542063
## 
## $`Q-B`
## [1] 657.0000000  -0.8440757  -0.3335678
## 
## $`Q-C`
## [1] 1128.0000000   -0.9333932   -0.2411145
## 
## $`Q-H`
## [1] 1381.0000000   -0.2018717   -0.1663326
## 
## $`E-B`
## [1] 1189.0000000   -1.0354833   -0.4141322
## 
## $`E-C`
## [1] 2144.0000000   -0.9967293   -0.3922832
## 
## $`E-H`
## [1] 2640.0000000   -0.1796031   -0.1493640
## 
## $`H-B`
## [1] 470.00000000   0.19239885   0.05524868
## 
## $`H-C`
## [1] 747.00000000  -0.17222415  -0.04460166
## 
## $`H-H`
## [1] 562.0000000   0.2840652   0.1219700
## 
## $`I-B`
## [1] 1834.0000000   -0.4351673   -0.1634190
## 
## $`I-C`
## [1] 916.0000000  -0.7237291  -0.2381589
## 
## $`I-H`
## [1] 1369.0000000    0.4364401    0.2206447
## 
## $`L-B`
## [1] 2065.0000000   -0.8016776   -0.3382421
## 
## $`L-C`
## [1] 1801.0000000   -0.5365736   -0.2101911
## 
## $`L-H`
## [1] 2823.0000000   -0.3101273   -0.2103686
## 
## $`K-B`
## [1] 1378.0000000   -0.7245197   -0.3080199
## 
## $`K-C`
## [1] 2148.0000000   -0.8171199   -0.2791819
## 
## $`K-H`
## [1] 1.980000e+03 1.947758e-02 1.680974e-02
## 
## $`M-B`
## [1] 388.00000000   0.10227293   0.03311725
## 
## $`M-C`
## [1] 625.0000000  -0.7526935  -0.2329029
## 
## $`M-H`
## [1] 693.0000000   1.1257735   0.3831066
## 
## $`F-B`
## [1] 1174.0000000   -0.4465210   -0.1633602
## 
## $`F-C`
## [1] 786.00000000  -0.22068540  -0.06214466
## 
## $`F-H`
## [1] 1009.0000000    0.3150024    0.1408849
## 
## $`P-B`
## [1] 641.00000000  -0.02472701  -0.01662971
## 
## $`P-C`
## [1] 2303.00000000   -0.05374849   -0.03466999
## 
## $`P-H`
## [1] 419.0000000  -0.2007258  -0.2295327
## 
## $`S-B`
## [1] 1147.0000000   -0.5161696   -0.2283037
## 
## $`S-C`
## [1] 2918.0000000   -0.7349567   -0.3661219
## 
## $`S-H`
## [1] 1154.0000000   -0.3616235   -0.2617560
## 
## $`T-B`
## [1] 1530.0000000   -0.9174117   -0.2191130
## 
## $`T-C`
## [1] 1757.0000000   -1.3653929   -0.4591621
## 
## $`T-H`
## [1] 1022.0000000   -1.3694568   -0.5102328
## 
## $`Y-B`
## [1] 1046.0000000   -0.3466151   -0.1249385
## 
## $`Y-C`
## [1] 712.00000000  -0.11854118  -0.02741412
## 
## $`Y-H`
## [1] 742.0000000   0.2044582   0.1041339
## 
## $`W-B`
## [1] 404.0000000  -0.6447850  -0.2250234
## 
## $`W-C`
## [1] 264.0000000  -0.8121571  -0.2211414
## 
## $`W-H`
## [1] 327.0000000  -0.4963807  -0.2212419
## 
## $`V-B`
## [1] 2545.0000000   -1.4931074   -0.5317318
## 
## $`V-C`
## [1] 1313.0000000   -1.3305446   -0.4439499
## 
## $`V-H`
## [1] 1550.0000000   -0.3254695   -0.2745089