In this report we check the rmsd values of the CA atoms in the protein PNP under different force constant compared to the rmsd values of the starting long MD trajectory used to estimate the principal curve for the binding path gate01. All the rmsd values in this report are calculated using the same frame of reference, which is the first frame from the MD trajectory.
The data to be analyzed comes from two different sources:
A long MD trajectory that displays the binding of the ligand (DIH) to the binding pocket of the protein (PNP) using a path that we called gate01 in which the ligand crosses a protein loop before arribing to the binding site.
Images comming from the OTF string method calculation that represent converged binding paths of DIH to PNP. The number of images are 21 (from primary OTF calculations) and 40 (from OTF calculations derived from reparametrizations of the primary calculation). In these images, we also specify several force constants that were applied to the protein using ColVar module of namd to keep the protein from rotating/translating.
# get into the folder
wd0<-getwd()
wd1<-paste(wd0,'/RMSD_CA', sep="")
setwd(wd1)
# load data
rmsdMD<-read.table("rmsdCA_SergioMDGate01.txt")
rmsdMD<-rmsdMD$V1
rmsd40K200<-read.table("rmsdCA_Img40K200.txt")
rmsd40K200<-rmsd40K200$V1
rmsd40K500<-read.table("rmsdCA_Img40K500.txt")
rmsd40K500<-rmsd40K500$V1
rmsd40K1000<-read.table("rmsdCA_Img40K1000.txt")
rmsd40K1000<-rmsd40K1000$V1
#create data frame with all the RMSDs
maxLen<-length(rmsdMD)
c1<-c(rmsd40K200, rep(NA, maxLen - length(rmsd40K200)))
c2<-c(rmsd40K500, rep(NA, maxLen - length(rmsd40K500)))
c3<-c(rmsd40K1000, rep(NA, maxLen - length(rmsd40K1000)))
all<-data.frame(rmsdMD,c1,c2,c3)
colnames(all) <- c("MD","I40K200","I40K500","I40K1000")
head(all)
## MD I40K200 I40K500 I40K1000
## 1 0.8630241156 3.829731 4.013243 3.934663
## 2 0.0004908327 3.848924 3.875071 3.973791
## 3 0.8565245867 3.456392 3.533393 3.608837
## 4 0.8623427153 3.491021 3.422280 3.420280
## 5 0.8345490098 3.389672 3.218209 3.464648
## 6 1.0214922428 3.607123 3.438020 3.517963
Now we simply explore the data using simple plots/boxplots:
#plot
layout(matrix(c(1,1,2,3), 2, 2, byrow = TRUE))
boxplot(all,ylab="RMSD",col=(c("grey","blue","green","red")))
plot(rmsdMD,type="l",xlab="time",ylab="RMSD",main="Molecular Dynamics")
plot(rmsd40K200,type="l",col="blue",xlab="Images",ylab="RMSD",main="OTF string method calc.")
lines(rmsd40K500,type="l",col="green")
lines(rmsd40K1000,type="l",col="red")
legend(5,4.6,c("I40K200","I40K500","I40K1000"),lty=c(1,1,1),lwd=c(1,1,1),col=c("blue","green","red"),pt.cex = 1,cex=0.7)
Although the RMSD values from OTF calculations are sliglty hihger than those from MD, the difference is not large. There is also homogeinity and even a good degree of correlation among the RMSD values from the images of the OTF method.