Mihaly Varadi - 17/02/2017
The goal of this analysis was to apply principal component analysis (PCA) on the coordinates of the atoms of structurally aligned IDP ensembles in order to compare ensembles to one another. PCA is so far the only published methodology for ensemble comparisons. We aim to rectify this limitation.
In the PED there were 4 sets of ensembles that described one-one proteins:
These 14 ensembles were used in the PCA in the following manner:
1.) The conformer which had an Rg value closest to the ensemble average was selected and all other conformers were aligned to this one. 2.) The aligned set of conformers was used to extract the X, Y, Z atomic coordinates using get_coords_for_pca.py 3.) The coordinates were saved in pca_table.csv files 4.) These files were used for PCA (in the script below)
These two ensembles of Sendai nucleocapsid protein have been generated using different force-fields. The two ensembles are remarkedly different according to the PCA plot: one is significantly more compact than the other.
PCA can be used to make a high-level comparison between ensembles, but it may not be able to provide more detail than the distribution of Rg values would do. When ensembles are markedly different then PCA plots show the difference strikingly.