Mihaly Varadi - 15th Apr 2015
Last modified - 4th Jul 2017
This report details the analysis of the ensemble data stored in the PED regarding the relative solvent accessible surface area of the conformers (stored in PDB format).
The following PED entries were considered for the analysis:
MD-based: PED2AAA, PED3AAD, PED4AAB, PED5AAD, PED6AAD
Pool-based: PED5AAA, PED5AAB, PED6AAA, PED6AAC, PED7AAC
Other entries failed either/both steric clash and/or phi/psi filtering.
A random subset of 1000 folded proteins was retrieved from the Protein Data Bank. The criteria were: unique, single chain, protein structures with maximum 30% similarity. Therefore the reference set of folded proteins contained:
The PDB files are saved in the REF subdirectory.
The Parameter optimized surfaces (POPS) software was used for the solvent accessible surface area and solvation free energy calculations.
Fraternali, F. and Cavallo, L. Parameter optimized surfaces (POPS): analysis of key interactions and conformational changes in the ribosome. Nucleic Acids Research 30 (2002) 2950-2960*
The Python wrapper script asa_pipeline.py was used for running POPS and extracting the solvent accessible surface area ratios (in chain / isolated reference) and the solvation free energies. The Python script residue_specific_values.py was used to get the per residue type data tables. The raw POPS output files are saved in the solvent_accessibility_analysis folder.
Residue level SASA was calculated using POPS. Residue level values are the sum of atomic level surface areas. The ensemble to isolated ratios were calculated by dividing the sum of each atomic surface for each residue by a refernece surface area of an isolated residue. Since the reference values are constants in POPS for each 20 amino acids, the ratio can be higher than 1 if the ensemble residue is even more extended than the reference.
The residue-specific SASA scores are saved in the sasa_ratios folder (and PED, PDB subfolders).