Solvent accessible surface area analysis (SASA)

Mihaly Varadi - 15th Apr 2015

Last modified - 4th Jul 2017



EXECUTIVE SUMMARY

This report details the analysis of the ensemble data stored in the PED regarding the relative solvent accessible surface area of the conformers (stored in PDB format).



1. Data assembly

The following PED entries were considered for the analysis:

MD-based: PED2AAA, PED3AAD, PED4AAB, PED5AAD, PED6AAD

Pool-based: PED5AAA, PED5AAB, PED6AAA, PED6AAC, PED7AAC

Other entries failed either/both steric clash and/or phi/psi filtering.

A random subset of 1000 folded proteins was retrieved from the Protein Data Bank. The criteria were: unique, single chain, protein structures with maximum 30% similarity. Therefore the reference set of folded proteins contained:

  • 1,000 PDB files
  • and 232,890 residues

The PDB files are saved in the REF subdirectory.



2. Calculation

The Parameter optimized surfaces (POPS) software was used for the solvent accessible surface area and solvation free energy calculations.

Fraternali, F. and Cavallo, L. Parameter optimized surfaces (POPS): analysis of key interactions and conformational changes in the ribosome. Nucleic Acids Research 30 (2002) 2950-2960*

The Python wrapper script asa_pipeline.py was used for running POPS and extracting the solvent accessible surface area ratios (in chain / isolated reference) and the solvation free energies. The Python script residue_specific_values.py was used to get the per residue type data tables. The raw POPS output files are saved in the solvent_accessibility_analysis folder.

2.1 Solvent accessible surface area (SASA)

Residue level SASA was calculated using POPS. Residue level values are the sum of atomic level surface areas. The ensemble to isolated ratios were calculated by dividing the sum of each atomic surface for each residue by a refernece surface area of an isolated residue. Since the reference values are constants in POPS for each 20 amino acids, the ratio can be higher than 1 if the ensemble residue is even more extended than the reference.

The residue-specific SASA scores are saved in the sasa_ratios folder (and PED, PDB subfolders).



3 Explorative analysis

3.1 SASA ratios per residues

3.1.1.1 IDP dataset

3.1.1.2 MD IDP dataset

3.1.1.3 Pool IDP dataset

3.1.2 Folded reference dataset

3.1.3 Residue-specific differences between the IDP and folded datasets