Analysis from the results of a label free mass spectrometry experiment on paraffin embedded samples stratified as follows:
| Sample | Etiology | Cellular Composition |
|---|---|---|
| BH-321-P1 | LAA | RBC |
| GOTH-168-P5 | LAA | PLT/FIB |
| GOTH-172-P5 | LAA | PLT/FIB |
| ATH-012-P2 | LAA | PLT/FIB |
| ATH-011-P1 | LAA | RBC |
| BH-316-P1 | LAA | RBC |
| NICN-213-P2 | LAA | PLT/FIB |
| BH-323-P3 | LAA | RBC |
| NICN-167-P2 | LAA | PLT/FIB |
| NICN-196-P1 | LAA | PLT/FIB |
| BH-215-P2* | LAA | PLT/FIB |
| BH-308-P2 | CEE | PLT/FIB |
| GOTH-158-P2 | CEE | PLT/FIB |
| BH-287-P1 | CEE | RBC |
| BH-278-P3 | CEE | PLT/FIB |
| BH-326-P2 | CEE | PLT/FIB |
| BH-364-P2 | CEE | PLT/FIB |
| NICN-193-P4 | CEE | PLT/FIB |
| ATH-018-P1 | CEE | PLT/FIB |
| NICN-198-P1 | CEE | PLT/FIB |
The plot below demonstrates no relationship between the groups and the protein content of the samples.
The Venn Diagram generated using the forth data set shows that 77 proteins are overlapping between the CE and LAA Groups - and this intersection is used for comparison between groups, 33 proteins are unique to the LAA samples, and 34 proteins are unique to CE samples.
# Data Processing
After processing the data with log-2 transformation and normalization, I performed an exploratory data analysis with principal component analysis. The results show that there is no clear separation between groups.
The correlation matrices with the co-abundant proteins were used to create the adjacency matrices necessary for the network analysis. The correlation analysis was done using the following data sets:
Using the correlation matrices above to extract information on the abundancy profile of the highly correlated proteins with a threshold of +/- 0.80, the network analysis was carried out and resulted in the graphs below. The node size is proportional with the degree (how connected the protein is), the red edges represent positive correlation and blue edges represent negative correlation. The optimal community structure was calculated for the graph, in using the maximal modularity score.
Here I used a linear model approach to assess differential abundance/expression between the two groups - this analysis resulted in no differentially abundant proteins.