1 Sample Information
2 Coverage Analysis
3 Overlapping proteins
- 3.1 Protein datasets
4 RBC Analysis
5 FIB Analysis
6 Differential Analysis
- 6.1 Co-expression network
- 6.2 Pathway Enrichment Analysis

1 Sample Information

Analysis from the results of a label free mass spectrometry experiment on paraffin embedded samples stratified as follows:

Red Blood Cells rich clot analogues (RBC): n = 5
Fibrin rich clot analogues (FIB): n = 5

2 Coverage Analysis

On the previously conducted analysis, we could see that filtering the data based on the coverage score was not a good idea as the majority of the proteins had low coverage values. The same analysis was conducted here and the summary statistics of the coverage score for each sample is shown on the table below, and based on that and the plots we can see a similar behaviour regarding the coverage.

	Fib1	Fib2	Fib3	Fib4	Fib5	RBC1	RBC2	RBC3	RBC4	RBC5
Min.	0.18000	0.07000	0.0600	0.28000	0.19000	0.19000	0.04000	0.08000	0.30000	0.13000
1st Qu.	4.73000	4.53500	4.3800	4.38000	4.30000	5.03000	5.01000	4.71000	5.61750	5.02000
Median	10.17000	10.07500	9.9800	9.52500	9.42000	10.94000	11.69000	10.95000	11.06500	10.90000
Mean	15.71651	15.68055	15.3679	15.47617	14.15617	16.54819	15.68757	15.77537	16.03192	15.75879
3rd Qu.	21.55000	23.03000	20.6500	21.46000	19.35000	22.54000	20.87000	21.42250	21.89000	20.62000
Max.	97.28000	91.55000	94.3700	93.88000	94.37000	100.00000	100.00000	100.00000	100.00000	100.00000

3 Overlapping proteins

The Venn Diagram shows that 101 proteins are overlapping between the RBC (Set_1) and FIB (Set_2) Groups - and this intersection is used for comparison between groups, 161 proteins are common to the RBC samples, and 216 proteins are common to the FIB samples.

3.1 Protein datasets

The correlation matrices with the co-abundant proteins were used to create the adjacency matrices necessary for the network analysis. Here we have three distinct datasets: i) Fibrin samples with overlapping proteins within the group, ii) Red Blood cells samples with with overlapping proteins within the group, and iii) All samples with the 101 proteins that are common to all samples.

4 RBC Analysis

4.1 Pearson correlation

Using the correlation matrix above to extract information on the abundancy profile of the top 50 highly correlated proteins with a threshold of +/- 0.80, the network analysis was carried out and resulted in the graph below. The node size is proportional with the degree (how connected the protein is), the red edges represent positive correlation and blue edges represent negative correlation. The optimal community structure was calculated for the graph, in using the maximal modularity score.This analysis resulted in the 4 communities represented as the node colours and shown in the tables below.

4.2 Network Analysis

4.3 Pathway Enrichment Analysis

Pathway enrichment analysis was conducted on communities 1, 2 and 4 using the InterMineR R package. Here the pathways were tested for over-representation in each of the communities relative to what is expected by chance and a p-value is computed for each pathway. The plots below represent the top 10 enriched pathways for the aforementioned communities - you can hover the bars for p-value information.

Community 3 resulted in no significant enriched pathways.

5 FIB Analysis

5.1 Pearson correlation analysis

5.2 Network analysis

The same method for constructing the co-expression network used on the RBC samples was applied to the FIB dataset here, in this case using the 216 overlapping proteins. The optimal community analysis resulted in four communities that were analysed for pathway enrichment below.

5.3 Pathway enrichment analysis

6 Differential Analysis

Here I used a linear model approach to assess differential abundance/expression between the two groups - this analysis resulted in 58 differentially abundant proteins. The table below shows the metrics of the top-ranked proteins from the linear model fit.

6.1 Co-expression network

The differentially abundant protein scores were used to perform the coexpression analysis on the comparison between groups. Here the node colours represent the logFC values, and only edges with correlation greater than +/- 0.80 were used. An adjusted p-value of 0.05 was selected for assuming significance.

## Step 1 ...computing correlation
## Step 2 ...computing null distribution
## ================================================================================
## Step 3 ...computing probs
## Step 4 ...adjusting pvals

6.2 Pathway Enrichment Analysis

Pathway enrichment analysis were conducted on statistically significant positive and negative associated proteins.

Proteomics Analysis

Mariel Barbachan

05/05/2020