1 Sample Information

Analysis from the results of a label free mass spectrometry experiment on paraffin embedded samples stratified as follows:

1.1 Third Batch of Samples: Cellular composition criteria

  • Red Blood Cell rich Clots (RBC): n = 4
  • Mixed clots (mixed): n = 3
  • Platelet/Fibrin clots (Fib): n = 5

2 Coverage Analysis

The coverage analysis result was conducted here on both data sets, and the summary statistics of the coverage score for each sample is shown on the tables below, and based on that and the plots we can see a similar behaviour regarding the coverage.

Third batch of samples: Cellular composition
rbc1 rbc2 rbc3 rbc4 mix1 mix2 mix3 fib1 fib2 fib3 fib4 fib5
Min. 0.13000 0.25000 0.24000 0.05000 0.04000 0.30000 0.04000 0.45000 0.6500 0.26000 0.15000 0.19000
1st Qu. 4.03500 5.13000 4.57000 4.47000 4.54000 4.11250 4.28000 5.03000 5.3725 3.28250 4.28000 4.79500
Median 8.95000 11.00000 9.59000 10.29000 9.75000 8.73000 9.72000 10.31000 7.1800 7.31500 9.78000 9.57000
Mean 14.02422 14.88716 14.31953 14.42899 14.78163 13.55094 14.25736 15.23341 15.4750 11.51224 14.81137 15.98963
3rd Qu. 19.26000 20.05000 18.89000 19.16000 20.30000 17.94250 19.76000 21.08750 13.2950 14.51500 19.21000 22.23500
Max. 97.28000 97.28000 97.28000 97.28000 100.00000 97.28000 97.28000 66.90000 87.7600 68.47000 95.92000 63.27000

The Venn Diagram generated using the second data set shows that 101 proteins are overlapping between the RBC and FIB Groups - and this intersection is used for comparison between groups, 60 proteins are unique to the RBC samples, and 115 proteins are unique to the FIB samples.

3 RBC Analysis

3.1 Pearson correlation

Using the correlation matrix above to extract information on the abundancy profile of the top 50 highly correlated proteins with a threshold of +/- 0.80, the network analysis was carried out and resulted in the graph below. The node size is proportional with the degree (how connected the protein is), the red edges represent positive correlation and blue edges represent negative correlation. The optimal community structure was calculated for the graph, in using the maximal modularity score.This analysis resulted in the six communities represented as the node colours and shown in the table below.

3.2 Network Analysis



4 MIX Analysis

4.1 Pearson correlation

4.2 Network analysis

The same method for constructing the co-expression network used on the RBC samples was applied to the FIB dataset here, in this case using the 115 proteins unique to FIB samples from the second batch. The optimal community analysis resulted in three communities listed in the table below.

5 Platelet/Fibrin Analysis

5.1 Pearson correlation

6 Differential Analysis

Here I used a linear model approach to assess differential abundance/expression between the two groups - this analysis resulted in 40 differentially abundant proteins. The tables below show the metrics of the top-ranked proteins from the linear model fit. The data was pre-processed with log-transformation and quantile normalisation to ensure that the expression distributions of each sample are similar across the entire experiment and not skewed.