MGV

https://github.com/baobabprince/MGV.git/report.qmd

Author

Rotem Hadar

Published

March 8, 2023

1 Summary table before filtering duplication

time gel Number of samples Mean Median SD
1 0 Gel 28 6725.900 3680.5 7204.986
2 0 No gel 28 9132.704 8427.0 5962.919
3 24 h 26 8521.962 7751.5 5669.882
4 5 Gel 28 5547.864 2675.5 6111.219
5 5 No gel 29 8127.179 7932.0 4367.264
Full Table
gel Number of samples Mean Median SD IQR1 IQR3
1 Gel 70 6515.815 3724.0 6613.201 201.0 11448.5
2 h 28 8729.357 7751.5 6215.736 4367.5 11182.5
3 No gel 66 8617.333 8099.0 5094.618 4899.0 11733.0
Number of samples Mean Median SD IQR1 IQR3
1 139 7742.163 8011 5876.105 2802 11325

SI Appendix, Table S2. Number of reads per sample type.

G0 – sample collected with lubrication and stored at -80°C within one hour of sampling.

N0 – sample collected without lubrication and stored at -80°C within one hour of sampling.

G5 - sample collected with lubrication and stored at -80°C within five hours of sampling.

N5 - sample collected without lubrication stored at -80°C within five hours of sampling.

G24 - sample collected 24 hours after first sampling stored at -80°C within one hour.

Rarefaction was adjusted to a minimum of 2000 reads.

2 Reads number

2.1 Summary statistics

Paired P-values gels Vs. No gel
time n pval median sd
0 27 0.009609835 4577.5 6612.864
5 28 0.003493442 6106.5 5403.185
Paired P-values Immediate freeze Vs. 5 hours delay
gel n pval median sd
Gel 28 0.7120570 985.0 6754.861
No gel 28 0.1817423 8172.5 5897.082

Figre 1a. Number of reads per sample type. 149 samples were included. Each point represents one sample, red dashed line represents the rarefaction threshold selected (2000 reads/sample). Red, blue and green circles represent samples collected with gel, without gel, and samples collected 24 hours after primary sample collection (in vivo) respectively. Between group read number comparison was performed using paired T test (paired on the patient ID). Boxes show median and interquartile range; whiskers extend from minimum to maximum values within each cohort.

3 Alpha diversity

Warning: Removed 14 rows containing non-finite values (`stat_boxplot()`).
Warning: Removed 13 rows containing non-finite values (`stat_signif()`).
Warning: Removed 14 rows containing non-finite values (`stat_summary()`).
Warning: Removed 14 rows containing missing values (`geom_point()`).

T test Paired alpha diversity for each group

Figure 1b. Alpha diversity, as quantified by Faith pd, is plotted versus sample type. Comparison within sample was performed using paired T test between gel and non gel samples. The number of samples in each group shown in the boxplot.

3.1 Summary statistics

4 PCoA - unweighted unifrac

Figure 2. Weighted Unifrac PCoA for all patient samples. Shapes represent three sequencing runs, Each color represents a different patient.

5 Clustering by microbial population resulted 3 main clusters

K-means clustering for Axis 1 and 2, k = 3

6 Heatmap

Figure 3. Heatmap showing the relative abundance of bacteria (rows) across samples (columns, sorted by Patient ID) across the 108(???) samples. Top color bars show the Patient ID, storage group??? (what colors??? Or should we remove this bar???) And gel treatment group (gel and no-gel represented by orange and green respectively????). *** need to replot this figure without the unreadable labels on the top.

7 Heatmap ordered by cluster

8 Most prevelant sequences

[1] 108