Baseline characteristics of all screened women using means ± SDs for continuous variables and counts (percentages) for categorical variables

Characteristic	N = 449
Age (years)	40.2 ± 14.9
Age group
<30	139 (31%)
30-50	175 (39%)
>50	135 (30%)
Season
Autumn	96 (21%)
Spring	169 (38%)
Summer	80 (18%)
Winter	104 (23%)
Patient status (categorical)
Fertile	238 (53%)
Pregnancy	40 (8.9%)
Premenopause	50 (11%)
Menopause	121 (27%)
Vaginal pH
4	73 (16%)
5	266 (59%)
6	99 (22%)
7	10 (2.2%)
8	1 (0.2%)
Sexually active
No	33 (7.3%)
Yes	393 (88%)
Unknown	23 (5.1%)
Itching symptoms	26 (5.8%)
Burning symptoms	38 (8.5%)
Leukorrhea (abnormal discharge)	70 (16%)
Chlamydia trachomatis	2 (0.5%)
Ureaplasma spp.	108 (24%)
Mycoplasma hominis	10 (2.2%)
Trichomonas vaginalis
negative	449 (100%)
Candida spp.	54 (12%)
Escherichia coli	143 (32%)
Proteus spp.	147 (33%)
Pseudomonas spp.	109 (24%)
Gardnerella vaginalis	281 (63%)
Staphylococcus aureus	212 (47%)
Enterococcus faecalis	262 (58%)
Neisseria gonorrhoeae
negative	449 (100%)
Streptococcus agalactiae (GBS)	30 (6.7%)

Symptoms dumbbell plot

For each pathogen and each symptom (itching, burning, leukorrhea) we computed the percentage symptomatic among pathogen-positive and pathogen-negative tests. Differences in proportions were explored with 2×2 tests (chi-square or Fisher when expected counts <5), with Benjamini–Hochberg false discovery rate (BH–FDR) control across comparisons.

Figure 1. Percent symptomatic among pathogen-positive (yellow) and pathogen-negative (grey) tests; connector length shows absolute difference. No BH–FDR-significant differences

Across pathogens, symptom prevalence was broadly similar between positive and negative tests; absolute differences were generally small and none remained significant after BH–FDR. This supports the interpretation that attendance at the clinic was not symptom-driven and that symptoms alone do not distinguish pathogen positivity in this setting.

Uni and Multivariable model for Relative Risk of infection by collected factors

We modeled per-pathogen positivity using Poisson regression with log link and robust (HC) standard errors, reporting risk ratios (RRs) with 95% CIs. Primary covariates: age group (three levels: <30 ref, 30-50, >50), reproductive status, sexual activity, and season (reference: autumn). Only pathogens with nr. of positivity > 10 were included in the analysis. For display, we summarized a common subset of predictors in a faceted forest plot (one panel per predictor). BH–FDR was applied within predictor across pathogens.

Figure 2. Adjusted RRs (95% CIs) from Poisson models with robust SEs; vertical line = RR 1.0. Filled points denote BH–FDR < 0.05.

Adjusted associations were generally modest and heterogeneous by organism. Higher pH , sexual activity and reproductive status showed very limited, pathogen-specific associations.Some seasonal effects were revealed, in particular for Staphylococcus A and Pseudomonas, showing significantly lower positivity rate in Spring and Summer compared to Autumn.

Monthly Incidence

For each pathogen and month we plotted the proportion positive among tests performed (positives/tests), with Wilson 95% CIs.

Figure 3. Monthly proportion positive (positives/tests) with Wilson 95% CIs; numbers under the axis show monthly tests; dashed lined: generalized additive models with cyclic month term

Several organisms showed stable monthly proportions, while others displayed suggestive seasonality (e.g., XXX). Some fluctuations coincided with months having fewer tests and wider CIs.

Upset Plot

We visualized co-detections among common pathogens using an UpSet plot, ranking intersections by size. We evaluated pairwise independence using 2×2 tests (chi-square or Fisher) comparing observed vs expected co-detections given marginal prevalences, with BH–FDR control. For selected pairs we further examined adjusted co-occurrence

Figure 4. UpSet plot of the most common co-detection sets; bar labels show counts.

sp1	sp2	obs	exp	p_adj
gardnerella_vaginalis	enterococcus_fecalis	211	179.129	0.000
staphylococcus_aureus	enterococcus_fecalis	159	135.144	0.000
gardnerella_vaginalis	staphylococcus_aureus	158	144.944	0.031
proteus	enterococcus_fecalis	128	93.708	0.000
e_coli	enterococcus_fecalis	116	91.158	0.000
proteus	staphylococcus_aureus	96	75.825	0.000
e_coli	staphylococcus_aureus	92	73.762	0.001
pseudomonas	staphylococcus_aureus	70	56.224	0.016
pseudomonas	gardnerella_vaginalis	62	74.523	0.018
pseudomonas	enterococcus_fecalis	57	69.484	0.023

Co-detections were frequent for combinations that included Gardnerella, Enterococcus, and Staphylococcus, with the largest intersections reaching 25 individuals. Looking at combinations of two pathogens, several pairs appeared more common than expected by chance in the unadjusted analysis, in particular pairs including Gardnerella, Enterococcus, Staphylococcus, Pseudomonas and E. Coli, each with >50 observations.

Infection pairs analysis

To assess whether pathogens co-occur beyond case-mix, we modeled each pathogen A as the outcome in a Poisson log-link model with robust SEs and included pathogen B as a predictor with covariate adjustment. We fit models in both directions (A~B and B~A), and took the average log-RRs; FDR control was applied across pairs.

Figure 5. Adjusted co-occurrence heat map. Cells show RR for symmetrized A↔︎B association; * BH–FDR < 0.05.

After adjustment, several bacterial pairs showed positive co-occurrence, in particular those involving Enterococcus f. and E. coli; others showed moderate associations (e.g., Gardnerella with Proteus or Staphylococcus). Most remaining pairs were near RR≈1. Findings were consistent with the UpSet intersections and robust to covariate adjustment.

IST Analysis

Lofaro D

2025-03-13

Symptoms dumbbell plot

Uni and Multivariable model for Relative Risk of infection by collected factors

Monthly Incidence

Upset Plot

Infection pairs analysis