1 Background

Glyphosate is one of the most widely used herbicides globally and has well-documented toxic effects on non-target soil microorganisms, including nitrogen-fixing Rhizobium bacteria. While the baseline toxicity of glyphosate is treated as static in this study, environmental factors such as soil pH and salinity are known to alter the bioavailability and chemical speciation of glyphosate, potentially modifying its toxicity to microbial communities.

This study investigates the interactive effects of pH and salinity on the toxicity of glyphosate to Rhizobium bacteria, as measured by Colony Forming Unit (CFU) counts. Glyphosate concentration was held constant to isolate the modulating roles of pH (which influences glyphosate’s ionic form and soil binding affinity) and salinity (which affects osmotic stress and ionic competition). Understanding these interactions is critical for assessing the true ecological risk of glyphosate in diverse soil environments.


Data = readr::read_rds("Cleaned_rhizobium_data")

knitr::kable(Data, 
         caption = "CFU count results",
        booktabs = TRUE) %>%
  kable_styling(latex_options = c("striped", "hold_position"))
CFU count results
pH Salinity_(ms/cm) Rep CFU_counts CFU_counts_log pH_f Salinity_f
5 1.2 replicate_1 0 0.000000 5 1.2
5 6.7 replicate_1 0 0.000000 5 6.7
5 13.1 replicate_1 0 0.000000 5 13.1
7 1.2 replicate_1 0 0.000000 7 1.2
7 6.7 replicate_1 0 0.000000 7 6.7
7 13.1 replicate_1 0 0.000000 7 13.1
9 1.2 replicate_1 300 5.707110 9 1.2
9 6.7 replicate_1 345 5.846439 9 6.7
9 13.1 replicate_1 360 5.888878 9 13.1
7 0.0 replicate_1 522 6.259582 7 0
5 1.2 replicate_2 0 0.000000 5 1.2
5 6.7 replicate_2 0 0.000000 5 6.7
5 13.1 replicate_2 0 0.000000 5 13.1
7 1.2 replicate_2 0 0.000000 7 1.2
7 6.7 replicate_2 60 4.110874 7 6.7
7 13.1 replicate_2 75 4.330733 7 13.1
9 1.2 replicate_2 255 5.545177 9 1.2
9 6.7 replicate_2 344 5.843544 9 6.7
9 13.1 replicate_2 407 6.011267 9 13.1
7 0.0 replicate_2 540 6.293419 7 0
5 1.2 replicate_3 0 0.000000 5 1.2
5 6.7 replicate_3 0 0.000000 5 6.7
5 13.1 replicate_3 0 0.000000 5 13.1
7 1.2 replicate_3 0 0.000000 7 1.2
7 6.7 replicate_3 0 0.000000 7 6.7
7 13.1 replicate_3 0 0.000000 7 13.1
9 1.2 replicate_3 195 5.278115 9 1.2
9 6.7 replicate_3 300 5.707110 9 6.7
9 13.1 replicate_3 375 5.929589 9 13.1
7 0.0 replicate_3 540 6.293419 7 0
Summary <- Data %>%
  group_by(pH_f, Salinity_f) %>%
  summarise(mean_cfu = mean(CFU_counts),
            sd_cfu = sd(CFU_counts), 
            .groups = "drop")

knitr::kable(Summary, 
         caption = "Summary CFU count results",
        booktabs = TRUE) %>%
  kable_styling(latex_options = c("striped", "hold_position"))
Summary CFU count results
pH_f Salinity_f mean_cfu sd_cfu
5 1.2 0.0000 0.00000
5 6.7 0.0000 0.00000
5 13.1 0.0000 0.00000
7 0 534.0000 10.39230
7 1.2 0.0000 0.00000
7 6.7 20.0000 34.64102
7 13.1 25.0000 43.30127
9 1.2 250.0000 52.67827
9 6.7 329.6667 25.69695
9 13.1 380.6667 24.00694

2 Exploratory Data Visualisation

Before conducting inferential analyses, the raw data were visualised to examine distributional patterns and potential interactions between pH and salinity on CFU counts.

2.1 Boxplot of CFU Counts by pH and Salinity

The boxplot below provides an overview of the distribution and spread of CFU counts across pH levels, grouped by salinity treatment. Each box represents the interquartile range (IQR) of bacterial counts within a treatment combination, with the central line indicating the median. Whiskers extend to 1.5× the IQR, and individual points beyond this range are plotted as outliers.

This plot is particularly useful for identifying differences in central tendency and variability between treatment groups, as well as for detecting potential outliers in the data.

ggplot(Data, aes(x = pH_f, y = CFU_counts, fill = Salinity_f)) +
  geom_boxplot() +
  labs(
    title = "CFU Counts across pH and Salinity",
    x = "pH Levels",
    y = "CFU Counts",
    fill = "Salinity"
  )
Figure 1. Distribution of CFU counts across pH levels for each salinity treatment. Boxes represent the IQR; horizontal lines indicate medians.

Figure 1. Distribution of CFU counts across pH levels for each salinity treatment. Boxes represent the IQR; horizontal lines indicate medians.


2.2 Jitter Plot of CFU Counts by pH and Salinity

To complement the boxplot, a jitter plot displays each individual observation in the dataset. Points are horizontally jittered to reduce overplotting, allowing the full sample size and raw data spread within each treatment combination to be seen clearly.

This plot is valuable for assessing whether apparent differences in medians (from the boxplot) are consistent across individual replicates, and for identifying any clusters or gaps in the data that aggregated summaries might obscure.

ggplot(Data, aes(x = pH_f, y = CFU_counts, color = Salinity_f)) +
  geom_jitter(width = 0.2, height = 0, size = 3, alpha = 0.7) +
  labs(
    title = "Scatter of CFU Counts by pH and Salinity",
    x = "pH Levels",
    y = "CFU Counts",
    color = "Salinity"
  )
Figure 2. Individual CFU count observations by pH level and salinity treatment. Points are jittered horizontally to improve visibility.

Figure 2. Individual CFU count observations by pH level and salinity treatment. Points are jittered horizontally to improve visibility.


2.3 Mean CFU Counts with Error Bars across pH and Salinity

This plot summarises the data by displaying group means ± one standard deviation for each pH–salinity combination, connected by smooth trend lines. It provides a cleaner picture of the overall patterns and directions of the relationships compared to the raw data plots above.

The error bars convey the degree of variability within each treatment group. Diverging or crossing trend lines between salinity levels would suggest a potential interaction effect between pH and salinity on bacterial CFU counts — a key hypothesis tested in the inferential analysis below.

ggplot(Summary, aes(x = pH_f, y = mean_cfu, group = Salinity_f, color = Salinity_f)) +
  geom_smooth(size = 1) +
  geom_point(size = 3) +
  geom_errorbar(aes(ymin = mean_cfu - sd_cfu, ymax = mean_cfu + sd_cfu), width = 0.2) +
  labs(
    title = "Mean CFU Counts across pH and Salinity",
    x = "pH Levels",
    y = "Mean CFU Counts",
    color = "Salinity"
  )
Figure 3. Mean CFU counts by pH level and salinity treatment. Error bars represent ± 1 standard deviation. Lines connect group means to illustrate trends.

Figure 3. Mean CFU counts by pH level and salinity treatment. Error bars represent ± 1 standard deviation. Lines connect group means to illustrate trends.


3 Inferential Statistics

Prior to conducting inferential tests, the normality of CFU count data was assessed visually using a Q-Q plot. Deviation of data points from the theoretical reference line indicated that the normality assumption required for parametric tests such as ANOVA was violated. Consequently, non-parametric alternatives were employed throughout this analysis. The Kruskal-Wallis test was used to assess the main effects of pH and salinity on CFU counts, followed by Dunn’s post-hoc test with Bonferroni correction for pairwise comparisons. To specifically test the interaction effect between pH and salinity, the Scheirer-Ray-Hare test was applied by fitting a linear model on ranked CFU counts.

3.1 Normality Check

Data %>%
  ggplot(aes(sample = CFU_counts)) +
  stat_qq() +
  stat_qq_line(color = "red") +
  labs(
    title = "Q-Q Plot of CFU Counts",
    x = "Theoretical Quantiles",
    y = "Sample Quantiles"
  )
Figure 4. Q-Q plot of raw CFU counts.

Figure 4. Q-Q plot of raw CFU counts.

3.2 Kruskal-Wallis Test

kruskal_ph <- Data %>%
  rstatix::kruskal_test(CFU_counts ~ pH_f)

kruskal_sal <- Data %>%
  rstatix::kruskal_test(CFU_counts ~ Salinity_f)

kruskal_combined <- bind_rows(kruskal_ph, kruskal_sal)

knitr::kable(kruskal_combined,
             caption = "Table 1. Kruskal-Wallis Test Results for pH and Salinity",
             booktabs = TRUE,
             digits = 4) %>%
  kable_styling(latex_options = c("striped", "hold_position"))
Table 1. Kruskal-Wallis Test Results for pH and Salinity
.y. n statistic df p method
CFU_counts 30 14.4344 2 0.0007 Kruskal-Wallis
CFU_counts 30 9.8669 3 0.0197 Kruskal-Wallis

Both pH (H = 14.4, df = 2, p = 0.0007) and salinity (H = 9.87, df = 3, p = 0.020) had statistically significant effects on Rhizobium CFU counts under constant glyphosate exposure, indicating that both environmental factors independently modulate glyphosate toxicity to the bacteria.

3.3 Dunn’s Post-Hoc Test

dunn_ph <- Data %>%
  rstatix::dunn_test(CFU_counts ~ pH_f, p.adjust.method = "bonferroni")

dunn_sal <- Data %>%
  rstatix::dunn_test(CFU_counts ~ Salinity_f, p.adjust.method = "bonferroni")

knitr::kable(dunn_ph,
             caption = "Table 2. Dunn Post-Hoc Comparisons for pH",
             booktabs = TRUE,
             digits = 4) %>%
  kable_styling(latex_options = c("striped", "hold_position"))
Table 2. Dunn Post-Hoc Comparisons for pH
.y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
CFU_counts 5 7 9 12 1.8530 0.0639 0.1917 ns
CFU_counts 5 9 9 9 3.7936 0.0001 0.0004 ***
CFU_counts 7 9 12 9 2.2026 0.0276 0.0829 ns
knitr::kable(dunn_sal,
             caption = "Table 3. Dunn Post-Hoc Comparisons for Salinity",
             booktabs = TRUE,
             digits = 4) %>%
  kable_styling(latex_options = c("striped", "hold_position"))
Table 3. Dunn Post-Hoc Comparisons for Salinity
.y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
CFU_counts 0 1.2 3 9 -3.0731 0.0021 0.0127
CFU_counts 0 6.7 3 9 -2.7339 0.0063 0.0376
CFU_counts 0 13.1 3 9 -2.5181 0.0118 0.0708 ns
CFU_counts 1.2 6.7 9 9 0.4797 0.6315 1.0000 ns
CFU_counts 1.2 13.1 9 9 0.7849 0.4325 1.0000 ns
CFU_counts 6.7 13.1 9 9 0.3052 0.7602 1.0000 ns

For pH, Dunn’s test revealed that pH 9 was significantly different from pH 5 (p.adj = 0.0004, ***), while pH 5 vs pH 7 (p.adj = 0.192) and pH 7 vs pH 9 (p.adj = 0.083) did not differ significantly. For salinity, significant differences were observed between 0 ms/cm and 1.2 ms/cm (p.adj = 0.013) and between 0 ms/cm and 6.7 ms/cm (p.adj = 0.038), while all comparisons among non-zero salinity levels were non-significant (p.adj = 1.000).

3.4 Scheirer-Ray-Hare Test

Data <- Data %>%
  mutate(ranked_CFU = rank(CFU_counts))

srh_model <- lm(ranked_CFU ~ pH_f * Salinity_f, data = Data)
srh_results <- as.data.frame(summary(aov(srh_model))[[1]])
srh_results <- round(srh_results, 4)
srh_results[is.na(srh_results)] <- ""

knitr::kable(srh_results,
             caption = "Table 4. Scheirer-Ray-Hare Test Results",
             booktabs = TRUE,
             col.names = c("Df", "Sum Sq", "Mean Sq", "F value", "Pr(>F)")) %>%
  kable_styling(latex_options = c("striped", "hold_position"))
Table 4. Scheirer-Ray-Hare Test Results
Df Sum Sq Mean Sq F value Pr(>F)
pH_f 2 948.9375 474.4687 80.305 0
Salinity_f 3 811.2292 270.4097 45.7675 0
pH_f:Salinity_f 4 28.1667 7.0417 1.1918 0.3448
Residuals 20 118.1667 5.9083

The Scheirer-Ray-Hare test confirmed significant main effects of both pH (F = 80.3, p < 0.001) and salinity (F = 45.8, p < 0.001). Critically, the interaction term (pH x Salinity) was not statistically significant (F = 1.19, p = 0.345), indicating that pH and salinity act independently rather than synergistically on Rhizobium CFU counts under glyphosate stress.


Analysis conducted in R. All figures and tables generated using ggplot2, knitr, and kableExtra.