Formative Coursework BIO3097

Author

Oliwia Wieczorek

library

Running Code

When you click the Render button a document will be generated that includes both content and the output of embedded code. You can embed code like this:

adding code in

asv_table <- read.csv("C:/Users/oliwi/OneDrive/Desktop/uni/SAR11_ASV_table.csv")
env_table <- read.csv("C:/Users/oliwi/OneDrive/Desktop/uni/SAR11_ENV_table.csv", header = TRUE, row.names = 1, check.names = FALSE)

Compute Alpha-Diversity (Shannon Index)

alpha_diversity <- diversity(asv_table, index = "shannon")

lost script i used to add the shannon tbale onto the data set saving the dataset and then i removed the correct rows needed in excell

importing the new data set

diveristy_datar<- read.csv("C:/Users/oliwi/OneDrive/Desktop/uni/diveristy_datar.csv")

linear model of each variable

model_temp <- lm(Shannon ~ Temp, data=diveristy_datar)
summary(model_temp)

Call:
lm(formula = Shannon ~ Temp, data = diveristy_datar)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.4846 -0.4050  0.1527  0.3912  0.8611 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.73786    0.24562  15.218  < 2e-16 ***
Temp        -0.14606    0.01886  -7.744 4.14e-12 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.5499 on 115 degrees of freedom
Multiple R-squared:  0.3427,    Adjusted R-squared:  0.337 
F-statistic: 59.96 on 1 and 115 DF,  p-value: 4.142e-12
model_oxygen <- lm(Shannon ~ Oxygen, data=diveristy_datar)
summary(model_oxygen)

Call:
lm(formula = Shannon ~ Oxygen, data = diveristy_datar)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.43135 -0.56552  0.05754  0.57882  1.19571 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  1.9778782  0.7058316   2.802  0.00596 **
Oxygen      -0.0003923  0.0027370  -0.143  0.88628   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6783 on 115 degrees of freedom
Multiple R-squared:  0.0001786, Adjusted R-squared:  -0.008515 
F-statistic: 0.02054 on 1 and 115 DF,  p-value: 0.8863
model_Salinity <- lm(Shannon ~Salinity, data=diveristy_datar)
summary(model_oxygen)

Call:
lm(formula = Shannon ~ Oxygen, data = diveristy_datar)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.43135 -0.56552  0.05754  0.57882  1.19571 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)   
(Intercept)  1.9778782  0.7058316   2.802  0.00596 **
Oxygen      -0.0003923  0.0027370  -0.143  0.88628   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6783 on 115 degrees of freedom
Multiple R-squared:  0.0001786, Adjusted R-squared:  -0.008515 
F-statistic: 0.02054 on 1 and 115 DF,  p-value: 0.8863
model_phosphate <- lm(Shannon ~ Phosphate, data=diveristy_datar)
summary(model_phosphate)

Call:
lm(formula = Shannon ~ Phosphate, data = diveristy_datar)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.87007 -0.23313  0.02436  0.26350  0.89763 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.08916    0.05967   18.25   <2e-16 ***
Phosphate    2.94108    0.18146   16.21   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3743 on 115 degrees of freedom
Multiple R-squared:  0.6955,    Adjusted R-squared:  0.6929 
F-statistic: 262.7 on 1 and 115 DF,  p-value: < 2.2e-16

To determine which factor explains diversity best, compare R-squared values:

r_squared <- c(
  summary(model_temp)$r.squared,
  summary(model_oxygen)$r.squared,
  summary(model_Salinity)$r.squared,
  summary(model_phosphate)$r.squared
)

predictors <- c("Temp", "Oxygen", "Salinity", "Phosphate")
print(r_squared)
[1] 0.3427151379 0.0001786052 0.0387783191 0.6955178655
best_predictor <- predictors[which.max(r_squared)]
cat("Best predictor of Shannon Diversity is:", best_predictor)# phosphate
Best predictor of Shannon Diversity is: Phosphate
# 0.6955178655 phosphate value

visualising

p1 <- ggplot(diveristy_datar, aes(x=Temp, y=Shannon)) +
  geom_point() +
  geom_smooth(method="lm", color="red") +
  labs(title="Shannon vs Temperature", x="Temperature (°C)", y="Shannon Index") +
  theme_minimal() +  # Keeps grid
  theme(
    axis.line = element_line(size = 1.2, color = "black")  # Bold axis lines
  )
p2 <- ggplot(diveristy_datar, aes(x=Oxygen, y=Shannon)) +
  geom_point() +
  geom_smooth(method="lm", color="blue") +
  labs(title="Shannon vs Oxygen", x="Oxygen (mg/L)", y="Shannon Index") +
  theme_minimal() +
  theme(
    axis.line = element_line(size = 1.2, color = "black")
  )
p3 <- ggplot(diveristy_datar, aes(x=Salinity, y=Shannon)) +
  geom_point() +
  geom_smooth(method="lm", color="green") +
  labs(title="Shannon vs Salinity", x="Salinity (PSU)", y="Shannon Index") +
  theme_minimal() +
  theme(
    axis.line = element_line(size = 1.2, color = "black")
  )
p4 <- ggplot(diveristy_datar, aes(x=Phosphate, y=Shannon)) +
  geom_point() +
  geom_smooth(method="lm", color="purple") +
  labs(title="Shannon vs Phosphate", x="Phosphate (mg/L)", y="Shannon Index") +
  theme_minimal() +
  theme(
    axis.line = element_line(size = 1.2, color = "black")
  )
(p1 | p2) / (p3 | p4)

300 word explaination

SAR11 is one of the most common bacterial groups in the ocean and plays an important role in breaking down organic matter and cycling nutrients. In the Western English Channel (WEC), factors like temperature, oxygen, salinity, and nutrients can influence SAR11 diversity. Our analysis found that phosphate was the strongest predictor of SAR11 diversity, with an R² value of 0.6955. This suggests that phosphate concentration has a significant impact on the composition of SAR11 communities.

Phosphate is an essential macronutrient for marine bacteria because it’s needed for making DNA, RNA, and ATP. In nutrient-poor environments like the open ocean, phosphate is often limited, which can restrict microbial growth (Karl, 2014). SAR11 has adapted by developing high-affinity phosphate uptake systems, giving it an advantage over other bacteria in low-phosphate conditions (lebrun et al, 2018). Because of this, SAR11 diversity may actually increase when phosphate levels are low since it faces less competition from bacteria that need more phosphate. On the other hand, when phosphate is more available, other bacterial groups may grow more, leading to increased competition and possibly lower SAR11 diversity.

Even though our regression analysis shows a strong correlation, it’s important to remember that correlation does not mean causation. Other factors, like dissolved organic carbon, predation, or viral infections, could also affect SAR11 diversity (Morris et al., 2012). Additionally, our dataset is relatively small, and seasonal changes in the WEC could add more complexity. Future studies using metagenomic sequencing and long-term monitoring could help us better understand how phosphate influences SAR11 populations

Example referencing

You can include references like this (Morris et al., 2012), with an appropriate pointer to the citation in references.bib

Generative AI statement

References

Karl, D. M. (2014). Microbially mediated transformations of phosphorus in the sea: New views of an old cycle. Annual Review of Marine Science, 6(1), 279-337. Morris, R.M., Rappé, M.S., Connon, S.A., Vergin, K.L., Siebold, W.A., Carlson, C.A. and Giovannoni, S.J., 2002. SAR11 clade dominates ocean surface bacterioplankton communities. Nature, 420(6917), pp.806-810. LeBrun, E.S., King, R.S., Back, J.A. and Kang, S., 2018. Microbial community structure and function decoupling across a phosphorus gradient in streams. Microbial Ecology, 75, pp.64-73.

References

Morris, J.J., Lenski, R.E., Zinser, E.R., 2012. The black queen hypothesis: Evolution of dependencies through adaptive gene loss. mBio 3, e00036–12. https://doi.org/10.1128/mBio.00036-12