AIM : Which predictors are best explaining the Ebed algal ratio between Brown, Red and Green algae? We investigate the question using a GAM approach. We focus here on the ratio \(\frac{Ebed_{brown}}{Ebed_{red}}\), \(\frac{Ebed_{green}}{Ebed_{red}}\) and \(\frac{Ebed_{green}}{Ebed_{brown}}\) as response variables influenced by 4 main predictors: \(ag(440)\), \(chl\), \(tsm\), \(nap_{astar}\), and \(z\).
Document Note : The maps, plots and app within this document are interactive so make sure you give them a play like zooming in and out in the maps but also on the plots. Clicking on the legend allows to only select and display the time series needed.
Spectral F0, a, bb, Kd, Ebed and action spectra of Green, Brown and Red algae.
Observations: * Z could represent best shallower depths, maybe via uniform law * Brown always win vs Green, why this clear cut? * Red almost always win over Brown and Green. Maybe due to few shallower depths with low chl, tsm and ag440? * Why some high Ebed values?
vars <- read_csv('Synth_vars_RGB.csv')
vars_gam_BR <- vars %>%
dplyr::select(-c(Ebed_par,EbedAbs_green,EbedAbs_brown,EbedAbs_red)) %>%
pivot_longer(-Ebed_BR_ratio,values_to = "values",names_to = "variables")
pgam_BR <- ggplot(vars_gam_BR,aes(x=values,y=Ebed_BR_ratio)) + geom_point(color='#ff5500',alpha=.75) +
geom_smooth(se=F, lwd=.5, color='#00aaff',method='gam') + facet_wrap(~variables, scales='free_x') +
labs(x='') + theme_light()
pgam_BR
Figure 1 - Ration between Ebed_brown and Ebed_red as a function of its predictors. The blue lines are fitted using a GAM using single predictor.
Expected negative relationship between the ratio Ebed B/R with increasing ag440, tsm and chl. In other words, red algae are doing better than brown for higher ag440, tsm and chl. Also negative relationship with depth (zz) which is expected too as red are knwon to do better at deeper depth. No relationship between Ebed B/R and n, which is expected, no sampling biaises. Interesting relationships between the Ebed B/R ratio with the 2 other ratios: When Red are doing better than Brown (low ratio), Red also do bettern than Green (Ebed GR_ratio) This is expected given the similar photosynthetic requirements/abilities of Brown and Green. Finally, when Brown do better than green, Red do better than Brown (Ebed_GB_ratio). In the next, we are going to focus on the interactive effects of chl, tsm, ag440 and depth on the Ebed RGB ratios.
kable(do(group_by(vars_gam_BR,variables), glance(gam(Ebed_BR_ratio ~ s(values, bs='cr'), data = .))
)%>%select(variables,AIC),caption = 'Table 1 - AIC for GAMs using single predictor.')
| variables | AIC |
|---|---|
| ag440 | -2298.604 |
| chl | -2183.246 |
| Ebed_GB_ratio | -2497.060 |
| Ebed_GR_ratio | -4489.205 |
| n | -2122.814 |
| tsm | -2246.249 |
| zz | -2403.018 |
Observations:
mod_gam_zz <- gam(Ebed_BR_ratio ~ s(zz, bs="cr"), data=vars) #cr: cubic regression splines
summary(mod_gam_zz)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Ebed_BR_ratio ~ s(zz, bs = "cr")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.746989 0.002291 326 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(zz) 5.788 6.8 48.66 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.248 Deviance explained = 25.2%
## GCV = 0.0052852 Scale est. = 0.0052493 n = 1000
Deviance explained: 25.2%
mod_gam_tsm <- gam(Ebed_BR_ratio ~ s(tsm, bs="cr"), data=vars) #cr: cubic regression splines
summary(mod_gam_tsm)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Ebed_BR_ratio ~ s(tsm, bs = "cr")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.746989 0.002476 301.8 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(tsm) 7.782 8.374 17.3 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.122 Deviance explained = 12.9%
## GCV = 0.0061824 Scale est. = 0.0061281 n = 1000
Deviance explained: 12.9%
mod_gam_chl <- gam(Ebed_BR_ratio ~ s(chl, bs="cr"), data=vars)
summary(mod_gam_chl)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Ebed_BR_ratio ~ s(chl, bs = "cr")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.746989 0.002555 292.4 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(chl) 7.481 8.137 9.172 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.0646 Deviance explained = 7.16%
## GCV = 0.0065844 Scale est. = 0.0065286 n = 1000
Deviance explained: 7.16%
mod_gam_ag440 <- gam(Ebed_BR_ratio ~ s(ag440, bs="cr"), data=vars)
summary(mod_gam_ag440)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Ebed_BR_ratio ~ s(ag440, bs = "cr")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.746989 0.002414 309.5 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(ag440) 5.849 6.724 29.83 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.165 Deviance explained = 17%
## GCV = 0.0058669 Scale est. = 0.0058267 n = 1000
Deviance explained: 17%
Observations:
mod_gam2 <- gam(Ebed_BR_ratio ~ s(chl, bs="cr") + s(tsm, bs="cr") + s(ag440, bs="cr") + s(zz, bs="cr"), data=vars)
summary(mod_gam2)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Ebed_BR_ratio ~ s(chl, bs = "cr") + s(tsm, bs = "cr") + s(ag440,
## bs = "cr") + s(zz, bs = "cr")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.746989 0.001684 443.7 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## s(chl) 3.181 3.623 37.31 <2e-16 ***
## s(tsm) 8.326 8.770 35.57 <2e-16 ***
## s(ag440) 6.610 7.477 53.81 <2e-16 ***
## s(zz) 5.696 6.707 89.42 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.594 Deviance explained = 60.4%
## GCV = 0.0029068 Scale est. = 0.0028346 n = 1000
plot_gam(mod_gam2, ncol = 2) + ylab("Pred. Ebed B/R") + theme_light()#all predictors
Figure 2 - Ratio between Ebed_brown and Ebed_red as a function of its predictors. The red lines are fitted using a GAM using multiple predictors.
plot_gam_3d(model = mod_gam2, main_var = tsm, second_var = chl, palette='bilbao', direction = -1)
Figure 3 - Prediction of the ratio between Ebed_brown and Ebed_red using a GAM using multiple predictors (zz, tsm, chl, ag440). Showing chl and tsm.
#(model = mod_gam2, main_var = tsm, second_var = zz, palette='bilbao', direction = -1)
#plot_gam_3d(model = mod_gam2, main_var = chl, second_var = zz, palette='bilbao', direction = -1)
plot_gam_3d(model = mod_gam2, main_var = tsm, second_var = zz, palette='bilbao', direction = -1)
Figure 4 - Prediction of the ratio between Ebed_brown and Ebed_red using a GAM using multiple predictors (zz, tsm, chl, ag440). Showing zz and tsm.
plot_gam_3d(model = mod_gam2, main_var = ag440, second_var = zz, palette='bilbao', direction = -1)
Figure 5 - Prediction of the ratio between Ebed_brown and Ebed_red using a GAM using multiple predictors (zz, tsm, chl, ag440). Showing zz and tsm.
Observations:
mod_gam2_int <- gam(Ebed_BR_ratio ~ te(tsm, chl,ag440, bs='cr') + s(zz, bs = 'cr'), data=vars)
summary(mod_gam2_int)
##
## Family: gaussian
## Link function: identity
##
## Formula:
## Ebed_BR_ratio ~ te(tsm, chl, ag440, bs = "cr") + s(zz, bs = "cr")
##
## Parametric coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.746989 0.001606 465.1 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Approximate significance of smooth terms:
## edf Ref.df F p-value
## te(tsm,chl,ag440) 57.145 66.15 16.13 <2e-16 ***
## s(zz) 5.271 6.27 98.36 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## R-sq.(adj) = 0.63 Deviance explained = 65.4%
## GCV = 0.002754 Scale est. = 0.0025794 n = 1000
Deviance explained: 65.4%
plot_gam_3d(model = mod_gam2_int, main_var = tsm, second_var = zz, palette='bilbao', direction = -1)
Figure 6 - Ratio between Ebed_brown and Ebed_red as a function of its predictors. The red lines are fitted using a GAM using multiple predictors + interactions.
plot_gam_3d(model = mod_gam2_int, main_var = tsm, second_var = chl, palette='bilbao', direction = -1)
AIC(mod_gam_zz,mod_gam_chl,mod_gam_tsm,mod_gam_ag440,mod_gam2,mod_gam2_int)
## df AIC
## mod_gam_zz 7.787644 -2403.018
## mod_gam_chl 9.480988 -2183.246
## mod_gam_tsm 9.781875 -2246.249
## mod_gam_ag440 7.849464 -2298.604
## mod_gam2 25.812641 -3001.467
## mod_gam2_int 64.416126 -3059.008
It seems like taking into account the interaction of tsm, chl and ag440 in the GAM is what gives the highest proportion of the deviance explained. But not so much difference with the model without interaction.
We do see here some evidence that red algae do better than Brown in deeper and higher chl, tsm and ag440 conditions.
Changing the distribution of depth could be worth investigating, for instance having more shallow waters.
Also, here red algae are most of the time wining, even in shallow waters, whereas we should see brown winning at intermediate and low chl, tsm, ag440 waters.