AIM : How to best predict Ebed algal (Brown, Red and Green) using Ebed PAR int, Ebed and Kd at different lambda (412, 443, 488, 531, 555, 645, 667)? We investigate the question using a linear multiple regression and GAM approach.

Document Note : The maps, plots and app within this document are interactive so make sure you give them a play like zooming in and out in the maps but also on the plots. Clicking on the legend allows to only select and display the time series needed.

Table of contents

  1. Synthetic data

  2. Predicting E algal Brown

  3. Predicting E algal Red

  4. Predicting E algal Green

  5. Conclusion

Synthetic data

vars <- read_csv('Synth_Ealgal_Elam_Kdlam_EbedInt.csv')

p <- vars %>% 
  pivot_longer(-n,names_to = 'Vars',values_to = 'Values') %>%
  ggplot(data=.) +
  geom_histogram(aes(x=Values)) +
  facet_wrap(~Vars, scales = 'free') +
  theme_light()
p
Figure 1 - Distribution of variables.

Figure 1 - Distribution of variables.

Observations:

Predicting E algal Brown

Linear regression

vars_brown <- vars %>% 
  pivot_longer(-EbedAction_brown,names_to = 'Vars',values_to = 'Values')
  
p <- ggplot(data=vars_brown,aes(x=Values,y=EbedAction_brown)) +
  geom_point() +
  facet_wrap(~Vars, scales = 'free') +
  geom_smooth(method = 'lm') +
  theme_light()
p
Figure 2 - Linear model to explain E algal Brown.

Figure 2 - Linear model to explain E algal Brown.

kable(do(group_by(vars_brown,Vars), glance(lm(EbedAction_brown ~ Values, data = .)))
 ,caption = 'Table 1 - Linear model.')
Table 1 - Linear model.
Vars r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
Ebed_412 0.8624664 0.8623286 26.497017 6.258407e+03 0.0000000 1 -4694.970 9395.939 9410.663 700687.747 998 1000
Ebed_443 0.9624955 0.9624579 13.836756 2.561213e+04 0.0000000 1 -4045.266 8096.532 8111.255 191072.916 998 1000
Ebed_488 0.9623554 0.9623177 13.862578 2.551309e+04 0.0000000 1 -4047.131 8100.261 8114.984 191786.737 998 1000
Ebed_531 0.8982067 0.8981047 22.795662 8.806178e+03 0.0000000 1 -4544.508 9095.016 9109.739 518602.902 998 1000
Ebed_555 0.8784377 0.8783159 24.911047 7.211783e+03 0.0000000 1 -4633.249 9272.498 9287.221 619319.127 998 1000
Ebed_645 0.9135667 0.9134801 21.005504 1.054847e+04 0.0000000 1 -4462.722 8931.444 8946.167 440348.724 998 1000
Ebed_667 0.8525104 0.8523626 27.439314 5.768578e+03 0.0000000 1 -4729.914 9465.829 9480.552 751410.139 998 1000
Ebed_par 0.9959511 0.9959470 4.546341 2.454877e+05 0.0000000 1 -2932.260 5870.520 5885.244 20627.875 998 1000
EbedAction_green 0.9987831 0.9987819 2.492381 8.191397e+05 0.0000000 1 -2331.176 4668.352 4683.075 6199.539 998 1000
EbedAction_red 0.9964931 0.9964896 4.231127 2.835814e+05 0.0000000 1 -2860.406 5726.812 5741.535 17866.632 998 1000
Kd_412 0.0215241 0.0205437 70.675288 2.195363e+01 0.0000032 1 -5676.034 11358.067 11372.790 4985006.370 998 1000
Kd_443 0.0249092 0.0239321 70.552932 2.549440e+01 0.0000005 1 -5674.301 11354.602 11369.325 4967760.787 998 1000
Kd_488 0.0264839 0.0255085 70.495938 2.715002e+01 0.0000002 1 -5673.493 11352.985 11367.708 4959737.859 998 1000
Kd_531 0.0247064 0.0237291 70.560268 2.528160e+01 0.0000006 1 -5674.405 11354.809 11369.533 4968793.871 998 1000
Kd_555 0.0232567 0.0222780 70.612688 2.376286e+01 0.0000013 1 -5675.147 11356.295 11371.018 4976179.445 998 1000
Kd_645 0.0201623 0.0191805 70.724456 2.053598e+01 0.0000066 1 -5676.729 11359.458 11374.181 4991944.761 998 1000
Kd_667 0.0194159 0.0184333 70.751387 1.976072e+01 0.0000098 1 -5677.110 11360.219 11374.943 4995747.290 998 1000
n 0.0007533 -0.0002479 71.421487 7.523792e-01 0.3859325 1 -5686.536 11379.073 11393.796 5090826.771 998 1000

Observations:

  • Very good linear relationships with Ebed_par
  • Good-ish linear relationships with Ebed(\(\lambda\)). Some more expo fit-like.
  • Also with EbedAction Green and Red, not used in the next. We are also interested in explaining deviation in their relationships, but another story.
  • Negative relationship but way weaker relationship with Kds, we are not including them in the regression this time.

Multiple linear regression

vars_mlm_b <- lm(EbedAction_brown ~ Ebed_par + Ebed_412 + Ebed_443 + Ebed_488 + Ebed_531 + Ebed_555 + Ebed_645 + Ebed_667,data=vars)
summary(vars_mlm_b)
## 
## Call:
## lm(formula = EbedAction_brown ~ Ebed_par + Ebed_412 + Ebed_443 + 
##     Ebed_488 + Ebed_531 + Ebed_555 + Ebed_645 + Ebed_667, data = vars)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.4222  -0.0133  -0.0029   0.0338  17.9193 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.005097   0.045407   0.112  0.91064    
## Ebed_par      0.267175   0.025381  10.527  < 2e-16 ***
## Ebed_412    108.038560   5.211571  20.731  < 2e-16 ***
## Ebed_443     12.384190   6.253666   1.980  0.04795 *  
## Ebed_488     58.360596   7.646015   7.633 5.39e-14 ***
## Ebed_531     10.872783   5.804605   1.873  0.06134 .  
## Ebed_555     -9.660195   9.230408  -1.047  0.29556    
## Ebed_645    -42.784415  15.141026  -2.826  0.00481 ** 
## Ebed_667     28.058157   5.921033   4.739 2.46e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.085 on 991 degrees of freedom
## Multiple R-squared:  0.9998, Adjusted R-squared:  0.9998 
## F-statistic: 5.412e+05 on 8 and 991 DF,  p-value: < 2.2e-16

Observations:

  • Most significant predictor is Ebed_par again
  • Interesting significance of Ebed_412, Ebed_488, Ebed_645 and Ebed_667 (Blue and Red light, cf aaction spectra) BUT non linear relationship with 500<\(\lambda\)<600, green light.
  • \(R^2 = 0.9998\) almost like just with taking Ebed_par alone…
  • Green Ebed, non linear relationship so better use other regression method like GAM?
layout(matrix(c(1,2,3,4),2,2)) # optional 4 graphs/page
plot(vars_mlm_b)
Figure 3 - Assumptions check of linear model to explain E algal Brown.

Figure 3 - Assumptions check of linear model to explain E algal Brown.

Observations:

  • Q-Q plot suggests more extreme values than “pure” normal distribution.
  • To include uniform depth distribution so Ebed will be normally distributed?

GAM

p <- vars %>% 
  dplyr::select(Ebed_412,Ebed_443,Ebed_488,Ebed_531,Ebed_555,Ebed_645,Ebed_667,Ebed_par,EbedAction_brown) %>% 
  pivot_longer(-EbedAction_brown) %>% 
  ggplot(data=.,aes(x=value,y=EbedAction_brown)) +
     geom_point() +
     facet_wrap(~name, scales = 'free') +
     geom_smooth(method = 'gam') +
     theme_light()
p
Figure 4 - GAM model to explain E algal Brown. No interactions.

Figure 4 - GAM model to explain E algal Brown. No interactions.

vars_brown <- vars %>% 
  dplyr::select(Ebed_412,Ebed_443,Ebed_488,Ebed_531,Ebed_555,Ebed_645,Ebed_667,Ebed_par,EbedAction_brown) %>% 
  pivot_longer(-EbedAction_brown)

kable(do(group_by(vars_brown,name), glance(gam(EbedAction_brown ~ value, data = .)))
 ,caption = 'Table 2 - GAM no interactions.')
Table 2 - GAM no interactions.
name df logLik AIC BIC deviance df.residual nobs
Ebed_412 2 -4694.970 9395.939 9410.663 700687.75 998 1000
Ebed_443 2 -4045.266 8096.532 8111.255 191072.92 998 1000
Ebed_488 2 -4047.131 8100.261 8114.984 191786.74 998 1000
Ebed_531 2 -4544.508 9095.016 9109.739 518602.90 998 1000
Ebed_555 2 -4633.249 9272.498 9287.221 619319.13 998 1000
Ebed_645 2 -4462.722 8931.444 8946.167 440348.72 998 1000
Ebed_667 2 -4729.914 9465.829 9480.552 751410.14 998 1000
Ebed_par 2 -2932.260 5870.520 5885.244 20627.87 998 1000

Observations:

  • Best predictor based on lowest AIC: Ebed_par, then Ebed_443, then Ebed_488, then Ebed_645, then Ebed_531, then Ebed_555, then Ebed_412, then Ebed_667.
  • What about interactions?
vars_gam_b <- gam(EbedAction_brown ~ s(Ebed_par, bs='cr') + s(Ebed_412, bs='cr') + s(Ebed_443, bs='cr') + s(Ebed_488, bs='cr') + s(Ebed_531, bs='cr') + s(Ebed_555, bs='cr') + s(Ebed_645, bs='cr') + s(Ebed_667, bs='cr'), data=vars)

plot_gam(vars_gam_b, ncol = 3) + ylab("Pred. Ebed Algal Brown") + theme_light()#all predictors
Figure 4 - GAM model to explain E algal Brown.

Figure 4 - GAM model to explain E algal Brown.

Observations:

  • Interesting negative relationships between Ealgal Brown with 555 and 645 (match with action spectra of Brown)
  • Parabolic relationships with 443 and 531. Not sure how to interpret these, especially the 443.
  • Positive relationships with 412, 488, 667 and PAR, makes sense and consistent with action spectra.
summary(vars_gam_b)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## EbedAction_brown ~ s(Ebed_par, bs = "cr") + s(Ebed_412, bs = "cr") + 
##     s(Ebed_443, bs = "cr") + s(Ebed_488, bs = "cr") + s(Ebed_531, 
##     bs = "cr") + s(Ebed_555, bs = "cr") + s(Ebed_645, bs = "cr") + 
##     s(Ebed_667, bs = "cr")
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 19.82191    0.02686   737.9   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df       F  p-value    
## s(Ebed_par) 3.294  3.724  27.206  < 2e-16 ***
## s(Ebed_412) 6.998  7.098 108.838  < 2e-16 ***
## s(Ebed_443) 5.731  5.993  26.288  < 2e-16 ***
## s(Ebed_488) 2.845  3.172  43.267  < 2e-16 ***
## s(Ebed_531) 4.032  4.503   7.530 3.67e-06 ***
## s(Ebed_555) 1.386  1.638   9.025 0.000588 ***
## s(Ebed_645) 3.123  3.295  11.545  < 2e-16 ***
## s(Ebed_667) 4.879  5.020  37.885  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =      1   Deviance explained =  100%
## GCV = 0.74645  Scale est. = 0.7216    n = 1000

Observations:

  • Deviance explained = 100%
  • All predictors significant
plot_gam_3d(model = vars_gam_b, main_var = Ebed_412, second_var = Ebed_555, palette='bilbao', direction = 1)

Figure 4bis - Prediction of Ebed_brown using a GAM using multiple predictors (Ebed_par, Ebed_412, Ebed_443, Ebed_488, Ebed_531 ,Ebed_555, Ebed_645, Ebed_667). Showing Ebed_412 and Ebed_555.

Observations:

  • High Ebed_555 but low Ebed 412 lead to low Ebed Brown!!
plot_gam_3d(model = vars_gam_b, main_var = Ebed_412, second_var = Ebed_667, palette='bilbao', direction = 1)

Figure 4bis - Prediction of Ebed_brown using a GAM using multiple predictors (Ebed_par, Ebed_412, Ebed_443, Ebed_488, Ebed_531 ,Ebed_555, Ebed_645, Ebed_667). Showing Ebed_412 and Ebed_667.

Observations:

  • High Ebed_412 and high/low Ebed_667 lead to high Ebed Brown!!
  • 667 less important of a predictor than 412?

Predicting E algal red

Linear regression

vars_red <- vars %>% 
  pivot_longer(-EbedAction_red,names_to = 'Vars',values_to = 'Values')
  
p <- ggplot(data=vars_red,aes(x=Values,y=EbedAction_red)) +
  geom_point() +
  facet_wrap(~Vars, scales = 'free') +
  geom_smooth(method = 'lm') +
  theme_light()
p
Figure 5 - Linear model to explain E algal Red

Figure 5 - Linear model to explain E algal Red

kable(do(group_by(vars_red,Vars), glance(lm(EbedAction_red ~ Values, data = .)))
 ,caption = 'Table 2 - Linear model.')
Table 2 - Linear model.
Vars r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
Ebed_412 0.8310128 0.8308435 31.017469 4.907772e+03 0.0000000 1 -4852.488 9710.976 9725.699 960159.21 998 1000
Ebed_443 0.9456264 0.9455719 17.594363 1.735649e+04 0.0000000 1 -4285.516 8577.032 8591.755 308942.49 998 1000
Ebed_488 0.9725025 0.9724749 12.511980 3.529621e+04 0.0000000 1 -3944.624 7895.248 7909.972 156236.55 998 1000
Ebed_531 0.9227212 0.9226437 20.975373 1.191627e+04 0.0000000 1 -4461.287 8928.573 8943.296 439086.33 998 1000
Ebed_555 0.9050944 0.9049993 23.244771 9.517716e+03 0.0000000 1 -4564.018 9134.035 9148.759 539238.74 998 1000
Ebed_645 0.8912972 0.8911882 24.877107 8.182993e+03 0.0000000 1 -4631.886 9269.771 9284.494 617632.71 998 1000
Ebed_667 0.8221632 0.8219850 31.819272 4.613887e+03 0.0000000 1 -4878.010 9762.019 9776.743 1010441.11 998 1000
Ebed_par 0.9973445 0.9973418 3.888256 3.748219e+05 0.0000000 1 -2775.898 5557.796 5572.520 15088.29 998 1000
EbedAction_brown 0.9964931 0.9964896 4.468307 2.835814e+05 0.0000000 1 -2914.947 5835.894 5850.617 19925.83 998 1000
EbedAction_green 0.9928302 0.9928231 6.388980 1.381979e+05 0.0000000 1 -3272.512 6551.024 6565.748 40737.43 998 1000
Kd_412 0.0244693 0.0234918 74.524642 2.503291e+01 0.0000007 1 -5729.067 11464.135 11478.858 5542814.35 998 1000
Kd_443 0.0283046 0.0273309 74.378002 2.907079e+01 0.0000001 1 -5727.098 11460.196 11474.919 5521023.05 998 1000
Kd_488 0.0300933 0.0291215 74.309511 3.096498e+01 0.0000000 1 -5726.176 11458.353 11473.076 5510859.55 998 1000
Kd_531 0.0282283 0.0272546 74.380922 2.899016e+01 0.0000001 1 -5727.137 11460.274 11474.997 5521456.47 998 1000
Kd_555 0.0266544 0.0256791 74.441132 2.732950e+01 0.0000002 1 -5727.946 11461.892 11476.616 5530399.24 998 1000
Kd_645 0.0230979 0.0221191 74.577005 2.359678e+01 0.0000014 1 -5729.770 11465.540 11480.263 5550606.27 998 1000
Kd_667 0.0222846 0.0213049 74.608044 2.274693e+01 0.0000021 1 -5730.186 11466.372 11481.095 5555227.55 998 1000
n 0.0007774 -0.0002238 75.424172 7.764425e-01 0.3784435 1 -5741.065 11488.131 11502.854 5677428.15 998 1000

Observations:

  • Very good linear relationships with Ebed_par
  • Also with EbedAction Green and Brown, not used in the next. We are also interested in explaining deviation in their relationships, but another story.
  • Less Good-ish linear relationships with Ebed(\(\lambda\)). Some more expo fit-like.
  • Negative relationship but way weaker relationship with Kds

Multiple linear regression

vars_mlm_r <- lm(EbedAction_red ~ Ebed_par + Ebed_412 + Ebed_443 + Ebed_488 + Ebed_531 + Ebed_555 + Ebed_645 + Ebed_667,data=vars)
summary(vars_mlm_r)
## 
## Call:
## lm(formula = EbedAction_red ~ Ebed_par + Ebed_412 + Ebed_443 + 
##     Ebed_488 + Ebed_531 + Ebed_555 + Ebed_645 + Ebed_667, data = vars)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -20.9027  -0.1642   0.0228   0.0582  24.1409 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -0.03109    0.09108  -0.341 0.732956    
## Ebed_par      0.17325    0.05091   3.403 0.000693 ***
## Ebed_412     21.61340   10.45372   2.068 0.038943 *  
## Ebed_443    118.60969   12.54403   9.455  < 2e-16 ***
## Ebed_488     41.75247   15.33689   2.722 0.006596 ** 
## Ebed_531     24.49114   11.64327   2.103 0.035677 *  
## Ebed_555     50.49340   18.51498   2.727 0.006501 ** 
## Ebed_645    -54.08429   30.37089  -1.781 0.075252 .  
## Ebed_667     87.28131   11.87681   7.349 4.17e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.176 on 991 degrees of freedom
## Multiple R-squared:  0.9992, Adjusted R-squared:  0.9992 
## F-statistic: 1.499e+05 on 8 and 991 DF,  p-value: < 2.2e-16

Observations:

  • Most significant predictor is Ebed_par again
  • Interesting significance of Ebed_555 this time compared to brown
  • \(R^2 = 0.9992\) almost like just with taking Ebed_par alone…
  • Worth keeping Kds in regression? Non linear relationship so better use other regression method like GAM?
layout(matrix(c(1,2,3,4),2,2)) # optional 4 graphs/page
plot(vars_mlm_r)
Figure 6 - Assumptions check of linear model to explain E algal Red

Figure 6 - Assumptions check of linear model to explain E algal Red

Observations:

  • Q-Q plot suggests more extreme values than “pure” normal distribution.

GAM

vars_gam_r <- gam(EbedAction_red ~ s(Ebed_par, bs='cr') + s(Ebed_412, bs='cr') + s(Ebed_443, bs='cr') + s(Ebed_488, bs='cr') + s(Ebed_531, bs='cr') + s(Ebed_555, bs='cr') + s(Ebed_645, bs='cr') + s(Ebed_667, bs='cr'), data=vars)

plot_gam(vars_gam_r, ncol = 2) + ylab("Pred. Ebed Algal Red") + theme_light()#all predictors
Figure 7 - GAM model to explain E algal Red

Figure 7 - GAM model to explain E algal Red

summary(vars_gam_r)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## EbedAction_red ~ s(Ebed_par, bs = "cr") + s(Ebed_412, bs = "cr") + 
##     s(Ebed_443, bs = "cr") + s(Ebed_488, bs = "cr") + s(Ebed_531, 
##     bs = "cr") + s(Ebed_555, bs = "cr") + s(Ebed_645, bs = "cr") + 
##     s(Ebed_667, bs = "cr")
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 22.63920    0.05657   400.2   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                edf Ref.df      F p-value    
## s(Ebed_par) 1.6920 2.0213 32.630 < 2e-16 ***
## s(Ebed_412) 7.1283 7.2622 33.913 < 2e-16 ***
## s(Ebed_443) 5.1762 5.6051 11.706 < 2e-16 ***
## s(Ebed_488) 2.7561 3.0627 13.127 < 2e-16 ***
## s(Ebed_531) 3.4140 3.8800  4.446 0.00208 ** 
## s(Ebed_555) 0.9864 0.9875  5.507 0.01991 *  
## s(Ebed_645) 2.7264 2.8857 14.646 < 2e-16 ***
## s(Ebed_667) 4.9484 5.0670 17.616 < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Rank: 72/73
## R-sq.(adj) =  0.999   Deviance explained = 99.9%
## GCV = 3.2987  Scale est. = 3.2003    n = 1000

Observations:

  • Deviance explained = 99.9%
  • Again, Ebed_Par best explain and very linear
  • Ebed_555 less significant than Ebed_blue, how to interpret that?
plot_gam_3d(model = vars_gam_r, main_var = Ebed_555, second_var = Ebed_443, palette='bilbao', direction = 1)

Figure 7bis - Prediction of Ebed_red using a GAM using multiple predictors (Ebed_par, Ebed_412, Ebed_443, Ebed_488, Ebed_531, Ebed_555, Ebed_645, Ebed_667). Showing Ebed_443 and Ebed_555.

Predicting E algal Green

Linear regression

vars_green <- vars %>% 
  pivot_longer(-EbedAction_green,names_to = 'Vars',values_to = 'Values')
  
p <- ggplot(data=vars_green,aes(x=Values,y=EbedAction_green)) +
  geom_point() +
  facet_wrap(~Vars, scales = 'free') +
  geom_smooth(method = 'lm') +
  theme_light()
p
Figure 8 - Linear model to explain E algal Green

Figure 8 - Linear model to explain E algal Green

kable(do(group_by(vars_green,Vars), glance(lm(EbedAction_green ~ Values, data = .)))
 ,caption = 'Table 3 - Linear model.')
Table 3 - Linear model.
Vars r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
Ebed_412 0.8783311 0.8782092 24.120173 7.204587e+03 0.0000000 1 -4600.986 9207.972 9222.695 580619.186 998 1000
Ebed_443 0.9703817 0.9703521 11.900643 3.269742e+04 0.0000000 1 -3894.530 7795.060 7809.783 141342.056 998 1000
Ebed_488 0.9526864 0.9526390 15.041239 2.009530e+04 0.0000000 1 -4128.733 8263.466 8278.190 225786.387 998 1000
Ebed_531 0.8794141 0.8792932 24.012583 7.278256e+03 0.0000000 1 -4596.516 9199.031 9213.754 575450.945 998 1000
Ebed_555 0.8587624 0.8586209 25.987573 6.068108e+03 0.0000000 1 -4675.556 9357.112 9371.835 674003.220 998 1000
Ebed_645 0.9285967 0.9285252 18.477773 1.297895e+04 0.0000000 1 -4334.506 8675.012 8689.735 340745.233 998 1000
Ebed_667 0.8723838 0.8722559 24.702648 6.822322e+03 0.0000000 1 -4624.848 9255.696 9270.419 609000.389 998 1000
Ebed_par 0.9929335 0.9929264 5.812911 1.402312e+05 0.0000000 1 -3178.019 6362.038 6376.761 33722.354 998 1000
EbedAction_brown 0.9987831 0.9987819 2.412195 8.191397e+05 0.0000000 1 -2298.475 4602.949 4617.673 5807.048 998 1000
EbedAction_red 0.9928302 0.9928231 5.855213 1.381979e+05 0.0000000 1 -3185.270 6376.540 6391.263 34214.949 998 1000
Kd_412 0.0188827 0.0178996 68.493765 1.920762e+01 0.0000130 1 -5644.680 11295.361 11310.084 4682013.123 998 1000
Kd_443 0.0220131 0.0210332 68.384406 2.246362e+01 0.0000025 1 -5643.082 11292.165 11306.888 4667074.195 998 1000
Kd_488 0.0235881 0.0226097 68.329321 2.410963e+01 0.0000011 1 -5642.277 11290.553 11305.276 4659558.301 998 1000
Kd_531 0.0220133 0.0210334 68.384401 2.246378e+01 0.0000025 1 -5643.082 11292.165 11306.888 4667073.436 998 1000
Kd_555 0.0207235 0.0197422 68.429481 2.111970e+01 0.0000049 1 -5643.741 11293.483 11308.206 4673228.660 998 1000
Kd_645 0.0180742 0.0170903 68.521981 1.837006e+01 0.0000200 1 -5645.092 11296.184 11310.908 4685871.414 998 1000
Kd_667 0.0173608 0.0163762 68.546869 1.763216e+01 0.0000292 1 -5645.455 11296.911 11311.634 4689275.905 998 1000
n 0.0007404 -0.0002608 69.124139 7.394850e-01 0.3900324 1 -5653.842 11313.683 11328.406 4768590.291 998 1000

Observations:

  • Very good linear relationships with Ebed_par
  • Also with EbedAction Brown and Red, not used in the next. We are also interested in explaining deviation in their relationships, but another story.
  • Negative relationship but way weaker relationship with Kds

Multiple linear regression

vars_mlm_g <- lm(EbedAction_green ~ Ebed_par + Ebed_412 + Ebed_443 + Ebed_488 + Ebed_531 + Ebed_555 + Ebed_645 + Ebed_667,data=vars)
summary(vars_mlm_g)
## 
## Call:
## lm(formula = EbedAction_green ~ Ebed_par + Ebed_412 + Ebed_443 + 
##     Ebed_488 + Ebed_531 + Ebed_555 + Ebed_645 + Ebed_667, data = vars)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.4114  -0.0277  -0.0119   0.0782  23.5325 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.01738    0.05931   0.293  0.76960    
## Ebed_par      0.32439    0.03315   9.786  < 2e-16 ***
## Ebed_412     61.73225    6.80678   9.069  < 2e-16 ***
## Ebed_443     21.71382    8.16785   2.658  0.00798 ** 
## Ebed_488     49.00755    9.98638   4.907 1.08e-06 ***
## Ebed_531     -8.27486    7.58133  -1.091  0.27533    
## Ebed_555    -32.72179   12.05574  -2.714  0.00676 ** 
## Ebed_645    -81.03303   19.77553  -4.098 4.51e-05 ***
## Ebed_667     79.55153    7.73340  10.287  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.417 on 991 degrees of freedom
## Multiple R-squared:  0.9996, Adjusted R-squared:  0.9996 
## F-statistic: 2.971e+05 on 8 and 991 DF,  p-value: < 2.2e-16

Observations:

  • Most significant predictor is Ebed_par again, but also Blue light an red peak absorption (Ebed_667)
  • Interesting negative slope for 531<\(\lambda\)<645
  • \(R^2 = 0.9996\) almost like just with taking Ebed_par alone…
  • Worth keeping Kds in regression? Non linear relationship so better use other regression method like GAM?
layout(matrix(c(1,2,3,4),2,2)) # optional 4 graphs/page
plot(vars_mlm_g)
Figure 9 - Assumptions check of linear model to explain E algal Green

Figure 9 - Assumptions check of linear model to explain E algal Green

Observations:

  • Q-Q plot suggests more extreme values than “pure” normal distribution.

GAM

vars_gam_g <- gam(EbedAction_green ~ s(Ebed_par, bs='cr') + s(Ebed_412, bs='cr') + s(Ebed_443, bs='cr') + s(Ebed_488, bs='cr') + s(Ebed_531, bs='cr') + s(Ebed_555, bs='cr') + s(Ebed_645, bs='cr') + s(Ebed_667, bs='cr'), data=vars)

plot_gam(vars_gam_g, ncol = 2) + ylab("Pred. Ebed Algal Green") + theme_light()#all predictors
Figure 10 - GAM model to explain E algal Green

Figure 10 - GAM model to explain E algal Green

summary(vars_gam_g)
## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## EbedAction_green ~ s(Ebed_par, bs = "cr") + s(Ebed_412, bs = "cr") + 
##     s(Ebed_443, bs = "cr") + s(Ebed_488, bs = "cr") + s(Ebed_531, 
##     bs = "cr") + s(Ebed_555, bs = "cr") + s(Ebed_645, bs = "cr") + 
##     s(Ebed_667, bs = "cr")
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 18.05158    0.04083   442.1   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##               edf Ref.df      F  p-value    
## s(Ebed_par) 2.387  2.710 10.118 2.03e-05 ***
## s(Ebed_412) 7.037  7.140 29.033  < 2e-16 ***
## s(Ebed_443) 5.695  5.970 28.148  < 2e-16 ***
## s(Ebed_488) 2.402  2.697 29.490  < 2e-16 ***
## s(Ebed_531) 4.124  4.591  8.122 9.30e-07 ***
## s(Ebed_555) 2.082  2.577  0.932   0.4213    
## s(Ebed_645) 3.120  3.293  2.354   0.0647 .  
## s(Ebed_667) 4.963  5.081  1.672   0.1371    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =      1   Deviance explained =  100%
## GCV = 1.7237  Scale est. = 1.6672    n = 1000

Observations:

  • Deviance explained = 100%
  • Again, Ebed_Par best explain and \(\lambda\)<531
  • Not significant Ebed green light
  • Less linear effects of Ebed than for Brown and Red??
plot_gam_3d(model = vars_gam_g, main_var = Ebed_531, second_var = Ebed_443, palette='bilbao', direction = -1)

Figure 11 - Prediction of Ebed_green using a GAM using multiple predictors (Ebed_par, Ebed_412, Ebed_443, Ebed_488, Ebed_531, Ebed_555, Ebed_645, Ebed_667). Showing Ebed_443 and Ebed_531.

Conclusion - Model comparison

What can be changed?