AIM : Which predictors are best explaining the Ebed algal ratio between Brown, Red and Green algae? We investigate the question using a GAM approach. We focus here on the ratio $\frac{Ebed_{brown}}{Ebed_{red}}$, $\frac{Ebed_{green}}{Ebed_{red}}$ and $\frac{Ebed_{green}}{Ebed_{brown}}$ as response variables influenced by 4 main predictors: $b_{bp}$, $a_{det}$, $a_{ph}$ ,and $z$. Other factors investigated in GAM_Synth_RatioRGB_04.Rmd.

Document Note : Some plots within this document are interactive so make sure you give them a play like zooming in and out.

Synthetic data
GAM - Single predictor
- AIC of single predictor model
- z
- adet
- bbp
- aph
GAM - Multiple predictors, no interaction
- All predictors
GAM - Multiple predictors + interactions
Conclusion
What about Brown vs Green?
What about Green vs Red?
All GAMs

Synthetic data

Figure 1 - Spectral F0, a, bb, Kd, Ebed and action spectra of Green, Brown and Red algae.

Figure 2 - Distribution of the 10,000 sets of synthetic chl, tsm, ag440 and zz. Distribution of Ebed PAR. Distribution of EBed ratio Green/Brown, Brown/Red and Green/Red.

Observations:

Z better represent shallower depths
Ratio algae is interesting, showing different winners depending on light condition! Although quite dominated by Red (most B/R and G/R <1) We investigate this using a GAM approach in the following
Ebed conversion from instant light to daily light doses to re-do (and chat with Matt, especially unit)

Are predictors correlated?

Figure 3 - Correlation between variables.

GAM - Single predictor

vars <- read_csv('Synth_vars_RGB_05.csv')

vars_gam_BR <- vars %>% 
  dplyr::select(-c(Ebed_par,EbedAction_green,EbedAction_brown,EbedAction_red,ag440,tsm,chl)) %>% 
  pivot_longer(-Ebed_BR_ratio,values_to = "values",names_to = "variables")


pgam_BR <- ggplot(vars_gam_BR,aes(x=values,y=Ebed_BR_ratio)) + geom_point(color='#ff5500',alpha=.75) +
  geom_smooth(se=F, lwd=.5, color='#00aaff',method='gam') + facet_wrap(~variables, scales='free_x') +
  labs(x='') + theme_light()
pgam_BR

Figure 4 - Ratio between Ebed_brown and Ebed_red as a function of its predictors. The blue lines are fitted using a GAM using single predictor.

Observations:

Expected negative relationship between the ratio Ebed B/R with increasing adet, aph and bbp. In other words, red algae are doing better than brown for higher values of these predictors.
Also clear negative relationship with depth (zz) which is expected too as red are known to do better at deeper depth.
No relationship between Ebed B/R and n, which is expected, no sampling biases.
Interesting relationships between the Ebed B/R ratio with the 2 other ratios

In the next, we are going to focus on the interactive effects of adet, aph and bbp and depth on the Ebed RGB ratios.

AIC of single predictor model

kable(do(group_by(vars_gam_BR,variables), glance(gam(Ebed_BR_ratio ~ s(values, bs='cr'), data = .))
)%>%select(variables,AIC),caption = 'Table 1 - AIC for GAMs using single predictor.')

Table 1 - AIC for GAMs using single predictor.
variables	AIC
adet	-16986.86
aph	-16012.25
bbp	-16090.65
Ebed_GB_ratio	-17230.07
Ebed_GR_ratio	-41159.64
n	-15449.07
zz	-26607.29

Observations:

The model including zz has the lowest AIC, potentially explaining the best compared to other predictors (outside of Ebed_GR/B_ratio)
Followed by adet, bbp, aph, and n in increasing order of AIC

z

mod_gam_zz <- gam(Ebed_BR_ratio ~ s(zz, bs="cr"), data=vars) #cr:  cubic regression splines
summary(mod_gam_zz)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## Ebed_BR_ratio ~ s(zz, bs = "cr")
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.7751527  0.0006393    1212   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##         edf Ref.df    F p-value    
## s(zz) 8.732  8.968 2292  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.673   Deviance explained = 67.3%
## GCV = 0.0040917  Scale est. = 0.0040877  n = 10000

Deviance explained: 67.3%

adet

mod_gam_adet <- gam(Ebed_BR_ratio ~ s(adet, bs="cr"), data=vars) #cr:  cubic regression splines
summary(mod_gam_adet)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## Ebed_BR_ratio ~ s(adet, bs = "cr")
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.775153   0.001034   749.4   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##           edf Ref.df     F p-value    
## s(adet) 6.229  7.128 234.6  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.143   Deviance explained = 14.4%
## GCV = 0.010708  Scale est. = 0.0107    n = 10000

Deviance explained: 14.4%

bbp

mod_gam_bbp <- gam(Ebed_BR_ratio ~ s(bbp, bs="cr"), data=vars) #cr:  cubic regression splines
summary(mod_gam_bbp)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## Ebed_BR_ratio ~ s(bbp, bs = "cr")
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.775153   0.001082   716.5   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##          edf Ref.df     F p-value    
## s(bbp) 4.779  5.406 124.1  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.0627   Deviance explained = 6.31%
## GCV = 0.011712  Scale est. = 0.011705  n = 10000

Deviance explained: 6.31%

aph

mod_gam_aph <- gam(Ebed_BR_ratio ~ s(aph, bs="cr"), data=vars) #cr:  cubic regression splines
summary(mod_gam_aph)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## Ebed_BR_ratio ~ s(aph, bs = "cr")
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.775153   0.001086   713.6   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##          edf Ref.df     F p-value    
## s(aph) 3.598  4.123 141.6  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.0552   Deviance explained = 5.55%
## GCV = 0.011804  Scale est. = 0.011799  n = 10000

Deviance explained: 5.55%

Observations:

Deviance explained by zz: 67.3%
Followed by adet (14.4%), bbp(6.31%), aph (5.55%)
Pretty low $\rightarrow$ Need to take multiple predictors to explain ratio of Ebed BR, not surprising

GAM - Multiple predictors, no interaction

All predictors

mod_gam2 <- gam(Ebed_BR_ratio ~ s(adet, bs="cr")+ s(bbp, bs="cr")+ s(aph, bs="cr") + s(zz, bs="cr"), data=vars)
summary(mod_gam2)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## Ebed_BR_ratio ~ s(adet, bs = "cr") + s(bbp, bs = "cr") + s(aph, 
##     bs = "cr") + s(zz, bs = "cr")
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.7751527  0.0003924    1975   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##           edf Ref.df      F p-value    
## s(adet) 6.662  7.550 1137.9  <2e-16 ***
## s(bbp)  4.796  5.426  118.2  <2e-16 ***
## s(aph)  4.131  4.713  743.7  <2e-16 ***
## s(zz)   8.826  8.986 6035.0  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.877   Deviance explained = 87.7%
## GCV = 0.0015436  Scale est. = 0.0015397  n = 10000

Observations:

Deviance explained: 87.7%

plot_gam(mod_gam2, ncol = 2) + ylab("Pred. Ebed B/R") + theme_light()#all predictors

Figure 5 - Ratio between Ebed_brown and Ebed_red as a function of its predictors. The red lines are fitted using a GAM using multiple predictors.

plot_gam_3d(model = mod_gam2, main_var = zz, second_var = bbp, palette='bilbao', direction = -1)

Figure 6 - Prediction of the ratio between Ebed_brown and Ebed_red using a GAM using multiple predictors (zz, adet, bbp, aph). Showing zz and bbp

#(model = mod_gam2, main_var = tsm, second_var = zz, palette='bilbao', direction = -1)
#plot_gam_3d(model = mod_gam2, main_var = chl, second_var = zz, palette='bilbao', direction = -1)

plot_gam_3d(model = mod_gam2, main_var = bbp, second_var = aph, palette='bilbao', direction = -1)

Figure 7 - Prediction of the ratio between Ebed_brown and Ebed_red using a GAM using multiple predictors (zz, adet, bbp, aph). Showing aph and bbp.

plot_gam_3d(model = mod_gam2, main_var = adet, second_var = aph, palette='bilbao', direction = -1)

Figure 8 - Prediction of the ratio between Ebed_brown and Ebed_red using a GAM using multiple predictors (zz, adet, bbp, aph). Showing adet and aph

Observations:

Deviance explained by all predictors (zz, ): 87.7%. $\rightarrow$ What if we take interactions into account?
Deeper, higher predictor values favor red, expected $\rightarrow$ Shape of gam surfaces are different and interesting to dig a wee deeper.

GAM - Multiple predictors + interactions

mod_gam4_int <- gam(Ebed_BR_ratio ~ te(adet, aph, bbp, bs='cr') + s(zz, bs = 'cr'), data=vars)
summary(mod_gam4_int)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## Ebed_BR_ratio ~ te(adet, aph, bbp, bs = "cr") + s(zz, bs = "cr")
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.7751527  0.0003619    2142   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                     edf Ref.df    F p-value    
## te(adet,aph,bbp) 44.057 49.907  425  <2e-16 ***
## s(zz)             8.933  8.998 7032  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.895   Deviance explained = 89.6%
## GCV = 0.0013168  Scale est. = 0.0013097  n = 10000

Deviance explained: 89.6%

plot_gam_3d(model = mod_gam4_int, main_var = bbp, second_var = zz, palette='bilbao', direction = -1)

Figure 9 - 3D plot of a GAM taking into account the interaction between bbp (x-axis), zz (y-axis), aph and adet to explain the ratio of EalgalBrown/Red. This model explains 89.6% of the deviance.

plot_gam_3d(model = mod_gam4_int, main_var = bbp, second_var = aph, palette='bilbao', direction = -1)

Last 2 plots to comment.

What if we take zz into account in interactions?

mod_gam5_int <- gam(Ebed_BR_ratio ~ te(adet, aph, bbp, zz, bs='cr'), data=vars)
# Takes time, ~1min
summary(mod_gam5_int)

"Family: gaussian 
Link function: identity 

Formula:
Ebed_BR_ratio ~ te(adet, aph, bbp, zz, bs = "cr")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.775153   0.000363    2135   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                      edf Ref.df   F p-value    
te(adet,aph,bbp,zz) 149.9  174.5 486  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.894   Deviance explained = 89.6%
GCV = 0.001338  Scale est. = 0.0013178  n = 10000"

Deviance explained: 89.6%

Not much difference between taking zz inside interaction terms? Worth keeping depth into account? What are the hypothesis behind these?

Conclusion - Model comparison

AIC(mod_gam_zz,mod_gam_bbp,mod_gam_adet,mod_gam_aph,mod_gam2,mod_gam4_int)

##                     df       AIC
## mod_gam_zz   10.732027 -26607.29
## mod_gam_bbp   6.779250 -16090.65
## mod_gam_adet  8.229365 -16986.86
## mod_gam_aph   5.598141 -16012.25
## mod_gam2     26.414673 -36355.46
## mod_gam4_int 54.990585 -37944.95

It seems like taking into account the interaction of adet, bbp and aph with depth outside (mod_gam4_int) of interaction (mod_gam5_int: zz inside interaction makes even higher deviance explained) in the GAM is what gives the highest proportion of the deviance explained. Not taking into account interaction between adet, bbp and aph perform similarly too, what difference in hypotheses?

We do see here some evidence that red algae do better than Brown in deeper zz and higher adet, bbp and aph conditions.

Summary plots of potential best model (mod_gam4_int)

p_bbpaph <- plot_gam_2d(model = mod_gam4_int, main_var = bbp, second_var = aph, palette='bilbao', direction = -1)

p_bbpzz <- plot_gam_2d(model = mod_gam4_int, main_var = zz, second_var = bbp, palette='bilbao', direction = -1)

p_bbpadet <- plot_gam_2d(model = mod_gam4_int, main_var = bbp, second_var = adet, palette='bilbao', direction = -1)

p_adetzz <- plot_gam_2d(model = mod_gam4_int, main_var = zz, second_var = adet, palette='bilbao', direction = -1)

p_adetaph <- plot_gam_2d(model = mod_gam4_int, main_var = aph, second_var = adet, palette='bilbao', direction = -1)

p_aphzz <- plot_gam_2d(model = mod_gam4_int, main_var = zz, second_var = aph, palette='bilbao', direction = -1)

g <- grid.arrange(arrangeGrob(p_bbpaph,p_bbpzz, ncol = 2),
                  arrangeGrob(p_bbpadet,p_adetzz, ncol = 2),
                  arrangeGrob(p_adetaph,p_aphzz, ncol = 2),
                  nrow = 3)

Figure 11 - 2D plots of a GAM taking into account the interaction between bbp, aph and adet + zz to explain the ratio of EalgalBrown/Red. This model explains 87.7% of the deviance.

Check GAM assumptions: concurvity

plot_gam_check(mod_gam4_int) #+ theme_light()#all predictors

Figure 12 - GAM model check to explain the ratio between Ebed_brown and Ebed_red using a GAM using multiple predictors (zz, adet, bbp, aph).

kable(concurvity(mod_gam4_int))

	te(adet,aph,bbp)	s(zz)
worst	0.0175937	0.0175937
observed	0.0005034	0.0136954
estimate	0.0013736	0.0123971

Observations:

Concurvity low (<< 1), not surprising given very small correlation between predictors. GAM seems worth doing
QQ plot doesnt look too bad?
Not much tail in density vs residuals plot?

What can be changed?

Look at distribution of satellite-derived aph, bbp and adet. To use observed values in simulated dataset? At least make sure we encompass these values.

bbp <- read_csv('C:/Users/thoralf/OneDrive - NIWA/Documents/Sat_Data/NETCDF_ORIGINALS/BBP/MO/BBP_RockyReefs_012003_122020.csv')
adet <- read_csv('C:/Users/thoralf/OneDrive - NIWA/Documents/Sat_Data/NETCDF_ORIGINALS/ADET/MO/ADET_RockyReefs_012003_122020.csv')

kpar_trend <- read_csv('C:/Users/thoralf/OneDrive - NIWA/Documents/Sat_Data/NETCDF_ORIGINALS/KPAR/MO/KPAR_Trends_NZ_RockyReefs_No_NA_STL_MK2.csv')  %>% 
  mutate(Significance = if_else(sens_pvalue<0.1,'Significant','Not Significant'),
         Sign = if_else(sens_slope>0,'Positive','Negative')) %>% 
  mutate(likelihood = (1 - sens_pvalue/2)*100,
         likelihood_cut = cut(likelihood,c(50, 66, 90, 95, 99, 100),
                              labels = c("About as likely as not", "Likely", "Very likely", "Extremely likely", "Virtually certain"),
                              include.lowest = T)) 

bbp_adet_kpartrend <- left_join(bbp,adet,by=c('x','y','Date','geometry')) %>% 
  left_join(.,kpar_trend,by=c('x','y')) %>% 
  drop_na() %>% 
  mutate(Date = as.yearmon(Date,"%b.%Y")) %>% 
  mutate(season_num=month(floor_date(as.Date(Date), unit="season")),
         season=recode_factor(season_num, `12` = "Summer", `3` = "Autumn", `6` = "Winter", `9` = "Spring"))

h <- bbp_adet_kpartrend %>%
  pivot_longer(c(bbp,adet),names_to = 'sat_prod', values_to = 'values') %>% 
  ggplot(.,aes(x=values,fill=season)) +
  geom_histogram() +
  facet_wrap(~sat_prod,scales = 'free') +
  theme_light()
h

Figure 13 - Histogram of satellite-derived BBP and ADET at rocky reefs around NZ.

Look at time series of aph, bbp and adet. Superimpose in figure 11 (gam surface function of bbp, aph, adet). Add colour of time and see if there are some interesting trajectories along ratio Ealgal gradients.

pix_highest_kpar_trends_bbpadet <- bbp_adet_kpartrend %>% 
  filter(likelihood_cut == 'Virtually certain') %>%
  arrange(desc(sens_slope)) %>% 
  mutate(season_num=month(floor_date(as.Date(Date), unit="season")),
         season=recode_factor(season_num, `12` = "Summer", `3` = "Autumn", `6` = "Winter", `9` = "Spring"))

pix_highest_kpar_trends_bbpadet_top1 <- pix_highest_kpar_trends_bbpadet[1:216,] 

p_bbpadet + geom_point(data= pix_highest_kpar_trends_bbpadet_top1,aes(x=bbp,y=adet,col=season))

Figure 14 - Ratio between Ebed_green and Ebed_brown as a function of its predictors bbp and adet. Points: satellite-derived BBP and ADET at rocky reefs around NZ (from 1 pixel, Jan 2003 to Dec 2020).

What about Brown vs Green?

vars_gam_GB <- vars %>% 
  dplyr::select(-c(Ebed_par,EbedAction_green,EbedAction_brown,EbedAction_red,ag440,tsm,chl)) %>% 
  pivot_longer(-Ebed_GB_ratio,values_to = "values",names_to = "variables")


pgam_GB <- ggplot(vars_gam_GB,aes(x=values,y=Ebed_GB_ratio)) + geom_point(color='#ff5500',alpha=.75) +
  geom_smooth(se=F, lwd=.5, color='#00aaff',method='gam') + facet_wrap(~variables, scales='free_x') +
  labs(x='') + theme_light()
pgam_GB

Figure 15 - Ratio between Ebed_green and Ebed_brown as a function of its predictors. The blue lines are fitted using a GAM using single predictor.

kable(do(group_by(vars_gam_GB,variables), glance(gam(Ebed_GB_ratio ~ s(values, bs='cr'), data = .))
)%>%select(variables,AIC),caption = 'Table 2 - AIC for GAMs using single predictor.')

Table 2 - AIC for GAMs using single predictor.
variables	AIC
adet	-32961.90
aph	-33357.97
bbp	-32711.32
Ebed_BR_ratio	-33839.74
Ebed_GR_ratio	-35153.14
n	-31601.14
zz	-34748.52

mod_gam_gb <- gam(Ebed_GB_ratio ~ te(adet, aph, bbp, bs='cr') + s(zz, bs="cr"), data=vars)
summary(mod_gam_gb)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## Ebed_GB_ratio ~ te(adet, aph, bbp, bs = "cr") + s(zz, bs = "cr")
## 
## Parametric coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.9606260  0.0003097    3102   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                     edf Ref.df     F p-value    
## te(adet,aph,bbp) 44.219 52.798 168.4  <2e-16 ***
## s(zz)             7.099  7.964 885.6  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.614   Deviance explained = 61.6%
## GCV = 0.00096402  Scale est. = 0.00095897  n = 10000

p_bbpaph_gb <- plot_gam_2d(model = mod_gam_gb, main_var = bbp, second_var = aph, palette='bamako', direction = -1)

p_bbpzz_gb <- plot_gam_2d(model = mod_gam_gb, main_var = zz, second_var = bbp, palette='bamako', direction = -1)

p_bbpadet_gb <- plot_gam_2d(model = mod_gam_gb, main_var = bbp, second_var = adet, palette='bamako', direction = -1)

p_adetzz_gb <- plot_gam_2d(model = mod_gam_gb, main_var = zz, second_var = adet, palette='bamako', direction = -1)

p_adetaph_gb <- plot_gam_2d(model = mod_gam_gb, main_var = aph, second_var = adet, palette='bamako', direction = -1)

p_aphzz_gb <- plot_gam_2d(model = mod_gam_gb, main_var = zz, second_var = aph, palette='bamako', direction = -1)

g <- grid.arrange(arrangeGrob(p_bbpaph_gb,p_bbpzz_gb, ncol = 2),
                  arrangeGrob(p_bbpadet_gb,p_adetzz_gb, ncol = 2),
                  arrangeGrob(p_adetaph_gb,p_aphzz_gb, ncol = 2),
                  nrow = 3)

Figure 16 - Prediction of the ratio between Ebed_green and Ebed_brown using a GAM using multiple predictors (zz, adet, bbp, aph). This model explains 61.6% of the deviance.

What about Green vs Red?

vars_gam_GR <- vars %>% 
  dplyr::select(-c(Ebed_par,EbedAction_green,EbedAction_brown,EbedAction_red,ag440,tsm,chl)) %>% 
  pivot_longer(-Ebed_GR_ratio,values_to = "values",names_to = "variables")


pgam_GR <- ggplot(vars_gam_GR,aes(x=values,y=Ebed_GR_ratio)) + geom_point(color='#ff5500',alpha=.75) +
  geom_smooth(se=F, lwd=.5, color='#00aaff',method='gam') + facet_wrap(~variables, scales='free_x') +
  labs(x='') + theme_light()
pgam_GR

Figure 17 - Ratio between Ebed_green and Ebed_red as a function of its predictors. The blue lines are fitted using a GAM using single predictor.

kable(do(group_by(vars_gam_GR,variables), glance(gam(Ebed_GR_ratio ~ s(values, bs='cr'), data = .))
)%>%select(variables,AIC),caption = 'Table 3 - AIC for GAMs using single predictor.')

Table 3 - AIC for GAMs using single predictor.
variables	AIC
adet	-13545.75
aph	-13032.69
bbp	-13049.46
Ebed_BR_ratio	-40138.83
Ebed_GB_ratio	-17319.79
n	-12869.19
zz	-27652.25

mod_gam_gr <- gam(Ebed_GR_ratio ~ te(adet, aph, bbp, bs='cr') + s(zz, bs="cr"), data=vars)
summary(mod_gam_gr)

## 
## Family: gaussian 
## Link function: identity 
## 
## Formula:
## Ebed_GR_ratio ~ te(adet, aph, bbp, bs = "cr") + s(zz, bs = "cr")
## 
## Parametric coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.746255   0.000449    1662   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##                     edf Ref.df      F p-value    
## te(adet,aph,bbp) 53.985 60.015  138.1  <2e-16 ***
## s(zz)             8.933  8.998 6804.0  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.875   Deviance explained = 87.6%
## GCV = 0.0020292  Scale est. = 0.0020162  n = 10000

p_bbpaph_gr <- plot_gam_2d(model = mod_gam_gr, main_var = bbp, second_var = aph, palette='tokyo', direction = 1)

p_bbpzz_gr <- plot_gam_2d(model = mod_gam_gr, main_var = zz, second_var = bbp, palette='tokyo', direction = 1)

p_bbpadet_gr <- plot_gam_2d(model = mod_gam_gr, main_var = bbp, second_var = adet, palette='tokyo', direction = 1)

p_adetzz_gr <- plot_gam_2d(model = mod_gam_gr, main_var = zz, second_var = adet, palette='tokyo', direction = 1)

p_adetaph_gr <- plot_gam_2d(model = mod_gam_gr, main_var = aph, second_var = adet, palette='tokyo', direction = 1)

p_aphzz_gr <- plot_gam_2d(model = mod_gam_gr, main_var = zz, second_var = aph, palette='tokyo', direction = 1)

g <- grid.arrange(arrangeGrob(p_bbpaph_gr,p_bbpzz_gr, ncol = 2),
                  arrangeGrob(p_bbpadet_gr,p_adetzz_gr, ncol = 2),
                  arrangeGrob(p_adetaph_gr,p_aphzz_gr, ncol = 2),
                  nrow = 3)

$Figure 18 - Prediction of the ratio between Ebed_green and Ebed_red using a GAM using multiple predictors (zz, adet, bbp, aph). This model explains 61.6% of the deviance.$

Figure 18 - Prediction of the ratio between Ebed_green and Ebed_red using a GAM using multiple predictors (zz, adet, bbp, aph). This model explains 61.6% of the deviance.

All GAM

g <- grid.arrange(arrangeGrob(p_bbpaph,p_bbpaph_gb,p_bbpaph_gr, ncol = 3),
                  arrangeGrob(p_bbpzz,p_bbpzz_gb,p_bbpzz_gr, ncol = 3),
                  arrangeGrob(p_bbpadet,p_bbpadet_gb,p_bbpadet_gr, ncol = 3),
                  arrangeGrob(p_adetzz,p_adetzz_gb,p_adetzz_gr, ncol = 3),
                  arrangeGrob(p_aphzz,p_aphzz_gb,p_aphzz_gr, ncol = 3),
                  arrangeGrob(p_adetaph,p_adetaph_gb,p_adetaph_gr, ncol = 3),
                  nrow = 6)

Figure 19 - Prediction of the ratio between Ebed_brown, Ebed_green and Ebed_red using a GAM using multiple predictors (zz, adet, bbp, aph). These models explain respectively X%, X% and 61.6% of the deviance.

GAM_Synth_RGBratio

Francois Thoral

10/03/2022

Table of contents

Synthetic data

GAM - Single predictor

AIC of single predictor model

z

adet

bbp

aph

GAM - Multiple predictors, no interaction

All predictors

GAM - Multiple predictors + interactions

Conclusion - Model comparison

Summary plots of potential best model (mod_gam4_int)

Check GAM assumptions: concurvity

What can be changed?

What about Brown vs Green?

What about Green vs Red?

All GAM