ANALYTICAL APPROACH

Spring conditions are considered here as a proxy for climate change. These spring conditions were summarised via a PCA to avoid collinearity among the different environmental variables.

The PCA results are available in the script METEOROLIGCAL_DATA.R (https://rpubs.com/paquittet/1390293).


Spring



Analytical approach:




INITIALISATION

# PACKAGES
library(tidyverse)
library(splines)  # for function bf() to splined emergence date
library(kableExtra)  # for nice table
library(ggtext)  # better annotation on gpplot2
library(patchwork)  # display multiple ggplots

# WD
dir = "C:/Users/quittet/Documents/marmot-quantitative-genetics"

# LOAD DATA
load("~/marmot-quantitative-genetics/1_DATA_ANALYSIS/3_Data/pheno_abiotic.rds")

load("C:/Users/quittet/Documents/marmot-quantitative-genetics/1_DATA_ANALYSIS/3_Data/df_abiotic_short.rds")



TEMPORAL VARIATION IN SPRING CONDITIONS

plot_index <- function(data, y, ylab, title){
  
  # GET OVERALL MEAN
  overall_mean <- mean(data[[y]], na.rm = T)
  data_temp <- data |>
    mutate(anomalie = if_else(data[[y]] <= overall_mean, "#C44E52", "#4C72B0"))
  
  # PLOT
plot <- data_temp |>
    ggplot(aes(
      x = as.numeric(as.character(year)),
      y = !!(sym(y)),
      group = 1
    )) +
  geom_hline(yintercept = overall_mean, linetype = 2, size = 1) +
     geom_point(size = 3) +
    geom_point(color = data_temp$anomalie, size = 2) +
    geom_smooth(method = "lm",
              color = "#FFF5EE",
              fill = "#383838",
              linewidth = 1,
              alpha = 0.5,
              se = T) +
  theme_classic(base_size = 20) +
  theme(legend.position = "none",
          panel.grid = element_line(color = "#CFCFCF",
                                    size = 0.2,
                                    linetype = 2)) +
  ylab(ylab) +
  theme(legend.position = "none", axis.title.x = element_blank())
  return(plot)
}
# RENAME DF WITH CLIMATIC VARIABLES
df_clim <- df_abiotic_short
df_abiotic_short$spring_index_1 <- df_abiotic_short$spring_index_1

# CLIMATIC VARIABLES
ClimVars <- c(
  "spring_index_1",
  "spring_index_2"
)

df_clim <- df_clim[ClimVars]

# SCALE
df_clim <- as.data.frame(apply(df_clim, 2, scale))
df_clim$year <- df_abiotic_short$year

# PLOT
# AXIS 1
plot1 <- plot_index(data = df_clim, y = "spring_index_1", ylab = "Aridity and temperature", title = "Spring conditions")

# AXIS 2
plot2 <- plot_index(data = df_clim, y = "spring_index_2", ylab = "NDVI", title = "")

# COMBINE PLOTS
(plot1 | plot2)  # patchwork syntax



DE-TRENDED VARIABLES

References: Grosbois et al. (2008).

The idea is to remove the temporal trend from a climatic variable \(x\). This avoids correlations arising from confounding effects between the time effect of \(y\) and the time effect of \(x\).

The de-trended variable \(x_{d,i}\) at each time point \(i\) is defined as:

\[ x_{d,i} = x_i - (\alpha+\beta t_i) \]

where \(\alpha\) and \(\beta\) are the regression parameters of the covariate \(x\) against the linear trend \(t\). The de-trended variable \(x_{d,i}\) is essentially the residuals from the regression of the model above. Note that quadratic effects can also be tested for de-trending.

The model is then built using our response variable with \(x_d\) instead of \(x\), while retaining the time effect \(t\):

\[ y = a + b_1t+b_2x_{d} \]

The de-trended variable is now interpreted as the temporal variation of \(x\) around the linear trend.


Building the de-trended variables

The spring variables are therefore de-trended by constructing the variable .<variable>_dt.

# GET SCALED CLIMATIC VARIABLES
# VARIABLES TO DETREND
to_detrend_temp <- names(models_best)[models_best != "constant"]

# CONSTRUCT DETREND VARIABLES
for(i in to_detrend_temp) {
  df_res_temp <- models_all[[i]][["residuals"]]
  index_temp <- match(pheno_abiotic$annee_capture, df_res_temp$time)
  pheno_abiotic[[paste0(i, "_dt")]] <- df_res_temp$res[index_temp]
}


Residuals analysis

If de-trending has worked correctly, no pattern should be visible in the residuals. The residuals of each variable that showed a significant linear trend are plotted below:

The trends have been successfully removed!



MODEL SELECTION

We select the fixed part of the model with a frequentist approach.


Extract best model per trait

Data preparation

# FILTER DATA
pheno_abiotic_mod <- pheno_abiotic |>
  filter(
    !is.na(age_annee) &
      !is.na(annee_emergence) &
      !is.na(litter_size_naissance)  &
      !is.na(age_d) &
      !is.na(mother) &
      age_annee == 0 &
      age_d == 0 &
      annee_emergence %in% 1990:2025 &
      masse < 800
  ) |>
  arrange(age_d) |>
  filter(!duplicated(individual_id)) |>
  mutate(
    annee_emergence_num = as.numeric(as.character(annee_emergence))
  )

nrow(pheno_abiotic_mod)

# REMOVE NAs
vars_model <- c(
  # CONTROL VARIABLE
  "annee_emergence_num",
  "tibia",
  "sexe",
  "inbreeding",
  "capture_time",
  
  # SOCIAL
  "litter_size_naissance",
  "taille_groupe_naissance",
  
  # CLIMATIC
  "spring_index_1_dt",
  "spring_index_2_dt"
)

pheno_abiotic_mod <- pheno_abiotic_mod[complete.cases(pheno_abiotic_mod[, vars_model]), ]


# SCALE QUANTITATIVE PREDICTORS
for(i in vars_model[!vars_model %in% c("tibia", "sexe", "presence_male_sub", "winter_index_1_y", "winter_index_2_y", "spring_index_1_dt", "spring_index_2_dt", "summer_index_1_dt", "summer_index_2_dt")]) {
  pheno_abiotic_mod[[i]] <- as.numeric(scale(pheno_abiotic_mod[[i]]))
}


Fixed effects selection with MuMIn::dredge()

We test the fixed part of the model with constant random effects: mother. We set the argument REML = F to use ML methods for model selection.


Mass

# GET FULL MODEL
full_model_mass <- lmerTest::lmer(
  masse ~
    # CONTROL
    sexe + 
    capture_time +
    annee_emergence_num +
    
    # SOCIAL
    litter_size_naissance +
    taille_groupe_naissance +
    
    # CLIMATIC
    spring_index_1_dt +
    spring_index_2_dt +
    
    # RANDOM
    (1|mother),
  
  REML = F,
  data = pheno_abiotic_mod
)

# MODEL SELECTION
options(na.action = "na.fail")  # required
model_selection_masse <- MuMIn::dredge(
  full_model_mass,
  evaluate = TRUE, 
  fixed = c(
    "annee_emergence_num",
    "capture_time"
  )  #' ... note: fixed argument forces variables to be included in all models
  )


The list of all models with \(\Delta AICc < 2\) is shown below:
Model selection for body mass (models with ΔAICc ≤ 2)
Fixed part
Spring
Model performance
\(\mu\) year Capture time Litter size Group size Sex Sp. index1 Sp. index2 AICc df \(\Delta\) w
378.24 -2.5 30.13 -46.82 -14.3 + 9.87 31.34 16830.01 10 0 1
385.63 -2.06 29.57 -46.75 -14.34 - 9.78 31.47 16841.31 9 11.3 0
377.46 -4.72 29.53 -47.47 -15.55 + - 35.72 16845.98 9 15.96 0
384.75 -4.27 28.98 -47.4 -15.57 - - 35.81 16856.78 8 26.77 0
381.58 -11.08 29.35 -45.67 - + 11.64 29.18 16864.81 9 34.8 0
NOTE: Bold correspond to the best parsimonious model, + means factor has been added


The best model (\(\Delta AICc < 2\) & with the fewest parameters) is therefore:

## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: masse ~ annee_emergence_num + capture_time + sexe + litter_size_naissance +      taille_groupe_naissance + spring_index_1_dt + spring_index_2_dt +      (1 | mother)
##    Data: pheno_abiotic_mod
## 
## REML criterion at convergence: 16776.3
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.9910 -0.5084 -0.0301  0.4782  3.9709 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  mother   (Intercept) 5234     72.35   
##  Residual             4566     67.57   
## Number of obs: 1464, groups:  mother, 136
## 
## Fixed effects:
##                         Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept)              378.215      6.914  144.478  54.705  < 2e-16 ***
## annee_emergence_num       -2.418      5.392  250.273  -0.449 0.654177    
## capture_time              30.174      2.545 1451.806  11.857  < 2e-16 ***
## sexem                     13.544      3.708 1335.887   3.653 0.000269 ***
## litter_size_naissance    -46.826      2.118 1415.090 -22.104  < 2e-16 ***
## taille_groupe_naissance  -14.314      2.344 1449.958  -6.106 1.31e-09 ***
## spring_index_1_dt          9.864      2.325 1412.954   4.242 2.36e-05 ***
## spring_index_2_dt         31.362      3.645 1434.613   8.605  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) ann_m_ cptr_t sexem  lttr__ tll_g_ sp__1_
## ann_mrgnc_n -0.041                                          
## capture_tim -0.047  0.051                                   
## sexem       -0.291 -0.027  0.056                            
## lttr_sz_nss  0.038  0.003 -0.013 -0.007                     
## tll_grp_nss  0.077 -0.262 -0.039  0.006  0.086              
## sprng_nd_1_  0.026  0.099  0.058  0.011  0.072  0.125       
## sprng_nd_2_  0.000  0.096  0.083 -0.011 -0.070 -0.095 -0.282



Tibia

# REMOVE NA IN TIBIA
pheno_abiotic_mod <- pheno_abiotic_mod[complete.cases(pheno_abiotic_mod[, "tibia"]), ]

# GET FULL MODEL
full_model_tibia <- lmerTest::lmer(
  tibia ~
    # CONTROL
    sexe + 
    capture_time +
    splines::bs(annee_emergence_num) +
    
    # SOCIAL
    litter_size_naissance +
    taille_groupe_naissance +
    
    # CLIMATIC
    spring_index_1_dt +
    spring_index_2_dt +
    
    # RANDOM
    (1|mother),
  
  REML = F,
  data = pheno_abiotic_mod
)

# MODEL SELECTION
options(na.action = "na.fail")  # required

model_selection_tibia <- MuMIn::dredge(
  full_model_tibia,
  fixed = c(
    "capture_time",
    "sexe"
  )  #' ... note: fixed argument forces variables to be included in all models
  ) 

options(na.action = "na.omit")  # reset immediately after


The list of all models with \(\Delta AICc < 2\) is shown below:

Model selection for tibia length (models with ΔAICc ≤ 2)
Fixed part
Spring
Model performance
\(\mu\) year Capture time Litter size Group size Sex Sp. index1 Sp. index2 AICc df \(\Delta\) w
62.54 + 0.91 -1.43 -0.77 + 0.29 1.21 7964.88 12 0 0.89
62.42 + 0.9 -1.44 -0.8 + - 1.34 7968.98 11 4.09 0.11
63.74 + 0.88 -1.41 -0.72 + 0.51 - 8005.33 11 40.45 0
63.26 + 0.87 -1.36 - + 0.38 1.11 8009.07 11 44.19 0
63.14 + 0.85 -1.38 - + - 1.28 8017.32 10 52.44 0
NOTE: Bold correspond to the best parsimonious model, + means factor has been added


The best model (\(\Delta AICc < 2\) & with the fewest parameters) is therefore:

## Linear mixed model fit by REML. t-tests use Satterthwaite's method ['lmerModLmerTest']
## Formula: formula_tibia
##    Data: pheno_abiotic_mod
## 
## REML criterion at convergence: 7948.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.9573 -0.5345  0.0422  0.6115  3.2537 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  mother   (Intercept)  7.987   2.826   
##  Residual             11.101   3.332   
## Number of obs: 1464, groups:  mother, 136
## 
## Fixed effects:
##                                    Estimate Std. Error        df t value Pr(>|t|)    
## (Intercept)                         62.5505     0.7781  376.4312  80.385  < 2e-16 ***
## sexem                                1.1098     0.1825 1346.9134   6.081 1.56e-09 ***
## capture_time                         0.9173     0.1247 1390.0267   7.358 3.17e-13 ***
## splines::bs(annee_emergence_num)1  -27.2211     1.7664 1117.5394 -15.411  < 2e-16 ***
## splines::bs(annee_emergence_num)2   -2.7665     1.2501  450.6351  -2.213   0.0274 *  
## splines::bs(annee_emergence_num)3  -10.6871     0.9876  387.2161 -10.822  < 2e-16 ***
## litter_size_naissance               -1.4267     0.1045 1443.8106 -13.653  < 2e-16 ***
## taille_groupe_naissance             -0.7718     0.1130 1431.7656  -6.829 1.26e-11 ***
## spring_index_1_dt                    0.2866     0.1160 1413.7753   2.471   0.0136 *  
## spring_index_2_dt                    1.2057     0.1842 1434.2837   6.546 8.21e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) sexem  cptr_t s::(__)1 s::(__)2 s::(__)3 lttr__ tll_g_ sp__1_
## sexem       -0.106                                                              
## capture_tim  0.004  0.048                                                       
## spln::(__)1 -0.714  0.004 -0.144                                                
## spln::(__)2 -0.415 -0.043  0.159 -0.122                                         
## spln::(__)3 -0.819 -0.013 -0.045  0.676    0.139                                
## lttr_sz_nss -0.069 -0.005 -0.036  0.146   -0.052    0.085                       
## tll_grp_nss  0.136  0.005 -0.051  0.012   -0.105   -0.158    0.089              
## sprng_nd_1_  0.062  0.016  0.050 -0.084   -0.092    0.058    0.056  0.112       
## sprng_nd_2_ -0.236 -0.009  0.037  0.284    0.010    0.228   -0.028 -0.075 -0.294





MODEL DIAGNOSTICS

Residuals diagnostics



Mass

We observe a slight heteroscedasticity with increased residual values with increased predicted mass.


Tibia



Multicollinearity

We check the collinearity of the predictors using the VIF score with the car::vif() function.


Mass

##                         vif_score
## annee_emergence_num      1.111712
## capture_time             1.019790
## sexe                     1.004288
## litter_size_naissance    1.014271
## taille_groupe_naissance  1.109748
## spring_index_1_dt        1.138766
## spring_index_2_dt        1.119412


Tibia

##                                  vif_score.GVIF vif_score.Df vif_score.GVIF..1..2.Df..
## sexe                                   1.005317            1                  1.002655
## capture_time                           1.059915            1                  1.029522
## splines::bs(annee_emergence_num)       1.325691            3                  1.048110
## litter_size_naissance                  1.036649            1                  1.018160
## taille_groupe_naissance                1.087783            1                  1.042968
## spring_index_1_dt                      1.190632            1                  1.091161
## spring_index_2_dt                      1.208884            1                  1.099492

All is good!



Goodness of fit

Using the performance package, we compute the marginal \(R^2\) (variance explained by fixed effects) and the conditional \(R^2\) (fixed + random effects).


Mass

performance::r2(best_model_masse)
## # R2 for Mixed Models
## 
##   Conditional R2: 0.664
##      Marginal R2: 0.280


Tibia

performance::r2(best_model_tibia)
## # R2 for Mixed Models
## 
##   Conditional R2: 0.644
##      Marginal R2: 0.388



ANODEV

The deviance analysis is a statistical approach that quantifies the contribution of climatic variables to the temporal variation of a trait. The idea is to compare a baseline model including the climatic variables as covariates, a constant model, and a full time-dependent model (i.e. year as a factor). Concretely, each of these 3 models is fitted and the likelihood is extracted. The \(F\) statistics test the null hypothesis that the climatic variables have no effect on the trait. In addition, the ANODEV \(R^2\) quantifies the proportion of temporal variation in the trait explained by each climatic variable.

The ANODEV statistic is:

\[ F = \frac{(D_0 - D_c)/(df_0 - df_c)}{(D_c - D_t)/(df_c - df_t)} \]

Where:

and

\[ R^2_{dev} = \frac{D_0 - D_c}{D_0 - D_t} \]


The function code is available below by clicking the Show button ->

ANODEV <- function(data,
                   response,
                   fixed_effects,
                   random_effects,
                   clim_vars,
                   time_var) {
  
  # STORE ANODEV RESULTS
  results <- data.frame(
    ClimVar = character(),
    Dev = numeric(),
    R2dev = numeric(),
    F_stat = numeric(),
    p = numeric(),
    n = numeric(),
    J = numeric(),
    stringsAsFactors = FALSE
  )
  
  # STORE FULL TIME & CONSTANT MODEL DEVIANCES
  results_dev <- data.frame(
    model = character(),
    dev = numeric(),
    k = numeric(),
    stringsAsFactors = FALSE
  )
  
  # CONSTANT MODEL (Fcst)
  form_const <- as.formula(
    paste(response, "~",
          paste(fixed_effects, collapse = " + "),
          "+", paste(random_effects, collapse = " + "))
  )
  
  m0 <- lmerTest::lmer(form_const,
                       data = data,
                       REML = FALSE)
  
  
  # ANODEV PER CLIMATIC VARIABLE
  for (clim in clim_vars) {
    
    # CLIMATIC MODEL (Fco)
    form_clim <- as.formula(
      paste(response, "~",
            paste(fixed_effects, collapse = " + "),
            "+", clim,
            "+", paste(random_effects, collapse = " + "))
    )
    
    mc <- lmerTest::lmer(form_clim, data = data, REML = FALSE)
    
    
    # TIME MODEL (Ft)
    form_time <- as.formula(
      paste(response, "~",
            paste(fixed_effects, collapse = " + "),
            "+ as.factor(", time_var, ")",
            "+", paste(random_effects, collapse = " + "))
    )
    
    mt <- lmerTest::lmer(form_time, data = data, REML = FALSE)
    
    # Deviances
    D0 <- deviance(m0)
    Dc <- deviance(mc)
    Dt <- deviance(mt)
    
    # Df
    df0 <- attr(x = logLik(m0), which = "df")
    dfc <- attr(x = logLik(mc), which = "df")
    dft <- attr(x = logLik(mt), which = "df")
    
    # Skalski parameters
    J <- dfc - df0 
    n <- dft - df0 
    
    # F statistic
    F_stat <- ((D0 - Dc)/(J)) /
      ((Dc - Dt)/(n - J))
    
    # p-value
    p_val <- pf(F_stat,
                df1 = J,
                df2 = n - J,
                lower.tail = FALSE)
    
    # R2 dev
    R2dev <- (D0 - Dc)/(D0 - Dt)
    
    
    # Store results
    results <- rbind(results, data.frame(
      ClimVar = clim,
      Dev = Dc,
      R2dev = round(R2dev, 2),
      F_stat = round(F_stat, 2),
      p = round(p_val, 3),
      n = n,
      J = J
    ))
    
  }
  
  
  # MODEL WITH ALL CLIMATIC VARIABLES
  clim_all <- paste(clim_vars, collapse = " + ")
  
  form_clim_all <- as.formula(
    paste(response, "~",
          paste(fixed_effects, collapse = " + "),
          "+", clim_all,
          "+", paste(random_effects, collapse = " + "))
  )
  
  mc_all <- lme4::lmer(form_clim_all, data = data, REML = FALSE)
  
  
  # FULL TIME-DEPENDENT MODEL
  form_time_all <- as.formula(
    paste(response, "~",
          paste(fixed_effects, collapse = " + "),
          "+ as.factor(", time_var, ")",
          "+", paste(random_effects, collapse = " + "))
  )
  
  mt_all <- lme4::lmer(form_time_all, data = data, REML = FALSE)
  
  # Deviances
  D0 <- deviance(m0)
  Dc <- deviance(mc_all)
  Dt <- deviance(mt_all)
  
  # Df
  df0 <- attr(x = logLik(m0), which = "df")
  dfc <- attr(x = logLik(mc_all), which = "df")
  dft <- attr(x = logLik(mt_all), which = "df")
  
  
  # Skalski parameters
  J <- dfc - df0 
  n <- dft - df0 
  
  
  # Total F
  F_stat <- ((D0 - Dc)/(J)) /
    ((Dc - Dt)/(n - J))
  
  
  # Total p
  p_val <- pf(F_stat,
              df1 = J,
              df2 = n - J,
              lower.tail = FALSE)
  
  
  # Total R2
  R2dev <- (D0 - Dc)/(D0 - Dt)
  
  
  # Add TOTAL row
  results <- rbind(results, data.frame(
    ClimVar = "TOTAL",
    Dev = Dc,
    R2dev = round(R2dev, 2),
    F_stat = round(F_stat, 2),
    p = round(p_val, 3),
    n = n,
    J = J
  ))
  
  # MODELS
  results_dev[1:3, "model"] <- c("D0", "Dc", "Dt")
  results_dev[1:3, "dev"] <- c(D0, Dc, Dt)
  results_dev[1:3, "k"] <- c(df0, dfc, dft)
  
  
  return(list(results = results, results_dev = results_dev))
}



ANODEV : Masse

ANODEV MASS
Climatic variable Dev \(R^2\) \(F_{cst/co/t}\) p n J
spring aridity 16881.95 0.12 3.44 0.075 26 1
spring NDVI 16827.85 0.27 9.04 0.006 26 1
TOTAL 16809.86 0.31 5.48 0.011 26 2


ANODEV : Tibia

ANODEV TIBIA LENGTH
Climatic variable Dev \(R^2\) \(F_{cst/co/t}\) p n J
spring aridity 7983.147 0.07 1.60 0.219 24 1
spring NDVI 7946.795 0.18 5.06 0.034 24 1
TOTAL 7940.667 0.20 2.75 0.086 24 2



Table X. ANODEV results for mass and tibia length. \(R^2_{dev}\) is the proportion of deviance explained by the climatic variable. \(F_{cst/co/t}\) tests whether a covariate or temporal model best fits the data.
Trait Climatic variable Dev \(R^2_{dev}\) \(F_{cst/co/t}\) p n J
Mass spring aridity 16881.949 0.12 3.44 0.075 26 1
spring NDVI 16827.852 0.27 9.04 0.006 26 1
Tibia length spring aridity 7983.147 0.07 1.60 0.219 24 1
spring NDVI 7946.795 0.18 5.06 0.034 24 1