R-scripts and files used to generate this report are available at: https://github.com/Bioshifts/MethodologicalAdjustment
Code demonstrating the entire processes is bellow, including data processing and decisions made along the way. Note that code is hidden by default – considering that most would be interested in the outputs themselves. However, if you wish to see code you can expand all code chunks by clicking at the button at the top right corner of this page, and selecting “Show All Code”. Alternatively, you may want to expand code chunks case-by-case by clicking at the button code at the beginning of each code chunk.
The bioshifts v1 data with corrected range shifts using the methods described bellow can be found at the group’s google drive under: “BIOSHIFTS Working Group/DATA/Bioshifts_v1/biov1_method_corrected_shifts.csv”
Previous explorations of the Bioshifts database have identified immense influence of methods on the observed range shift values. This suggests that methods can affect the potential of range shift detection. Here we propose a way for dealing with methodological variation, with the ultimate goal to generate range shift measurements that are free from this variation and so comparable across studies.
Our approach consisted of fitting a class- and study-level linear mixed effect model, which we use to correct observed range shift values (Figure 1).
Pre-processing
Before modeling, we removed studies with low-species number ( e.g. <
3 ) and variables that are co-linear with climate velocity (e.g. start
year and duration). We do not include parameter (leading edge, trailing
edge, centroid) because methods do not differ across range position
(i.e., assume effect of method is the same across range position). It’s
important to note that the specific methodological variables considered
may vary depending on the version or the Bioshifts database being
studied.
We expect methods affect the magnitude of observed range shifts, and that they have a lesser influence on the direction of shift (a detected southwards range shift that is actually a northward shift should be rare). Therefore, we fitted models using absolute shift values.
1) Fixed effect structure of the models:
We fitted separate models based on the gradient (latitude or elevation). We do not assign any particular weights to the data during this step. The fixed effect structure of these models without weights is as follows:
\[ StudyLevelShift \sim NtemporalUnits + Extent + GrainSize + PrAb + Quality \]
NtemporalUnits represents the number of sampling years. Extent represents the total latitudinal extent of the study area for latitudinal shifts, or total elevation range at the study area for elevation shifts. GrainSize is an ordinal variable representing the spatial resolution of the data used to generate range shift estimates. Levels of the variable GrainSize are: small (< 1km2), moderate (> 1km2 and < 10km2), large (> 10 km2), and very large (> 100 km2). PrAb is a two levels variable representing whether the observation is of a shift in occupancy or a shift in abundance. Quality represents whether how the uncertainty coming from the geolocalisation of the sites was taken into account. Levels of the variable quality are raw (the raw data were used without attempting to account for spatial/temporal biases), resurveyed (the same sites were used through time), and balanced (a method was used to account for sampling biases).
Two different statistical models are used at this step:
2) Gamma vs Gaussian models:
Gamma Model: This model uses the
glmmTMB
package and is fitted with a Gamma distribution and
a log link function. The Gamma distribution is suitable for
positive-valued data, and the log link function transforms the response
variable.
Gaussian Model: This model uses both the
glmmTMB
package and the nlme
package and is
fitted with a Gaussian distribution and identity link. The response
variable in this case is the log-transformed absolute shifts at the
study-level. We fitted Gaussian models with the two packages because the
nlme
packages allows greater flexibility in terms of
weighting structure (see bellow).
3) Random effect structure: Random intercept vs random
slope
We included random effect of Class on the intercept because methods
might differ across classes. We also constructed a model with a random
slope to account for potential variability between different taxonomic
Classes:
Random Intercept only (1|Class):
This model considers Class as a random intercept. It means that the
effect of Class (grouping of similar entities) is allowed to vary
randomly across the data, accounting for the differences between the
taxonomic classes.
Random Intercept and Random Slope
(Extent|Class):
This model considers Extent (e.g., extent of geographical area covered
in a study) as a random slope within each Class. It means that the
effect of Extent can vary randomly across the data, and it can differ
between the different taxonomic classes.
We selected the random structure in model with the lowest AIC, and a weight structure is applied to the selected model.
4) Adding Weights to the Model:
We explore multiple sources of variability in data using weights, given that study-level shifts differ in variability. In this step, we contrast different weight schemes. Overall, these models aim to identify the most appropriate statistical approach and weighting scheme for analyzing the data on range shifts and understanding the factors that influence these shifts.
Weighting Precision Around Mean: The weighting parameter used is the inverse of the study-level shift variability. This means that observations with more precise or less variable shift measurements are given higher weights during model fitting.
\[ weights = \frac{1}{shiftVar} \]
Weighting Taxa Diversity: The weighting is based on taxa diversity, meaning that species richness within a particular class and study is taken into account during model fitting, and observations with more species having higher weights during model fitting.
\[ weights = NtaxaClass \]
For both Precision Around Mean and Taxa Diversity, four variance functions (constant, fixed, power, and exponential) were applied and contrasted to determine the most suitable one based on AIC and inspection of residuals.
Weighting Two Sources of Variability: Here, two sources of variability are combined as weights in the model. The first source is Taxa Diversity, and the second source is the number of temporal units (e.g., time periods) used in the analysis.
\[ weights = varComb(varExp(NtaxaClass), varExp(NtemporalUnits)) \]
5) Applying correction to species-level range shifts:
We used predicted shift values obtained from the study-level models. These represent the expected range shift given the set of methodological conditions based on the analyses. To better understand and interpret these predicted values, we standardized them to a reference level. This reference level represents ideal methodological conditions, which can serve as a benchmark for comparison.
Two different methods were used for standardization:
𝚫Difference: In this method, we calculated the difference between predicted study-level shift and the reference value. By subtracting the reference value from the predicted study-level shift, we obtain the standardized value (Δ), which tells us how much the observed shift deviates from the ideal condition.
\[ 𝚫Difference = PredictedStudyLevel - PredictedIdeal \]
𝚫Ratio: In the 𝚫Ratio, we calculated the ratio between the predicted study-level shift and the reference value. Dividing the predicted study-level shift by the reference value gives us a ratio that of the observed shift compared to the ideal condition. If the standardized values are close to 1, it indicates that the model’s predictions are similar to the ideal conditions.
\[ 𝚫Ratio = \frac{PredictedStudyLevel}{PredictedIdeal} \]
Finally, we calculate corrected species-level shift using the two methods as:
\[ CorrShiftDiff = ObservedShift - 𝚫Difference \]
\[ CorrShiftRatio = ObservedShift * 𝚫Ratio \]
After applying correction to species range shifts using the 𝚫Difference method, predicted absolute values were back-transformed to negative values when appropriate.
library(tidyverse)
library(nlme)
library(car)
library(googledrive)
library(rgdal)
library(pbapply)
library(DHARMa)
library(lme4)
library(statmod)
library(corrplot)
library(glmmTMB)
library(knitr)
library(DT)
library(ade4)
library(FD)
library(plotly)
library(dbscan)
library(ggpubr)
library(R.utils)
rm(list=ls())
theme_set(theme_bw())
# function to find the mode of each predictor
getmode <- function(v) {
uniqv <- unique(v)
uniqv[which.max(tabulate(match(v, uniqv)))]
}
# Range values between 0 and 1
range01 <- function(x){(x-min(x))/(max(x)-min(x))}
# Create a table with AIC and R2 from a model list and vector with model names
source("scripts/aic_r2_table.R")
# Predict fixed effect
source("scripts/predict_fixed.R")
if(!file.exists(here::here("data/bioshiftsv3_metadata.csv"))){
# # From Google drive
# drive_path <- readWindowsShortcut("G:/My Drive.lnk")$pathname
# bioshifts_path <- readWindowsShortcut(paste0(drive_path,"\\BIOSHIFTS Working Group.lnk"))$pathname
# shp_path <- here::here(bioshifts_path,("DATA/BIOSHIFTS_v3/CleanDatabasev3/ShapefilesBioShiftsv3"))
# all_shps <- list.files(shp_path,pattern = "dbf", full.names = TRUE)
# From local folder
all_shps <- list.files("C:/Users/brunn/ShadowDrive/CreateGeodatabaseBioShifts/Data/ShapefilesBioShiftsv3",
pattern = "shp", full.names = TRUE)
rs_meta <- pblapply(all_shps, terra::vect)
rs_meta <- pblapply(rs_meta, data.frame)
rs_meta <- data.table::rbindlist(rs_meta)
write.csv(rs_meta, here::here("data/bioshiftsv3_metadata.csv"),row.names = FALSE)
} else {
rs_meta <- read.csv("data/bioshiftsv3_metadata.csv")
}
# Read in data
rs <- read.delim("data/Shifts2018_checkedtaxo.txt")
rs <- rs[-which(rs$Type %in% c("BAT","LON")),]
rs$Type[rs$Type == "HOR"] <- "LAT"
length(unique(rs$ID)) # 322
## [1] 322
# add geospatial data
rs_meta$ID <- rs_meta$Name
# check if all IDs in v1 exist in geospatial data
all(rs$ID %in% rs_meta$ID) # No
## [1] FALSE
# what is missing?
unique(rs$ID[which(!rs$ID %in% rs_meta$ID)])
## [1] "A101_P1" "A120_P1" "A124_P1" "A127_P1" "A140_P1" "A148_P1" "A150_P1"
## [8] "A175_P1" "A175_P2" "A182_P1" "A189_P1" "A197_P1" "A198_P1" "A200_P1"
## [15] "A210_P1" "A223_P1" "A231_P1" "A3_P1" "A45_P1" "A50_P1" "A73_P1"
## [22] "A91_P1" "A96_P1" "A97_P1"
# these are all problematic studies. remove them all
rs <- rs %>%
filter(ID %in% rs_meta$ID)
# merge
rs <- merge(rs,rs_meta,by="ID",all.y = TRUE) # all.y to keep only studies with geospatial data
# Get Extent (=Lat/Ele extent)
rs$Extent <- rs$LatExtentk
rs$Extent[which(rs$Type=="ELE")] <- rs$EleExtentm[which(rs$Type=="ELE")]
# Missing extent
tmp <- rs %>%
filter(Type == "ELE" & is.na(Extent))
unique(tmp$ID)
## character(0)
# log transform Lat/Ele extent
rs$LogExtent <- log(rs$Extent)
#remove bacteria and fungi
table(rs$Kingdom)
##
## Animalia Bacteria Fungi Plantae
## 15591 1 106 14369
rs <- rs[-which(rs$Kingdom %in% c("Bacteria","Fungi")),]
#remove taxa not identified to the species level (genus or family level data) or with no valide names
rs <- rs[-which(is.na(rs$Class)),]
rs <- rs[-which(rs$valid_species_name=="no"),]
# new coding scheme for type and param
rs$Gradient <- rs$Type
rs$Position <- rs$Param
## make new var for unique study id + class
rs$Source = rs$ID
rs$ID = paste(rs$Source, rs$Gradient, rs$Class, sep = "_")
## log transform area
# names(rs)
rs$LogArea = log(rs$ID.area)
hist(rs$LogArea)
## group irregular sampling with multiple sampling
rs$Sampling = ifelse(rs$Sampling %in% c("IRR","MULTIPLE(continuous)"),"MULT", rs$Sampling)
table(rs$Sampling)
##
## CONTINUOUS MULT TWO
## 3083 1246 25483
## order factors in a logical way
table(rs$Grain_size)
##
## large moderate small very_large
## 6853 10655 9746 2557
rs$Grain <- factor(rs$Grain_size, levels = c("small", "moderate", "large","very_large"))
rs$ContinuousGrain <-
as.numeric(
as.character(
ifelse(
rs$Grain == "small", 1,
ifelse(rs$Grain == "moderate",
2, #10,
ifelse(rs$Grain == "large",
3, #100,
4))))) #we have a few very large in the full database
hist(rs$ContinuousGrain)
rs$ContinuousGrain2 <-
as.numeric(
as.character(
ifelse(
rs$Grain == "small", 1,
ifelse(rs$Grain == "moderate",
10,
ifelse(rs$Grain == "large",
100,
1000))))) #we have a few very large in the full database
rs$ContinuousGrain2 <- rs$ContinuousGrain2
hist(rs$ContinuousGrain2)
#harmonize column names
rs$Quality <- ifelse(rs$Uncertainty_Distribution %in% c("RESAMPLING"),"BALANCED",
ifelse(rs$Uncertainty_Distribution %in% c("MODEL","MODEL+RESAMPLING(same)","RESAMPLING(same)+MODEL","RESAMPLING+MODEL"),"BALANCED",
ifelse(rs$Uncertainty_Distribution %in% c("DETECTABILITY","RESAMPLING(same)+DETECTABILITY"),"BALANCED",
ifelse(rs$Uncertainty_Distribution %in% c("RESAMPLING(same)"),"RESURVEYED",rs$Uncertainty_Distribution))))
table(rs$Quality) #to be thought about how to split the modalities?
##
## BALANCED RAW RESURVEYED
## 14016 9440 6356
rs$PrAb <- ifelse(rs$Data %in% c("occurence-based", "occurrence-based"),"occurrence","abundance")
table(rs$PrAb)
##
## abundance occurrence
## 9426 20386
##compute number of species per class
NtaxaClass <- tapply(rs$ID,rs$ID,length)
rs$NtaxaClass <- NtaxaClass[match(rs$ID,names(NtaxaClass))]
NClass <- table(rs$Source,rs$Class)
NClass <- ifelse(NClass > 0,1,0)
NClass <- apply(NClass,1,sum)
rs$NClass <- NClass[match(rs$Source,names(NClass))]
# Number of temporal units
rs$NtempUnits <- rs$Nperiodes
these <- which(rs$NtempUnits > rs$DUR)
rs$NtempUnits[these] <- rs$DUR[these]
# log NtempUnits
rs$LogNtempUnits <- log(rs$NtempUnits)
## Save the "full data"
rs_full <- rs
## Create study level "full data"
SL_variables <- c("ID","Article_ID","Class","Source",
"Grain","ContinuousGrain","ContinuousGrain2",
"Quality","PrAb","NtaxaClass","NClass",
"LogNtempUnits","NtempUnits","Gradient","Sampling",
"Extent","LogExtent")
rs_SL_full <- rs_full[,c(SL_variables)]
rs_SL_full <- na.omit(unique(rs_SL_full))
# any study with more >1 NtempUnits?
test <- rs_SL_full %>%
group_by(ID) %>%
summarise(N=length(unique(NtempUnits)))
any(test$N>1)
## [1] FALSE
# No. Good!
# any study with more >1 Extent?
test <- rs_SL_full %>%
group_by(ID) %>%
summarise(N=length(unique(Extent)))
any(test$N>1)
## [1] FALSE
# Yes...
# which ones?
these <- unique(test$ID[which(test$N>1)])
# View(rs_SL_full %>% filter(ID %in% these))
#
# any(duplicated(rs_SL_full$ID))
#
# rs_SL_full$ID[which(duplicated(rs_SL_full$ID))]
#
# rs_SL_full <- rs_SL_full %>%
# group_by(ID) %>%
# mutate(study_level_shift = mean(SHIFT),
# study_level_var = var(SHIFT),
# study_level_sd = sd(SHIFT),
# study_level_shift_abs = mean(abs(SHIFT)),
# study_level_var_abs = var(abs(SHIFT)),
# study_level_sd_abs = sd(abs(SHIFT)),
# NtempUnits = mean(NtempUnits),
# LogNtempUnits = mean(LogNtempUnits),
# Extent = mean(Extent),
# LogExtent = mean(LogExtent)) %>%
# select(-SHIFT)
#
# dim(rs_SL_full) # 685
rs_SL_full_lat <- filter(rs_SL_full, grepl("LAT",ID))
rs_SL_full_ele <- filter(rs_SL_full, grepl("ELE",ID))
head(rs[rs$Source == "A51_P1",]) #varies between 13 to 97 years so better to put a cap on the differences
## ID Publi_name Article_ID Study_ID Type Param
## 27205 A51_P1_ELE_Aves Arachnothera_juliae A51 P1 ELE LE
## 27206 A51_P1_ELE_Aves Pycnonotus_goiavier A51 P1 ELE TE
## 27207 A51_P1_ELE_Aves Eurylaimus_ochromalus A51 P1 ELE TE
## 27208 A51_P1_ELE_Aves Pycnonotus_goiavier A51 P1 ELE LE
## 27209 A51_P1_ELE_Aves Pellorneum_pyrrogenys A51 P1 ELE TE
## 27210 A51_P1_ELE_Aves Geokichla_citrina A51 P1 ELE TE
## Trend Sign.trend Unit.trend Azimuth Sign.azim Unit.azim START END DUR
## 27205 333 <NA> m NA NA <NA> 1978 2010 33
## 27206 0 <NA> m NA NA <NA> 1970 2010 41
## 27207 227 <NA> m NA NA <NA> 1997 2010 14
## 27208 -1015 <NA> m NA NA <NA> 1970 2010 41
## 27209 475 <NA> m NA NA <NA> 1996 2010 15
## 27210 591 <NA> m NA NA <NA> 1998 2010 13
## Observation Hemisphere HOR LAT LON ELE BAT LE TE O ECO TAX N Sampling
## 27205 <NA> N 0 0 0 1 0 1 1 0 T A 58 TWO
## 27206 <NA> N 0 0 0 1 0 1 1 0 T A 58 TWO
## 27207 <NA> N 0 0 0 1 0 1 1 0 T A 58 TWO
## 27208 <NA> N 0 0 0 1 0 1 1 0 T A 58 TWO
## 27209 <NA> N 0 0 0 1 0 1 1 0 T A 58 TWO
## 27210 <NA> N 0 0 0 1 0 1 1 0 T A 58 TWO
## Nperiodes first_year last_year Data Uncertainty_Distribution
## 27205 2 1978 2011 occurrence-based RAW
## 27206 2 1978 2011 occurrence-based RAW
## 27207 2 1978 2011 occurrence-based RAW
## 27208 2 1978 2011 occurrence-based RAW
## 27209 2 1978 2011 occurrence-based RAW
## 27210 2 1978 2011 occurrence-based RAW
## Uncertainty_Parameter Grain_size ID.area ID.latitude baseline.temp
## 27205 no small 0.01847762 6.07545 25.75278
## 27206 no small 0.01847762 6.07545 25.75278
## 27207 no small 0.01847762 6.07545 25.75278
## 27208 no small 0.01847762 6.07545 25.75278
## 27209 no small 0.01847762 6.07545 25.75278
## 27210 no small 0.01847762 6.07545 25.75278
## trend.mean trend.sd trend.nb.pixels v.lat.mean v.lat.sd v.lat.nb.pixels
## 27205 0.01513407 NA 1 0.01568606 0.1160768 1
## 27206 0.01513407 NA 1 0.01568606 0.1160768 1
## 27207 0.01513407 NA 1 0.01568606 0.1160768 1
## 27208 0.01513407 NA 1 0.01568606 0.1160768 1
## 27209 0.01513407 NA 1 0.01568606 0.1160768 1
## 27210 0.01513407 NA 1 0.01568606 0.1160768 1
## ele.gradient v.ele.mean v.ele.sd SHIFT UNIT old_name
## 27205 0.005318239 2.845691 NA 10.09091 m/year Arachnothera_juliae
## 27206 0.005318239 2.845691 NA 0.00000 m/year Pycnonotus_goiavier
## 27207 0.005318239 2.845691 NA 16.21429 m/year Eurylaimus_ochromalus
## 27208 0.005318239 2.845691 NA -24.75610 m/year Pycnonotus_goiavier
## 27209 0.005318239 2.845691 NA 31.66667 m/year Pellorneum_pyrrogenys
## 27210 0.005318239 2.845691 NA 45.46154 m/year Geokichla_citrina
## taxo_level checkedby Kingdom Phylum Class Order Family
## 27205 Species Lise Animalia Chordata Aves Passeriformes Nectariniidae
## 27206 Species Lise Animalia Chordata Aves Passeriformes Pycnonotidae
## 27207 Species Lise Animalia Chordata Aves Passeriformes Eurylaimidae
## 27208 Species Lise Animalia Chordata Aves Passeriformes Pycnonotidae
## 27209 Species Lise Animalia Chordata Aves Passeriformes Timaliidae
## 27210 Species Lise Animalia Chordata Aves Passeriformes Turdidae
## Genus Species New_name group Temp
## 27205 Arachnothera juliae Arachnothera_juliae vertebrate endotherme
## 27206 Pycnonotus goiavier Pycnonotus_goiavier vertebrate endotherme
## 27207 Eurylaimus ochromalus Eurylaimus_ochromalus vertebrate endotherme
## 27208 Pycnonotus goiavier Pycnonotus_goiavier vertebrate endotherme
## 27209 Pellorneum pyrrogenys Pellorneum_pyrrogenys vertebrate endotherme
## 27210 Geokichla citrina Geokichla_citrina vertebrate endotherme
## valid_species_name Name Article DOI LatCentDeg LonCentDeg
## 27205 yes A51_P1 Harris_al_2012_RBZ n/a 116.5626 6.07545
## 27206 yes A51_P1 Harris_al_2012_RBZ n/a 116.5626 6.07545
## 27207 yes A51_P1 Harris_al_2012_RBZ n/a 116.5626 6.07545
## 27208 yes A51_P1 Harris_al_2012_RBZ n/a 116.5626 6.07545
## 27209 yes A51_P1 Harris_al_2012_RBZ n/a 116.5626 6.07545
## 27210 yes A51_P1 Harris_al_2012_RBZ n/a 116.5626 6.07545
## Areakm2 LatExtentk EleExtentm Eleminm Elemaxm Elemeanm Extent LogExtent
## 27205 226.2 17.87 3567.62 427.4 3995.01 472.13 3567.62 8.179654
## 27206 226.2 17.87 3567.62 427.4 3995.01 472.13 3567.62 8.179654
## 27207 226.2 17.87 3567.62 427.4 3995.01 472.13 3567.62 8.179654
## 27208 226.2 17.87 3567.62 427.4 3995.01 472.13 3567.62 8.179654
## 27209 226.2 17.87 3567.62 427.4 3995.01 472.13 3567.62 8.179654
## 27210 226.2 17.87 3567.62 427.4 3995.01 472.13 3567.62 8.179654
## Gradient Position Source LogArea Grain ContinuousGrain ContinuousGrain2
## 27205 ELE LE A51_P1 -3.991195 small 1 1
## 27206 ELE TE A51_P1 -3.991195 small 1 1
## 27207 ELE TE A51_P1 -3.991195 small 1 1
## 27208 ELE LE A51_P1 -3.991195 small 1 1
## 27209 ELE TE A51_P1 -3.991195 small 1 1
## 27210 ELE TE A51_P1 -3.991195 small 1 1
## Quality PrAb NtaxaClass NClass NtempUnits LogNtempUnits
## 27205 RAW occurrence 88 1 2 0.6931472
## 27206 RAW occurrence 88 1 2 0.6931472
## 27207 RAW occurrence 88 1 2 0.6931472
## 27208 RAW occurrence 88 1 2 0.6931472
## 27209 RAW occurrence 88 1 2 0.6931472
## 27210 RAW occurrence 88 1 2 0.6931472
## Exclude studies with varying number of years
dur <- paste0(rs$ID,"_",rs$DUR)
dur_details <- aggregate(DUR ~ ID,data=rs,range)
#select studies with less than 2 years of difference
VaryingYears <- dur_details$ID[which(dur_details$DUR[,2] - dur_details$DUR[,1] > 2)] #12 studies to remove
rs <- rs[-which(rs$ID %in% VaryingYears),]
# take the min only for studies with less than 2 years of difference
rs = rs %>%
group_by(ID) %>%
mutate(DUR = min(DUR),
START = min(START),
END = min(END)) %>%
ungroup()
## get rid of study with insanely large area (and weird study overall)
rs <- filter(rs, LogArea < 9)
## split raw data by gradient at the species level
rs_lat <- filter(rs, Gradient == "LAT")
rs_ele <- filter(rs, Gradient == "ELE")
## Create the study-level dataset
## group by study ID and class and calculate a mean range shift value and variance for each class within studies
rs_SL <- rs[,c(SL_variables,"SHIFT")]
rs_SL <- rs_SL %>%
group_by(ID) %>%
mutate(study_level_shift = mean(SHIFT),
study_level_var = var(SHIFT),
study_level_sd = sd(SHIFT),
study_level_shift_abs = mean(abs(SHIFT)),
study_level_var_abs = var(abs(SHIFT)),
study_level_sd_abs = sd(abs(SHIFT)),
NtempUnits = mean(NtempUnits)) %>%
select(-SHIFT) %>%
ungroup() %>%
distinct()
dim(rs_SL) #596
## [1] 572 23
## filter to studies with at least X species
rs_SL <- filter(rs_SL, rs_SL$NtaxaClass >= 3) #LC: here Ntaxa refers to total number of species in a study (not a taxonomic class)
range(rs_SL$NtaxaClass)
## [1] 3 3596
nlevels(factor(rs_SL$ID))
## [1] 374
length(unique(paste(rs_SL$Source, rs_SL$Gradient, rs_SL$Class, sep = "_"))) # 325 studies should remain
## [1] 374
rs_SL <- rs_SL[complete.cases(rs_SL),]
## split data into latitudinal and elevational observations
rs_SL_lat <- filter(rs_SL, rs_SL$Gradient == "LAT")
rs_SL_ele <- filter(rs_SL, rs_SL$Gradient == "ELE")
#a few checks
range(rs_SL_lat$NtaxaClass)
## [1] 3 1199
range(rs_SL_ele$NtaxaClass)
## [1] 3 3596
range(rs_SL_lat$study_level_var)
## [1] 0.02249571 997.55689917
range(rs_SL_ele$study_level_var)
## [1] 0.01085069 568.50323929
table(rs_SL_lat$ContinuousGrain)
##
## 1 2 3 4
## 81 44 46 46
table(rs_SL_ele$ContinuousGrain)
##
## 1 2 3 4
## 126 29 1 1
table(rs_SL_lat$Quality)
##
## BALANCED RAW RESURVEYED
## 69 101 47
table(rs_SL_ele$Quality)
##
## BALANCED RAW RESURVEYED
## 60 41 56
table(rs_SL_lat$Sampling)
##
## CONTINUOUS MULT TWO
## 43 17 157
table(rs_SL_ele$Sampling)
##
## CONTINUOUS MULT TWO
## 9 1 147
table(rs_SL_lat$PrAb)
##
## abundance occurrence
## 79 138
table(rs_SL_ele$PrAb)
##
## abundance occurrence
## 34 123
#rs_SL_lat[rs_SL_lat$study_level_var > 1000,]
#var(rs$SHIFT[rs$ID == "A116_P1__Insecta"])
#remove groups with <3 observations TO BE DISCUSSED
#KEEP_lat <- names(which(table(rs_SL_lat$Class) > 3))
#rs_SL_lat <- rs_SL_lat[which(rs_SL_lat$Class %in% KEEP_lat),]
#KEEP_ele <- names(which(table(rs_SL_ele$Class) > 3))
#rs_SL_ele <- rs_SL_ele[which(rs_SL_ele$Class %in% KEEP_ele),]
write.csv(rs_SL_lat,"outputs/LATData_model.csv",row.names=F)
write.csv(rs_SL_ele,"outputs/ELEData_model.csv",row.names=F)
Fit one model per gradient (latitudinal, elevational)
rs_SL_lat <- read.csv("outputs/LATData_model.csv")
rs_SL_lat$Class <- factor(rs_SL_lat$Class)
dim(rs_SL_lat)
## [1] 217 23
nlevels(factor(rs_SL_lat$Source))
## [1] 120
nlevels(factor(rs_SL_lat$Class))
## [1] 37
# 120 studies, 37 taxonomic groups
hist(rs_SL_lat$study_level_shift_abs)
hist(rs_SL_lat$study_level_var)
hist(rs_SL_lat$LogExtent)
hist(rs_SL_lat$ContinuousGrain)
hist(rs_SL_lat$LogNtempUnits, breaks = 50)
ggplot(rs_SL_lat, aes(x=Quality)) + geom_bar()
ggplot(rs_SL_lat, aes(x=PrAb)) + geom_bar()
ggplot(rs_SL_lat, aes(y=Class)) + geom_bar()
ggplot(rs_SL_lat, aes(y=Sampling)) + geom_bar()
corrplot(
cor(rs_SL_lat[,c("study_level_shift_abs",
"study_level_var",
"NtempUnits",
"LogExtent",
"NtaxaClass")]),
method = "number",
type = "lower",
diag = FALSE)
# we cannot use study_level_var and study_level_shift_abs in the same model
plot(rs_SL_lat$study_level_shift_abs, 1/rs_SL_lat$study_level_var)
plot(log(rs_SL_lat$study_level_shift_abs), log(1/rs_SL_lat$study_level_var)) # do not fit models with study_level_var?
# study_level_var is correlated with Extent
plot(log(1/study_level_var) ~ LogExtent, rs_SL_lat)
cor.test(rs_SL_lat$LogExtent,1/rs_SL_lat$study_level_var) # do not fit models with study_level_var?
##
## Pearson's product-moment correlation
##
## data: rs_SL_lat$LogExtent and 1/rs_SL_lat$study_level_var
## t = -2.4162, df = 215, p-value = 0.01652
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.28950808 -0.03005944
## sample estimates:
## cor
## -0.1625929
# NtaxaClass is OK
plot(study_level_shift_abs ~ NtaxaClass, rs_SL_lat)
plot(log(study_level_shift_abs) ~ log(NtaxaClass), rs_SL_lat)
plot(log(1/study_level_var) ~ log(NtaxaClass), rs_SL_lat)
# LogNtempUnits is OK
plot(study_level_shift_abs ~ LogNtempUnits, rs_SL_lat)
plot(log(study_level_shift_abs) ~ log(LogNtempUnits), rs_SL_lat)
# we cannot use sampling and LogNtempUnits in the same model
boxplot(rs_SL_lat$LogNtempUnits~rs_SL_lat$Sampling) # do not fit models with sampling
# more correlation tests
boxplot(log(study_level_var) ~ ContinuousGrain, rs_SL_lat)
cor.test(rs_SL_lat$ContinuousGrain,log(rs_SL_lat$study_level_var))
##
## Pearson's product-moment correlation
##
## data: rs_SL_lat$ContinuousGrain and log(rs_SL_lat$study_level_var)
## t = 1.7343, df = 215, p-value = 0.0843
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.01597405 0.24678402
## sample estimates:
## cor
## 0.1174603
plot(log(NtaxaClass) ~ ContinuousGrain, rs_SL_lat)
boxplot(LogExtent ~ ContinuousGrain, rs_SL_lat)
# N temporal units
p1 <- ggplot(rs_lat, aes(x = NtempUnits, y = abs(SHIFT), size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Log N temporal units") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = NtempUnits, y = study_level_shift_abs, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Log N temporal units") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Extent
p1 <- ggplot(rs_lat, aes(x = LogExtent, y = abs(SHIFT), size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Study area\n(Latitudinal extent km)") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = LogExtent, y = study_level_shift_abs, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Study area\n(Latitudinal extent km)") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Grain size
p1 <- ggplot(rs_lat, aes(x = ContinuousGrain, y = abs(SHIFT), size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data") +
xlab("Grain size categories") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = ContinuousGrain, y = study_level_shift_abs, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Grain size categories") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Sampling
p1 <- ggplot(rs_lat, aes(x = Sampling, y = abs(SHIFT), size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Sampling") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = Sampling, y = study_level_shift_abs, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Sampling") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Quality
p1 <- ggplot(rs_lat, aes(x = Quality, y = abs(SHIFT), size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = Quality, y = study_level_shift_abs, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# PrAb
p1 <- ggplot(rs_lat, aes(x = PrAb, y = abs(SHIFT), size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = PrAb, y = study_level_shift_abs, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# N temporal units
p1 <- ggplot(rs_lat, aes(x = NtempUnits, y = SHIFT, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Log N temporal units") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = NtempUnits, y = study_level_shift, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Log N temporal units") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Extent
p1 <- ggplot(rs_lat, aes(x = LogExtent, y = SHIFT, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Study area\n(Latitudinal extent km)") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = LogExtent, y = study_level_shift, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Study area\n(Latitudinal extent km)") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Grain size
p1 <- ggplot(rs_lat, aes(x = ContinuousGrain, y = SHIFT, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Grain size categories") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = ContinuousGrain, y = study_level_shift, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Grain size categories") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
## Warning: Removed 1 rows containing missing values (`geom_point()`).
# Sampling
p1 <- ggplot(rs_lat, aes(x = Sampling, y = SHIFT, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Sampling") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = Sampling, y = study_level_shift, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Sampling") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
## Warning: The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
## Warning: The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
# Quality
p1 <- ggplot(rs_lat, aes(x = Quality, y = SHIFT, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = Quality, y = study_level_shift, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
## Warning: The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
## The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
# PrAb
p1 <- ggplot(rs_lat, aes(x = PrAb, y = SHIFT, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_lat, aes(x = PrAb, y = study_level_shift, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
## Warning: The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
## The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
We expect methods affect shift magnitude, and to a way leaser extent they affect direction (?), so we decided to model absolute shifts rather than raw shifts.
# Gamma intercept
tmb_lat_gamma_int <- glmmTMB(
study_level_shift_abs ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb +
(1|Class),
family = Gamma(link = "log"),
data = rs_SL_lat)
# Gamma slope
tmb_lat_gamma_slp <- glmmTMB(
study_level_shift_abs ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb +
(LogExtent|Class),
family = Gamma(link = "log"),
data = rs_SL_lat)
# Gaussian intercept
tmb_lat_gaussian_int <- glmmTMB(
log(study_level_shift_abs) ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb +
(1|Class),
data = rs_SL_lat)
# Gaussian slope
tmb_lat_gaussian_slp <- glmmTMB(
log(study_level_shift_abs) ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb +
(LogExtent|Class),
data = rs_SL_lat)
## Warning in fitTMB(TMBStruc): Model convergence problem; non-positive-definite
## Hessian matrix. See vignette('troubleshooting')
## Warning in fitTMB(TMBStruc): Model convergence problem; singular convergence
## (7). See vignette('troubleshooting')
###########
# Get results
tmb_list <- list(tmb_lat_gamma_int,
tmb_lat_gamma_slp,
tmb_lat_gaussian_int,
tmb_lat_gaussian_slp)
tmb_names <- c("Gamma intercept",
"Gamma slope",
"Gaussian intercept",
"Gaussian slope")
# AIC and R2 table
DT::datatable(aic_r2_table(tmb_list, tmb_names))
## Loading required package: data.table
##
## Attaching package: 'data.table'
## The following objects are masked from 'package:lubridate':
##
## hour, isoweek, mday, minute, month, quarter, second, wday, week,
## yday, year
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## The following object is masked from 'package:purrr':
##
## transpose
## Loading required package: piecewiseSEM
##
## This is piecewiseSEM version 2.3.01.
##
##
## Questions or bugs can be addressed to <jlefcheck@umces.edu>.
# Residual distribution
for(i in 1:length(tmb_list)){
try({
plot(simulateResiduals(tmb_list[[i]]), main = tmb_names[i])
})
}
## qu = 0.75, log(sigma) = -2.56479 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -2.363777 : outer Newton did not converge fully.
Gaussian intercept is the best model according to AIC. This model also show good residual distribution.
# The selected model
tmb_lat <- tmb_lat_gaussian_int
summary(tmb_lat)
## Family: gaussian ( identity )
## Formula:
## log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain +
## Quality + PrAb + (1 | Class)
## Data: rs_SL_lat
##
## AIC BIC logLik deviance df.resid
## 512.8 543.3 -247.4 494.8 208
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## Class (Intercept) 0.2866 0.5354
## Residual 0.4686 0.6845
## Number of obs: 217, groups: Class, 37
##
## Dispersion estimate for gaussian family (sigma^2): 0.469
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.44715 0.60584 -4.039 5.36e-05 ***
## LogNtempUnits -0.19402 0.06184 -3.138 0.00170 **
## LogExtent 0.50629 0.08219 6.160 7.26e-10 ***
## ContinuousGrain 0.09772 0.04803 2.035 0.04190 *
## QualityRAW 0.20472 0.12714 1.610 0.10736
## QualityRESURVEYED -0.17089 0.14218 -1.202 0.22939
## PrAboccurrence -0.40687 0.14112 -2.883 0.00394 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# NtaxaClass model
tmb_lat_class <- try(
update(tmb_lat, weights = NtaxaClass))
# summary(tmb_lat_class)
# 1/study_level_var model
tmb_lat_var <- try(
update(tmb_lat, weights = range01(log(1/study_level_var_abs)))) # precision
# NtempUnits model
tmb_lat_temp <- try(
update(tmb_lat,
.~ LogExtent + ContinuousGrain + Quality + PrAb + (1 | Class),
weights = LogNtempUnits))
# NtaxaClass + NtempUnits model
tmb_lat_class_temp <- try(
update(tmb_lat,
.~ LogExtent + ContinuousGrain + Quality + PrAb + (1 | Class),
weights = log(NtaxaClass) * LogNtempUnits)) # two sources of variability (N species and sample size)
##############################################################
tmb_list <- list(tmb_lat,
tmb_lat_var,
tmb_lat_class,
tmb_lat_temp,
tmb_lat_class_temp)
tmb_names <- c("No weights",
"1/variance",
"NtaxaClass",
"NtempUnits",
"NtaxaClass + NtempUnits")
# AIC and R2 table
DT::datatable(aic_r2_table(tmb_list, tmb_names))
# Dharma residuals
for(i in 1:length(tmb_list)){
try({
plot(simulateResiduals(tmb_list[[i]]), main = tmb_names[i])
})
}
## qu = 0.75, log(sigma) = -2.56479 : outer Newton did not converge fully.
## qu = 0.75, log(sigma) = -2.363777 : outer Newton did not converge fully.
## Warning in getSimulations.glmmTMB(fittedModel, nsim = n, type = "normal", :
## Model was fit with prior weights. These will be ignored in the simulation. See
## ?getSimulations for details
## Warning in newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L,
## : Fitting terminated with step failure - check results carefully
## Warning in newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L,
## : Model was fit with prior weights. These will be ignored in the simulation.
## See ?getSimulations for details
## Warning in getSimulations.glmmTMB(fittedModel, nsim = n, type = "normal", :
## Model was fit with prior weights. These will be ignored in the simulation. See
## ?getSimulations for details
## Warning in getSimulations.glmmTMB(fittedModel, nsim = n, type = "normal", :
## Model was fit with prior weights. These will be ignored in the simulation. See
## ?getSimulations for details
# qqnorm plot
{
par(mfrow=c(2,3))
for(i in 1:length(tmb_list)){
qqnorm(resid(tmb_list[[i]]), main = tmb_names[i])
qqline(resid(tmb_list[[i]]))
}
}
# histogram residuals
{
par(mfrow=c(2,3))
for(i in 1:length(tmb_list)){
hist(resid(tmb_list[[i]]), main = tmb_names[i], xlab = "Residual")
}
}
# residuals vs fitted
{
par(mfrow=c(2,3))
for(i in 1:length(tmb_list)){
plot(fitted(tmb_list[[i]]),resid(tmb_list[[i]]), main = tmb_names[i],
xlab="Fitted values",ylab = "Residuals")
}
}
The model with the lowest AIC is the “1/variance”, followed by the one without weights. All models have acceptable residual distribution.
nlme only fits Gaussian models. Therefore, here we test random intercept vs random slopes.
lme_lat_gaussian_int <- try(lme(
log(study_level_shift_abs) ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb,
random = ~1|Class,
data = rs_SL_lat), silent = TRUE)
lme_lat_gaussian_slp <- try(lme(
log(study_level_shift_abs) ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb,
random = ~LogExtent|Class,
data = rs_SL_lat), silent = TRUE)
###########
# Get results
lme_list <- list(lme_lat_gaussian_int,
lme_lat_gaussian_slp)
lme_names <- c("Gaussian intercept",
"Gaussian slope")
# AIC and R2 table
DT::datatable(aic_r2_table(lme_list, lme_names))
# Residual distribution
for(i in 1:length(lme_list)){
try({
plot(simulateResiduals(lme_list[[i]]), main = lme_names[i])
}, silent = TRUE)
}
## Warning in checkModel(fittedModel): DHARMa: fittedModel not in class of
## supported models. Absolutely no guarantee that this will work!
## Warning in checkModel(fittedModel): DHARMa: fittedModel not in class of
## supported models. Absolutely no guarantee that this will work!
The model with a random slope does not converges.
Here we contrast 3 weighting parameters:
1) 1/study_level_var_abs
2) NtaxaClass
3) NtaxaClass + LogNtempUnits
For each of these parameters, we tested multiple variance functions
# The selected model
lme_lat <- lme_lat_gaussian_int
summary(lme_lat)
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 533.4977 563.6217 -257.7488
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.5507015 0.6948875
##
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -2.4412740 0.6092196 174 -4.007215 0.0001
## LogNtempUnits -0.1940158 0.0628066 174 -3.089100 0.0023
## LogExtent 0.5055169 0.0826801 174 6.114133 0.0000
## ContinuousGrain 0.0976624 0.0487850 174 2.001892 0.0468
## QualityRAW 0.2046510 0.1291787 174 1.584248 0.1150
## QualityRESURVEYED -0.1702981 0.1441215 174 -1.181629 0.2390
## PrAboccurrence -0.4059768 0.1427542 174 -2.843886 0.0050
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.050
## LogExtent -0.930 -0.071
## ContinuousGrain -0.287 -0.288 0.129
## QualityRAW 0.064 -0.050 -0.216 0.068
## QualityRESURVEYED -0.155 0.167 0.033 0.050 0.447
## PrAboccurrence -0.322 0.290 0.092 0.144 0.101 0.086
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.09908495 -0.60288880 -0.03025387 0.62938224 3.27542928
##
## Number of Observations: 217
## Number of Groups: 37
##########################
# 1) 1/study_level_var_abs
my_variable <- as.formula("~ study_level_var_abs")
weight_options <- list(varIdent(form = my_variable),
varFixed(my_variable),
varPower(form = my_variable),
varExp(form = my_variable))
# Fit
lme_lat_var <- lapply(weight_options, function(x){
update(lme_lat, weights = x)
})
# histogram residuals
{
par(mfrow=c(2,2))
for(i in 1:length(lme_lat_var)){
try(hist(resid(lme_lat_var[[i]]),main = class(weight_options[[i]])[[1]]))
}
}
# residuals vs fitted
{
par(mfrow=c(2,2))
for(i in 1:length(lme_lat_var)){
plot(fitted(lme_lat_var[[i]]),resid(lme_lat_var[[i]]),
main = class(weight_options[[i]])[[1]],
xlab="Fitted values",ylab = "Residuals")
}
}
# qqnorm plot
{
par(mfrow=c(2,2))
for(i in 1:length(lme_lat_var)){
qqnorm(resid(lme_lat_var[[i]]),
main = class(weight_options[[i]])[[1]])
qqline(resid(lme_lat_var[[i]]))
}
}
# AIC and R2 table
DT::datatable(aic_r2_table(lme_lat_var, sapply(weight_options, function(x) class(x)[[1]])))
# summary model
lapply(lme_lat_var, function(x) summary(x))
## [[1]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 533.4977 563.6217 -257.7488
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.5507015 0.6948875
##
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -2.4412740 0.6092196 174 -4.007215 0.0001
## LogNtempUnits -0.1940158 0.0628066 174 -3.089100 0.0023
## LogExtent 0.5055169 0.0826801 174 6.114133 0.0000
## ContinuousGrain 0.0976624 0.0487850 174 2.001892 0.0468
## QualityRAW 0.2046510 0.1291787 174 1.584248 0.1150
## QualityRESURVEYED -0.1702981 0.1441215 174 -1.181629 0.2390
## PrAboccurrence -0.4059768 0.1427542 174 -2.843886 0.0050
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.050
## LogExtent -0.930 -0.071
## ContinuousGrain -0.287 -0.288 0.129
## QualityRAW 0.064 -0.050 -0.216 0.068
## QualityRESURVEYED -0.155 0.167 0.033 0.050 0.447
## PrAboccurrence -0.322 0.290 0.092 0.144 0.101 0.086
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.09908495 -0.60288880 -0.03025387 0.62938224 3.27542928
##
## Number of Observations: 217
## Number of Groups: 37
##
## [[2]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 794.9419 825.0659 -388.471
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.5518732 0.6331793
##
## Variance function:
## Structure: fixed weights
## Formula: ~study_level_var_abs
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -1.9935160 0.5764445 174 -3.458297 0.0007
## LogNtempUnits -0.1010358 0.0718081 174 -1.407026 0.1612
## LogExtent 0.3361240 0.0919585 174 3.655171 0.0003
## ContinuousGrain -0.0436097 0.0456051 174 -0.956246 0.3403
## QualityRAW 0.7797228 0.1506752 174 5.174858 0.0000
## QualityRESURVEYED 0.2016302 0.1126776 174 1.789444 0.0753
## PrAboccurrence -0.6587409 0.1809530 174 -3.640397 0.0004
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits 0.015
## LogExtent -0.906 -0.193
## ContinuousGrain -0.092 -0.320 -0.018
## QualityRAW 0.051 -0.040 -0.165 -0.083
## QualityRESURVEYED 0.479 0.213 -0.631 0.112 0.231
## PrAboccurrence -0.053 0.332 -0.250 -0.051 0.208 0.251
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -5.01331454 0.01464133 0.28698506 0.58823027 3.13452231
##
## Number of Observations: 217
## Number of Groups: 37
##
## [[3]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 534.1468 567.6179 -257.0734
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.5634731 0.6572764
##
## Variance function:
## Structure: Power of variance covariate
## Formula: ~study_level_var_abs
## Parameter estimates:
## power
## 0.03520022
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -2.4068074 0.5986633 174 -4.020302 0.0001
## LogNtempUnits -0.1957462 0.0621496 174 -3.149597 0.0019
## LogExtent 0.4909565 0.0817333 174 6.006814 0.0000
## ContinuousGrain 0.0935788 0.0488264 174 1.916562 0.0569
## QualityRAW 0.2410644 0.1288646 174 1.870680 0.0631
## QualityRESURVEYED -0.1736397 0.1405396 174 -1.235522 0.2183
## PrAboccurrence -0.4070470 0.1443815 174 -2.819246 0.0054
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.046
## LogExtent -0.927 -0.078
## ContinuousGrain -0.284 -0.297 0.126
## QualityRAW 0.068 -0.043 -0.218 0.052
## QualityRESURVEYED -0.126 0.174 -0.003 0.049 0.444
## PrAboccurrence -0.307 0.294 0.067 0.132 0.105 0.096
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.30671544 -0.55750402 0.02781574 0.77350311 3.16741416
##
## Number of Observations: 217
## Number of Groups: 37
##
## [[4]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 523.1192 556.5902 -251.5596
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.5587764 0.6408116
##
## Variance function:
## Structure: Exponential of variance covariate
## Formula: ~study_level_var_abs
## Parameter estimates:
## expon
## 0.001857796
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -2.3241216 0.5800041 174 -4.007078 0.0001
## LogNtempUnits -0.1730623 0.0595588 174 -2.905741 0.0041
## LogExtent 0.4622914 0.0789066 174 5.858718 0.0000
## ContinuousGrain 0.1080122 0.0467968 174 2.308112 0.0222
## QualityRAW 0.1790855 0.1233973 174 1.451291 0.1485
## QualityRESURVEYED -0.1551612 0.1359925 174 -1.140955 0.2555
## PrAboccurrence -0.2885558 0.1404481 174 -2.054536 0.0414
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.049
## LogExtent -0.925 -0.082
## ContinuousGrain -0.295 -0.273 0.131
## QualityRAW 0.064 -0.053 -0.205 0.045
## QualityRESURVEYED -0.144 0.174 0.018 0.048 0.446
## PrAboccurrence -0.307 0.310 0.061 0.156 0.076 0.097
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.29259075 -0.58919682 0.03299913 0.70020347 3.13939611
##
## Number of Observations: 217
## Number of Groups: 37
##########################
# 2) NtaxaClass
my_variable <- as.formula("~ 1/NtaxaClass")
weight_options <- list(varIdent(form = my_variable),
varFixed(my_variable),
varPower(form = my_variable),
varExp(form = my_variable))
# Fit
lme_lat_class <- lapply(weight_options, function(x){
try(update(lme_lat, weights = x))
})
# histogram residuals
{
par(mfrow=c(2,2))
for(i in 1:length(lme_lat_class)){
try(hist(resid(lme_lat_class[[i]]),main = class(weight_options[[i]])[[1]]))
}
}
# residuals vs fitted
{
par(mfrow=c(2,2))
for(i in 1:length(lme_lat_class)){
try(plot(fitted(lme_lat_class[[i]]),resid(lme_lat_class[[i]]),
main = class(weight_options[[i]])[[1]],
xlab="Fitted values",ylab = "Residuals"))
}
}
# qqnorm plot
{
par(mfrow=c(2,2))
for(i in 1:length(lme_lat_class)){
try(qqnorm(resid(lme_lat_class[[i]]),
main = class(weight_options[[i]])[[1]]))
try(qqline(resid(lme_lat_class[[i]])))
}
}
# AIC and R2 table
DT::datatable(aic_r2_table(lme_lat_class, sapply(weight_options, function(x) class(x)[[1]])))
# summary model
lapply(lme_lat_class, function(x) try(summary(x)))
## [[1]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 533.4977 563.6217 -257.7488
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.5507015 0.6948875
##
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -2.4412740 0.6092196 174 -4.007215 0.0001
## LogNtempUnits -0.1940158 0.0628066 174 -3.089100 0.0023
## LogExtent 0.5055169 0.0826801 174 6.114133 0.0000
## ContinuousGrain 0.0976624 0.0487850 174 2.001892 0.0468
## QualityRAW 0.2046510 0.1291787 174 1.584248 0.1150
## QualityRESURVEYED -0.1702981 0.1441215 174 -1.181629 0.2390
## PrAboccurrence -0.4059768 0.1427542 174 -2.843886 0.0050
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.050
## LogExtent -0.930 -0.071
## ContinuousGrain -0.287 -0.288 0.129
## QualityRAW 0.064 -0.050 -0.216 0.068
## QualityRESURVEYED -0.155 0.167 0.033 0.050 0.447
## PrAboccurrence -0.322 0.290 0.092 0.144 0.101 0.086
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.09908495 -0.60288880 -0.03025387 0.62938224 3.27542928
##
## Number of Observations: 217
## Number of Groups: 37
##
## [[2]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 674.0736 704.1976 -328.0368
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.6027634 4.638077
##
## Variance function:
## Structure: fixed weights
## Formula: ~1/NtaxaClass
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -1.4595751 0.5328458 174 -2.739207 0.0068
## LogNtempUnits -0.2226465 0.0587588 174 -3.789164 0.0002
## LogExtent 0.4201687 0.0768569 174 5.466899 0.0000
## ContinuousGrain -0.0287806 0.0490695 174 -0.586526 0.5583
## QualityRAW 0.1335398 0.1021747 174 1.306976 0.1929
## QualityRESURVEYED -0.2790245 0.1230153 174 -2.268209 0.0245
## PrAboccurrence -0.6440590 0.1241472 174 -5.187868 0.0000
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits 0.335
## LogExtent -0.898 -0.499
## ContinuousGrain -0.305 -0.158 0.098
## QualityRAW 0.291 0.347 -0.461 0.004
## QualityRESURVEYED 0.209 0.355 -0.330 -0.168 0.521
## PrAboccurrence -0.270 0.171 0.028 0.134 0.105 0.177
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.84298648 -0.39866059 0.01288189 0.46365561 4.30215131
##
## Number of Observations: 217
## Number of Groups: 37
##
## [[3]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 526.8818 560.3529 -253.4409
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.5615717 0.9474142
##
## Variance function:
## Structure: Power of variance covariate
## Formula: ~1/NtaxaClass
## Parameter estimates:
## power
## 0.1064028
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -2.2807461 0.5789205 174 -3.939654 0.0001
## LogNtempUnits -0.1794017 0.0600957 174 -2.985266 0.0032
## LogExtent 0.4854499 0.0790291 174 6.142675 0.0000
## ContinuousGrain 0.0799024 0.0474494 174 1.683951 0.0940
## QualityRAW 0.1731711 0.1214523 174 1.425836 0.1557
## QualityRESURVEYED -0.1631199 0.1357614 174 -1.201519 0.2312
## PrAboccurrence -0.4319845 0.1358949 174 -3.178813 0.0018
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits 0.019
## LogExtent -0.924 -0.152
## ContinuousGrain -0.331 -0.254 0.164
## QualityRAW 0.111 0.004 -0.263 0.044
## QualityRESURVEYED -0.082 0.196 -0.051 0.013 0.467
## PrAboccurrence -0.317 0.266 0.080 0.159 0.086 0.130
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.564678600 -0.661652047 -0.003470179 0.652144636 2.844564321
##
## Number of Observations: 217
## Number of Groups: 37
##
## [[4]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 520.6408 554.1118 -250.3204
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.575876 0.5475617
##
## Variance function:
## Structure: Exponential of variance covariate
## Formula: ~1/NtaxaClass
## Parameter estimates:
## expon
## 1.988932
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -2.4040330 0.5691356 174 -4.224007 0.0000
## LogNtempUnits -0.1523503 0.0578878 174 -2.631821 0.0093
## LogExtent 0.4870975 0.0771452 174 6.314036 0.0000
## ContinuousGrain 0.0933897 0.0457378 174 2.041849 0.0427
## QualityRAW 0.1476989 0.1194466 174 1.236527 0.2179
## QualityRESURVEYED -0.1408404 0.1326195 174 -1.061989 0.2897
## PrAboccurrence -0.3454114 0.1314168 174 -2.628366 0.0093
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits 0.001
## LogExtent -0.925 -0.127
## ContinuousGrain -0.346 -0.254 0.187
## QualityRAW 0.113 -0.020 -0.260 0.029
## QualityRESURVEYED -0.095 0.195 -0.037 0.011 0.473
## PrAboccurrence -0.309 0.260 0.077 0.161 0.076 0.132
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.302899540 -0.639129017 0.003250849 0.648176530 2.759057956
##
## Number of Observations: 217
## Number of Groups: 37
##########################
# NtempUnits model
lme_lat_temp <-nlme::lme(
log(study_level_shift_abs) ~ LogExtent + ContinuousGrain + Quality + PrAb,
random = ~ 1 | Class,
weights = varExp(form = ~1/LogNtempUnits),
data = rs_SL_lat)
hist(resid(lme_lat_temp))
plot(predict(lme_lat_temp),resid(lme_lat_temp))
summary(lme_lat_temp)
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 539.1006 569.2673 -260.5503
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.5555131 0.7320209
##
## Variance function:
## Structure: Exponential of variance covariate
## Formula: ~1/LogNtempUnits
## Parameter estimates:
## expon
## -0.02724066
## Fixed effects: log(study_level_shift_abs) ~ LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -2.5260382 0.6212676 175 -4.065942 0.0001
## LogExtent 0.4865162 0.0842332 175 5.775823 0.0000
## ContinuousGrain 0.0532925 0.0479117 175 1.112305 0.2675
## QualityRAW 0.1825462 0.1315774 175 1.387367 0.1671
## QualityRESURVEYED -0.0994335 0.1444599 175 -0.688312 0.4922
## PrAboccurrence -0.2787999 0.1392715 175 -2.001845 0.0468
## Correlation:
## (Intr) LgExtn CntnsG QltRAW QRESUR
## LogExtent -0.938
## ContinuousGrain -0.316 0.113
## QualityRAW 0.063 -0.223 0.062
## QualityRESURVEYED -0.148 0.044 0.107 0.464
## PrAboccurrence -0.325 0.121 0.245 0.121 0.041
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.49949576 -0.63584774 -0.07258622 0.66662585 3.30643264
##
## Number of Observations: 217
## Number of Groups: 37
aic_r2_table(list(lme_lat_temp))
## Model AIC R2_mar R2_cond
## 1 Model 1 539.1 0.19 0.48
qqnorm(resid(lme_lat_temp));qqline(resid(lme_lat_temp))
##########################
# NtaxaClass + NtempUnits model
lme_lat_class_temp <-nlme::lme(
log(study_level_shift_abs) ~ LogExtent + ContinuousGrain + Quality + PrAb,
random = ~ 1 | Class,
weights = varComb(varExp(form = ~1/NtaxaClass),
varExp(form = ~1/LogNtempUnits)),
data = rs_SL_lat)
hist(resid(lme_lat_class_temp))
plot(predict(lme_lat_class_temp),resid(lme_lat_class_temp))
summary(lme_lat_class_temp)
## Linear mixed-effects model fit by REML
## Data: rs_SL_lat
## AIC BIC logLik
## 523.5557 557.0743 -251.7778
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.5780469 0.5438538
##
## Combination of variance functions:
## Structure: Exponential of variance covariate
## Formula: ~1/NtaxaClass
## Parameter estimates:
## expon
## 2.116788
## Structure: Exponential of variance covariate
## Formula: ~1/LogNtempUnits
## Parameter estimates:
## expon
## 0.00786819
## Fixed effects: log(study_level_shift_abs) ~ LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -2.4055438 0.5746180 175 -4.186336 0.0000
## LogExtent 0.4611313 0.0772520 175 5.969185 0.0000
## ContinuousGrain 0.0634887 0.0446834 175 1.420854 0.1571
## QualityRAW 0.1396562 0.1206498 175 1.157533 0.2486
## QualityRESURVEYED -0.0720077 0.1314635 175 -0.547739 0.5846
## PrAboccurrence -0.2560108 0.1282094 175 -1.996817 0.0474
## Correlation:
## (Intr) LgExtn CntnsG QltRAW QRESUR
## LogExtent -0.933
## ContinuousGrain -0.359 0.163
## QualityRAW 0.115 -0.266 0.023
## QualityRESURVEYED -0.094 -0.015 0.061 0.487
## PrAboccurrence -0.320 0.114 0.244 0.083 0.088
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.3035080 -0.6257098 -0.0166655 0.6831095 2.7082614
##
## Number of Observations: 217
## Number of Groups: 37
aic_r2_table(list(lme_lat_class_temp))
## Model AIC R2_mar R2_cond
## 1 Model 1 523.56 0.21 0.63
qqnorm(resid(lme_lat_class_temp));qqline(resid(lme_lat_class_temp))
All models have acceptable residual distribution.
The model with “1/variance” has lowest AIC than the model combining
NtaxaClass and LogNtempUnits.
The “1/variance” model with a varExp function was the best based on AIC
and residuals distribution.
# mod_lat <- lme_lat_var[[4]]
mod_lat <- tmb_lat_var
Predict shifts to all possible combinations of methodological variables. For this, we create a grid with all possible combination of methodological variables. Note that some combination of methodological variables do not exist in the real database. This grid can be useful for illustrative purposes: visualization of predicted shifts along levels of a given methodological variable.
new_data <- data.frame(
expand_grid(
LogNtempUnits = seq(from = min(rs_SL_lat$LogNtempUnits),
to = max(rs_SL_lat$LogNtempUnits),
length.out = 5),
LogExtent = seq(from = min(rs_SL_lat$LogExtent),
to = max(rs_SL_lat$LogExtent),
length.out = 5),
ContinuousGrain = unique(rs_SL_lat$ContinuousGrain),
PrAb = unique(rs_SL_lat$PrAb),
Quality = unique(rs_SL_lat$Quality),
Class = NA,
study_level_var_abs = NA))
pred_lat <- predict_fixed(mod_lat, new_data) # Use fixed effects only
new_data$pred_val <- pred_lat
## get mode of each predictor
time_mode = median(rs_SL_lat$LogNtempUnits)
time_mode = new_data$LogNtempUnits[which.min(abs(new_data$LogNtempUnits-time_mode))]
mode_grain = getmode(rs_SL_lat$ContinuousGrain)
mode_prab = getmode(rs_SL_lat$PrAb)
mode_quality = getmode(rs_SL_lat$Quality)
area = median(rs_SL_lat$LogExtent)
area = new_data$LogExtent[which.min(abs(new_data$LogExtent-area))]
############################################
## plot predictions and raw data together:
## AREA
new_data %>%
filter(PrAb == mode_prab, ## plot predictions for most common levels
LogNtempUnits == time_mode,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = LogExtent, y = pred_val)) +
geom_line() +
geom_point(data = rs_lat, aes(x = LogExtent, y = abs(SHIFT), colour = ID),
#I subseted only the studies used in the model
size = 0.25, alpha = 0.75) + ## small circles are raw shifts within studies
geom_point(data = rs_SL_lat, aes(x = LogExtent, y = study_level_shift_abs, fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75) + ## large circles are mean study-level shift per class
theme(legend.position = "none") +
labs(y = "Mean range shift rate (class within study)")
# Only SL
p1 <- new_data %>%
filter(PrAb == mode_prab, ## plot predictions for most common levels
LogNtempUnits == time_mode,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = LogExtent, y = pred_val)) +
geom_line() +
geom_point(data = rs_SL_lat, aes(x = LogExtent, y = study_level_shift_abs, fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75) + ## large circles are mean study-level shift per class
theme(legend.position = "none") +
labs(y = "Mean range shift rate (class within study)")
p1
## Contrast with the expected linear relationship
p1 +
geom_smooth(method = "lm",
data = rs_SL_lat,
aes(x = LogExtent, y = study_level_shift_abs),
color = "darkgray",se = FALSE, linetype = "dashed")
#############################################
# plot predictions and raw data together:
## N temporal units
new_data %>%
filter(PrAb == mode_prab, ## plot predictions for most common levels
LogExtent == area,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = LogNtempUnits, y = pred_val)) +
geom_line() +
geom_point(data = rs_lat,
aes(x = LogNtempUnits, y = abs(SHIFT), colour = ID),
size = 0.25, alpha = 0.75) +
geom_point(data = rs_SL_lat,
aes(x = LogNtempUnits, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75) + #
theme(legend.position = "none") +
labs(y = "Mean range shift rate (class within study)")
# Only SL
p1 <- new_data %>%
filter(PrAb == mode_prab, ## plot predictions for most common levels
LogExtent == area,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = LogNtempUnits, y = pred_val)) +
geom_line() +
geom_point(data = rs_SL_lat,
aes(x = LogNtempUnits, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75) +
theme(legend.position = "none") +
labs(y = "Mean range shift rate (class within study)")
p1
## Contrast with the expected linear relationship
p1 +
geom_smooth(method = "lm",
data = rs_SL_lat,
aes(x = LogNtempUnits, y = study_level_shift_abs),
color = "darkgray",se = FALSE, linetype = "dashed")
############################################
## GRAIN
new_data %>%
filter(
LogExtent == area,
LogNtempUnits == time_mode,
PrAb == mode_prab,
Quality == mode_quality
) %>%
ggplot(aes(x = ContinuousGrain, y = pred_val)) +
geom_line()+
theme(legend.position = "none") +
geom_point(data = rs_lat,
aes(x = ContinuousGrain, y = abs(SHIFT), colour = ID),
size = 0.25, alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03, jitter.height = 0)) +
geom_point(data = rs_SL_lat,
aes(x = ContinuousGrain, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03, jitter.height = 0)) +
labs(y = "Mean range shift rate (class within study)")
# Only SL
p1 <- new_data %>%
filter(
LogExtent == area,
LogNtempUnits == time_mode,
PrAb == mode_prab,
Quality == mode_quality
) %>%
ggplot(aes(x = ContinuousGrain, y = pred_val)) +
geom_line()+
theme(legend.position = "none") +
geom_point(data = rs_SL_lat,
aes(x = ContinuousGrain, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03, jitter.height = 0)) +
labs(y = "Mean range shift rate (class within study)")
p1
## Contrast with the expected linear relationship
p1 +
geom_smooth(method = "lm",data = rs_SL_lat,
aes(x = ContinuousGrain, y = study_level_shift_abs),
color = "darkgray",se = FALSE, linetype = "dashed")
############################################
## PRAB
new_data %>%
filter(
LogExtent == area,
LogNtempUnits == time_mode,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = PrAb, y = pred_val)) +
geom_boxplot() +
theme(legend.position = "none") +
geom_point(data = rs_lat,
aes(x = PrAb, y = abs(SHIFT), colour = ID),
size = 0.25, alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03,
jitter.height = 0)) +
geom_point(data = rs_SL_lat,
aes(x = PrAb, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03,
jitter.height = 0)) +
labs(y = "Mean range shift rate (class within study)")
# Only SL
p1 <- new_data %>%
filter(
LogExtent == area,
LogNtempUnits == time_mode,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = PrAb, y = pred_val)) +
geom_boxplot() +
theme(legend.position = "none") +
geom_point(data = rs_SL_lat,
aes(x = PrAb, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03,
jitter.height = 0)) +
labs(y = "Mean range shift rate (class within study)")
p1
## Contrast with the expected linear relationship
p1 +
geom_boxplot(data=rs_SL_lat,
aes(x = PrAb, y = study_level_shift_abs), color = "red", alpha = .5)
rs_SL_ele <- read.csv("outputs/ELEData_model.csv")
rs_SL_ele$Class <- factor(rs_SL_ele$Class)
dim(rs_SL_ele)
## [1] 157 23
nlevels(factor(rs_SL_ele$Source))
## [1] 92
nlevels(factor(rs_SL_ele$Class))
## [1] 16
# 120 studies, 37 taxonomic groups
hist(rs_SL_ele$study_level_shift_abs)
hist(rs_SL_ele$study_level_var)
hist(rs_SL_ele$LogExtent)
hist(rs_SL_ele$ContinuousGrain)
hist(rs_SL_ele$LogNtempUnits, breaks = 50)
ggplot(rs_SL_ele, aes(x=Quality)) + geom_bar()
ggplot(rs_SL_ele, aes(x=PrAb)) + geom_bar()
ggplot(rs_SL_ele, aes(y=Class)) + geom_bar()
ggplot(rs_SL_ele, aes(y=Sampling)) + geom_bar()
# we cannot use sampling and LogNtempUnits in the same model
corrplot(
cor(rs_SL_ele[,c("study_level_shift_abs",
"study_level_var",
"NtempUnits",
"LogExtent",
"NtaxaClass")]),
method = "number",
type = "lower",
diag = FALSE)
plot(rs_SL_ele$study_level_shift_abs, 1/rs_SL_ele$study_level_var)
plot(log(rs_SL_ele$study_level_shift_abs), log(1/rs_SL_ele$study_level_var))
# study_level_var is not correlated with Extent
plot(log(1/study_level_var) ~ LogExtent, rs_SL_ele)
cor.test(rs_SL_ele$LogExtent,1/rs_SL_ele$study_level_var)
##
## Pearson's product-moment correlation
##
## data: rs_SL_ele$LogExtent and 1/rs_SL_ele$study_level_var
## t = -0.22586, df = 155, p-value = 0.8216
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1742818 0.1388939
## sample estimates:
## cor
## -0.01813887
# NtaxaClass is OK
plot(study_level_shift_abs ~ NtaxaClass, rs_SL_ele)
plot(log(study_level_shift_abs) ~ log(NtaxaClass), rs_SL_ele)
plot(log(1/study_level_var) ~ log(NtaxaClass), rs_SL_ele)
# LogNtempUnits is OK
plot(study_level_shift_abs ~ LogNtempUnits, rs_SL_ele)
plot(log(study_level_shift_abs) ~ log(LogNtempUnits), rs_SL_ele)
# we cannot use sampling and LogNtempUnits in the same model
boxplot(rs_SL_ele$LogNtempUnits~rs_SL_ele$Sampling) # do not fit models with sampling
# more correlation tests
boxplot(log(study_level_var) ~ ContinuousGrain, rs_SL_ele)
cor.test(rs_SL_ele$ContinuousGrain,log(rs_SL_ele$study_level_var))
##
## Pearson's product-moment correlation
##
## data: rs_SL_ele$ContinuousGrain and log(rs_SL_ele$study_level_var)
## t = 0.77415, df = 155, p-value = 0.44
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.0955050 0.2165941
## sample estimates:
## cor
## 0.06206156
boxplot(log(NtaxaClass) ~ ContinuousGrain, rs_SL_ele)
boxplot(LogExtent ~ ContinuousGrain, rs_SL_ele)
# N temporal units
p1 <- ggplot(rs_ele, aes(x = NtempUnits, y = abs(SHIFT), size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Log N temporal units") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = NtempUnits, y = study_level_shift_abs, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Log N temporal units") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Extent
p1 <- ggplot(rs_full %>%
filter(Gradient == "ELE"),
aes(x = LogExtent, y = abs(SHIFT), size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Study area\n(Elevation extent km)") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = LogExtent, y = study_level_shift_abs, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Study area\n(Elevation extent km)") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Grain size
p1 <- ggplot(rs_ele, aes(x = ContinuousGrain, y = abs(SHIFT), size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Grain size categories") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = ContinuousGrain, y = study_level_shift_abs, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Grain size categories") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Sampling
p1 <- ggplot(rs_ele, aes(x = Sampling, y = abs(SHIFT), size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Sampling") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = Sampling, y = study_level_shift_abs, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Sampling") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Quality
p1 <- ggplot(rs_ele, aes(x = Quality, y = abs(SHIFT), size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = Quality, y = study_level_shift_abs, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# PrAb
p1 <- ggplot(rs_ele, aes(x = PrAb, y = abs(SHIFT), size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = PrAb, y = study_level_shift_abs, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# N temporal units
p1 <- ggplot(rs_full %>%
filter(Gradient == "ELE"),
aes(x = NtempUnits, y = SHIFT, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Log N temporal units") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = NtempUnits, y = study_level_shift, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Log N temporal units") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Extent
p1 <- ggplot(rs_full %>%
filter(Gradient == "ELE"),
aes(x = LogExtent, y = SHIFT, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Study area\n(Latitudinal extent km)") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = LogExtent, y = study_level_shift, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Study area\n(Latitudinal extent km)") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Grain size
p1 <- ggplot(rs_full %>%
filter(Gradient == "ELE"),
aes(x = ContinuousGrain, y = SHIFT, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Species-level data")+
xlab("Grain size categories") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = ContinuousGrain, y = study_level_shift, size = NtaxaClass, fill = Class))+
geom_point(shape = 21, colour = "black") +
ggtitle("Study-level data")+
xlab("Grain size categories") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
# Sampling
p1 <- ggplot(rs_full %>%
filter(Gradient == "ELE"),
aes(x = Sampling, y = SHIFT, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Sampling") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = Sampling, y = study_level_shift, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Sampling") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
## Warning: The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
## The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
# Quality
p1 <- ggplot(rs_full %>%
filter(Gradient == "ELE"),
aes(x = Quality, y = SHIFT, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = Quality, y = study_level_shift, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
## Warning: The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
## The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
# PrAb
p1 <- ggplot(rs_full %>%
filter(Gradient == "ELE"),
aes(x = PrAb, y = SHIFT, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black",alpha=.2, aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Species-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
p2 <- ggplot(rs_SL_ele, aes(x = PrAb, y = study_level_shift, size = NtaxaClass))+
geom_jitter(shape = 21, colour = "black", aes(fill = Class)) +
geom_boxplot(alpha=.5) +
ggtitle("Study-level data")+
xlab("Quality") + ylab("Study level average shift\n(absolute km/year)")+
theme_classic() + theme(legend.position = "none")
gridExtra::grid.arrange(p1,p2,nrow=1)
## Warning: The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
## The following aesthetics were dropped during statistical transformation: size
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
Looking at the data it seems it makes more sense using absolute shift values than raw values.
# Gamma intercept
tmb_ele_gamma_int <- glmmTMB(
study_level_shift_abs ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb +
(1|Class),
family = Gamma(link = "log"),
data = rs_SL_ele)
# Gamma slope
tmb_ele_gamma_slp <- glmmTMB(
study_level_shift_abs ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb +
(LogExtent|Class),
family = Gamma(link = "log"),
data = rs_SL_ele)
## Warning in fitTMB(TMBStruc): Model convergence problem; non-positive-definite
## Hessian matrix. See vignette('troubleshooting')
## Warning in fitTMB(TMBStruc): Model convergence problem; singular convergence
## (7). See vignette('troubleshooting')
# Gaussian intercept
tmb_ele_gaussian_int <- glmmTMB(
log(study_level_shift_abs) ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb +
(1|Class),
data = rs_SL_ele)
# Gaussian slope
tmb_ele_gaussian_slp <- glmmTMB(
log(study_level_shift_abs) ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb +
(LogExtent|Class),
data = rs_SL_ele)
## Warning in fitTMB(TMBStruc): Model convergence problem; singular convergence
## (7). See vignette('troubleshooting')
###########
# Get results
tmb_list <- list(tmb_ele_gamma_int,
tmb_ele_gamma_slp,
tmb_ele_gaussian_int,
tmb_ele_gaussian_slp)
tmb_names <- c("Gamma intercept",
"Gamma slope",
"Gaussian intercept",
"Gaussian slope")
# AIC and R2 table
DT::datatable(aic_r2_table(tmb_list, tmb_names))
# Residual distribution
for(i in 1:length(tmb_list)){
try({
plot(simulateResiduals(tmb_list[[i]]), main = tmb_names[i])
})
}
Gaussian intercept is the best model according to AIC. This model also show good residual distribution.
# The selected model
tmb_ele <- tmb_ele_gaussian_int
summary(tmb_ele)
## Family: gaussian ( identity )
## Formula:
## log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain +
## Quality + PrAb + (1 | Class)
## Data: rs_SL_ele
##
## AIC BIC logLik deviance df.resid
## 352.5 380.0 -167.2 334.5 148
##
## Random effects:
##
## Conditional model:
## Groups Name Variance Std.Dev.
## Class (Intercept) 0.05742 0.2396
## Residual 0.46120 0.6791
## Number of obs: 157, groups: Class, 16
##
## Dispersion estimate for gaussian family (sigma^2): 0.461
##
## Conditional model:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.56387 0.66892 -0.843 0.39925
## LogNtempUnits 0.28225 0.09980 2.828 0.00468 **
## LogExtent 0.17746 0.08239 2.154 0.03125 *
## ContinuousGrain -0.02780 0.12619 -0.220 0.82566
## QualityRAW 0.16540 0.15320 1.080 0.28031
## QualityRESURVEYED -0.14673 0.13368 -1.098 0.27236
## PrAboccurrence -0.21046 0.13835 -1.521 0.12821
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# NtaxaClass model
tmb_ele_class <- try(
update(tmb_ele, weights = NtaxaClass))
# summary(tmb_ele_class)
# 1/study_level_var model
tmb_ele_var <- try(
update(tmb_ele, weights = range01(log(1/study_level_var_abs)))) # precision
# NtempUnits model
tmb_ele_temp <- try(
update(tmb_ele,
.~ LogExtent + ContinuousGrain + Quality + PrAb + (1 | Class),
weights = LogNtempUnits))
# NtaxaClass + NtempUnits model
tmb_ele_class_temp <- try(
update(tmb_ele,
.~ LogExtent + ContinuousGrain + Quality + PrAb + (1 | Class),
weights = log(NtaxaClass) * LogNtempUnits)) # two sources of variability (N species and sample size)
##############################################################
tmb_list <- list(tmb_ele,
tmb_ele_var,
tmb_ele_class,
tmb_ele_temp,
tmb_ele_class_temp)
tmb_names <- c("No weights",
"1/variance",
"NtaxaClass",
"NtempUnits",
"NtaxaClass + NtempUnits")
# AIC and R2 table
DT::datatable(aic_r2_table(tmb_list, tmb_names))
## Warning: Can't compute random effect variances. Some variance components equal
## zero. Your model may suffer from singularity (see `?lme4::isSingular`
## and `?performance::check_singularity`).
## Solution: Respecify random structure! You may also decrease the
## `tolerance` level to enforce the calculation of random effect variances.
## Random effect variances not available. Returned R2 does not account for random effects.
## Warning: Can't compute random effect variances. Some variance components equal
## zero. Your model may suffer from singularity (see `?lme4::isSingular`
## and `?performance::check_singularity`).
## Solution: Respecify random structure! You may also decrease the
## `tolerance` level to enforce the calculation of random effect variances.
## Random effect variances not available. Returned R2 does not account for random effects.
# Dharma residuals
for(i in 1:length(tmb_list)){
try({
plot(simulateResiduals(tmb_list[[i]]), main = tmb_names[i])
})
}
## Warning in getSimulations.glmmTMB(fittedModel, nsim = n, type = "normal", :
## Model was fit with prior weights. These will be ignored in the simulation. See
## ?getSimulations for details
## Warning in getSimulations.glmmTMB(fittedModel, nsim = n, type = "normal", :
## Model was fit with prior weights. These will be ignored in the simulation. See
## ?getSimulations for details
## Warning in getSimulations.glmmTMB(fittedModel, nsim = n, type = "normal", :
## Model was fit with prior weights. These will be ignored in the simulation. See
## ?getSimulations for details
## Warning in getSimulations.glmmTMB(fittedModel, nsim = n, type = "normal", :
## Model was fit with prior weights. These will be ignored in the simulation. See
## ?getSimulations for details
# qqnorm plot
{
par(mfrow=c(2,3))
for(i in 1:length(tmb_list)){
qqnorm(resid(tmb_list[[i]]), main = tmb_names[i])
qqline(resid(tmb_list[[i]]))
}
}
# histogram residuals
{
par(mfrow=c(2,3))
for(i in 1:length(tmb_list)){
hist(resid(tmb_list[[i]]), main = tmb_names[i], xlab = "Residual")
}
}
# residuals vs fitted
{
par(mfrow=c(2,3))
for(i in 1:length(tmb_list)){
plot(fitted(tmb_list[[i]]),resid(tmb_list[[i]]), main = tmb_names[i],
xlab="Fitted values",ylab = "Residuals")
}
}
The model with the lowest AIC is the one with “1/variance”, followed by “NtempUnits”. However, residual distribution of the “1/variance” model is problematic. The model with “NtempUnits” has low AIC and good residual distribution.
nlme only fits Gaussian models. Therefore, here we test random intercept vs random slopes.
lme_ele_gaussian_int <- try(lme(
log(study_level_shift_abs) ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb,
random = ~1|Class,
data = rs_SL_ele), silent = TRUE)
lme_ele_gaussian_slp <- try(lme(
log(study_level_shift_abs) ~
LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb,
random = ~LogExtent|Class,
data = rs_SL_ele), silent = TRUE)
###########
# Get results
lme_list <- list(lme_ele_gaussian_int,
lme_ele_gaussian_slp)
lme_names <- c("Gaussian intercept",
"Gaussian slope")
# AIC and R2 table
DT::datatable(aic_r2_table(lme_list, lme_names))
# Residual distribution
for(i in 1:length(lme_list)){
try({
plot(simulateResiduals(lme_list[[i]]), main = lme_names[i])
}, silent = TRUE)
}
## Warning in checkModel(fittedModel): DHARMa: fittedModel not in class of
## supported models. Absolutely no guarantee that this will work!
## Warning in checkModel(fittedModel): DHARMa: fittedModel not in class of
## supported models. Absolutely no guarantee that this will work!
The model with a random intercept has the lowest AIC.
Here we contrast 3 weighting parameters:
1) 1/study_level_var_abs
2) NtaxaClass
3) NtaxaClass + LogNtempUnits
For each of these parameters, we tested multiple variance functions
# The selected model
lme_ele <- lme_ele_gaussian_int
summary(lme_ele)
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 370.1906 397.2863 -176.0953
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.2618918 0.6927744
##
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.5665101 0.6833711 135 -0.8289933 0.4086
## LogNtempUnits 0.2818703 0.1018657 135 2.7670781 0.0065
## LogExtent 0.1776792 0.0841615 135 2.1111709 0.0366
## ContinuousGrain -0.0283435 0.1289498 135 -0.2198022 0.8264
## QualityRAW 0.1627400 0.1557477 135 1.0448950 0.2979
## QualityRESURVEYED -0.1484327 0.1361030 135 -1.0905910 0.2774
## PrAboccurrence -0.2086462 0.1408523 135 -1.4813122 0.1409
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.102
## LogExtent -0.923 -0.052
## ContinuousGrain -0.064 0.112 -0.228
## QualityRAW -0.318 -0.136 0.276 0.026
## QualityRESURVEYED -0.299 0.068 0.158 0.157 0.463
## PrAboccurrence -0.187 0.029 -0.005 0.129 -0.111 0.069
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.79473828 -0.67384968 -0.02456627 0.65023332 2.87768510
##
## Number of Observations: 157
## Number of Groups: 16
##########################
# 1) 1/study_level_var_abs
my_variable <- as.formula("~ study_level_var_abs")
weight_options <- list(varIdent(form = my_variable),
varFixed(my_variable),
varPower(form = my_variable),
varExp(form = my_variable))
# Fit
lme_ele_var <- lapply(weight_options, function(x){
update(lme_ele, weights = x)
})
# histogram residuals
{
par(mfrow=c(2,2))
for(i in 1:length(lme_ele_var)){
try(hist(resid(lme_ele_var[[i]]),main = class(weight_options[[i]])[[1]]))
}
}
# residuals vs fitted
{
par(mfrow=c(2,2))
for(i in 1:length(lme_ele_var)){
plot(fitted(lme_ele_var[[i]]),resid(lme_ele_var[[i]]),
main = class(weight_options[[i]])[[1]],
xlab="Fitted values",ylab = "Residuals")
}
}
# qqnorm plot
{
par(mfrow=c(2,2))
for(i in 1:length(lme_ele_var)){
qqnorm(resid(lme_ele_var[[i]]),
main = class(weight_options[[i]])[[1]])
qqline(resid(lme_ele_var[[i]]))
}
}
# AIC and R2 table
DT::datatable(aic_r2_table(lme_ele_var, sapply(weight_options, function(x) class(x)[[1]])))
# summary model
lapply(lme_ele_var, function(x) summary(x))
## [[1]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 370.1906 397.2863 -176.0953
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.2618918 0.6927744
##
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.5665101 0.6833711 135 -0.8289933 0.4086
## LogNtempUnits 0.2818703 0.1018657 135 2.7670781 0.0065
## LogExtent 0.1776792 0.0841615 135 2.1111709 0.0366
## ContinuousGrain -0.0283435 0.1289498 135 -0.2198022 0.8264
## QualityRAW 0.1627400 0.1557477 135 1.0448950 0.2979
## QualityRESURVEYED -0.1484327 0.1361030 135 -1.0905910 0.2774
## PrAboccurrence -0.2086462 0.1408523 135 -1.4813122 0.1409
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.102
## LogExtent -0.923 -0.052
## ContinuousGrain -0.064 0.112 -0.228
## QualityRAW -0.318 -0.136 0.276 0.026
## QualityRESURVEYED -0.299 0.068 0.158 0.157 0.463
## PrAboccurrence -0.187 0.029 -0.005 0.129 -0.111 0.069
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.79473828 -0.67384968 -0.02456627 0.65023332 2.87768510
##
## Number of Observations: 157
## Number of Groups: 16
##
## [[2]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 557.6328 584.7286 -269.8164
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.3560825 0.5516225
##
## Variance function:
## Structure: fixed weights
## Formula: ~study_level_var_abs
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.3095504 0.7821234 135 -0.395782 0.6929
## LogNtempUnits 0.5503917 0.2928678 135 1.879318 0.0624
## LogExtent -0.0578956 0.1034053 135 -0.559890 0.5765
## ContinuousGrain 0.5843595 0.1745176 135 3.348428 0.0011
## QualityRAW 0.4482744 0.1741657 135 2.573838 0.0111
## QualityRESURVEYED -0.1142241 0.1121303 135 -1.018673 0.3102
## PrAboccurrence -0.5114893 0.1891612 135 -2.703987 0.0077
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.272
## LogExtent -0.874 -0.032
## ContinuousGrain 0.046 0.066 -0.370
## QualityRAW -0.164 0.035 0.128 -0.081
## QualityRESURVEYED 0.216 0.033 -0.306 0.051 0.278
## PrAboccurrence -0.118 0.053 -0.148 0.227 -0.069 0.046
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -4.7575487 0.1934655 0.4876003 0.8908965 3.8365471
##
## Number of Observations: 157
## Number of Groups: 16
##
## [[3]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 368.2133 398.3196 -174.1066
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.2951556 0.7710153
##
## Variance function:
## Structure: Power of variance covariate
## Formula: ~study_level_var_abs
## Parameter estimates:
## power
## -0.07838948
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.4034978 0.6724225 135 -0.6000659 0.5495
## LogNtempUnits 0.2602537 0.0889548 135 2.9256861 0.0040
## LogExtent 0.1840592 0.0823818 135 2.2342228 0.0271
## ContinuousGrain -0.0990278 0.1244354 135 -0.7958176 0.4275
## QualityRAW 0.1295775 0.1509162 135 0.8586055 0.3921
## QualityRESURVEYED -0.0751246 0.1326141 135 -0.5664907 0.5720
## PrAboccurrence -0.1883007 0.1352009 135 -1.3927477 0.1660
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.072
## LogExtent -0.926 -0.066
## ContinuousGrain -0.066 0.121 -0.217
## QualityRAW -0.307 -0.194 0.273 0.014
## QualityRESURVEYED -0.307 0.062 0.169 0.146 0.454
## PrAboccurrence -0.201 0.008 0.016 0.118 -0.095 0.106
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.1918563 -0.8099605 -0.1749449 0.5460862 3.2864242
##
## Number of Observations: 157
## Number of Groups: 16
##
## [[4]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 362.0615 392.1679 -171.0308
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.20329 0.6329828
##
## Variance function:
## Structure: Exponential of variance covariate
## Formula: ~study_level_var_abs
## Parameter estimates:
## expon
## 0.003861337
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.6385182 0.6464150 135 -0.9877836 0.3250
## LogNtempUnits 0.2722494 0.1066754 135 2.5521296 0.0118
## LogExtent 0.1676456 0.0792938 135 2.1142340 0.0363
## ContinuousGrain 0.0293767 0.1217038 135 0.2413784 0.8096
## QualityRAW 0.1790103 0.1484502 135 1.2058605 0.2300
## QualityRESURVEYED -0.1966395 0.1304901 135 -1.5069302 0.1342
## PrAboccurrence -0.1786052 0.1355370 135 -1.3177591 0.1898
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.142
## LogExtent -0.921 -0.026
## ContinuousGrain -0.071 0.102 -0.226
## QualityRAW -0.336 -0.072 0.284 0.039
## QualityRESURVEYED -0.304 0.076 0.164 0.161 0.468
## PrAboccurrence -0.188 0.050 -0.009 0.134 -0.120 0.041
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.90472019 -0.62135350 0.05226188 0.71693426 2.39625091
##
## Number of Observations: 157
## Number of Groups: 16
##########################
# 2) NtaxaClass
my_variable <- as.formula("~ 1/NtaxaClass")
weight_options <- list(varIdent(form = my_variable),
varFixed(my_variable),
varPower(form = my_variable),
varExp(form = my_variable))
# Fit
lme_ele_class <- lapply(weight_options, function(x){
try(update(lme_ele, weights = x))
})
# histogram residuals
{
par(mfrow=c(2,2))
for(i in 1:length(lme_ele_class)){
try(hist(resid(lme_ele_class[[i]]),main = class(weight_options[[i]])[[1]]))
}
}
# residuals vs fitted
{
par(mfrow=c(2,2))
for(i in 1:length(lme_ele_class)){
try(plot(fitted(lme_ele_class[[i]]),resid(lme_ele_class[[i]]),
main = class(weight_options[[i]])[[1]],
xlab="Fitted values",ylab = "Residuals"))
}
}
# qqnorm plot
{
par(mfrow=c(2,2))
for(i in 1:length(lme_ele_class)){
try(qqnorm(resid(lme_ele_class[[i]]),
main = class(weight_options[[i]])[[1]]))
try(qqline(resid(lme_ele_class[[i]])))
}
}
# AIC and R2 table
DT::datatable(aic_r2_table(lme_ele_class, sapply(weight_options, function(x) class(x)[[1]])))
# summary model
lapply(lme_ele_class, function(x) try(summary(x)))
## [[1]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 370.1906 397.2863 -176.0953
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.2618918 0.6927744
##
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.5665101 0.6833711 135 -0.8289933 0.4086
## LogNtempUnits 0.2818703 0.1018657 135 2.7670781 0.0065
## LogExtent 0.1776792 0.0841615 135 2.1111709 0.0366
## ContinuousGrain -0.0283435 0.1289498 135 -0.2198022 0.8264
## QualityRAW 0.1627400 0.1557477 135 1.0448950 0.2979
## QualityRESURVEYED -0.1484327 0.1361030 135 -1.0905910 0.2774
## PrAboccurrence -0.2086462 0.1408523 135 -1.4813122 0.1409
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.102
## LogExtent -0.923 -0.052
## ContinuousGrain -0.064 0.112 -0.228
## QualityRAW -0.318 -0.136 0.276 0.026
## QualityRESURVEYED -0.299 0.068 0.158 0.157 0.463
## PrAboccurrence -0.187 0.029 -0.005 0.129 -0.111 0.069
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.79473828 -0.67384968 -0.02456627 0.65023332 2.87768510
##
## Number of Observations: 157
## Number of Groups: 16
##
## [[2]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 433.1665 460.2622 -207.5832
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.2512939 4.110173
##
## Variance function:
## Structure: fixed weights
## Formula: ~1/NtaxaClass
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.2283451 0.7143937 135 -0.319635 0.7497
## LogNtempUnits 0.4538880 0.0621407 135 7.304194 0.0000
## LogExtent 0.1432886 0.0852335 135 1.681131 0.0950
## ContinuousGrain -0.0996393 0.0877098 135 -1.136011 0.2580
## QualityRAW -0.0220492 0.1455722 135 -0.151466 0.8798
## QualityRESURVEYED -0.1039861 0.1038899 135 -1.000925 0.3187
## PrAboccurrence -0.2168552 0.0984883 135 -2.201836 0.0294
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.077
## LogExtent -0.956 -0.025
## ContinuousGrain -0.124 0.206 -0.122
## QualityRAW -0.400 -0.471 0.431 -0.012
## QualityRESURVEYED -0.263 -0.015 0.198 0.207 0.445
## PrAboccurrence -0.177 0.069 0.026 0.423 -0.320 -0.183
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -3.9586319 -0.5169851 -0.0447035 0.4280858 2.7716126
##
## Number of Observations: 157
## Number of Groups: 16
##
## [[3]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 368.7637 398.87 -174.3818
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.2610468 0.9109973
##
## Variance function:
## Structure: Power of variance covariate
## Formula: ~1/NtaxaClass
## Parameter estimates:
## power
## 0.08970922
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.3784942 0.6925878 135 -0.546493 0.5856
## LogNtempUnits 0.3032638 0.0954944 135 3.175724 0.0019
## LogExtent 0.1603629 0.0847878 135 1.891344 0.0607
## ContinuousGrain -0.0489349 0.1193912 135 -0.409870 0.6826
## QualityRAW 0.1033126 0.1547291 135 0.667700 0.5055
## QualityRESURVEYED -0.1377403 0.1319591 135 -1.043811 0.2984
## PrAboccurrence -0.2318803 0.1341852 135 -1.728062 0.0863
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.113
## LogExtent -0.931 -0.030
## ContinuousGrain -0.057 0.112 -0.220
## QualityRAW -0.320 -0.163 0.286 0.026
## QualityRESURVEYED -0.302 0.069 0.170 0.165 0.463
## PrAboccurrence -0.186 0.047 0.003 0.149 -0.128 0.055
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.88534287 -0.63003323 -0.05091425 0.65444661 2.92470286
##
## Number of Observations: 157
## Number of Groups: 16
##
## [[4]]
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 370.8647 400.971 -175.4323
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.267146 0.6468247
##
## Variance function:
## Structure: Exponential of variance covariate
## Formula: ~1/NtaxaClass
## Parameter estimates:
## expon
## 0.6999818
## Fixed effects: log(study_level_shift_abs) ~ LogNtempUnits + LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.3926583 0.6884726 135 -0.5703324 0.5694
## LogNtempUnits 0.2856459 0.1011927 135 2.8227929 0.0055
## LogExtent 0.1622643 0.0845785 135 1.9185068 0.0572
## ContinuousGrain -0.0411185 0.1253765 135 -0.3279599 0.7435
## QualityRAW 0.1091734 0.1560608 135 0.6995567 0.4854
## QualityRESURVEYED -0.1512619 0.1341335 135 -1.1276966 0.2614
## PrAboccurrence -0.2266836 0.1384148 135 -1.6377120 0.1038
## Correlation:
## (Intr) LgNtmU LgExtn CntnsG QltRAW QRESUR
## LogNtempUnits -0.119
## LogExtent -0.926 -0.033
## ContinuousGrain -0.058 0.109 -0.227
## QualityRAW -0.310 -0.136 0.269 0.027
## QualityRESURVEYED -0.297 0.071 0.157 0.159 0.464
## PrAboccurrence -0.183 0.042 -0.005 0.127 -0.114 0.068
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.92188020 -0.64012774 -0.04210522 0.69059415 2.95213169
##
## Number of Observations: 157
## Number of Groups: 16
##########################
# NtempUnits model
lme_ele_temp <-nlme::lme(
log(study_level_shift_abs) ~ LogExtent + ContinuousGrain + Quality + PrAb,
random = ~ 1 | Class,
weights = varExp(form = ~1/LogNtempUnits),
data = rs_SL_ele)
hist(resid(lme_ele_temp))
plot(predict(lme_ele_temp),resid(lme_ele_temp))
summary(lme_ele_temp)
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 374.9583 402.1138 -178.4791
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.2673134 0.7387208
##
## Variance function:
## Structure: Exponential of variance covariate
## Formula: ~1/LogNtempUnits
## Parameter estimates:
## expon
## -0.03110809
## Fixed effects: log(study_level_shift_abs) ~ LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.3629888 0.6937616 136 -0.5232183 0.6017
## LogExtent 0.1879246 0.0857591 136 2.1913075 0.0301
## ContinuousGrain -0.0653682 0.1307824 136 -0.4998245 0.6180
## QualityRAW 0.2139081 0.1577765 136 1.3557661 0.1774
## QualityRESURVEYED -0.1735447 0.1386184 136 -1.2519603 0.2127
## PrAboccurrence -0.2215600 0.1437848 136 -1.5409137 0.1257
## Correlation:
## (Intr) LgExtn CntnsG QltRAW QRESUR
## LogExtent -0.934
## ContinuousGrain -0.053 -0.225
## QualityRAW -0.338 0.273 0.040
## QualityRESURVEYED -0.295 0.162 0.151 0.477
## PrAboccurrence -0.186 -0.002 0.125 -0.107 0.067
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.75233236 -0.63702510 -0.01948146 0.66653593 2.80292465
##
## Number of Observations: 157
## Number of Groups: 16
aic_r2_table(list(lme_ele_temp))
## Model AIC R2_mar R2_cond
## 1 Model 1 374.96 0.06 0.17
qqnorm(resid(lme_ele_temp));qqline(resid(lme_ele_temp))
##########################
# NtaxaClass + NtempUnits model
lme_ele_class_temp <-nlme::lme(
log(study_level_shift_abs) ~ LogExtent + ContinuousGrain + Quality + PrAb,
random = ~ 1 | Class,
weights = varComb(varExp(form = ~1/NtaxaClass),
varExp(form = ~1/LogNtempUnits)),
data = rs_SL_ele)
hist(resid(lme_ele_class_temp))
plot(predict(lme_ele_class_temp),resid(lme_ele_class_temp))
summary(lme_ele_class_temp)
## Linear mixed-effects model fit by REML
## Data: rs_SL_ele
## AIC BIC logLik
## 375.7758 405.9486 -177.8879
##
## Random effects:
## Formula: ~1 | Class
## (Intercept) Residual
## StdDev: 0.2690567 0.7515249
##
## Combination of variance functions:
## Structure: Exponential of variance covariate
## Formula: ~1/NtaxaClass
## Parameter estimates:
## expon
## 0.6759289
## Structure: Exponential of variance covariate
## Formula: ~1/LogNtempUnits
## Parameter estimates:
## expon
## -0.09129755
## Fixed effects: log(study_level_shift_abs) ~ LogExtent + ContinuousGrain + Quality + PrAb
## Value Std.Error DF t-value p-value
## (Intercept) -0.13918651 0.6961218 136 -0.1999456 0.8418
## LogExtent 0.16595178 0.0860363 136 1.9288576 0.0558
## ContinuousGrain -0.07100645 0.1271129 136 -0.5586096 0.5773
## QualityRAW 0.14789586 0.1584144 136 0.9336013 0.3522
## QualityRESURVEYED -0.17644655 0.1365301 136 -1.2923639 0.1984
## PrAboccurrence -0.24786636 0.1412801 136 -1.7544327 0.0816
## Correlation:
## (Intr) LgExtn CntnsG QltRAW QRESUR
## LogExtent -0.937
## ContinuousGrain -0.045 -0.225
## QualityRAW -0.334 0.270 0.040
## QualityRESURVEYED -0.292 0.161 0.153 0.479
## PrAboccurrence -0.183 0.000 0.119 -0.106 0.063
##
## Standardized Within-Group Residuals:
## Min Q1 Med Q3 Max
## -2.87715955 -0.64359053 -0.04739503 0.67808785 2.89301266
##
## Number of Observations: 157
## Number of Groups: 16
aic_r2_table(list(lme_ele_class_temp))
## Model AIC R2_mar R2_cond
## 1 Model 1 375.78 0.05 0.16
qqnorm(resid(lme_ele_class_temp));qqline(resid(lme_ele_class_temp))
All models have acceptable residual distribution.
The model with “1/variance” has lowest AIC than the model combining
NtaxaClass and LogNtempUnits.
The “1/variance” model with a varExp function was the best based on AIC
and residuals distribution.
# mod_ele <- lme_ele_var[[4]]
mod_ele <- tmb_ele_var
Predict shifts to all possible combinations of methodological variables. For this, we create a grid with all possible combination of methodological variables. Note that some combination of methodological variables do not exist in the real database. This grid can be useful for illustrative purposes: visualization of predicted shifts along levels of a given methodological variable.
new_data <- data.frame(
expand_grid(
LogNtempUnits = seq(from = min(rs_SL_ele$LogNtempUnits),
to = max(rs_SL_ele$LogNtempUnits),
length.out = 5),
LogExtent = seq(from = min(rs_SL_ele$LogExtent),
to = max(rs_SL_ele$LogExtent),
length.out = 5),
ContinuousGrain = unique(rs_SL_ele$ContinuousGrain),
PrAb = unique(rs_SL_ele$PrAb),
Quality = unique(rs_SL_ele$Quality),
Class = NA,
study_level_var_abs = NA))
pred_ele <- predict_fixed(mod_ele, new_data) # Use fixed effects only
new_data$pred_val <- pred_ele
## get mode of each predictor
time_mode = median(rs_SL_ele$LogNtempUnits)
time_mode = new_data$LogNtempUnits[which.min(abs(new_data$LogNtempUnits-time_mode))]
mode_grain = getmode(rs_SL_ele$ContinuousGrain)
mode_prab = getmode(rs_SL_ele$PrAb)
mode_quality = getmode(rs_SL_ele$Quality)
area = median(rs_SL_ele$LogExtent)
area = new_data$LogExtent[which.min(abs(new_data$LogExtent-area))]
############################################
## plot predictions and raw data together:
## AREA
new_data %>%
filter(PrAb == mode_prab, ## plot predictions for most common levels
LogNtempUnits == time_mode,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = LogExtent, y = pred_val)) +
geom_line() +
geom_point(data = rs_ele, aes(x = LogExtent, y = abs(SHIFT), colour = ID),
#I subseted only the studies used in the model
size = 0.25, alpha = 0.75) + ## small circles are raw shifts within studies
geom_point(data = rs_SL_ele, aes(x = LogExtent, y = study_level_shift_abs, fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75) + ## large circles are mean study-level shift per class
theme(legend.position = "none") +
labs(y = "Mean range shift rate (class within study)")
# Only SL
p1 <- new_data %>%
filter(PrAb == mode_prab, ## plot predictions for most common levels
LogNtempUnits == time_mode,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = LogExtent, y = pred_val)) +
geom_line() +
geom_point(data = rs_SL_ele, aes(x = LogExtent, y = study_level_shift_abs, fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75) + ## large circles are mean study-level shift per class
theme(legend.position = "none") +
labs(y = "Mean range shift rate (class within study)")
p1
## Contrast with the expected linear relationship
p1 +
geom_smooth(method = "lm",
data = rs_SL_ele,
aes(x = LogExtent, y = study_level_shift_abs),
color = "darkgray",se = FALSE, linetype = "dashed")
#############################################
# plot predictions and raw data together:
## N temporal units
new_data %>%
filter(PrAb == mode_prab, ## plot predictions for most common levels
LogExtent == area,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = LogNtempUnits, y = pred_val)) +
geom_line() +
geom_point(data = rs_ele,
aes(x = LogNtempUnits, y = abs(SHIFT), colour = ID),
size = 0.25, alpha = 0.75) +
geom_point(data = rs_SL_ele,
aes(x = LogNtempUnits, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75) + #
theme(legend.position = "none") +
labs(y = "Mean range shift rate (class within study)")
# Only SL
p1 <- new_data %>%
filter(PrAb == mode_prab, ## plot predictions for most common levels
LogExtent == area,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = LogNtempUnits, y = pred_val)) +
geom_line() +
geom_point(data = rs_SL_ele,
aes(x = LogNtempUnits, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75) +
theme(legend.position = "none") +
labs(y = "Mean range shift rate (class within study)")
p1
## Contrast with the expected linear relationship
p1 +
geom_smooth(method = "lm",
data = rs_SL_ele,
aes(x = LogNtempUnits, y = study_level_shift_abs),
color = "darkgray",se = FALSE, linetype = "dashed")
############################################
## GRAIN
new_data %>%
filter(
LogExtent == area,
LogNtempUnits == time_mode,
PrAb == mode_prab,
Quality == mode_quality
) %>%
ggplot(aes(x = ContinuousGrain, y = pred_val)) +
geom_line()+
theme(legend.position = "none") +
geom_point(data = rs_ele,
aes(x = ContinuousGrain, y = abs(SHIFT), colour = ID),
size = 0.25, alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03, jitter.height = 0)) +
geom_point(data = rs_SL_ele,
aes(x = ContinuousGrain, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03, jitter.height = 0)) +
labs(y = "Mean range shift rate (class within study)")
# Only SL
p1 <- new_data %>%
filter(
LogExtent == area,
LogNtempUnits == time_mode,
PrAb == mode_prab,
Quality == mode_quality
) %>%
ggplot(aes(x = ContinuousGrain, y = pred_val)) +
geom_line()+
theme(legend.position = "none") +
geom_point(data = rs_SL_ele,
aes(x = ContinuousGrain, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03, jitter.height = 0)) +
labs(y = "Mean range shift rate (class within study)")
p1
## Contrast with the expected linear relationship
p1 +
geom_smooth(method = "lm",data = rs_SL_ele,
aes(x = ContinuousGrain, y = study_level_shift_abs),
color = "darkgray",se = FALSE, linetype = "dashed")
############################################
## PRAB
new_data %>%
filter(
LogExtent == area,
LogNtempUnits == time_mode,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = PrAb, y = pred_val)) +
geom_boxplot() +
theme(legend.position = "none") +
geom_point(data = rs_ele,
aes(x = PrAb, y = abs(SHIFT), colour = ID),
size = 0.25, alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03,
jitter.height = 0)) +
geom_point(data = rs_SL_ele,
aes(x = PrAb, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03,
jitter.height = 0)) +
labs(y = "Mean range shift rate (class within study)")
# Only SL
p1 <- new_data %>%
filter(
LogExtent == area,
LogNtempUnits == time_mode,
ContinuousGrain == mode_grain,
Quality == mode_quality
) %>%
ggplot(aes(x = PrAb, y = pred_val)) +
geom_boxplot() +
theme(legend.position = "none") +
geom_point(data = rs_SL_ele,
aes(x = PrAb, y = study_level_shift_abs,
fill = ID, size = log(NtaxaClass)),
shape = 21, colour = "black", alpha = 0.75,
position = position_jitterdodge(jitter.width = 0.03,
jitter.height = 0)) +
labs(y = "Mean range shift rate (class within study)")
p1
## Contrast with the expected linear relationship
p1 +
geom_boxplot(data=rs_SL_ele,
aes(x = PrAb, y = study_level_shift_abs), color = "red", alpha = .5)
Next step is to calculate the predicted study-level shift for each study, holding methods constant across all studies.
We want to leave variation due to class in, so we will ignore the random effect predict mean study level range shift per class, holding all methodological variables at their median/mode value.
We apply correction to species-level shifts based on the estimated study-level effect of methods using two approaches: Delta and ratio. The delta difference consists on the difference between the predicted shift at the study-level and the predicted shift under ideal method conditions. In contrast, the 𝚫Ratio is the ratio between the predicted study-level shift and predicted shift under ideal method conditions.
The ideal method was defined as the 95% quantile of Log N temporal units, Log extent of the study area (using latitudinal extent for latitudinal shifts or elevation extent for elevation shifts), grain = “small”, PrAb = “abundance”, and Quality = “resurveyed”. We these represent a set of methodological conditions that would (at least in theory) generate accurate range shift measurements.
var_meth_names <- c("ContinuousGrain","LogNtempUnits","LogExtent","PrAb","Quality")
#############
## 1) Predict to the full dataset study level shift
pred_rs_SL_lat <- predict_fixed(mod_lat,
mutate(rs_SL_full_lat,
study_level_var_abs = NA,
Class = NA)) # using fixed effects only
rs_SL_full_lat$Pred_SL <- pred_rs_SL_lat
pred_rs_SL_ele <- predict_fixed(mod_ele,
mutate(rs_SL_full_ele,
study_level_var_abs = NA,
Class = NA)) # using fixed effects only
rs_SL_full_ele$Pred_SL <- pred_rs_SL_ele
# Any predicted negative shift
# We do not expect models to predict negative shifts as we fit models using a Gamma distribution
any(exp(pred_rs_SL_lat) < 0)
## [1] FALSE
any(exp(pred_rs_SL_ele) < 0)
## [1] FALSE
#############
## 2) filter predictions on the ideal data
## 2.1) Latitude
# Option 1)
# Select the set of condition that (in theory) would provide a good shift estimate
# e.g., large grain, more time periods, abundance data, resurveyed...
# Option 2)
# Select the most common set of conditions for each single method variable
# Option 3)
# Select the most common set of conditions in the database == higher occupancy in a multidimensional space built with methodological variables
# Option 1)
ideal_data_lat1 <- data.frame(ContinuousGrain = quantile(rs_SL_lat$ContinuousGrain, 0.05),
LogNtempUnits = quantile(rs_SL_lat$LogNtempUnits, 0.95),
LogExtent = quantile(rs_SL_lat$LogExtent, 0.95),
PrAb = "abundance",
Quality = "RESURVEYED")
ideal_data_lat1
## ContinuousGrain LogNtempUnits LogExtent PrAb Quality
## 5% 1 3.621447 8.203747 abundance RESURVEYED
# Option 2)
ideal_data_lat2 <- rs_SL_lat[,var_meth_names] %>% unique()
ideal_data_lat2 <- apply(ideal_data_lat2, 2, getmode)
ideal_data_lat2 <- data.frame(matrix(ncol = length(ideal_data_lat2),data = ideal_data_lat2))
names(ideal_data_lat2) <- var_meth_names
ideal_data_lat2[,1:3] <- as.numeric(ideal_data_lat2[,1:3])
ideal_data_lat2
## ContinuousGrain LogNtempUnits LogExtent PrAb Quality
## 1 1 0.6931472 6.827608 occurrence RAW
# Option 3)
# Gower's distance
lat_meth_dis <- rs_SL_lat %>% select(var_meth_names) %>% data.frame %>% gowdis
# PCoA
rs_pcoa <- dudi.pco(lat_meth_dis, scannf = FALSE, nf = 3)
# Desity
rs_dens <- pointdensity(rs_pcoa$li, eps = .1, type = "density")
ideal_data_lat3 <- rs_SL_lat[which(rs_dens == max(rs_dens))[1],var_meth_names]
ideal_data_lat3
## ContinuousGrain LogNtempUnits LogExtent PrAb Quality
## 145 1 1.386294 7.070911 occurrence BALANCED
# Check whether the selected method conditions exists in the multidimensional method space
# create a methods table
lat_meth <- data.frame(rs_SL_lat[,var_meth_names])
# add ideal methodo
lat_meth <- rbind(data.frame(lat_meth, Reference = "Data"),
data.frame(ideal_data_lat1, Reference = "Ideal 1"),
data.frame(ideal_data_lat2, Reference = "Ideal 2"),
data.frame(ideal_data_lat3, Reference = "Ideal 3"))
ref_data <- 1:length(which(lat_meth$Reference=="Data"))
# Gower's distance
lat_meth_dis <- gowdis(lat_meth[,var_meth_names])
# PCoA
rs_pcoa <- dudi.pco(lat_meth_dis, scannf = FALSE, nf = 3)
plot_ly() %>%
add_trace(data = rs_pcoa$li[ref_data,], x = ~A1, y = ~A2, z = ~A3,
type = "scatter3d", mode = "markers",
marker = list(size = 5, opacity = .5)) %>%
add_trace(data = rs_pcoa$li[ref_data,], x = ~A1, y = ~A2, z = ~A3,
type="mesh3d", opacity=.2, alphahull=0) %>%
add_trace(data = rs_pcoa$li[-ref_data,], x = ~A1, y = ~A2, z = ~A3,
type = "scatter3d", mode = "markers",
color = lat_meth$Reference[-ref_data],
marker = list(size = 20, opacity = .5)) %>%
layout(title = "Latitude")
## 2.2) Elevation
# Option 1)
# Select the set of condition that (in theory) would provide a good shift estimate
# e.g., large grain, more time periods, abundance data, resurveyed...
# Option 2)
# Select the most common set of conditions in the database
# Option 3)
# Select the set of conditions that generates an average predicted value
# Option 1)
ideal_data_ele1 <- data.frame(ContinuousGrain = quantile(rs_SL_ele$ContinuousGrain, 0.05),
LogNtempUnits = quantile(rs_SL_ele$LogNtempUnits, 0.95),
LogExtent = quantile(rs_SL_ele$LogExtent, 0.95),
PrAb = "abundance",
Quality = "RESURVEYED")
ideal_data_ele1
## ContinuousGrain LogNtempUnits LogExtent PrAb Quality
## 5% 1 2.542443 8.554775 abundance RESURVEYED
# Option 2)
ideal_data_ele2 <- rs_SL_ele[,var_meth_names] %>% unique()
ideal_data_ele2 <- apply(ideal_data_ele2, 2, getmode)
ideal_data_ele2 <- data.frame(matrix(ncol = length(ideal_data_ele2),data = ideal_data_ele2))
names(ideal_data_ele2) <- var_meth_names
ideal_data_ele2[,1:3] <- as.numeric(ideal_data_ele2[,1:3])
ideal_data_ele2
## ContinuousGrain LogNtempUnits LogExtent PrAb Quality
## 1 1 0.6931472 8.452354 occurrence BALANCED
# Option 3)
# Gower's distance
ele_meth_dis <- rs_SL_ele %>% select(var_meth_names) %>% data.frame %>% gowdis
# PCoA
rs_pcoa <- dudi.pco(ele_meth_dis, scannf = FALSE, nf = 3)
# Desity
rs_dens <- pointdensity(rs_pcoa$li, eps = .1, type = "density")
ideal_data_ele3 <- rs_SL_ele[which(rs_dens == max(rs_dens))[1],var_meth_names]
ideal_data_ele3
## ContinuousGrain LogNtempUnits LogExtent PrAb Quality
## 58 1 2.484907 7.498726 occurrence BALANCED
# Check whether the selected method conditions exists in the multidimensional method space
# create a methods table
ele_meth <- data.frame(rs_SL_ele[,var_meth_names])
# add ideal methodo
ele_meth <- rbind(data.frame(ele_meth, Reference = "Data"),
data.frame(ideal_data_ele1, Reference = "Ideal 1"),
data.frame(ideal_data_ele2, Reference = "Ideal 2"),
data.frame(ideal_data_ele3, Reference = "Ideal 3"))
ref_data <- 1:length(which(ele_meth$Reference=="Data"))
# Gower's distance
ele_meth_dis <- gowdis(ele_meth[,var_meth_names])
# PCoA
rs_pcoa <- dudi.pco(ele_meth_dis, scannf = FALSE, nf = 3)
plot_ly() %>%
add_trace(data = rs_pcoa$li[ref_data,], x = ~A1, y = ~A2, z = ~A3,
type = "scatter3d", mode = "markers",
marker = list(size = 5, opacity = .5)) %>%
add_trace(data = rs_pcoa$li[ref_data,], x = ~A1, y = ~A2, z = ~A3,
type="mesh3d", opacity=.2, alphahull=0) %>%
add_trace(data = rs_pcoa$li[-ref_data,], x = ~A1, y = ~A2, z = ~A3,
type = "scatter3d", mode = "markers",
color = ele_meth$Reference[-ref_data],
marker = list(size = 20, opacity = .5)) %>%
layout(title = "Elevation")
#############
## 3) Adding the predicted ideal shift
## Latitude
Pred_ideal1 = predict_fixed(mod_lat,
mutate(ideal_data_lat1,
study_level_var_abs = NA,
Class = NA)) # using fixed effects only
Pred_ideal2 = predict_fixed(mod_lat,
mutate(ideal_data_lat2,
study_level_var_abs = NA,
Class = NA)) # using fixed effects only
Pred_ideal3 = predict_fixed(mod_lat,
mutate(ideal_data_lat3,
study_level_var_abs = NA,
Class = NA)) # using fixed effects only
rs_SL_full_lat <- mutate(rs_SL_full_lat,
Type = "LAT",
Pred_ideal1 = Pred_ideal1,
Pred_ideal2 = Pred_ideal2,
Pred_ideal3 = Pred_ideal3)
## Elevation
Pred_ideal1 = predict_fixed(mod_ele,
mutate(ideal_data_ele1,
study_level_var_abs = NA,
Class = NA)) # using fixed effects only
Pred_ideal2 = predict_fixed(mod_ele,
mutate(ideal_data_ele2,
study_level_var_abs = NA,
Class = NA)) # using fixed effects only
Pred_ideal3 = predict_fixed(mod_ele,
mutate(ideal_data_ele3,
study_level_var_abs = NA,
Class = NA)) # using fixed effects only
rs_SL_full_ele <- mutate(rs_SL_full_ele,
Type = "ELE",
Pred_ideal1 = Pred_ideal1,
Pred_ideal2 = Pred_ideal2,
Pred_ideal3 = Pred_ideal3)
## bind ele and lat corr shift data
corr_rs <- rbind(rs_SL_full_lat,
rs_SL_full_ele)
# Inverse of log of predictions
corr_rs$Pred_ideal1 <- exp(corr_rs$Pred_ideal1)
corr_rs$Pred_ideal2 <- exp(corr_rs$Pred_ideal2)
corr_rs$Pred_ideal3 <- exp(corr_rs$Pred_ideal3)
corr_rs$Pred_SL <- exp(corr_rs$Pred_SL)
## Add variables
corr_rs$Article_ID <- sapply(corr_rs$ID, function(x) {
strsplit(x,"_")[[1]][1]
})
corr_rs$Study_ID <- sapply(corr_rs$ID, function(x) {
strsplit(x,"_")[[1]][2]
})
corr_rs$Class <- sapply(corr_rs$ID, function(x) {
strsplit(x,"_")[[1]][4]
})
## 4) calculate deltaSHIFT
## 3.1) difference between predicted study-level shift and predicted reference study-level shift
{
par(mfrow = c(1,3))
corr_rs$SLDiff1 <- corr_rs$Pred_SL - corr_rs$Pred_ideal1
hist(corr_rs$SLDiff1,main = "",xlab="SLDiff1")
corr_rs$SLDiff2 <- corr_rs$Pred_SL - corr_rs$Pred_ideal2
hist(corr_rs$SLDiff2,main = "",xlab="SLDiff2")
corr_rs$SLDiff3 <- corr_rs$Pred_SL - corr_rs$Pred_ideal3
hist(corr_rs$SLDiff3,main = "",xlab="SLDiff3")
}
## 4.2) ratio between predicted study-level shift and predicted reference study-level shift
{
par(mfrow = c(1,3))
corr_rs$SLRatio1 <- corr_rs$Pred_SL / corr_rs$Pred_ideal1
hist(corr_rs$SLRatio1,main = "",xlab="SLRatio1")
corr_rs$SLRatio2 <- corr_rs$Pred_SL / corr_rs$Pred_ideal2
hist(corr_rs$SLRatio2,main = "",xlab="SLRatio2")
corr_rs$SLRatio3 <- corr_rs$Pred_SL / corr_rs$Pred_ideal3
hist(corr_rs$SLRatio3,main = "",xlab="SLRatio3")
}
## 5) calculate corrected range shift
rs <- merge(rs_full,
corr_rs %>%
select(c("Article_ID","Study_ID","Class","Type","ID",
names(corr_rs)[!names(corr_rs) %in% names(rs_full)])),
by = c("Article_ID","Study_ID","Class","Type","ID"),
all.x = TRUE)
rs$CorrShiftDiff1 <- abs(rs$SHIFT) - rs$SLDiff1
rs$CorrShiftDiff2 <- abs(rs$SHIFT) - rs$SLDiff2
rs$CorrShiftDiff3 <- abs(rs$SHIFT) - rs$SLDiff3
rs$CorrShiftRatio1 <- rs$SHIFT * rs$SLRatio1
rs$CorrShiftRatio2 <- rs$SHIFT * rs$SLRatio2
rs$CorrShiftRatio3 <- rs$SHIFT * rs$SLRatio3
## Look at CorrShiftDiff
# Corrected vs abs shift
p1 <- ggplot(rs, aes(x = abs(SHIFT), y = CorrShiftDiff1, color = as.factor(sign(SHIFT))))+
geom_point(alpha=.3) + theme_classic() +
geom_vline(xintercept = 0, color = "gray")+
geom_hline(yintercept = 0, color = "gray")+
geom_abline(intercept = 0, slope = 1)
p2 <- ggplot(rs, aes(x = abs(SHIFT), y = CorrShiftDiff2, color = as.factor(sign(SHIFT))))+
geom_point(alpha=.3) + theme_classic() +
geom_vline(xintercept = 0, color = "gray")+
geom_hline(yintercept = 0, color = "gray")+
geom_abline(intercept = 0, slope = 1)
p3 <- ggplot(rs, aes(x = abs(SHIFT), y = CorrShiftDiff3, color = as.factor(sign(SHIFT))))+
geom_point(alpha=.3) + theme_classic() +
geom_vline(xintercept = 0, color = "gray")+
geom_hline(yintercept = 0, color = "gray")+
geom_abline(intercept = 0, slope = 1)
ggarrange(p1,p2,p3,ncol=3, nrow=1, common.legend = TRUE, legend="bottom")
# change sign for negative shifts
rs$neg_shifts <- ifelse(sign(rs$SHIFT) == -1 & rs$SHIFT != 0, 1, 0)
rs$CorrShiftDiff1[which(rs$neg_shifts == 1)] <- rs$CorrShiftDiff1[which(rs$neg_shifts == 1)] * -1
rs$CorrShiftDiff2[which(rs$neg_shifts == 1)] <- rs$CorrShiftDiff2[which(rs$neg_shifts == 1)] * -1
rs$CorrShiftDiff3[which(rs$neg_shifts == 1)] <- rs$CorrShiftDiff3[which(rs$neg_shifts == 1)] * -1
p1 <- ggplot(rs, aes(x = SHIFT, y = CorrShiftDiff1, color = as.factor(sign(SHIFT))))+
geom_point(alpha=.3) + theme_classic() +
geom_vline(xintercept = 0, color = "gray")+
geom_hline(yintercept = 0, color = "gray")+
geom_abline(intercept = 0, slope = 1)
p2 <- ggplot(rs, aes(x = SHIFT, y = CorrShiftDiff2, color = as.factor(sign(SHIFT))))+
geom_point(alpha=.3) + theme_classic() +
geom_vline(xintercept = 0, color = "gray")+
geom_hline(yintercept = 0, color = "gray")+
geom_abline(intercept = 0, slope = 1)
p3 <- ggplot(rs, aes(x = SHIFT, y = CorrShiftDiff3, color = as.factor(sign(SHIFT))))+
geom_point(alpha=.3) + theme_classic() +
geom_vline(xintercept = 0, color = "gray")+
geom_hline(yintercept = 0, color = "gray")+
geom_abline(intercept = 0, slope = 1)
ggarrange(p1,p2,p3,ncol=3, nrow=1, common.legend = TRUE, legend="bottom")
## Look at CorrShiftRatio
p1 <- ggplot(rs, aes(x = SHIFT, y = CorrShiftRatio1, color = as.factor(sign(SHIFT))))+
geom_point(alpha=.3) + theme_classic() +
geom_vline(xintercept = 0, color = "gray")+
geom_hline(yintercept = 0, color = "gray")+
geom_abline(intercept = 0, slope = 1)
p2 <- ggplot(rs, aes(x = SHIFT, y = CorrShiftRatio2, color = as.factor(sign(SHIFT))))+
geom_point(alpha=.3) + theme_classic() +
geom_vline(xintercept = 0, color = "gray")+
geom_hline(yintercept = 0, color = "gray")+
geom_abline(intercept = 0, slope = 1)
p3 <- ggplot(rs, aes(x = SHIFT, y = CorrShiftRatio3, color = as.factor(sign(SHIFT))))+
geom_point(alpha=.3) + theme_classic() +
geom_vline(xintercept = 0, color = "gray")+
geom_hline(yintercept = 0, color = "gray")+
geom_abline(intercept = 0, slope = 1)
ggarrange(p1,p2,p3,ncol=3, nrow=1, common.legend = TRUE, legend="bottom")
###################
## Save new database
write.csv(rs,
"outputs/biov1_method_corrected_shifts.csv",
row.names = FALSE)
## Upload to google drive
# googledrive::drive_upload(media = "outputs/biov1_corrected_shifts.csv",
# path = paste0("BIOSHIFTS Working Group/DATA/Bioshifts_v1/",
# "biov1_method_corrected_shifts.csv"))
In general, the 𝚫Difference corrects more the small values, whereas the 𝚫Ratio corrects more the larger values. The magnitude of correction is much greater under the 𝚫Ratio.
# Raw vs predicted study-level shift
test <- rs_full %>%
group_by(ID,Type) %>%
summarise(SHIFT_SL = mean(abs(SHIFT)))
## `summarise()` has grouped output by 'ID'. You can override using the `.groups`
## argument.
test <- merge(corr_rs[,c("ID","Pred_SL")],
test[,c("ID","Type","SHIFT_SL")],
by = "ID",
all = "TRUE")
# Find which one observations were used for fitting the model
test <- test %>%
mutate(Model = ifelse(ID %in% rs_SL$ID, 1, 0)) %>%
na.omit
# Plot
ggplot(test,aes(Pred_SL,SHIFT_SL))+
geom_point(alpha=.5)+
theme_classic()+
geom_abline(slope = 1, intercept = 0) +
labs(x = "Predicted study-level shift", y = "Observed study-level shift")
ggplot(test,aes(Pred_SL,SHIFT_SL))+
geom_point(alpha=.5)+
theme_classic()+
geom_abline(slope = 1, intercept = 0) +
labs(x = "Predicted study-level shift", y = "Observed study-level shift")+
facet_wrap(.~Type, scales = "free")
ggplot(test,aes(Pred_SL,SHIFT_SL))+
geom_point(alpha=.5)+
theme_classic()+
geom_abline(slope = 1, intercept = 0) +
labs(x = "Predicted study-level shift", y = "Observed study-level shift")+
facet_wrap(Model~Type, scales = "free")
# read in corrected shifts
rs <- read.csv("outputs/biov1_method_corrected_shifts.csv")
## Histograms
## Raw vs corrected shift
rs %>%
gather(key = "shift_type", value = "shift_value",
c(SHIFT, CorrShiftDiff1)) %>%
ggplot(., aes(x = shift_value, fill = shift_type)) +
geom_histogram(alpha=.5) +
theme_bw() +
facet_wrap(Gradient~Position, scales = "free") +
scale_fill_discrete(labels = c("Corrected shift (𝚫Difference)", "Observed shift")) +
labs(x = "Range shift", y = "Frequency", fill = "")
## Distribution of SL difference btw predicted shift and ideal SL shift
ggplot(rs, aes(x = SLDiff1)) +
geom_histogram() +
theme_bw() +
labs(x = "Study-level difference", y = "Frequency")
ggplot(rs, aes(x = SLDiff1)) +
geom_histogram() +
theme_bw() +
facet_grid(Gradient~Position, scales = "free") +
labs(x = "𝚫Difference", y = "Frequency")
## plot raw vs corrected species-level shift
ggplot(rs, aes(x = SHIFT, y = CorrShiftDiff1, colour = Source)) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
labs(x = "Observed range shift", y = "Corrected shift (𝚫Difference)") +
geom_abline(intercept = 0, slope = 1)
ggplot(rs, aes(x = SHIFT, y = CorrShiftDiff1, colour = Source)) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
labs(x = "Observed range shift", y = "Corrected shift (𝚫Difference)") +
geom_abline(intercept = 0, slope = 1) +
facet_grid(Gradient~Position, scales = "free")
## plot raw vs delta correction
ggplot(rs, aes(x = SLDiff1, y = abs(SHIFT), colour = Source)) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
labs(y = "Absolute observed shift", x = "𝚫Difference") +
geom_abline(intercept = 0, slope = 1) +
facet_grid(Gradient~Position, scales = "free")
The correction applied tends to be larger for larger shifts. However, note that little correction is applied for some very large shifts at the elevation LE. In general, the magnitude of correction is very low (range delta difference: -2.6288904, 4.3595131)
rs$sign_change <- ifelse(sign(rs$CorrShiftDiff1) != sign(rs$SHIFT) & !rs$SHIFT==0, 1, 0)
rs$sign_change_posneg <- ifelse(sign(rs$CorrShiftDiff1)==-1 & sign(rs$SHIFT)==1 & !rs$SHIFT==0, 1, 0)
rs$sign_change_negpos <- ifelse(sign(rs$CorrShiftDiff1)==1 & sign(rs$SHIFT)==-1 & !rs$SHIFT==0, 1, 0)
## N change in sign for observed shift to predicted shift
rs %>%
summarise(sign_change = length(which(sign_change==1)),
total_grain_obs = length(Grain),
sign_change_percent = (sign_change/total_grain_obs)*100,
sign_change_posneg_percent = (length(which(sign_change_posneg==1))/total_grain_obs)*100,
sign_change_negpos_percent = (length(which(sign_change_negpos==1))/total_grain_obs)*100,
sign_change_posneg_max_shift = max(SHIFT[which(sign_change_posneg==1)]),
sign_change_posneg_mean_shift = mean(SHIFT[which(sign_change_posneg==1)]),
sign_change_negpos_min_shift = min(SHIFT[which(sign_change_negpos==1)]),
sign_change_negpos_mean_shift = mean(SHIFT[which(sign_change_negpos==1)])) %>%
t %>%
datatable
rs %>%
group_by(Param) %>%
summarise(sign_change = length(which(sign_change==1)),
total_grain_obs = length(Grain),
sign_change_percent = (sign_change/total_grain_obs)*100,
sign_change_posneg_percent = (length(which(sign_change_posneg==1))/total_grain_obs)*100,
sign_change_negpos_percent = (length(which(sign_change_negpos==1))/total_grain_obs)*100) %>%
t %>% as.data.frame() %>% janitor::row_to_names(1) %>%
datatable
rs %>%
group_by(Type) %>%
summarise(sign_change = length(which(sign_change==1)),
total_grain_obs = length(Grain),
sign_change_percent = (sign_change/total_grain_obs)*100,
sign_change_posneg_percent = (length(which(sign_change_posneg==1))/total_grain_obs)*100,
sign_change_negpos_percent = (length(which(sign_change_negpos==1))/total_grain_obs)*100) %>%
t %>% as.data.frame() %>% janitor::row_to_names(1) %>%
datatable
# is change in sign related to grain?
rs %>%
group_by(Grain) %>%
summarise(sign_change = length(which(sign_change==1)),
total_grain_obs = length(Grain),
sign_change_percent = (sign_change/total_grain_obs)*100,
sign_change_posneg_percent = (length(which(sign_change_posneg==1))/total_grain_obs)*100,
sign_change_negpos_percent = (length(which(sign_change_negpos==1))/total_grain_obs)*100) %>%
t %>% as.data.frame() %>% janitor::row_to_names(1) %>%
datatable
We have a total of 7% total sign change (observed positive shift becomes negative, or a negative shift becomes positive, after applying a correction). This does in includes shifts that were originally equal to zero. The number of positive shifts becoming negative is proportional to the number of negative shifts becoming positive (3.5 and 3.2, respectively).
# related to the magnitude of shift?
rs %>%
filter(SHIFT != 0) %>%
select(sign_change,CorrShiftDiff1,Gradient,Position,Source) %>%
na.omit() %>%
ggplot(., aes(x = as.factor(sign_change), y = CorrShiftDiff1)) +
geom_jitter(aes(color=Source),alpha=.1,width = .1)+
theme_bw() +
theme(legend.position = "none") +
labs(y = "Corrected shift (𝚫Difference)", x = "Sign change") +
facet_wrap(Gradient~Position, scales = "free")
Shift sign change occurs mainly for short shifts.
# read in corrected shifts
rs <- read.csv("outputs/biov1_method_corrected_shifts.csv")
## Histograms
## Raw vs corrected shift
rs %>%
gather(key = "shift_type", value = "shift_value",
c(SHIFT, CorrShiftRatio1)) %>%
ggplot(., aes(x = shift_value, fill = shift_type)) +
geom_histogram(alpha=.5) +
theme_bw() +
facet_wrap(Gradient~Position, scales = "free") +
scale_fill_discrete(labels = c("Corrected shift (𝚫Ratio)", "Observed shift")) +
labs(x = "Range shift", y = "Frequency", fill = "")
## Distribution of SL difference btw predicted shift and ideal SL shift
ggplot(rs, aes(x = SLRatio1)) +
geom_histogram() +
theme_bw() +
labs(x = "Study-level difference", y = "Frequency")
ggplot(rs, aes(x = SLRatio1)) +
geom_histogram() +
theme_bw() +
facet_grid(Gradient~Position, scales = "free") +
labs(x = "𝚫Ratio", y = "Frequency")
## plot raw vs corrected species-level shift
ggplot(rs, aes(x = SHIFT, y = CorrShiftRatio1, colour = Source)) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
labs(x = "Observed range shift", y = "Corrected shift (𝚫Ratio)") +
geom_abline(intercept = 0, slope = 1)
ggplot(rs, aes(x = SHIFT, y = CorrShiftRatio1, colour = Source)) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
labs(x = "Observed range shift", y = "Corrected shift (𝚫Ratio)") +
geom_abline(intercept = 0, slope = 1) +
facet_grid(Gradient~Position, scales = "free")
The correction applied tends to be larger for larger shifts.
ggplot(rs, aes(x = CorrShiftRatio1, y = CorrShiftDiff1, colour = Source)) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
labs(x = "Corrected shift (𝚫Ratio)", y = "Corrected shift (𝚫Difference)") +
geom_abline(intercept = 0, slope = 1) +
theme(aspect.ratio=1)
## Warning: Removed 1 rows containing missing values (`geom_point()`).
ggplot(rs, aes(x = CorrShiftRatio1, y = CorrShiftDiff1, colour = Source)) +
geom_point() +
theme_bw() +
theme(legend.position = "none") +
labs(x = "Corrected shift (𝚫Ratio)", y = "Corrected shift (𝚫Difference)") +
geom_abline(intercept = 0, slope = 1) +
facet_grid(Gradient~Position, scales = "free")+
theme(aspect.ratio=1)
## Warning: Removed 1 rows containing missing values (`geom_point()`).
rs <- rs %>%
mutate(velocity = ifelse(Type == "LAT", v.lat.mean, v.ele.mean))
rs %>%
mutate(ObsShift = SHIFT) %>%
select(velocity,CorrShiftDiff1,CorrShiftRatio1,ObsShift,Gradient) %>%
gather(Condition, Shift, -c(velocity,Gradient)) %>%
mutate(Condition =
ifelse(Condition=="CorrShiftRatio1", "𝚫Ratio",
ifelse(Condition=="CorrShiftDiff1","𝚫Difference", "Observed Shift"))) %>%
ggplot(.,aes(x = velocity, y = Shift))+
geom_smooth(aes(color = Condition))+
# geom_point(alpha=.1)+
labs(x = "Velocity", y = "Range shift")+
facet_wrap(.~Gradient)
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 5 rows containing non-finite values (`stat_smooth()`).
rs %>%
mutate(ObsShift = SHIFT) %>%
select(velocity,CorrShiftRatio1,CorrShiftDiff1,ObsShift,Gradient,ECO) %>%
gather(Condition, Shift, -c(velocity,Gradient,ECO)) %>%
mutate(Condition =
ifelse(Condition=="CorrShiftRatio1", "𝚫Ratio",
ifelse(Condition=="CorrShiftDiff1","𝚫Difference", "Observed Shift"))) %>%
ggplot(.,aes(x = velocity, y = Shift))+
geom_smooth(aes(color = Condition))+
# geom_point(alpha=.1)+
labs(x = "Velocity", y = "Range shift")+
facet_wrap(Gradient~ECO)
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 5 rows containing non-finite values (`stat_smooth()`).
rs %>%
mutate(ObsShift = SHIFT) %>%
select(velocity,CorrShiftDiff1,CorrShiftRatio1,ObsShift,Gradient,Position,ECO) %>%
gather(Condition, Shift, -c(velocity,Gradient,Position,ECO)) %>%
mutate(Condition =
ifelse(Condition=="CorrShiftRatio1", "𝚫Ratio",
ifelse(Condition=="CorrShiftDiff1","𝚫Difference", "Observed Shift"))) %>%
ggplot(.,aes(x = velocity, y = Shift))+
geom_smooth(aes(color = Condition))+
# geom_point(alpha=.1)+
labs(x = "Velocity", y = "Range shift")+
facet_grid(Gradient~Position,scales = "free")
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 5 rows containing non-finite values (`stat_smooth()`).
rs %>%
mutate(ObsShift = SHIFT) %>%
select(velocity,CorrShiftDiff1,CorrShiftRatio1,ObsShift,Gradient,Position,ECO) %>%
gather(Condition, Shift, -c(velocity,Gradient,Position,ECO)) %>%
mutate(Condition =
ifelse(Condition=="CorrShiftRatio1", "𝚫Ratio",
ifelse(Condition=="CorrShiftDiff1","𝚫Difference", "Observed Shift"))) %>%
ggplot(.,aes(x = velocity, y = Shift))+
geom_smooth(aes(color = Condition))+
# geom_point(alpha=.1)+
labs(x = "Velocity", y = "Range shift")+
facet_grid(Gradient*ECO~Position,scales = "free")
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 5 rows containing non-finite values (`stat_smooth()`).
any(is.na(rs$CorrShiftDiff1))
## [1] TRUE
rs_plot <- rs %>%
mutate(ObsShift = SHIFT) %>%
select(velocity,CorrShiftDiff1,CorrShiftDiff2,CorrShiftDiff3,ObsShift,Gradient,Position,ECO) %>%
gather(Condition, Shift, -c(velocity,Gradient,Position,ECO)) %>%
mutate(Condition =
ifelse(Condition=="CorrShiftDiff1", "Option1 (Ideal methods)",
ifelse(Condition=="CorrShiftDiff2","Option2 (Common method var)",
ifelse(Condition=="CorrShiftDiff3","Option3 (Common method density)", "Observed Shift"))))
ggplot(rs_plot,aes(x = velocity, y = Shift, color = Condition, fill = Condition))+
geom_smooth(alpha = .2) +
geom_vline(xintercept = 0, color = "gray")+
labs(x = "Velocity", y = "Range shift")+
facet_wrap(.~Gradient) +
theme_classic() + theme(legend.position="bottom") + guides(color=guide_legend(nrow=2,byrow=TRUE))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 7 rows containing non-finite values (`stat_smooth()`).
ggplot(rs_plot,aes(x = velocity, y = Shift, color = Condition, fill = Condition))+
geom_smooth(alpha = .2) +
geom_vline(xintercept = 0, color = "gray")+
labs(x = "Velocity", y = "Range shift")+
facet_wrap(Gradient~ECO) +
theme_classic() + theme(legend.position="bottom") + guides(color=guide_legend(nrow=2,byrow=TRUE))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 7 rows containing non-finite values (`stat_smooth()`).
ggplot(rs_plot,aes(x = velocity, y = Shift, color = Condition, fill = Condition))+
geom_smooth(alpha = .2) +
geom_vline(xintercept = 0, color = "gray")+
labs(x = "Velocity", y = "Range shift")+
facet_grid(Gradient~Position,scales = "free") +
theme_classic() + theme(legend.position="bottom") + guides(color=guide_legend(nrow=2,byrow=TRUE))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 7 rows containing non-finite values (`stat_smooth()`).
ggplot(rs_plot,aes(x = velocity, y = Shift, color = Condition, fill = Condition))+
geom_smooth(alpha = .2) +
geom_vline(xintercept = 0, color = "gray")+
labs(x = "Velocity", y = "Range shift")+
facet_grid(Gradient*ECO~Position,scales = "free") +
theme_classic() + theme(legend.position="bottom") + guides(color=guide_legend(nrow=2,byrow=TRUE))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 7 rows containing non-finite values (`stat_smooth()`).
rs_plot <- rs %>%
mutate(ObsShift = SHIFT) %>%
select(velocity,CorrShiftRatio1,CorrShiftRatio2,CorrShiftRatio3,ObsShift,Gradient,Position,ECO) %>%
gather(Condition, Shift, -c(velocity,Gradient,Position,ECO)) %>%
mutate(Condition =
ifelse(Condition=="CorrShiftRatio1", "Option1 (Ideal methods)",
ifelse(Condition=="CorrShiftRatio2","Option2 (Common method var)",
ifelse(Condition=="CorrShiftRatio3","Option3 (Common method density)", "Observed Shift"))))
ggplot(rs_plot,aes(x = velocity, y = Shift, color = Condition, fill = Condition))+
geom_smooth(alpha = .2) +
geom_vline(xintercept = 0, color = "gray")+
labs(x = "Velocity", y = "Range shift")+
facet_wrap(.~Gradient) +
theme_classic() + theme(legend.position="bottom") + guides(color=guide_legend(nrow=2,byrow=TRUE))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 7 rows containing non-finite values (`stat_smooth()`).
ggplot(rs_plot,aes(x = velocity, y = Shift, color = Condition, fill = Condition))+
geom_smooth(alpha = .2) +
geom_vline(xintercept = 0, color = "gray")+
labs(x = "Velocity", y = "Range shift")+
facet_wrap(Gradient~ECO) +
theme_classic() + theme(legend.position="bottom") + guides(color=guide_legend(nrow=2,byrow=TRUE))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 7 rows containing non-finite values (`stat_smooth()`).
ggplot(rs_plot,aes(x = velocity, y = Shift, color = Condition, fill = Condition))+
geom_smooth(alpha = .2) +
geom_vline(xintercept = 0, color = "gray")+
labs(x = "Velocity", y = "Range shift")+
facet_grid(Gradient~Position,scales = "free") +
theme_classic() + theme(legend.position="bottom") + guides(color=guide_legend(nrow=2,byrow=TRUE))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 7 rows containing non-finite values (`stat_smooth()`).
ggplot(rs_plot,aes(x = velocity, y = Shift, color = Condition, fill = Condition))+
geom_smooth(alpha = .2) +
geom_vline(xintercept = 0, color = "gray")+
labs(x = "Velocity", y = "Range shift")+
facet_grid(Gradient*ECO~Position,scales = "free") +
theme_classic() + theme(legend.position="bottom") + guides(color=guide_legend(nrow=2,byrow=TRUE))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 7 rows containing non-finite values (`stat_smooth()`).
Main difference between methods:
ΔRatio Method: The predictions become more spread out or variable for
larger observed shifts.
ΔDifference Method: The predictions become more scattered or varied for shifts close to zero (i.e., small changes in range shifts).
Which method to select?
The choice of method depends on the specific objective of the
correction. We can provide the following guidance:
ΔRatio Method: If the goal is to correct larger absolute shift values (i.e., significant changes in range shifts across studies with different methodologies), then the ΔRatio method should be selected.
ΔDifference Method: On the other hand, if the aim is to correct shorter absolute shift values (i.e., negligible changes in short range shifts).
How do we know if our method generated meaningful corrected
shift values?
To evaluate the meaningfulness of the corrected shift values generated
by our methods, we may want to propose a simulation-based approach.
Here, we can suggest generating a distribution of known (true) shift
values through simulations. We can then apply a certain level of
distortion to these known shift values to represent the effect of
methodological variables on range shift detection. After applying the
shift correction approach, we can compare the generated corrected shift
values with the known true shift values. If the corrected shifts
approximate the true shifts accurately, it indicates that the methods
generate meaningful corrections.