To be continued…
We’ll work through 2 processes:
1. ISSA process method well described into by Chatterjee et AL. in 2024
(Section III.1) 2. The more classic State– space models method using and
testing the aniMotum package described by Jonsen et AL. in 2022 (Section
III.2) 3. Conclusion between the results brought from the two
methods
In addition, we’ll also provide descriptive analysis regarding the nest phenology depending the factor “enclosure/open habitats”: 1. Nesting 2. Chick-rearing
STEP 1 - Download and extract the RProject from GitLab to your device
:
https://gitlab.com/TheMaskou/tly_habitat_selection_bauges/-/archive/master/tly_habitat_selection_bauges-master.zip
STEP 2 - Open the analysis.R file located in /R folder and run its code - also detailed step by step here below.
STEP 3 - Read below how the study has been processed through the R code, and get better informed with the details provided.
Run the following code to set-up the project: set the working directory, check and update R and Rstudio (optional), load and update packages and functions, download the data from G-drive.
#Settings the working directory - here below the default (delete the dash and run on .R)
#current_wd <- dirname(rstudioapi::getActiveDocumentContext()$path)
#Settings the project
suppressWarnings(source(here::here("project_setting.R")))
## Error in eval(ei, envir): object 'user_choice_update_R' not found
#Cleaning Global environment (delete the dash here below)
#rm(cran_packages, github_packages, n_i_p_cran, n_i_p_ghub)
#Specify the Google link to download the files from (drive_id = );
#Set the location where g_data will be stored (path_data = ./data, the default);
#Run the code and download the data
yn_data_download(drive_id = "1gtA-lACNH2kIbn5SuXUm17p8-KMH7Jsw",
path_data = path_data)
## Error in yn_data_download(drive_id = "1gtA-lACNH2kIbn5SuXUm17p8-KMH7Jsw", : could not find function "yn_data_download"
Open the data just downloaded and used in this study, and load them in the environment of Rstudio :
#Opening the 'sf' data set (birds)
data_sum <- readRDS(here("data", "data_sum.rds"))
data_all <- read_rds(here("data", "tot.ind.bauges.rds")) %>% st_transform(2154)
#Optional data
data_BDD <- st_read(here("data", "bilan_BDD_nid_poule_bauges_2024.gpkg"))
## Reading layer `t_nid_nid' from data source
## `D:\Documents\Etudes supérieures & Pro\5-POST_MASTER\FRANCE\OFB\2024 - Méribel\Bauges_Sc\bauges_tly_female-habitat-selection\data\bilan_BDD_nid_poule_bauges_2024.gpkg'
## using driver `GPKG'
## Simple feature collection with 9 features and 15 fields (with 2 geometries empty)
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 6.224168 ymin: 45.61211 xmax: 6.255551 ymax: 45.70174
## Geodetic CRS: WGS 84
#Opening the area data
study_area <- read_stars(here("data", "lidar_map.tif"))
Information about the data you should have now appearing in your environment: - data_sum: table that summarize all of the meta information concerning the birds studied - edited by the author - data_BDD: table providing additional meta information to the data_sum - provided by OFB - data_all: data-set providing all of the GPS information - collected from GPS trackers and cleaned-up by Marc Montadert - study_area: raster manually completed on QGIS to acquire good enough data. Indeed, since from the 2 initial raster, resolutions and metadata were too different, we could not use efficiently an automated command (ie. packages as ‘terra’ ; ‘sf’ or ‘stars’).
Open here the data for one specified bird (Optional) :
#Bird selected
bird <- "Arnica"
#Opening the bird selected data
head(assign(bird,
st_read(here("data", paste0(bird, ".gpkg"))),
envir = .GlobalEnv))
## Reading layer `Arnica' from data source
## `D:\Documents\Etudes supérieures & Pro\5-POST_MASTER\FRANCE\OFB\2024 - Méribel\Bauges_Sc\bauges_tly_female-habitat-selection\data\Arnica.gpkg'
## using driver `GPKG'
## Simple feature collection with 598 features and 21 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 951250.1 ymin: 6506947 xmax: 951765.1 ymax: 6507531
## Projected CRS: RGF93 v1 / Lambert-93
## Simple feature collection with 6 features and 21 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: 951283.9 ymin: 6507434 xmax: 951420 ymax: 6507499
## Projected CRS: RGF93 v1 / Lambert-93
## ani_nom tag_id date code_jour jour heure
## 1 Arnica 17459 2017-07-18 20:03:41 199 2017-07-18 19:03:41
## 2 Arnica 17459 2017-07-18 20:13:26 199 2017-07-18 19:13:26
## 3 Arnica 17459 2017-07-18 20:24:35 199 2017-07-18 19:24:35
## 4 Arnica 17459 2017-07-18 20:33:20 199 2017-07-18 19:33:20
## 5 Arnica 17459 2017-07-18 20:43:20 199 2017-07-18 19:43:20
## 6 Arnica 17459 2017-07-18 20:53:27 199 2017-07-18 19:53:27
## heure_relative_solaire am_pm period_jour period_chant saison cor
## 1 00:12:54 apresmidi jour journee ete FALSE
## 2 00:03:09 apresmidi jour journee ete FALSE
## 3 -00:08:00 apresmidi crepuscule journee ete FALSE
## 4 -00:16:45 apresmidi crepuscule journee ete FALSE
## 5 -00:26:45 apresmidi crepuscule journee ete FALSE
## 6 -00:36:52 apresmidi crepuscule journee ete FALSE
## dist dt rel_angle altitude mnt_altitude temperature acc_x acc_y acc_z
## 1 71.30711 585 NA 1714 1785.21 22 42 -79 1026
## 2 22.22461 669 2.0198721 1721 1761.20 23 76 -77 1026
## 3 23.20683 525 1.9254387 1764 1756.01 23 81 56 1028
## 4 81.92260 600 -1.7887462 1756 1767.50 23 -293 182 765
## 5 44.21306 607 -0.4373428 1750 1772.85 20 219 -573 829
## 6 26.11233 593 1.6212039 1779 1784.86 19 155 -600 799
## geom
## 1 POINT (951393 6507434)
## 2 POINT (951420 6507499)
## 3 POINT (951397.8 6507498)
## 4 POINT (951407.2 6507477)
## 5 POINT (951326.8 6507461)
## 6 POINT (951283.9 6507471)
Information about the data you should have now appearing in your environment: - “bird”: a sf object called depending the name of the bird you want to work on and that you specified line 84 is now in your environment. NB: Rather change and re-specify the ‘bird’ object with the bird’s name you want to work on than using filter() on data_sum. Unique bird files are stored in your /data.
## III.1 ISSA process method (Chatterjee et AL., 2024)
# Presentation of the method
Integrated step- selection analysis (ISSA) offers an appealing framework for modelling complex movements in response to local environmental features. ISSA is the ability to model both habitat selection, using covariates at the end of the movement step, and habitat- dependent movement, by allowing an animal’s step lengths (distances between sequential locations) and turn angles (changes in direction) to depend on habitat at the start of the movement step. Tentative movement parameters are first estimated using observed step lengths and turn angles, ignoring the effect of habitat selection on observed movements, and then movement parameters are updated using additional coefficients from the fitted conditional logistic regression model. Mixed ISSA can be used to study among-individual variation in both movement and habitat-selection parameters.
# Prepare the data: step lengths & turn angles extraction
Packages used : ‘sf’, ‘amt’
#Create .X and .Y columns
Arnica_track <- Arnica %>%
mutate(x = st_coordinates(geom)[, "X"],
y = st_coordinates(geom)[,"Y"])
#Create the bird_track
Arnica_track <- make_track(Arnica_track,
.x = x,
.y = y,
.t = date)
#Extract step lengths & turn angles (in radians)
Arnica_track <- steps(Arnica_track)
hist(Arnica_track$ta_)
# Fit a distribution for step length
Packages used : ‘fitdistrplus’
#Visualize sl_
hist(Arnica_track$sl_)
#Fitting different distributions to sl_
sl_exp <- fitdist(Arnica_track$sl_, "exp" , method="mle")
sl_logis <- fitdist(Arnica_track$sl_, "logis" , method="mle")
sl_norm <- fitdist(Arnica_track$sl_, "norm" , method="mle")
sl_wb <- fitdist(Arnica_track$sl_, "weibull", method="mle", start = list(scale = 1, shape = 0.5)) # (Fandos et AL., 2023)
#Visualize the fitness sl_ data VS distribution law
par(mfrow = c(2,3))
plot(sl_exp)
plot(sl_logis)
plot(sl_norm)
plot(sl_wb)
#Log-Likelihood comparison (highest is best - position 1)
print(sort(sapply(list("Exp dist. Log(L):" = sl_exp,
"Logi dist. Log(L):" = sl_logis,
"Norm dist. Log(L):" = sl_norm,
"Weibull dist. Log(L):" = sl_wb), logLik), decreasing = T))
## Weibull dist. Log(L): Exp dist. Log(L): Logi dist. Log(L):
## -2462.030 -2478.465 -2615.043
## Norm dist. Log(L):
## -2730.887
#AIC comparison for distribution selection (lowest is best - position 1)
print(sort(sapply(list("Exp dist. AIC" = sl_exp,
"Logi dist. AIC" = sl_logis,
"Norm dist. AIC" = sl_norm,
"Weibull dist. AIC" = sl_wb), AIC)))
## Weibull dist. AIC Exp dist. AIC Logi dist. AIC Norm dist. AIC
## 4928.060 4958.930 5234.087 5465.774
#GOF comparison for distribution selection (lowest is best - position 1)
#Anderson-Darling test focused in - gives more weight to the tails of the distribution (McClintock & Michelot, 2021)
print(sort(sapply(list("Exp dist. GOF" = sl_exp,
"Logi dist. GOF" = sl_logis,
"Norm dist. GOF" = sl_norm,
"Weibull dist. GOF" = sl_wb), function(x) gofstat(x)$ad)))
## Weibull dist. GOF.1-mle-weibull Exp dist. GOF.1-mle-exp
## 7.046866 14.972344
## Logi dist. GOF.1-mle-logis Norm dist. GOF.1-mle-norm
## 21.650764 Inf
#Clean environment
rm(sl_logis, sl_norm)
Reminder: AIC is an estimator of prediction error and relative
quality of statistical models for a given set of data. Based on the
classic formula, the AIC for von Mises distribution is computed
as:
- AIC = 2Np − 2log(L), where Np is the number of parameters and L
log_likelihood
Regarding the steps length data ‘sl_’ distribution, has been tested - visually plot() and regarding the AIC values - different function of density to fit the data.
We then find above:
- the Weibull distribution is the most likely against Exponential,
Logistic and Normal distributions (log(L): -2462 > -2478 > -2615
< 2730) and suggests a more appropriate model for turn angles
distribution - the Weibull distribution is rather selected by the Akaike
Information criterion than the Exponential, Logistic and Normal
distributions (AIC: 4928 < 4958 < 5234 < 5465) - Regarding the
GOF - Anderson-Darling test result focused on, (McClintock &
Michelot, 2021) - the Weibull value is the lowest in comparison to the
Exponential, Logistic and Normal distribution values (7 < 14 < 21
< Infinity). It suggests the Weibull distribution as the one having
the less differences with our data (sl_) in their distributions.
We will then fit an Weibull model to the sl_ data-set - step lengths. A very flexible distribution (Wilson et AL., 2022; Fandos et AL., 2022)
# Fit a distribution for turn angle
Packages used : ‘fitdistrplus’, ‘circular’
#Visualize ta_
hist(na.omit(Arnica_track$ta_))
#Fitting von Mises distributions to ta_ (assuming that is the best distribution)
ta_unif <- fitdist(c(na.omit(Arnica_track$ta_)), "unif", method = "mle") #c() use because works with vector only
ta_vm <- mle.vonmises(na.omit(Arnica_track$ta_)) #mle.vonmises() used because fitdist() doesn't support von Mises method
## Warning in as.circular(x): an object is coerced to the class 'circular' using default value for the following components:
## type: 'angles'
## units: 'radians'
## template: 'none'
## modulo: 'asis'
## zero: 0
## rotation: 'counter'
## conversion.circularxradians0counter2pi
ta_wc <- mle.wrappedcauchy(na.omit(Arnica_track$ta_)) #mle.vonmises() used because fitdist() doesn't support Wrapped Cauchy method
## Warning in as.circular(x): an object is coerced to the class 'circular' using default value for the following components:
## type: 'angles'
## units: 'radians'
## template: 'none'
## modulo: 'asis'
## zero: 0
## rotation: 'counter'
## conversion.circularxradians0counter2pi
#Visualize the fitness ta_ data VS distribution law
#Struggling, might have to create a custumized function?
#Distribution selection:
#Log-Likelihood comparison (highest is best)
print(sort(sapply(list(
"van Mises dist. Log(L):" = suppressWarnings(
sum(log(dvonmises(na.omit(Arnica_track$ta_), mu = ta_vm$mu, kappa = ta_vm$kappa)))),
"Uniform dist. Log(L):" = logLik(ta_unif),
"Wrapped Cauchy dist. Log(L):" = suppressWarnings(
sum(log(dwrappedcauchy(na.omit(Arnica_track$ta_), mu = ta_wc$mu, rho = ta_wc$rho))))),
as.numeric), decreasing = T))
## Wrapped Cauchy dist. Log(L): van Mises dist. Log(L):
## -1065.168 -1067.496
## Uniform dist. Log(L):
## -1092.334
#AIC comparison for distribution selection (lowest is best - position 1)
print(sort(sapply(list(
"Uniform dist. AIC:" = ta_unif$aic,
"Wrapped Cauchy dist. AIC:" = 2*2 - 2 * suppressWarnings(
sum(log(dwrappedcauchy(na.omit(Arnica_track$ta_), mu = ta_wc$mu, rho = ta_wc$rho)))),
"von Mises dist. AIC:" = 2*2 - 2 * suppressWarnings(
sum(log(dvonmises(na.omit(Arnica_track$ta_), mu = ta_vm$mu, kappa = ta_vm$kappa))))),
as.numeric)))
## Wrapped Cauchy dist. AIC: von Mises dist. AIC: Uniform dist. AIC:
## 2134.335 2138.993 2188.668
#Calculate U² statistic for GOF (lowest is best - position 1)
print(sort(sapply(list(
"GOF for von Mises dist.:" = GOF_vm(na.omit(Arnica_track$ta_)),
"GOF for Wrapped Cauchy dist.:" = GOF_wc(na.omit(Arnica_track$ta_)),
"GOF for Uniform dist.:" = GOF_unif(na.omit(Arnica_track$ta_))),
as.numeric)))
## GOF for Uniform dist.: GOF for von Mises dist.:
## 95.65582 125.00311
## GOF for Wrapped Cauchy dist.:
## 8325.79088
#Cleaning environment
rm(ta_unif)
Visually, the distribution of our turn angles - hist(na.omit(Arnica_track$ta_)) - data would fit what is called von Mises, and it’s relyable on the study made by Lond et AL, 2022. Since the hist() suggests a uniform distribution too, both are tested.
The packages do not always perform the extraction of criteria (log-likelihood, AIC, GOF) to select the best distribution to apply. Some have been done manually.
Reminder: Likelihood is the product of every probabilities for each x
to appear. To simplify with log the likelihood to transform all products
in sums. Based on the classic formula, the Log-Likelihood (L) has been
computed as:
- L = Log OF the sum of every ‘x’ THAT ARE distributed under the ‘von
Mises function of density’ WITH the 2 parameters ‘mu’ and ‘kappa’.
Reminder: We will use for the Goodness Of Fit (GOF) the Watson’s U² statistic to evaluate the fit of circular data. Extracted manually, functions build with internet (Might have to be better dug out).
We then find above:
Log-Likelihood and AIC are metrics that assess how well a Wrapped Cauchy model explains the data based on the likelihood of observing the given data under that model. As the lowest AIC provided, it indicates a better model when considering both fit and complexity. Even if the U² value does not favor this model, we will fit a Wrapped Cauchy distribution to our data-set of ta_ - turn angles.
# Random steps VS Observed steps using
Packages used : ‘amt’
#To be continued...
#random_steps()
# III.1 Prepare the data & Fit the model (Jonsen et AL., 2022)
Packages used : ‘animotum’
#To be continued...