HaMaarag_SoN23_BirdsNational_Time_Abundance

Data ; Science ; Environment

Statistical and Ecological Solutions

email: gavishyoni@gmail.com

phone: +972-58-4479847

General information on the current analysis:

Project - HaMaarag, State of Nature report, 2023

Group - Birds

Scale - National

Species group - Non Synanthrope species

Factors - Time, Unit

Response - Abundance

Take home messages:

- Abundance decreased non-significantly (p=0.221) by 1.010% per year (1.010 fold per year) and by 8.731% (1.096 fold) over the entire period:

Predicted value 2012: 5.98 Individuals per sample
Predicted value 2021: 5.45 Individuals per sample

A few general comments:

1. The aim of the analysis is to explore the effect of time and/or proximity to settlements and/or proximity to agriculture on the species richness, total abundance, geometric mean of abundance and community structure of birds in Israel

2. Definition of non rare species:

For species richness:
- All species were included in the analysis
For total abundance and geometric mean of abundance (satisfy both criteria):
- Appeared in at least 50 samples over all campaigns
- At least 10 individuals over all campaigns
For community structure (satisfy both criteria):
- Appeared in at least 10 samples over all campaigns
- At least 10 individuals over all campaigns

3. The starting model (maximum complexity) for all analyses was kept constant at:

Proximity to agricultural + temporal contrasts:

-  Response ~ agriculture * time + unit + cos_td_rad + sin_td_rad + (1|site/point)

Proximity to settlements + temporal contrast:

-  Response ~ settlements * time + unit + cos_td_rad + sin_td_rad + (1|site/point)

Temporal contrast:

-  Response ~ time + unit + cos_td_rad + sin_td_rad + (1|site/point)

4. Accounting for spatial structure and repeated measures:

Each sampling point was visited several times in different years/campaigns
Each sampling point is nested within sites
Site and point are thus included as a nested random factor in the model

- For community analysis we used (1|point) instead of (1|site/point) as random factor

5. Accounting for activity levels (random factors):

“cos_td_rad”, “sin_td_rad” - measure the date difference from 21 June, as the cos or sin of the difference in days in radians
Wind, Precipitation, Temperature and Cloud are not included in the analysis although they may affect activity level since the data is too sparse (many NAs)
The hour of sampling was not recorded in T0 - the two related variables c(“cos_hsun”, “sin_hsun”) are ignored in the analysis

6. Model complexity:

This analysis was restricted to monotonic increase/decrease
More flexible models that allow some curvature may unveil different patterns and trends

1. Prepare the working environment

2. Load data and source functions

3. Arrange the Traits table

Add to the traits table the species code, order, family, IUCN category and group, and residency status in Israel
Table shows number of NAs, class, mean value (for integers and numeric) and number of factor levels (for character and factors)

Column	NAs	Class	Mean	Levels
SciName	0	factor	NA	189
SPECIES_CODE	0	character	NA	189
HebName	0	factor	NA	189
PRIMARY_COM_NAME	0	factor	NA	189
Order	0	character	NA	20
Family	0	character	NA	53
Synanthrope	0	integer	0.180	NA
AlienInvasive	0	integer	0.021	NA
NativeInvasive	0	integer	0.021	NA
synanthrope_or_invasive	0	logical	NA	NA
batha_specialist	70	logical	NA	NA
oblig_insectivore	32	logical	NA	NA
IUCN_AllCat	43	character	NA	9
IUCN	0	character	NA	3
Resident	0	integer	0.434	NA
Resident_IL	0	factor	NA	2

4. Arrange and filter the Plots table

Table shows number of NAs, class, mean value (for integers and numeric) and number of factor levels (for character and factors)

Column	NAs	Class	Mean	Levels
unit	0	factor	NA	8
subunit	926	factor	NA	4
site	0	factor	NA	70
year	0	integer	2016.665	NA
year_ct	0	numeric	4.665	NA
settlements	791	factor	NA	3
agriculture	1169	factor	NA	3
habitat	1145	factor	NA	5
dunes	1559	factor	NA	3
land_use	1455	factor	NA	4
point_name	0	factor	NA	574
date	0	POSIXct	NA	NA
date	0	POSIXt	NA	NA
datetime	441	POSIXct	NA	NA
datetime	441	POSIXt	NA	NA
td_sc	0	numeric	0.129	NA
td_rad	0	numeric	-0.783	NA
cos_td_rad	0	numeric	0.629	NA
sin_td_rad	0	numeric	-0.635	NA
timediff_Jun21	0	difftime	NA	NA
monitors_name	1023	factor	NA	14
wind	1449	ordered	NA	5
wind	1449	factor	NA	5
precipitation	1390	ordered	NA	3
precipitation	1390	factor	NA	3
temperature	1433	ordered	NA	5
temperature	1433	factor	NA	5
clouds	1519	ordered	NA	5
clouds	1519	factor	NA	5
h_from_sunrise	441	difftime	NA	NA
cos_hsun	441	numeric	0.817	NA
sin_hsun	441	numeric	0.442	NA
pilot	0	logical	NA	NA
campaign	0	character	NA	5
Unit_Label	0	character	NA	8
Region	0	character	NA	3

5. Clean Data and identify non-rare species

Filter columns less relevant to the analysis
Remove all observations with 0 abundance (legacy from former filtering)
Add features from the trait table
Keep only Non Synanthrope species
Extract general properties of each column in Data : BasicInfo.R
Identify species that meet the two criteria for non-rare : Get_InciAbunRare.R
Table shows number of NAs, class, mean value (for integers and numeric) and number of factor levels (for character and factors)

Column	NAs	Class	Mean	Levels
SciName	0	character	NA	65
year	0	integer	2016.993	NA
point_name	0	character	NA	565
campaign	0	character	NA	5
unit	0	character	NA	8
site	0	character	NA	70
year_ct	0	integer	4.993	NA
count_under_250	0	integer	2.690	NA
agriculture	3092	character	NA	3
settlements	2562	character	NA	3
cos_td_rad	0	numeric	0.601	NA
sin_td_rad	0	numeric	-0.671	NA
SPECIES_CODE	0	data.table	NA	NA
SPECIES_CODE	0	data.frame	NA	NA
HebName	0	data.table	NA	NA
HebName	0	data.frame	NA	NA
PRIMARY_COM_NAME	0	data.table	NA	NA
PRIMARY_COM_NAME	0	data.frame	NA	NA
Order	0	data.table	NA	NA
Order	0	data.frame	NA	NA
Family	0	data.table	NA	NA
Family	0	data.frame	NA	NA
Synanthrope	0	data.table	NA	NA
Synanthrope	0	data.frame	NA	NA
AlienInvasive	0	data.table	NA	NA
AlienInvasive	0	data.frame	NA	NA
NativeInvasive	0	data.table	NA	NA
NativeInvasive	0	data.frame	NA	NA
synanthrope_or_invasive	0	data.table	NA	NA
synanthrope_or_invasive	0	data.frame	NA	NA
batha_specialist	181	data.table	NA	NA
batha_specialist	181	data.frame	NA	NA
oblig_insectivore	297	data.table	NA	NA
oblig_insectivore	297	data.frame	NA	NA
IUCN_AllCat	0	data.table	NA	NA
IUCN_AllCat	0	data.frame	NA	NA
IUCN	0	data.table	NA	NA
IUCN	0	data.frame	NA	NA
Resident_IL	0	data.table	NA	NA
Resident_IL	0	data.frame	NA	NA

A total of 17 species satisfied the non-rare criteria, including:

Curruca melanocephala ; Galerida cristata ; Chloris chloris ; Alectoris chukar ; Troglodytes troglodytes ; Carduelis carduelis ; Ammomanes deserti ; Iduna pallida ; Emberiza calandra ; Lanius senator ; Lanius excubitor ; Curruca communis ; Burhinus oedicnemus ; Scotocerca inquieta ; Oenanthe melanura ; Cercotrichas galactotes ; Curruca conspicillata

6. Create a dataframe with the relevant variables for analysis

For Richness, we use all species.
For abundance, mean abundance (geometric) and community structure we filter rare species.
Create a data-table with a row for each combination of ‘year’ and ‘point_name’
- First two columns are ‘year’ and ‘point_name’
- Column for each species
Add empty plots (plots with no observations)
Filter Rare species
Calculate total abundance
Calculate the geometric mean abundance (GMA)
- For each row
- For species with at least one individual (Note)
Add the relevant predictor variables

Note:

when calculating a geometric mean (GMA), any species with 0 abundance would result with a geometric mean of 0 (since the geometric mean relies on the product). Since most samples include at least one species with 0 abundance, we based the geometric mean only on the species with at least one individual. For empty plots- NaN is returned and these rows should be removed when modeling GMA.

Table shows number of NAs, class, mean value (for integers and numeric) and number of factor levels (for character and factors)

Column	NAs	Class	Mean	Levels
year	0	numeric	2016.667	NA
point_name	0	character	NA	574
Curruca melanocephala	0	numeric	2.121	NA
Galerida cristata	0	numeric	1.561	NA
Chloris chloris	0	numeric	0.627	NA
Alectoris chukar	0	numeric	0.572	NA
Troglodytes troglodytes	0	numeric	0.253	NA
Carduelis carduelis	0	numeric	0.388	NA
Ammomanes deserti	0	numeric	0.176	NA
Iduna pallida	0	numeric	0.125	NA
Emberiza calandra	0	numeric	0.223	NA
Lanius senator	0	numeric	0.099	NA
Lanius excubitor	0	numeric	0.078	NA
Curruca communis	0	numeric	0.123	NA
Burhinus oedicnemus	0	numeric	0.095	NA
Scotocerca inquieta	0	numeric	0.073	NA
Oenanthe melanura	0	numeric	0.071	NA
Cercotrichas galactotes	0	numeric	0.068	NA
Curruca conspicillata	0	numeric	0.068	NA
TotalAbun	0	numeric	6.721	NA
GMA	95	numeric	2.990	NA
site	0	factor	NA	70
campaign	0	character	NA	5
year_ct	0	numeric	4.667	NA
agriculture	1168	factor	NA	3
settlements	791	factor	NA	3
unit	0	factor	NA	8
Unit_Label	0	character	NA	8
Region	0	character	NA	3
cos_td_rad	0	numeric	0.629	NA
sin_td_rad	0	numeric	-0.636	NA
year_ct_Sq	0	numeric	29.703	NA
Response	0	numeric	6.721	NA

The data contains 95 empty plot sampling events (plot sampling with response=0).

Of these, 16 plots were empty throughout the entire monitoring period.

These 16 plots represent 16 plot sampling events.

Of these 16 plots, 0 plots span more than a single year.

7. Explore the Abundance data frame

Visualize the relationship of the response to various explanatory variables

Main conclusions:

Considerable variability in total abundance between units, with highest values in the Mediterranean maquis unit and lowest in the inland sands unit
Considerable variability between and within sites

Notes:

unit Batha vegetation refers to Herbaceous and Dwarf-Shrub Vegetation
unit Transition Zone refers to Mediterranean-Desert Transition Zone
unit Loess Areas refers to Loess Covered Areas in the Northern Negev

samples with abundance > 110 (not shown in figures)
	1478
year	2021
point_name	Kerem Maharal Near 3
Unit_Label	Mediterranean Maquis
Region	Mediterranean
site	Kerem Maharal
TotalAbun	117

Number of samples
Year	Arid South	Negev Highlands	Inland Sands	Loess Areas	Transition Zone	Planted Conifer Forest	Mediterranean Maquis	Batha Vegetation
2012	30	30	0	24	30	0	89	0
2014	30	45	0	27	30	45	0	60
2015	0	0	0	0	0	45	89	0
2016	30	45	0	29	29	0	0	60
2017	0	0	12	0	0	45	88	0
2018	30	45	0	30	30	0	0	60
2019	0	0	12	0	0	45	89	0
2020	30	45	0	30	30	0	0	60
2021	0	0	12	0	0	45	89	0

8. Model Abundance - Idenitify Family and Fixed factors

We fit generalized linear mixed-effects model (glmmTMB) with:

The response is: TotalAbun (internally, “Response”)
The fixed factors are: year_ct ; unit ; cos_td_rad ; sin_td_rad
The groping factors are: site ; point_name - with the later nested within the first (each site contains several plots)

Objectives:

Identify the best combination of fixed factors
Decide on the family (poisson or negative binomial)

Workflow:

Identify the best model for Poisson:
- Fit the most complex model with glmmTMB
- Simplify the model with MuMIn::dredge(rank = “AIC”) - removal of fixed terms
- Select the best model
Identify the best model for Negative binomial:
- Fit the most complex model with glmmTMB
- Simplify the model with MuMIn::dredge(rank = “AIC”) - removal of fixed terms
- Select the best model
Select the optimal model in terms of family and fixed factors.
- Use model selection to compare the best poisson and best negative binomial
- Diagnose the best model with glmmTMB:diagnose and look at the sigma (for NB)

Simplification of the poisson models

Best model kept all terms
Second best model DeltaAIC > 2
Keep first model

	cond((Int))	cond(cos_td_rad)	cond(sin_td_rad)	cond(Unit_Label)	cond(year_ct)	df	logLik	AIC	delta	weight
16	0.148	0.463	-0.711		-0.025	13	-5034.514	10095.03	0.000	9.999891e-01
8	0.102	0.498	-0.617		NA	12	-5046.942	10117.88	22.857	1.088310e-05
15	0.692	NA	-0.391		-0.029	12	-5057.990	10139.98	44.952	1.732868e-10
12	1.030	0.454	-0.709	NA	-0.025	6	-5071.672	10155.34	60.317	7.986115e-14
7	0.686	NA	-0.253		NA	11	-5074.537	10171.07	76.046	3.068132e-17
4	0.964	0.488	-0.616	NA	NA	5	-5083.734	10177.47	82.441	1.253804e-18
11	1.533	NA	-0.391	NA	-0.028	5	-5093.666	10197.33	102.305	6.090948e-23
3	1.500	NA	-0.256	NA	NA	4	-5109.495	10226.99	131.963	2.210958e-29
14	0.961	-0.125	NA		-0.007	12	-5101.920	10227.84	132.812	1.446418e-29
6	0.915	-0.091	NA		NA	11	-5102.930	10227.86	132.833	1.431340e-29
5	0.852	NA	NA		NA	10	-5106.070	10232.14	137.111	1.685124e-30
13	0.850	NA	NA		0.001	11	-5106.052	10234.10	139.076	6.310282e-31
10	1.770	-0.134	NA	NA	-0.007	5	-5136.389	10282.78	187.751	1.700098e-41
2	1.720	-0.100	NA	NA	NA	4	-5137.442	10282.89	187.857	1.612062e-41
1	1.655	NA	NA	NA	NA	3	-5141.187	10288.37	193.346	1.036428e-42
9	1.650	NA	NA	NA	0.001	4	-5141.145	10290.29	195.263	3.973160e-43

Simplification of the negative binomial models

Best model kept all the terms
Second model DeltaAIC > 2
Keep first model

	cond((Int))	cond(cos_td_rad)	cond(sin_td_rad)	cond(Unit_Label)	cond(year_ct)	df	logLik	AIC	delta	weight
8	0.217	0.552	-0.506		NA	13	-4411.979	8849.958	0.000	5.624061e-01
16	0.241	0.535	-0.544		-0.010	14	-4411.230	8850.460	0.502	4.375647e-01
15	0.870	NA	-0.180		-0.015	13	-4422.788	8871.577	21.619	1.136619e-05
7	0.864	NA	-0.104		NA	12	-4424.480	8872.960	23.002	5.690782e-06
6	0.857	0.110	NA		NA	12	-4424.603	8873.207	23.249	5.030707e-06
5	0.933	NA	NA		NA	11	-4425.938	8873.876	23.918	3.600903e-06
14	0.838	0.123	NA		0.003	13	-4424.551	8875.102	25.144	1.950121e-06
13	0.947	NA	NA		-0.004	12	-4425.780	8875.561	25.603	1.550635e-06
4	1.114	0.489	-0.468	NA	NA	6	-4448.700	8909.401	59.443	6.954070e-14
12	1.142	0.474	-0.504	NA	-0.009	7	-4448.065	8910.130	60.172	4.828188e-14
11	1.671	NA	-0.172	NA	-0.013	6	-4456.537	8925.073	75.115	2.747779e-17
3	1.655	NA	-0.105	NA	NA	5	-4457.838	8925.676	75.718	2.032862e-17
1	1.719	NA	NA	NA	NA	4	-4459.289	8926.578	76.620	1.295089e-17
2	1.670	0.075	NA	NA	NA	5	-4458.674	8927.348	77.390	8.811965e-18
9	1.730	NA	NA	NA	-0.003	5	-4459.217	8928.433	78.475	5.120904e-18
10	1.657	0.084	NA	NA	0.002	6	-4458.650	8929.299	79.341	3.321085e-18

Model selection

The Negative binomial model (AIC weight = 1) outperformed the Poisson model

	cond((Int))	disp((Int))	cond(cos_td_rad)	cond(sin_td_rad)	cond(Unit_Label)	cond(year_ct)	family	df	logLik	AIC	delta	weight
Best_NB	0.241		0.535	-0.544		-0.010	nbinom2(log)	14	-4411.230	8850.46	0.000	1.000000e+00
Best_Po	0.148		0.463	-0.711		-0.025	poisson(log)	13	-5034.514	10095.03	1244.568	5.565952e-271

Diagnostics and sigma

Diagnostic of best negative binomial model reveals minor issues

## Unusually large Z-statistics (|x|>5):
## 
##                       sin_td_rad            Unit_LabelLoess Areas 
##                        -5.202071                         5.409626 
## Unit_LabelPlanted Conifer Forest   Unit_LabelMediterranean Maquis 
##                         5.556414                         6.654735 
##       Unit_LabelBatha Vegetation                    d~(Intercept) 
##                        10.129983                        17.909080 
##        theta_1|point_name:site.1                   theta_1|site.1 
##                        -9.957956                       -10.387473 
## 
## Large Z-statistics (estimate/std err) suggest a *possible* failure of
## the Wald approximation - often also associated with parameters that are
## at or near the edge of their range (e.g. random-effects standard
## deviations approaching 0).  (Alternately, they may simply represent
## very well-estimated parameters; intercepts of non-centered models may
## fall in this category.) While the Wald p-values and standard errors
## listed in summary() may be unreliable, profile confidence intervals
## (see ?confint.glmmTMB) and likelihood ratio test p-values derived by
## comparing models (e.g. ?drop1) are probably still OK.  (Note that the
## LRT is conservative when the null value is on the boundary, e.g. a
## variance or zero-inflation value of 0 (Self and Liang 1987; Stram and
## Lee 1994; Goldman and Whelan 2000); in simple cases these p-values are
## approximately twice as large as they should be.)

The dispersion parameter of the best negative binomial model was >1, suggesting the negative binomial model is better

## [1] 3.235275

Main conclusions:

Continue with the best negative binomial model

## Formula:          
## Response ~ cos_td_rad + sin_td_rad + Unit_Label + year_ct + (1 |  
##     site/point_name)
## Data: MainData
##       AIC       BIC    logLik  df.resid 
##  8850.460  8925.696 -4411.230      1580 
## Random-effects (co)variances:
## 
## Conditional model:
##  Groups          Name        Std.Dev.
##  point_name:site (Intercept) 0.2096  
##  site            (Intercept) 0.2510  
## 
## Number of obs: 1594 / Conditional model: point_name:site, 574; site, 70
## 
## Dispersion parameter for nbinom2 family (): 3.24 
## 
## Fixed Effects:
## 
## Conditional model:
##                      (Intercept)                        cos_td_rad  
##                          0.24131                           0.53453  
##                       sin_td_rad         Unit_LabelNegev Highlands  
##                         -0.54391                           0.55731  
##           Unit_LabelInland Sands             Unit_LabelLoess Areas  
##                          0.41276                           0.93479  
##        Unit_LabelTransition Zone  Unit_LabelPlanted Conifer Forest  
##                          0.78200                           0.82802  
##   Unit_LabelMediterranean Maquis        Unit_LabelBatha Vegetation  
##                          0.96648                           1.59728  
##                          year_ct  
##                         -0.01015

## glmmTMB::glmmTMB(formula = Response ~ cos_td_rad + sin_td_rad + 
##     Unit_Label + year_ct + (1 | site/point_name), data = MainData, 
##     family = "nbinom2", ziformula = ~0, dispformula = ~1, na.action = na.fail)

9. Model Abundance - Identfiy best family variant of the negative binomial model

The glmmTMB function support multiple variants of many families and various link function

For poisson: ‘poisson’ vs. ‘compois’
For negative binomial: ‘nbinom1’ vs. ‘nbinom2’
For Gamma: Different link functions (log vs. identity vs. inverse)

Objectives:

Identify the best model family/link variant based on the selected fixed factors

Workflow:

Fit an alternative model with family == ‘nbinom1’
Simplify the alternative model with MuMIn::dredge()
Select the best alternative model
Use model selection to compare the best original and alternative nodels (MuMIn::model.sel)
Select the optimal model in terms of family variant
Diagnose the best model with glmmTMB:diagnose

Simplification of the negative binomial models (nbinom1)

Best model kept all terms
Second best model DeltaAIC > 2
Keep the first

	cond((Int))	cond(cos_td_rad)	cond(sin_td_rad)	cond(Unit_Label)	cond(year_ct)	df	logLik	AIC	delta	weight
16	0.453	0.519	-0.580		-0.015	14	-4447.651	8923.302	0.000	7.080189e-01
8	0.430	0.539	-0.519		NA	13	-4449.537	8925.074	1.772	2.919430e-01
15	1.052	NA	-0.231		-0.019	13	-4458.639	8943.278	19.976	3.252781e-05
7	1.052	NA	-0.136		NA	12	-4461.544	8947.087	23.785	4.843230e-06
5	1.149	NA	NA		NA	11	-4465.029	8952.059	28.757	4.032874e-07
6	1.124	0.038	NA		NA	12	-4464.831	8953.662	30.360	1.808752e-07
13	1.155	NA	NA		-0.002	12	-4464.987	8953.974	30.672	1.548103e-07
14	1.121	0.040	NA		0.000	13	-4464.829	8955.659	32.357	6.666322e-08
12	1.206	0.453	-0.530	NA	-0.015	7	-4479.263	8972.527	49.225	1.448774e-11
4	1.172	0.470	-0.469	NA	NA	6	-4480.964	8973.928	50.626	7.188622e-12
11	1.706	NA	-0.217	NA	-0.018	6	-4487.604	8987.207	63.905	9.401576e-15
3	1.687	NA	-0.129	NA	NA	5	-4490.022	8990.045	66.743	2.275575e-15
1	1.769	NA	NA	NA	NA	4	-4492.998	8993.995	70.693	3.156753e-16
2	1.754	0.025	NA	NA	NA	5	-4492.914	8995.827	72.525	1.262838e-16
9	1.777	NA	NA	NA	-0.002	5	-4492.963	8995.926	72.624	1.201927e-16
10	1.756	0.023	NA	NA	0.000	6	-4492.912	8997.825	74.523	4.651438e-17

Model selection

The ‘nbinom2’ model outperformed the ‘nbinom1’ model.

	cond((Int))	disp((Int))	cond(cos_td_rad)	cond(sin_td_rad)	cond(Unit_Label)	cond(year_ct)	family	df	logLik	AIC	delta	weight
BestModel	0.241		0.535	-0.544		-0.010	nbinom2(log)	14	-4411.230	8850.460	0.000	1.000000e+00
Best_Opt1	0.453		0.519	-0.580		-0.015	nbinom1(log)	14	-4447.651	8923.302	72.842	1.522462e-16

Diagnostics and sigma

Diagnostic of best model reveals minor issues

## Unusually large Z-statistics (|x|>5):
## 
##                       sin_td_rad            Unit_LabelLoess Areas 
##                        -5.202071                         5.409626 
## Unit_LabelPlanted Conifer Forest   Unit_LabelMediterranean Maquis 
##                         5.556414                         6.654735 
##       Unit_LabelBatha Vegetation                    d~(Intercept) 
##                        10.129983                        17.909080 
##        theta_1|point_name:site.1                   theta_1|site.1 
##                        -9.957956                       -10.387473 
## 
## Large Z-statistics (estimate/std err) suggest a *possible* failure of
## the Wald approximation - often also associated with parameters that are
## at or near the edge of their range (e.g. random-effects standard
## deviations approaching 0).  (Alternately, they may simply represent
## very well-estimated parameters; intercepts of non-centered models may
## fall in this category.) While the Wald p-values and standard errors
## listed in summary() may be unreliable, profile confidence intervals
## (see ?confint.glmmTMB) and likelihood ratio test p-values derived by
## comparing models (e.g. ?drop1) are probably still OK.  (Note that the
## LRT is conservative when the null value is on the boundary, e.g. a
## variance or zero-inflation value of 0 (Self and Liang 1987; Stram and
## Lee 1994; Goldman and Whelan 2000); in simple cases these p-values are
## approximately twice as large as they should be.)

Main conclusions:

Continue with the ‘nbinom2’ model

## Formula:          
## Response ~ cos_td_rad + sin_td_rad + Unit_Label + year_ct + (1 |  
##     site/point_name)
## Data: MainData
##       AIC       BIC    logLik  df.resid 
##  8850.460  8925.696 -4411.230      1580 
## Random-effects (co)variances:
## 
## Conditional model:
##  Groups          Name        Std.Dev.
##  point_name:site (Intercept) 0.2096  
##  site            (Intercept) 0.2510  
## 
## Number of obs: 1594 / Conditional model: point_name:site, 574; site, 70
## 
## Dispersion parameter for nbinom2 family (): 3.24 
## 
## Fixed Effects:
## 
## Conditional model:
##                      (Intercept)                        cos_td_rad  
##                          0.24131                           0.53453  
##                       sin_td_rad         Unit_LabelNegev Highlands  
##                         -0.54391                           0.55731  
##           Unit_LabelInland Sands             Unit_LabelLoess Areas  
##                          0.41276                           0.93479  
##        Unit_LabelTransition Zone  Unit_LabelPlanted Conifer Forest  
##                          0.78200                           0.82802  
##   Unit_LabelMediterranean Maquis        Unit_LabelBatha Vegetation  
##                          0.96648                           1.59728  
##                          year_ct  
##                         -0.01015

## glmmTMB::glmmTMB(formula = Response ~ cos_td_rad + sin_td_rad + 
##     Unit_Label + year_ct + (1 | site/point_name), data = MainData, 
##     family = "nbinom2", ziformula = ~0, dispformula = ~1, na.action = na.fail)

10. Diagnose the best model with simulated residuals (DHARMa)

General info

To diagnose the model, we simulate residuals with the DHARMa package. For more details why DHARMa should be used with GLMM, see:

https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html

To interpret the residuals:

Scaled residual value of 0.5 means that half of the simulated data are higher than the observed value, and half of them lower
Scaled residual value of 0.99 would mean that nearly all simulated data are lower than the observed value.
The minimum/maximum values for the residuals are 0 and 1
For a correctly specified model we would expect asymptotically
- A uniform (flat) distribution of the scaled residuals
- Uniformity in y direction if we plot against any predictor

Dispersion:

Compare the variance of the observed raw residuals (red line) against the variance of the simulated residuals (histogram)
- Dispersion > 1 indicates overdispersion
- Dispersion < 1 indicates underdispersion
Observed variance (red line) did not differ from the simulated variances (histogram)
Dispersion > 1 but n.s.
No evidence for over or under dispersion

## 
##  DHARMa nonparametric dispersion test via sd of residuals fitted vs.
##  simulated
## 
## data:  simulationOutput
## dispersion = 1.367, p-value = 0.016
## alternative hypothesis: two.sided

Outliers:

Understanding outliers:
- When all simulated values for the observation are higher or smaller than the observation itself
- Tests if the number of Outliers is larger or smaller than expected
There are 7 outliers in the simulated residuals
Test is n.s.
No evidence for excessive number of outliers

## 
##  DHARMa outlier test based on exact binomial test with approximate
##  expectations
## 
## data:  SimulatedRes
## outliers at both margin(s) = 8, observations = 1594, p-value = 0.4283
## alternative hypothesis: true probability of success is not equal to 0.003992016
## 95 percent confidence interval:
##  0.002169186 0.009865008
## sample estimates:
## frequency of outliers (expected: 0.00399201596806387 ) 
##                                            0.005018821

Zero inflation:

The expected distribution of zeros (histogram) and the observed value (red line)
- ratioObsSim < 1 : the observed data has less zeros than expected
- ratioObsSim > 1 : the observed data has more zeros than expected
No evidence for zero inflation:
- Red line within histogram
- Test is n.s

## 
##  DHARMa zero-inflation test via comparison to expected zeros with
##  simulation under H0 = fitted model
## 
## data:  simulationOutput
## ratioObsSim = 1.0796, p-value = 0.532
## alternative hypothesis: two.sided

Quantiles:

For numeric:
- Solid black / red lines: qunatile regressions (0.25, 0.5, 0.75) of the observed residuals against predicted value
  - Significant regressions in red
- Points - simulated residuals
- Dashed horizontal lines - the expected value for the 0.25, 0.5, and 0.75 quantiles (simulated residuals have a uniform distribution)
For categorical:
- Within-group uniformity of categorical predictors [asymptotic one-sample Kolmogorov-Smirnov test]
- Gray: the residual within the group are distributed according to the model assumptions (dashed lines)
- Red: the residual within the group deviate from the model assumptions (dashed lines)
- homogeneity of variances of categorical predictors [Levene’s Test]
- Non-significant (gray): group variances are constant
- Significant (red): group variances are not constant
Results:
- The 0.25 and 0.5 quantile regressions of the observed residuals against predicted values were significant, but lines are mostly horizontal and deviation occurs mainly in the extremes
- The 0.75 quantile regression of the observed residuals against predicted values was non-significant

## 
##  Test for location of quantiles via qgam
## 
## data:  simulationOutput
## p-value = 0.009107
## alternative hypothesis: both

year:

See categorical above for details (years treated as categorical)
Residuals in 2017 deviated from expected
Residuals in all other years did not deviate from expected
Variances of different years differ significantly from one another

Unit:

See categorical above for details
Residuals of 1 out of 8 units deviated from expected:
- Negev Highlands
Variances of the units differ significantly from one another

## $uniformity
## $uniformity$details
## catPred: Arid South
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  dd[x, ]
## D = 0.14578, p-value = 0.003406
## alternative hypothesis: two-sided
## 
## ------------------------------------------------------------ 
## catPred: Negev Highlands
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  dd[x, ]
## D = 0.087058, p-value = 0.0829
## alternative hypothesis: two-sided
## 
## ------------------------------------------------------------ 
## catPred: Inland Sands
## 
##  Exact one-sample Kolmogorov-Smirnov test
## 
## data:  dd[x, ]
## D = 0.18601, p-value = 0.1455
## alternative hypothesis: two-sided
## 
## ------------------------------------------------------------ 
## catPred: Loess Areas
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  dd[x, ]
## D = 0.11975, p-value = 0.03609
## alternative hypothesis: two-sided
## 
## ------------------------------------------------------------ 
## catPred: Transition Zone
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  dd[x, ]
## D = 0.059894, p-value = 0.659
## alternative hypothesis: two-sided
## 
## ------------------------------------------------------------ 
## catPred: Planted Conifer Forest
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  dd[x, ]
## D = 0.0756, p-value = 0.1527
## alternative hypothesis: two-sided
## 
## ------------------------------------------------------------ 
## catPred: Mediterranean Maquis
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  dd[x, ]
## D = 0.052242, p-value = 0.1771
## alternative hypothesis: two-sided
## 
## ------------------------------------------------------------ 
## catPred: Batha Vegetation
## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  dd[x, ]
## D = 0.036211, p-value = 0.9114
## alternative hypothesis: two-sided
## 
## 
## $uniformity$p.value
## [1] 0.003406054 0.082898104 0.145545594 0.036086718 0.659025079 0.152712776
## [7] 0.177094531 0.911361465
## 
## $uniformity$p.value.cor
## [1] 0.02724843 0.49738862 0.72772797 0.25260703 1.00000000 0.72772797 0.72772797
## [8] 1.00000000
## 
## 
## $homogeneity
## Levene's Test for Homogeneity of Variance (center = median)
##         Df F value  Pr(>F)    
## group    7  4.8347 2.1e-05 ***
##       1586                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Uniformity of residuals:

Kolmogorov-Smirnov Test (KS) of the uniformity of simulated residuals
- Residuals did not deviate from expected
Dispersion test n.s. (see above)
Outliers test n.s. (see above)
qq plot -
- Quantiles of the observed residuals vs. qunatiles of the simulated residuals
- The expected value is the 1:1 line (red)
qqline is close to the expected line

## 
##  Asymptotic one-sample Kolmogorov-Smirnov test
## 
## data:  simulationOutput$scaledResiduals
## D = 0.046494, p-value = 0.002033
## alternative hypothesis: two-sided

Main conclusions:

The 0.25 and 0.5 quantile regressions of the observed residuals against predicted values were significant, but lines are mostly horizontal and deviation occurs mainly in the extremes
Some issues with uniformity of residuals within groups (overall, years, units)
Other than that, no other issues
Continue to compute marginal effects

11. Compute marginal effects for figures

Marginal effects are extracted with ggeffects:ggeffect() that calls effects::Effect() internally
Marginal effects are only calculated and plotted for terms that appear in the final model:
- year
- unit
When calculating change for years:
- FoldIncrease: Value_year / Value_year_0
- PercentIncrease: 100 * [(Value_year - Value_year_0) / Value_year_0]
- FoldDecline: Value_year_0 / Value_year
- PercentDecline: 100 * [(Value_year_0 - Value_year) / Value_year_0]
- Delta: Value_year - Value_year_0
When calculating change for Agriculture / Settlements / Units:
- Fold: Value_level / Value_min
- PercentDiffernce: 100 * [(Value_level - Value_min) / Value_min]
- Delta: Value_level - Value_min

13. Figures for years

14. Figures for unit

15. Calculate the folds and percent change

Calculate the folds and percent change - contrast (agriculture / settlements):
- If Near > Far (coefficient > 0):
  - Contrast_Type: “Increase”
  - Contrast_Fold: exp(coefficient)
  - Contrast_Percent: 100 * (exp(coefficient) - 1)
- If Near < Far (coefficient > 0):
  - Contrast_Type: “Decrease”
  - Contrast_Fold: 1 / exp(coefficient)
  - Contrast_Percent: 100 * (1 - exp(coefficient))
- If Near = Far (coefficient = 0):
  - Contrast_Type: “NoChange”
  - Contrast_Fold: 1
  - Contrast_Percent: 0
Calculate the folds and percent change - temporal trend (years):
- Num_Years: Number of years
- If LastYear > FirstYear (coefficient > 0):
  - Year_Type: “Increase”
  - YearFold_1Year: exp(coefficient)
  - YearPercent_1Year: 100 * (exp(coefficient) - 1)
  - YearFold_NumYear: exp(coefficient * Num_Years)
  - YearPercent_NumYear: 100 * (exp(coefficien * Num_Years) - 1)
- If LastYear < FirstYear (coefficient > 0):
  - Year_Type: “Decrease”
  - YearFold_1Year: 1 / exp(coefficient)
  - YearPercent_1Year: 100 * (1 - exp(coefficient))
  - YearFold_NumYear: 1 / exp(coefficient * Num_Years)
  - YearPercent_NumYear: 100 * (1 - exp(coefficient * Num_Years))
- If LastYear = FirstYear (coefficient = 0):
  - Year_Type: “NoChange”
  - YearFold_1Year: 1
  - YearPercent_1Year: 0
  - YearFold_NumYear: 1
  - YearPercent_NumYear: 0

16. Extract information and write to files

Summary(BestModel) is written into:
- SoN23_BirdsNational_Time_Abundance_NonSynanthrope_ModSummary.txt
Fixed effects from the best model are written into:
- SoN23_BirdsNational_Time_Abundance_NonSynanthrope_Coeff.csv
General information on the model is written into:
- SoN23_BirdsNational_Time_Abundance_NonSynanthrope_ModelInfo.csv
Marginal effects of years / units are written to:
- SoN23_BirdsNational_Time_Abundance_NonSynanthrope_MargEffe_Years.csv
- SoN23_BirdsNational_Time_Abundance_NonSynanthrope_MargEffe_Units.csv
A figure is written (png, svg, pdf) for each of the marginal effect figures plotted in this document
- Names are the predictors (interactions) + indication if the points are observations, partial residuals or none:
  - “AgriUnit_Observation” : agriculture * unit ; observations
  - “SettYearUnit_PartResi” : settlements * year * unit ; partial residuals
  - “AgriYear_None” : agriculture * year, without points
- Names with “CapY_100” means that the maximal value of Y was truncated at 100 and some samples are not plotted
- Names with “_Heb” mean that all text elements are in Hebrew

17. Summary of main findings

Data structure:

The data contains 1594 samples in which 17 species passed as non-rare (and thus included in the analysis)
Total abundances per sample ranges from 0 to 117 individuals

Visual inspection of Response:

No apparent trends in total abundance with time
Considerable variability in total abundance between units, with highest values in the Mediterranean maquis unit and lowest in the inland sands unit

Model fitting (glmmTMB):

Negative binomial outperformed poisson
nbinom2 outperformed nbinom1
Diagnostics with DHARMa:
- The 0.25 and 0.5 quantile regressions of the observed residuals against predicted values were significant, but lines are mostly horizontal and deviation occurs mainly in the extremes
- Some issues with uniformity of residuals within groups (overall, years, units)

Best model call:

## glmmTMB::glmmTMB(formula = Response ~ cos_td_rad + sin_td_rad + 
##     Unit_Label + year_ct + (1 | site/point_name), data = MainData, 
##     family = "nbinom2", ziformula = ~0, dispformula = ~1, na.action = na.fail)

Best model summary:

##  Family: nbinom2  ( log )
## Formula:          
## Response ~ cos_td_rad + sin_td_rad + Unit_Label + year_ct + (1 |  
##     site/point_name)
## Data: MainData
## 
##      AIC      BIC   logLik deviance df.resid 
##   8850.5   8925.7  -4411.2   8822.5     1580 
## 
## Random effects:
## 
## Conditional model:
##  Groups          Name        Variance Std.Dev.
##  point_name:site (Intercept) 0.04392  0.2096  
##  site            (Intercept) 0.06298  0.2510  
## Number of obs: 1594, groups:  point_name:site, 574; site, 70
## 
## Dispersion parameter for nbinom2 family (): 3.24 
## 
## Conditional model:
##                                   Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                       0.241306   0.186624   1.293  0.19601    
## cos_td_rad                        0.534530   0.110482   4.838 1.31e-06 ***
## sin_td_rad                       -0.543913   0.104557  -5.202 1.97e-07 ***
## Unit_LabelNegev Highlands         0.557312   0.171071   3.258  0.00112 ** 
## Unit_LabelInland Sands            0.412763   0.245670   1.680  0.09293 .  
## Unit_LabelLoess Areas             0.934791   0.172801   5.410 6.32e-08 ***
## Unit_LabelTransition Zone         0.781998   0.174690   4.476 7.59e-06 ***
## Unit_LabelPlanted Conifer Forest  0.828017   0.149020   5.556 2.75e-08 ***
## Unit_LabelMediterranean Maquis    0.966479   0.145232   6.655 2.84e-11 ***
## Unit_LabelBatha Vegetation        1.597276   0.157678  10.130  < 2e-16 ***
## year_ct                          -0.010151   0.008288  -1.225  0.22062    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Best model coefficients:

Effect	Estimate	Std. Error	z value	Pr(>\|z\|)	2.5 %	97.5 %
(Intercept)	0.241	0.187	1.293	0.196	-0.124	0.607
cos_td_rad	0.535	0.110	4.838	0.000	0.318	0.751
sin_td_rad	-0.544	0.105	-5.202	0.000	-0.749	-0.339
Unit_LabelNegev Highlands	0.557	0.171	3.258	0.001	0.222	0.893
Unit_LabelInland Sands	0.413	0.246	1.680	0.093	-0.069	0.894
Unit_LabelLoess Areas	0.935	0.173	5.410	0.000	0.596	1.273
Unit_LabelTransition Zone	0.782	0.175	4.476	0.000	0.440	1.124
Unit_LabelPlanted Conifer Forest	0.828	0.149	5.556	0.000	0.536	1.120
Unit_LabelMediterranean Maquis	0.966	0.145	6.655	0.000	0.682	1.251
Unit_LabelBatha Vegetation	1.597	0.158	10.130	0.000	1.288	1.906
year_ct	-0.010	0.008	-1.225	0.221	-0.026	0.006

Effect plots - year:

Slightly decreasing, almost horizontal
Partial residuals look better

x	predicted	std.error	conf.low	conf.high	group	FoldIncrease	PercentIncrease	FoldDecline	PercentDecline	Delta
0	5.976	0.053	5.385	6.632	1	1.000	0.000	1.000	0.000	0.000
2	5.856	0.043	5.379	6.376	1	0.980	-2.010	1.021	2.010	-0.120
3	5.797	0.040	5.358	6.272	1	0.970	-2.999	1.031	2.999	-0.179
4	5.738	0.039	5.320	6.189	1	0.960	-3.979	1.041	3.979	-0.238
5	5.680	0.039	5.265	6.128	1	0.951	-4.949	1.052	4.949	-0.296
6	5.623	0.041	5.193	6.089	1	0.941	-5.909	1.063	5.909	-0.353
7	5.566	0.044	5.107	6.067	1	0.931	-6.859	1.074	6.859	-0.410
8	5.510	0.049	5.010	6.060	1	0.922	-7.800	1.085	7.800	-0.466
9	5.454	0.054	4.907	6.063	1	0.913	-8.731	1.096	8.731	-0.522

Effect plots - unit:

Partial residuals looks better
Highest in Herbaceous and Dwarf-Shrub Vegetation (Batha vegetation in the figures) and Mediterranean Maquis
Lowest in the Inland Sands unit
Intermediate in all other units

x	predicted	std.error	conf.low	conf.high	group	Fold	PercentDiffernce	Delta
Arid South	2.401	0.128	1.867	3.087	1	1.000	0.000	0.000
Negev Highlands	4.192	0.116	3.341	5.259	1	1.746	74.597	1.791
Inland Sands	3.628	0.209	2.406	5.470	1	1.511	51.099	1.227
Loess Areas	6.115	0.114	4.889	7.648	1	2.547	154.668	3.714
Transition Zone	5.248	0.120	4.149	6.639	1	2.186	118.584	2.847
Planted Conifer Forest	5.495	0.077	4.729	6.385	1	2.289	128.878	3.094
Mediterranean Maquis	6.311	0.070	5.502	7.240	1	2.629	162.867	3.910
Batha Vegetation	11.860	0.092	9.903	14.204	1	4.940	393.956	9.459

Results summary:

Taxa	Birds
Scale	National
SpeGroup	Resident
DataGroup	Time
Response	Abundance
Call	glmmTMB::glmmTMB(formula = Response ~ cos_td_rad + sin_td_rad + Unit_Label + year_ct + (1 \| site/point_name), data = MainData, family = “nbinom2”, ziformula = ~0, dispformula = ~1, na.action = na.fail)
N_Obs	1594
df	14
df.residuals	1580
logLIk	-4411.23
AIC	8850.46
BIC	8925.696
deviance	8822.46
sigma	3.235275
ContrastFar_Marginal	NA
ContrastNear_Marginal	NA
Contrast_Type	NA
Contrast_Fold	NA
Num_Years	9
Year_Start_Marginal	5.976214
Year_End_Marginal	5.45442
Year_Type	Decrease
YearFold_1Year	1.010203
YearFold_NumYear	1.095664
YearPercent_1Year	1.009987
YearPercent_NumYear	8.731179

Session info:

## R version 4.2.3 (2023-03-15 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22621)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=he_IL.UTF-8  LC_CTYPE=he_IL.UTF-8    LC_MONETARY=he_IL.UTF-8
## [4] LC_NUMERIC=C            LC_TIME=he_IL.UTF-8    
## 
## attached base packages:
## [1] grid      splines   stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] extrafont_0.19    gridExtra_2.3     ggeffects_1.2.1   ggnewscale_0.4.9 
##  [5] gghalves_0.1.4    GGally_2.1.2      data.table_1.14.8 effects_4.2-3    
##  [9] carData_3.0-5     DHARMa_0.4.6      MuMIn_1.47.5      glmmTMB_1.1.8    
## [13] EnvStats_2.8.1    kableExtra_1.3.4  knitr_1.42        lubridate_1.9.2  
## [17] forcats_1.0.0     stringr_1.5.0     dplyr_1.1.1       purrr_1.0.1      
## [21] readr_2.1.4       tidyr_1.3.0       tibble_3.2.1      ggplot2_3.4.2    
## [25] tidyverse_2.0.0   rmarkdown_2.21   
## 
## loaded via a namespace (and not attached):
##   [1] TH.data_1.1-2      minqa_1.2.5        colorspace_2.1-1  
##   [4] ellipsis_0.3.2     sjlabelled_1.2.0   snakecase_0.11.0  
##   [7] estimability_1.4.1 rstudioapi_0.14    farver_2.1.1      
##  [10] fansi_1.0.4        mvtnorm_1.2-0      xml2_1.3.3        
##  [13] codetools_0.2-19   doParallel_1.0.17  cachem_1.0.7      
##  [16] jsonlite_1.8.4     nloptr_2.0.3       Cairo_1.6-0       
##  [19] Rttf2pt1_1.3.12    shiny_1.7.4        compiler_4.2.3    
##  [22] httr_1.4.5         emmeans_1.8.6      Matrix_1.5-5      
##  [25] fastmap_1.1.1      survey_4.2         cli_3.6.1         
##  [28] later_1.3.0        htmltools_0.5.5    tools_4.2.3       
##  [31] coda_0.19-4        gtable_0.3.3       glue_1.6.2        
##  [34] Rcpp_1.0.10        jquerylib_0.1.4    vctrs_0.6.1       
##  [37] svglite_2.1.1      nlme_3.1-162       extrafontdb_1.0   
##  [40] iterators_1.0.14   insight_0.19.1     xfun_0.38         
##  [43] rbibutils_2.2.16   lme4_1.1-32        rvest_1.0.3       
##  [46] mime_0.12          timechange_0.2.0   lifecycle_1.0.3   
##  [49] gap.datasets_0.0.6 MASS_7.3-58.3      zoo_1.8-11        
##  [52] scales_1.2.1       ragg_1.2.5         promises_1.2.0.1  
##  [55] hms_1.1.3          parallel_4.2.3     sandwich_3.1-0    
##  [58] TMB_1.9.6          RColorBrewer_1.1-3 yaml_2.3.7        
##  [61] sass_0.4.5         reshape_0.8.9      stringi_1.7.12    
##  [64] highr_0.10         gap_1.5-3          foreach_1.5.2     
##  [67] boot_1.3-28.1      qgam_1.3.4         Rdpack_2.6        
##  [70] rlang_1.1.0        pkgconfig_2.0.3    systemfonts_1.0.4 
##  [73] evaluate_0.20      lattice_0.21-8     labeling_0.4.2    
##  [76] tidyselect_1.2.0   plyr_1.8.8         magrittr_2.0.3    
##  [79] R6_2.5.1           generics_0.1.3     multcomp_1.4-23   
##  [82] DBI_1.1.3          pillar_1.9.0       withr_2.5.0       
##  [85] mgcv_1.8-42        survival_3.5-5     nnet_7.3-18       
##  [88] utf8_1.2.3         tzdb_0.3.0         digest_0.6.31     
##  [91] webshot_0.5.4      xtable_1.8-6       httpuv_1.6.9      
##  [94] numDeriv_2022.9-1  textshaping_0.3.6  stats4_4.2.3      
##  [97] munsell_0.5.0      viridisLite_0.4.1  bslib_0.4.2       
## [100] mitools_2.4

HaMaarag_SoN23_BirdsNational_Time_Abundance_NonSynanthrope

Yoni Gavish

2024-03-05

Dr. Yoni Gavish & Ron Chen

Data ; Science ; Environment

Statistical and Ecological Solutions

General information on the current analysis:

Take home messages:

- Abundance decreased non-significantly (p=0.221) by 1.010% per year (1.010 fold per year) and by 8.731% (1.096 fold) over the entire period:

A few general comments:

1. Prepare the working environment

2. Load data and source functions

3. Arrange the Traits table

4. Arrange and filter the Plots table

5. Clean Data and identify non-rare species

6. Create a dataframe with the relevant variables for analysis

7. Explore the Abundance data frame

8. Model Abundance - Idenitify Family and Fixed factors

9. Model Abundance - Identfiy best family variant of the negative binomial model

10. Diagnose the best model with simulated residuals (DHARMa)

11. Compute marginal effects for figures

13. Figures for years

14. Figures for unit

15. Calculate the folds and percent change

16. Extract information and write to files

17. Summary of main findings

Data structure:

Visual inspection of Response:

Model fitting (glmmTMB):

Best model call:

Best model summary:

Best model coefficients:

Effect plots - year:

Effect plots - unit:

Results summary:

Session info: