Reptiles - CoastalSands alpha diversity boxplots and GLMs

Model gma, abundance and richness

Richness done with rare species. Abundance and mean abundance done without rare species.

richness

Explore data and plot mean-variance plot. There is a strong relationship, indicating that employing GLMs is the proper way to analyze, rather than OLS (assumption of homogeneity is violated).

## [1] "RICHNESS WITH RARE SPECIES"

## [1] "RICHNESS WITHOUT RARE SPECIES"

## Overlapping points were shifted along the y-axis to make them visible.

## 
##  PIPING TO 2nd MVFACTOR

## Only the variables Chalcides.sepsoides, Stenodactylus.sthenodactylus, Acanthodactylus.scutellatus, Chalcides.ocellatus, Acanthodactylus.schreiberi, Lytorhynchus.diadema, Daboia.palaestinae, Psammophis.schokari, Spalerosophis.diadema, Macroprotodon.cucullatus, Eryx.jaculus, Testudo.graeca were included in the plot 
## (the variables with highest total abundance).

Fit Poisson glm, check for existence of overdispersion

## [1] "Estimating overdispersion parameter phi: (Res. Dev.)/(n-p) where n=number of observations; p=number of parameters in the model."

## [1] "od = 1.11449779007892"

Overdispersion parameter is >1 (but marginally) and therefore Poisson is inappropriate. Check to see if negative binomial / gaussian are a better fit (gaussian is essentially OLS regression)

## Warning in theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace =
## control$trace > : iteration limit reached

## Warning in theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace =
## control$trace > : iteration limit reached

##                 df      AIC
## mdl_r.poiss.int  6 265.0697
## mdl_r.gauss.int  7 265.3580
## mdl_r.negb.int   7 267.0699

Seems that the Poisson fits the data better. Fit fixed and mixed models. Find the most parsimonious model using AIC (lowest AIC is the most probable model, preventing loss of information)

## fixed-effect model matrix is rank deficient so dropping 2 columns / coefficients

## [1] "which model is a better fit?"

##     df      AIC
## m0   6 265.0697
## me0  7 267.0273

## Start:  AIC=265.07
## richness ~ settlements * year_ct * dunes
## 
## 
## Step:  AIC=265.07
## richness ~ settlements + year_ct + dunes + settlements:year_ct + 
##     settlements:dunes + year_ct:dunes
## 
## 
## Step:  AIC=265.07
## richness ~ settlements + year_ct + dunes + settlements:year_ct + 
##     year_ct:dunes
## 
##                       Df Deviance    AIC
## - settlements:year_ct  1   55.949 263.29
## - year_ct:dunes        1   56.373 263.72
## <none>                     55.725 265.07
## 
## Step:  AIC=263.29
## richness ~ settlements + year_ct + dunes + year_ct:dunes
## 
##                 Df Deviance    AIC
## - settlements    1   55.949 261.29
## - year_ct:dunes  1   56.383 261.73
## <none>               55.949 263.29
## 
## Step:  AIC=261.29
## richness ~ year_ct + dunes + year_ct:dunes
## 
##                 Df Deviance    AIC
## - year_ct:dunes  1   56.384 259.73
## <none>               55.949 261.29
## 
## Step:  AIC=259.73
## richness ~ year_ct + dunes
## 
##           Df Deviance    AIC
## - dunes    1   57.193 258.54
## <none>         56.384 259.73
## - year_ct  1   85.069 286.41
## 
## Step:  AIC=258.54
## richness ~ year_ct
## 
##           Df Deviance    AIC
## <none>         57.193 258.54
## - year_ct  1   85.997 285.34

## 
## Call:  glm(formula = richness ~ year_ct, family = poisson, data = P)
## 
## Coefficients:
## (Intercept)      year_ct  
##      1.2488       0.1161  
## 
## Degrees of Freedom: 55 Total (i.e. Null);  54 Residual
## Null Deviance:       86 
## Residual Deviance: 57.19     AIC: 258.5

## [1] "summary of m0: full glm model with interactions between settlements, dunes and year:"

Observations	56
Dependent variable	richness
Type	Generalized linear model
Family	poisson
Link	log

χ²(5)	30.27
Pseudo-R² (Cragg-Uhler)	0.42
Pseudo-R² (McFadden)	0.11
AIC	265.07
BIC	277.22

	Est.	S.E.	z val.	p
(Intercept)	1.40	0.20	6.86	0.00
settlementsNear	-0.13	0.30	-0.43	0.66
year_ct	0.09	0.04	2.57	0.01
dunesshifting	-0.33	0.30	-1.07	0.28
settlementsNear:year_ct	0.02	0.05	0.47	0.64
settlementsNear:dunesshifting	NA	NA	NA	NA
year_ct:dunesshifting	0.04	0.05	0.80	0.42
settlementsNear:year_ct:dunesshifting	NA	NA	NA	NA
Standard errors: MLE

## [1] "summary of best model m1: richness ~year_ct:"

Observations	56
Dependent variable	richness
Type	Generalized linear model
Family	poisson
Link	log

χ²(1)	28.80
Pseudo-R² (Cragg-Uhler)	0.40
Pseudo-R² (McFadden)	0.10
AIC	258.54
BIC	262.59

	exp(Est.)	2.5%	97.5%	z val.	p
(Intercept)	3.49	2.73	4.45	10.01	0.00
year_ct	1.12	1.08	1.17	5.31	0.00
Standard errors: MLE

## Warning in predict.lm(object, newdata, se.fit, scale = residual.scale, type =
## if (type == : prediction from a rank-deficient fit may be misleading

## Warning in predict.lm(object, newdata, se.fit, scale = residual.scale, type =
## if (type == : prediction from a rank-deficient fit may be misleading

## Warning in predict.lm(object, newdata, se.fit, scale = residual.scale, type =
## if (type == : prediction from a rank-deficient fit may be misleading

## [1] "check if using only far data produces different results for the dunes factor (no shifting dunes near settlements)."

Observations	38
Dependent variable	richness
Type	Generalized linear model
Family	poisson
Link	log

χ²(3)	19.94
Pseudo-R² (Cragg-Uhler)	0.41
Pseudo-R² (McFadden)	0.11
AIC	176.21
BIC	182.77

	Est.	S.E.	z val.	p
(Intercept)	1.40	0.20	6.86	0.00
year_ct	0.09	0.04	2.57	0.01
dunesshifting	-0.33	0.30	-1.07	0.28
year_ct:dunesshifting	0.04	0.05	0.80	0.42
Standard errors: MLE

Richness: significant effect found for year.

geometric mean of abundance

Explore data.

## [1] "GEOMETRIC MEAN ABUNDANCE WITH RARE SPECIES"

## [1] "GEOMETRIC MEAN ABUNDANCE WITHOUT RARE SPECIES"

## Warning: Removed 1 rows containing non-finite values (`stat_boxplot()`).
## Removed 1 rows containing non-finite values (`stat_boxplot()`).
## Removed 1 rows containing non-finite values (`stat_boxplot()`).

Fit Gamma glmm, compare with gaussian

##                 df      AIC
## mdl_g.gamma.int  7 191.2509
## mdl_g.gauss.int  7 183.5736

Seems that the gaussian fits the data better. Fit fixed and mixed models. Find the most parsimonious model using AIC (lowest AIC is the most probable model, preventing loss of information)

## Warning in glmer(data = P_no_rare, formula = gma ~ settlements * year_ct * :
## calling glmer() with family=gaussian (identity link) as a shortcut to lmer() is
## deprecated; please call lmer() directly

## fixed-effect model matrix is rank deficient so dropping 2 columns / coefficients

## [1] "which model is a better fit?"

##     df      AIC
## m0   7 183.5736
## me0  8 194.9551

## Start:  AIC=183.57
## gma ~ settlements * year_ct * dunes
## 
## 
## Step:  AIC=183.57
## gma ~ settlements + year_ct + dunes + settlements:year_ct + settlements:dunes + 
##     year_ct:dunes
## 
## 
## Step:  AIC=183.57
## gma ~ settlements + year_ct + dunes + settlements:year_ct + year_ct:dunes
## 
##                       Df Deviance    AIC
## - settlements:year_ct  1   70.906 182.05
## - year_ct:dunes        1   71.098 182.20
## <none>                     70.289 183.57
## 
## Step:  AIC=182.05
## gma ~ settlements + year_ct + dunes + year_ct:dunes
## 
##                 Df Deviance    AIC
## - year_ct:dunes  1   71.262 180.33
## <none>               70.906 182.05
## - settlements    1   73.566 182.08
## 
## Step:  AIC=180.33
## gma ~ settlements + year_ct + dunes
## 
##               Df Deviance    AIC
## - dunes        1   73.233 179.83
## - settlements  1   73.848 180.29
## <none>             71.262 180.33
## - year_ct      1  126.309 209.81
## 
## Step:  AIC=179.83
## gma ~ settlements + year_ct
## 
##               Df Deviance    AIC
## - settlements  1   74.357 178.67
## <none>             73.233 179.83
## - year_ct      1  128.280 208.66
## 
## Step:  AIC=178.67
## gma ~ year_ct
## 
##           Df Deviance    AIC
## <none>         74.357 178.67
## - year_ct  1  128.639 206.81

## 
## Call:  glm(formula = gma ~ year_ct, family = gaussian, data = P_no_rare)
## 
## Coefficients:
## (Intercept)      year_ct  
##      1.5602       0.3974  
## 
## Degrees of Freedom: 54 Total (i.e. Null);  53 Residual
##   (1 observation deleted due to missingness)
## Null Deviance:       128.6 
## Residual Deviance: 74.36     AIC: 178.7

## [1] "summary of m0: full glm model with interactions between settlements, dunes and year:"

Observations	55 (1 missing obs. deleted)
Dependent variable	gma
Type	Linear regression

χ²(5)	58.35
Pseudo-R² (Cragg-Uhler)	0.47
Pseudo-R² (McFadden)	0.16
AIC	183.57
BIC	197.62

	Est.	S.E.	t val.	p
(Intercept)	1.55	0.55	2.82	0.01
settlementsNear	-0.06	0.84	-0.08	0.94
year_ct	0.47	0.11	4.34	0.00
dunesshifting	0.05	0.78	0.07	0.95
settlementsNear:year_ct	-0.11	0.16	-0.66	0.51
settlementsNear:dunesshifting	NA	NA	NA	NA
year_ct:dunesshifting	-0.12	0.15	-0.75	0.46
settlementsNear:year_ct:dunesshifting	NA	NA	NA	NA
Standard errors: MLE

## [1] "summary of best model m1: gma ~year_ct:"

Observations	55 (1 missing obs. deleted)
Dependent variable	gma
Type	Linear regression

χ²(1)	54.28
Pseudo-R² (Cragg-Uhler)	0.43
Pseudo-R² (McFadden)	0.15
AIC	178.67
BIC	184.69

	exp(Est.)	2.5%	97.5%	t val.	p
(Intercept)	4.76	2.51	9.04	4.77	0.00
year_ct	1.49	1.31	1.69	6.22	0.00
Standard errors: MLE

## Warning in predict.lm(object, newdata, se.fit, scale = residual.scale, type =
## if (type == : prediction from a rank-deficient fit may be misleading

## Warning in predict.lm(object, newdata, se.fit, scale = residual.scale, type =
## if (type == : prediction from a rank-deficient fit may be misleading

## Warning in predict.lm(object, newdata, se.fit, scale = residual.scale, type =
## if (type == : prediction from a rank-deficient fit may be misleading

Significant effects found for gma only for year.

abundance

Explore data

## [1] "ABUNDANCE WITH RARE SPECIES"

## [1] "ABUNDANCE WITHOUT RARE SPECIES"

Fit poisson glmm, compare with negative binomial and gaussian

##                 df       AIC
## mdl_a.negb.int   7  528.7272
## mdl_a.poiss.int  6 1732.1042
## mdl_a.gauss.int  7  559.4941

Seems that the negative binomial fits the data better. Fit fixed and mixed models. Find the most parsimonious model using AIC (lowest AIC is the most probable model, preventing loss of information)

## boundary (singular) fit: see help('isSingular')

## [1] "which model is a better fit?"

##     df      AIC
## m0   7 528.7272
## me0  8 530.7272

## Start:  AIC=526.73
## abundance ~ settlements * year_ct * dunes
## 
## 
## Step:  AIC=526.73
## abundance ~ settlements + year_ct + dunes + settlements:year_ct + 
##     settlements:dunes + year_ct:dunes
## 
## 
## Step:  AIC=526.73
## abundance ~ settlements + year_ct + dunes + settlements:year_ct + 
##     year_ct:dunes
## 
##                       Df Deviance    AIC
## - settlements:year_ct  1   62.141 524.77
## - year_ct:dunes        1   62.261 524.89
## <none>                     62.098 526.73
## 
## Step:  AIC=524.77
## abundance ~ settlements + year_ct + dunes + year_ct:dunes
## 
##                 Df Deviance    AIC
## - settlements    1   62.135 522.79
## - year_ct:dunes  1   62.231 522.89
## <none>               62.111 524.77
## 
## Step:  AIC=522.79
## abundance ~ year_ct + dunes + year_ct:dunes
## 
##                 Df Deviance    AIC
## - year_ct:dunes  1   62.242 520.92
## <none>               62.117 522.79
## 
## Step:  AIC=520.92
## abundance ~ year_ct + dunes
## 
##           Df Deviance    AIC
## - dunes    1   64.093 520.88
## <none>         62.136 520.92
## - year_ct  1   92.289 549.07
## 
## Step:  AIC=520.85
## abundance ~ year_ct
## 
##           Df Deviance    AIC
## <none>         62.318 520.85
## - year_ct  1   91.205 547.73

## 
## Call:  glm.nb(formula = abundance ~ year_ct, data = P_no_rare, init.theta = 1.293529425, 
##     link = log)
## 
## Coefficients:
## (Intercept)      year_ct  
##      2.0275       0.3603  
## 
## Degrees of Freedom: 55 Total (i.e. Null);  54 Residual
## Null Deviance:       91.21 
## Residual Deviance: 62.32     AIC: 522.8

## [1] "summary of m0: full glm model with interactions between settlements, dunes and year:"

## Error in glm.control(...) : 
##   unused argument (family = list("Negative Binomial(1.338)", "log", function (mu) 
## log(mu), function (eta) 
## pmax(exp(eta), .Machine$double.eps), function (mu) 
## mu + mu^2/.Theta, function (y, mu, wt) 
## 2 * wt * (y * log(pmax(1, y)/mu) - (y + .Theta) * log((y + .Theta)/(mu + .Theta))), function (y, n, mu, wt, dev) 
## {
##     term <- (y + .Theta) * log(mu + .Theta) - y * log(mu) + lgamma(y + 1) - .Theta * log(.Theta) + lgamma(.Theta) - lgamma(.Theta + y)
##     2 * sum(term * wt)
## }, function (eta) 
## pmax(exp(eta), .Machine$double.eps), expression({
##     if (any(y < 0)) stop("negative values not allowed for the negative binomial family")
##     n <- rep(1, nobs)
##     mustart <- y + (y == 0)/6
## }), function (mu) 
## all(mu > 0), function (eta) 
## TRUE, function (object, nsim) 
## {
##     ftd <- fitted(object)
##     rnegbin(nsim * length(ftd), ftd, .Theta)
## }))

## Warning: Something went wrong when calculating the pseudo R-squared. Returning NA
## instead.

Observations	56
Dependent variable	abundance
Type	Generalized linear model
Family	Negative Binomial(1.338)
Link	log

χ²(NA)	NA
Pseudo-R² (Cragg-Uhler)	NA
Pseudo-R² (McFadden)	NA
AIC	528.73
BIC	542.90

	Est.	S.E.	z val.	p
(Intercept)	2.30	0.41	5.59	0.00
settlementsNear	-0.20	0.60	-0.33	0.74
year_ct	0.33	0.08	4.08	0.00
dunesshifting	-0.67	0.59	-1.13	0.26
settlementsNear:year_ct	0.03	0.12	0.30	0.76
settlementsNear:dunesshifting	NA	NA	NA	NA
year_ct:dunesshifting	0.07	0.11	0.57	0.57
settlementsNear:year_ct:dunesshifting	NA	NA	NA	NA
Standard errors: MLE

## [1] "summary of best model m1: abundance ~year_ct:"

## Error in glm.control(...) : 
##   unused argument (family = list("Negative Binomial(1.2935)", "log", function (mu) 
## log(mu), function (eta) 
## pmax(exp(eta), .Machine$double.eps), function (mu) 
## mu + mu^2/.Theta, function (y, mu, wt) 
## 2 * wt * (y * log(pmax(1, y)/mu) - (y + .Theta) * log((y + .Theta)/(mu + .Theta))), function (y, n, mu, wt, dev) 
## {
##     term <- (y + .Theta) * log(mu + .Theta) - y * log(mu) + lgamma(y + 1) - .Theta * log(.Theta) + lgamma(.Theta) - lgamma(.Theta + y)
##     2 * sum(term * wt)
## }, function (eta) 
## pmax(exp(eta), .Machine$double.eps), expression({
##     if (any(y < 0)) stop("negative values not allowed for the negative binomial family")
##     n <- rep(1, nobs)
##     mustart <- y + (y == 0)/6
## }), function (mu) 
## all(mu > 0), function (eta) 
## TRUE, function (object, nsim) 
## {
##     ftd <- fitted(object)
##     rnegbin(nsim * length(ftd), ftd, .Theta)
## }))

## Warning: Something went wrong when calculating the pseudo R-squared. Returning NA
## instead.

Observations	56
Dependent variable	abundance
Type	Generalized linear model
Family	Negative Binomial(1.2935)
Link	log

χ²(NA)	NA
Pseudo-R² (Cragg-Uhler)	NA
Pseudo-R² (McFadden)	NA
AIC	522.85
BIC	528.92

	exp(Est.)	2.5%	97.5%	z val.	p
(Intercept)	7.60	4.68	12.32	8.22	0.00
year_ct	1.43	1.31	1.57	7.52	0.00
Standard errors: MLE

## Warning in predict.lm(object, newdata, se.fit, scale = residual.scale, type =
## if (type == : prediction from a rank-deficient fit may be misleading

## Warning in predict.lm(object, newdata, se.fit, scale = residual.scale, type =
## if (type == : prediction from a rank-deficient fit may be misleading

## Warning in predict.lm(object, newdata, se.fit, scale = residual.scale, type =
## if (type == : prediction from a rank-deficient fit may be misleading

Abundance data: significant effects found for KKL plantings settlements, year (decrease) and for interactions between year and KKL plantings.

Reptiles - CoastalSands alpha diversity boxplots and GLMs

Ron Chen

2023-01-30

Coastal Plain Sands

Model gma, abundance and richness

richness

geometric mean of abundance

abundance