Species distribution modeling

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

## Warning: package 'dismo' was built under R version 4.1.3

## Loading required package: raster

## Warning: package 'raster' was built under R version 4.1.3

## Loading required package: sp

## Warning: package 'sp' was built under R version 4.1.3

## Warning: package 'maptools' was built under R version 4.1.3

## Checking rgeos availability: TRUE
## Please note that 'maptools' will be retired by the end of 2023,
## plan transition at your earliest convenience;
## some functionality will be moved to 'sp'.

## Loading required namespace: rJava

## This is MaxEnt version 3.4.3

1

We will use the same data to illustrate all models, except that some models cannot use categorical variables. So for those models we drop the categorical variables from the predictors stack.

##2 We use the Bradypus data for presence of a species. First we make a training and a testing set.

To speed up processing, let’s restrict the predictions to a more restricted area (defined by a rectangular extent):

Background data for training and a testing set. The first layer in the RasterStack is used as a ‘mask’. That ensures that random points only occur within the spatial extent of the rasters, and within cells that are not NA, and that there is only a single absence point per cell. Here we further restrict the background points to be within 12.5% of our specified extent ‘ext’.

#Bioclim

## class          : ModelEvaluation 
## n presences    : 23 
## n absences     : 200 
## AUC            : 0.6890217 
## cor            : 0.1765706 
## max TPR+TNR at : 0.08592151

##Find a threshold

## [1] 0.08592151

## And we use the RasterStack with predictor variables to make a prediction to a RasterLayer:

## class      : RasterLayer 
## dimensions : 112, 116, 12992  (nrow, ncol, ncell)
## resolution : 0.5, 0.5  (x, y)
## extent     : -90, -32, -33, 23  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : layer 
## values     : 0, 0.7096774  (min, max)

##The Domain algorithm

## class          : ModelEvaluation 
## n presences    : 23 
## n absences     : 200 
## AUC            : 0.7097826 
## cor            : 0.2138087 
## max TPR+TNR at : 0.7107224

##Mahalanobis distance

## class          : ModelEvaluation 
## n presences    : 23 
## n absences     : 200 
## AUC            : 0.7686957 
## cor            : 0.1506777 
## max TPR+TNR at : 0.1116504

## Classical regression models

##   pa bio1 bio12 bio16 bio17 bio5 bio6 bio7 bio8 biome
## 1  1  263  1639   724    62  338  191  147  261     1
## 2  1  263  1639   724    62  338  191  147  261     1
## 3  1  253  3624  1547   373  329  150  179  271     1
## 4  1  243  1693   775   186  318  150  168  264     1
## 5  1  252  2501  1081   280  326  154  172  270     1
## 6  1  240  1214   516   146  317  150  168  261     2

Generalized Linear

Here we fit two basic glm models. All variables are used, but without interaction terms.

##logistic regression:

## 
## Call:
## glm(formula = pa ~ bio1 + bio5 + bio6 + bio7 + bio8 + bio12 + 
##     bio16 + bio17, family = binomial(link = "logit"), data = envtrain)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.73818  -0.49933  -0.23999  -0.06579   2.91501  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  3.0540581  1.5774812   1.936  0.05286 . 
## bio1         0.0906378  0.0577068   1.571  0.11626   
## bio5         0.2403921  0.2520843   0.954  0.34028   
## bio6        -0.3227360  0.2541727  -1.270  0.20417   
## bio7        -0.3212603  0.2528026  -1.271  0.20380   
## bio8        -0.0133843  0.0255062  -0.525  0.59976   
## bio12        0.0020951  0.0006931   3.023  0.00250 **
## bio16       -0.0023747  0.0014546  -1.633  0.10256   
## bio17       -0.0047152  0.0015473  -3.047  0.00231 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 596.69  on 892  degrees of freedom
## Residual deviance: 449.77  on 884  degrees of freedom
## AIC: 467.77
## 
## Number of Fisher Scoring iterations: 8

##  (Intercept)         bio1         bio5         bio6         bio7         bio8 
##  3.054058115  0.090637822  0.240392091 -0.322736029 -0.321260278 -0.013384258 
##        bio12        bio16        bio17 
##  0.002095136 -0.002374688 -0.004715157

## class          : ModelEvaluation 
## n presences    : 23 
## n absences     : 200 
## AUC            : 0.8156522 
## cor            : 0.308183 
## max TPR+TNR at : -2.565312

## class          : ModelEvaluation 
## n presences    : 23 
## n absences     : 200 
## AUC            : 0.7886957 
## cor            : 0.3629842 
## max TPR+TNR at : 0.08711959

#Machine learning methods

## This is MaxEnt version 3.4.3

##A response plot:

## class          : ModelEvaluation 
## n presences    : 23 
## n absences     : 200 
## AUC            : 0.8336957 
## cor            : 0.3789954 
## max TPR+TNR at : 0.1772358

##Random Forest

## Warning: package 'randomForest' was built under R version 4.1.3

## randomForest 4.7-1.1

## Type rfNews() to see new features/changes/bug fixes.

## Warning in randomForest.default(m, y, ...): The response has five or fewer
## unique values. Are you sure you want to do regression?

## class          : ModelEvaluation 
## n presences    : 23 
## n absences     : 200 
## AUC            : 0.8580435 
## cor            : 0.5010053 
## max TPR+TNR at : 0.1060667

##Support Vector Machines

## Warning: package 'kernlab' was built under R version 4.1.3

## 
## Attaching package: 'kernlab'

## The following objects are masked from 'package:raster':
## 
##     buffer, rotated

## class          : ModelEvaluation 
## n presences    : 23 
## n absences     : 200 
## AUC            : 0.7576087 
## cor            : 0.3738667 
## max TPR+TNR at : 0.02857293

#Combining model predictions ###Now we can compute the simple average: ###Weighted mean of three models:

Species distribution modeling

Ciaran Kelly

10/11/2022

R Markdown

1

Generalized Linear