Making Response Plots with Maxnet

By Laura Duncan
May 1, 2017

Maxnet is an R package that for fitting Maxent species distribution models using the R package glmnet, rather than outsourcing to the Maxent java application. The Maxnet package was released recently (February 11, 2017, paper forthcoming by Phillips et al. in Ecography). In order to actually map species distribution models, the maxnet package will require developments such as incorperating a linkage between the Maxnet package and dismo. However, Maxnet can produce valuable response plots for maxent models within R. This tutorial shows how to bring together a species occurrence dataset with environmental predictor values in order to find response plots using maxnet.

First, we will be needing the following packages. If they are not installed, you will need to click the “install packages” icon under the packages tab in Rstudio and install them.

library(raster)
library(rgdal)
library(maps)
library(mapdata)
library(BIEN)
library(dismo)
library(rJava)
library(maptools)
library(jsonlite)
library(glmnet)
library(maxnet)

Next, we need to import climate data for our predictor variables. I used bioclimatic predictor data from WorldClim (http://worldclim.org, Hijmans et al. 2005). The WorldClim dataset has obtained current climate information at high resolution and predicted the future climate incorperating climate change using machine learning algorithms. Here, we access the WorldClim dataset at a resolution of 2.5 arcseconds of latitude and longitude. We can then trim down the environmental data to encompass only the variables and geographic area we are interested in stuyding, so I trimmed the data to contain only variables pertaining to precipitation or temperature, in the continental United States.

#get data
#current environmental data from Worldclim
currentEnv=getData("worldclim", var="bio", res=2.5)
#predictions for 2070
futureEnv=getData('CMIP5', var='bio', res=2.5, rcp=85, model='HE', year=70)
names(futureEnv)=names(currentEnv)
#limit bioclimactic predictors
currentEnv=dropLayer(currentEnv, c("bio2", "bio3", "bio4", "bio10", "bio11", "bio13", "bio14", "bio15" ))
futureEnv=dropLayer(futureEnv, c("bio2", "bio3", "bio4", "bio10", "bio11", "bio13", "bio14", "bio15" ))

#making model past v. future climate:
model.extent<-extent(-130,-60,25,50) #numbers are for map of continental US
modelEnv=crop(currentEnv,model.extent)
modelFutureEnv=crop(futureEnv, model.extent)

Next, we access species. I used species occurrence records for rice, Oryza sativa, from the BIEN dataset (http://bien.nceas.ucsb.edu/bien/), and trimmed the set to include only values with both latitude and longitude recorded in the continental United States.

#access species from BIEN dataset
Oryza_sativa<-BIEN_occurrence_species(species="Oryza sativa")
ZP<-BIEN_occurrence_species(species="Zizania palustris")
#get rid of samples without latitude and longitude values
Oryza_sativa=subset(Oryza_sativa, !is.na(longitude) & !is.na(latitude))
#limit data range (this is to North America, can change) 
Oryza_sativa <- Oryza_sativa[Oryza_sativa$lon < -60 & Oryza_sativa$lat > 25 , ]

Here, I check to see how the Oryza sativa occurrences look before putting them in conversation with the Worldclim predictions, by mapping their datapoints in red. I can also layer in other data that I have imported, such as the species occurrence records shown here in purple for wild rice, Zizania palustris.

map('worldHires',fill=T , col= "light yellow", bg="light blue",xlim = c(-130,-60),ylim = c(25,50))
points(cbind(Oryza_sativa$longitude,Oryza_sativa$latitude),col="red", pch=20, cex=0.3)
points(cbind(ZP$longitude,ZP$latitude),col="purple", pch=20, cex=0.3)

plot of chunk unnamed-chunk-5

I can also plot the species occurrence records in tandem with the WorldClim current and future climate data. Here, I choose the current Worldclim data for predictor bio17, the Precipitation of the Driest Quarter.

plot(modelEnv[["bio17"]]/10, main="Precipitation of Driest Quarter")
map('worldHires',xlim = c(-130,-60), ylim = c(25,50), fill=FALSE, add=TRUE)
points(cbind(Oryza_sativa$longitude,Oryza_sativa$latitude),col="red", pch=20, cex=0.3
Error: <text>:4:0: unexpected end of input
2: map('worldHires',xlim = c(-130,-60), ylim = c(25,50), fill=FALSE, add=TRUE)
3: points(cbind(Oryza_sativa$longitude,Oryza_sativa$latitude),col="red", pch=20, cex=0.3
  ^

By overlaying the datapoints against the 2070 predictions, we can compare the future predictions to our previous plot:

#compare to 2070
#plot(futureEnv[["bio17"]]/10, main="Precipitation of Driest Quarter")
plot(modelFutureEnv[["bio1"]]/10, main="Future Annual Mean Temperature")
map('worldHires',xlim = c(-130,-60), ylim = c(25,50), fill=FALSE, add=TRUE)
points(cbind(Oryza_sativa$longitude,Oryza_sativa$latitude),col="red", pch=20, cex=0.3)

plot of chunk unnamed-chunk-7

Here, we can already see that changes will be taking place in the future! But it would be useful to have a measure of how climate variables and species distribution seem to be correlated. This is where the maxnet package comes into play. In order to use the maxnet package, though, we will need to combine our datasets so that the presense points for occurance and absance values (random background points) are together with predictor variables to make a dataset that maxnet can use. I have commented line-by-line in order to elucidate the function of each line of code.

#we will withold 20% of our data as test data to test the accuracy of the model later
#using the other 80% of the data to make the model
Oryza_sativaocc=cbind.data.frame(Oryza_sativa$longitude,Oryza_sativa$latitude) #first, trim the data frame to include only latitudes and longitudes for the model
fold <- kfold(Oryza_sativaocc, k=5) # add an index that makes five random groups of observations
Oryza_sativatest <- Oryza_sativaocc[fold == 1, ] # hold out one fifth as test data
Oryza_sativatrain <- Oryza_sativaocc[fold != 1, ] # the other four fifths are training data

#now we make a new environment that extracts the model environment values for our occurance values
TrainEnv <- extract(modelEnv, Oryza_sativatrain)
#then, we take 1000 random background points, and extract the values for our model environment for those points
#these are our background points, our absence points
set.seed(0)
backgr <- randomPoints(modelEnv, 1000)
absvals <- extract(modelEnv, backgr)
#we make a presence/absence collumn where 1=presence for our occurance points, and 0=absence for our background points
presabs <- c(rep(1, nrow(TrainEnv)), rep(0, nrow(absvals)))
#we make a dataset that combines the presence and absence data, the background points with their model environment values, 
#and the occurance points with their model environment values
sdmdata <- data.frame(cbind(presabs, rbind(TrainEnv, absvals)))
#we make a subset of that dataset, without the presence and absence values
data <- sdmdata[,-1]
#we run the maxnet function to fit the SDM
#maxnet fits using glmnet
Oryza_sativa.me<-maxnet(sdmdata$presabs, data) 

Now that we have modeled using maxnet, we can access our variable response plots. The maxnet package is capable of producing three types of response plots based on the linear predictor (lp) and entropy of the exponential model over background data: exponential, which uses the function exp(lp), complementary log-log (cloglog), which uses the function 1-exp(-exp(entropy+lp)), and logistic, which uses the function 1/(1+exp(-entropy-lp)), according to the maxnet reference manual: https://cran.r-project.org/web/packages/maxnet/maxnet.pdf

Here is the command to generate exponential response plots:

plot(Oryza_sativa.me, type="exponential")

plot of chunk unnamed-chunk-9

Here is the command to generate the cloglog response plots:

plot(Oryza_sativa.me, type="cloglog")

plot of chunk unnamed-chunk-10

And here is the command to generate logistic response plots:

plot(Oryza_sativa.me, type="logistic")

plot of chunk unnamed-chunk-11

A helpful next step in the development of the Maxnet package would be to connect maxnet's response capabilities with dismo's graphing capabilities. Unfortunately, the recently released Maxent package is not to such a point, yet. However, the maxent package can fully investigate species distribution modeling, by accessing the maxent function through a java application. If you choose to use maxent for producing species distribution models, the applet can be downloaded here: http://biodiversityinformatics.amnh.org/open_source/maxent/. Once you download maxent, you'll need to make sure it is placed in the following folder which can be reached through the command: system.file(“java”, package=“dismo”). A great tutorial for species distribution mapping using Maxent can be found here: https://rpubs.com/kerkhoffa/SDMMaxent