Threats, context, and factors needed to protect African biodiversity.

We’ll use long-term data from Kruger National Park to explore R. This will lead into the endangered species act.

Resources

Resources/data/knpHerbivores.rdata

Objectives

For next time

Read The future of sub-Saharan Africa’s biodiversity in the face of climate and societal change.

  1. We continue to look at long-term population data for megaherbivores at Kruger National Park. Climate change is but one of several threats to African biodiversity. Select one the challenges included in Figure 1 of this paper and summarize how funding from developed countries could be targeted to have the most impact, directly or indirectly, on conservation efforts.

Post no more than two paragraphs to Sakai for your group.

  1. Post your answers to questions on the analysis from last time to Sakai. Do this in groups.

Substrates at KNP, with acidic, infertile rocks in the west, fertile volcanic rocks in the east.

Continue today

Background reading: Clark, J. S., C. L. Scher, and M. Swift. 2020. The emergent interactions that govern biodiversity change. Proceedings of the National Academy of Sciences, 117, 17074-17083.

Here is a function to read:

columnSplit <- function(vec, sep='_', ASFACTOR = F, ASNUMERIC=F,
                        LASTONLY=F){
  vec <- as.character(vec)
  nc  <- length( strsplit(vec[1], sep)[[1]] )
  mat <- matrix( unlist( strsplit(vec, sep) ), ncol=nc, byrow=T )
  if(LASTONLY & ncol(mat) > 2){
    rnn <- mat[,1]
    for(k in 2:(ncol(mat)-1)){
      rnn <- columnPaste(rnn,mat[,k])
    }
    mat <- cbind(rnn,mat[,ncol(mat)])
  }
  if(ASNUMERIC){
    mat <- matrix( as.numeric(mat), ncol=nc )
  }
  if(ASFACTOR){
    mat <- data.frame(mat)
  }
  mat
}

Load data

There are three objects to load here:

load( 'data/knpHerbivores.rdata', verbose = T )
## Loading objects:
##   xdata
##   ydata
##   edata

All three of these objects are organized as site-year (rows) by variables (columns). The data.frame ydata holds counts from a fixed-wing aircraft. Sample effort is edata in km\(^2\). Predictors are in data.frame xdata.

The first goal is to look at the relationship between rainfall and species abundances. I use the function tapply to obtain precipitation means by site and year. In this example the site is the column shape. I then determine the anomalies from the mean.

prec     <- xdata$rainfall
precYear <- tapply( prec, xdata$year, mean, na.rm=T)   # aggregate to mean for year
precSite <- tapply( prec, xdata$shape, mean, na.rm=T)  # aggregate to mean for site

Exercise 1: Look at the objects xdata, precYear, and precSite.

How are they related?
  1. What are the dimensions of each object?
  2. What are the names (or rownames/colnames) of the elements in each object?
  3. How do their dimensions relate to one another?
  4. How many sites are there?
  5. How many years are there?

Year plot

The function plot( x, y ) needs the numeric vectors x and y. They must have the same length. I use this function to plot annual precipitation:

year <- as.numeric( names(precYear) )
plot( jitter( xdata$year ), prec, cex=.1 )
lines( year, precYear, lwd = 2 )

Exercise 2: Look at the objects xdata$year and prec.

How are they related?
  1. What are the lengths of each vector?
  2. What are the names of elements in each object, and what do these names represent?
  3. What is the object created by names(precYear)?
  4. What is the object created by as.numeric( names(precYear) ), and why is it used here?

Exercise 3:For a normally distributed random variable, 95% of points lie within 1.96 standard deviations of the mean. Add lines for the mean \(\pm 95\%\) of observations to the graph. To get the standard deviation use the function tapply, replacing mean with sd (for standard deviation).

Map plot

I now want to see how rainfall is distributed across the map. The locations of sites are held the xdata columns lon and lat. The mean precipitation for each site is held in the object I created precSite. So I need to get the location of each site in precSite from xdata. I do this by matching the names of precSite with the rows in xdata that have the matching shape column. Once I have these matching rows, I can get lon and lat for those rows. Here’s the algorithm:

Step 1. match the sites between names(precSite) and xdata$shape:

mm    <- match( names( precSite ), xdata$shape )    

# check that they match:
all.equal( names(precSite), xdata$shape[mm] )    # are they equivalent?
## [1] TRUE
# show me:
rbind( names(precSite), xdata$shape[mm] )[,1:5] # they are the same
##      [,1]    [,2]       [,3]       [,4]       [,5]      
## [1,] "shp_1" "shp_10.1" "shp_10.2" "shp_10.3" "shp_10.4"
## [2,] "shp_1" "shp_10.1" "shp_10.2" "shp_10.3" "shp_10.4"

Exercise 5: Which rows in xdata match the names of precSite?

Step 2. Generate a color palette for dry to wet:

Here’s a source I often use for colors. I select a diverging color scheme and export. From the javaScript option I copy the hex codes. I use the function colorRampPalette with this color vector to generate a function cfun that will return a new vector of interpolated colors of length n:

library( scales )
co    <- c('#8c510a','#bf812d','#dfc27d','#80cdc1','#35978f','#01665e') # color ramp
cfun  <- colorRampPalette( co )    
n     <- 12
show_col( cfun( n ), borders = NA )

# show more colors:
show_col( cfun( 36 ), borders = NA, cex = .7 )

Step 3. Assign a color to each precSite value, ranging from low (dry) to high (wet):

bin    <- cut( precSite, n )        # assign each value to a color bin
bnames <- names( table( bin ) )     # find the unique bins
cc     <- match( bin, bnames )      # replace bin name with the interval number
col    <- cfun( n )[ cc ]           # assign the color for that bin

Step 4. Assign a symbol size to each precSite value, ranging from low (snakk) to high (large):

Symbol size in the plot function is a scaling factor, the default being cex = 1. I decide to select a range of scaling factors from near zero (0.2) to 1.2:

cex   <- .4 + (precSite - min(precSite))/(max(precSite) - min(precSite))

Exercise 6: What is the object cfun, and what arguments does it take? How long are the vectors col and cex, and what do they hold?

Here is a map of mean precipitation for each site, based on color and size:

plot( xdata$lon[mm], xdata$lat[mm], cex = cex, col = col, asp = 1, pch = 16 )

Exercise 7: What is the range of rainfall across this map? What why asp used in the call to plot?

Anomalies

I can express variation over years or over sites as anomolies. An anomaly is often taken as the difference between a point and its mean value divided by its standard deviation. Here are anomalies from the site means and the year means.

site  <- as.character( xdata$shape )  # character mode needed for names
yname <- as.character( xdata$year )

precYearSd   <- tapply( prec, yname, sd, na.rm=T)               # aggregate to mean for year
precSiteSd   <- tapply( prec, site, sd, na.rm=T)                # aggregate to mean for site
precSiteAnom <- (prec - precSite[ site ])/precSiteSd[ site ]    # yr anom for site
precYearAnom <- (prec - precYear[ yname ])/precYearSd[ yname ]  # site anom for year

par( mfrow = c(2,1), bty = 'n', mar = c(4,4,1,1) )
plot( jitter( xdata$year ), prec, cex=.1, xlab = '')            # the absolute value
lines( as.numeric(names(precYear)), precYear )

plot( jitter( xdata$year ), precYearAnom, cex=.1, xlab = 'Year') # distribution of anomalies by year
abline(h=0, lty=2)

Exercise 8: What are the objects called site and yname? Why are they characters? What purpose do they serve? What are the units for anomalies?