Processing Environmental Data

Hannah Owens
1 February, 2017

Getting Started

Libraries

library(raster); 
  #For raster-based loading, calculations, and mapping
library(rgdal);
  #For reading the M polygon

Order of Operations

  • Introducing raster stacks
  • Masking environmental datasets to training regions
  • Reducing correlation among environmental variables

Introducing Raster Stacks

Remember this?

The simple way to load a raster data layer

altitude <- raster(x = "~/Dropbox/GBIF:BID/FINAL_DATA/ProvidedData/ForOpenLab/Terrestrial/BioClim2_5/Present/altitude.asc");
crs(altitude) <- "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"
plot(altitude, main="Altitude");

plot of chunk unnamed-chunk-2

Enter the raster stack

setwd("~/Dropbox/GBIF:BID/FINAL_DATA/ProvidedData/ForOpenLab/Terrestrial/BioClim2_5/Present/");
envtList <- list.files(pattern = ".asc");
envtStack <- stack(envtList);
crs(envtStack) <- "+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"

Enter the raster stack

plot(envtStack);

plot of chunk unnamed-chunk-4

Masking

Defining a training region from which background is drawn.

  • mask(x, mask,…)
  • Input can be raster or raster stack
  • Mask can be a shapefile or a raster
  • Must be the same extent
    • Use extend() and/or crop()

Masking: Loading your "M"

setwd("~/Dropbox/ENMSeminar/Labs:Homeworks/Lab5/Lab5Data/");
copperPheasant <- readOGR("./SyrmaticusSoemmerringii/Syrmaticus soemmerringii.shp");
OGR data source with driver: ESRI Shapefile 
Source: "/Users/HannahOwens/Dropbox/ENMSeminar/Labs:Homeworks/Lab5/Lab5Data/SyrmaticusSoemmerringii/Syrmaticus soemmerringii.shp", layer: "Syrmaticus soemmerringii"
with 1 features
It has 1 fields
crs(copperPheasant) <- crs(envtStack);

Masking: Loading your "M"

plot(envtStack[[1]], main="Altitude");
plot(copperPheasant, add = TRUE);

plot of chunk unnamed-chunk-6

Masking: Taking it to the next level

copperPheasantTraining <- mask(envtStack, copperPheasant)
writeRaster(
  copperPheasantTraining, filename = "SyrmaticusSoemmerringii", 
  format = "ascii", bylayer = T, suffix=names(envtStack), 
  NAFlag = "-9999", overwrite = T);

Masking: Taking it to the next level

plot(copperPheasantTraining, ylim = c(24,46), xlim = c(122,146));

plot of chunk unnamed-chunk-8

Raster Correlations

When it comes to predictor variables, more isn't always better!

  • Highly correlated variables lead to overfit models.

Raster Correlations

Checking variable correlations.

pairs(copperPheasantTraining);

plot of chunk unnamed-chunk-9

Raster Correlations

Removing the highly correlated variables.

reducedCopperPheasantTraining <- copperPheasantTraining[[-3:-6]];
  #Removes Bio 10 through Bio 13.
reducedCopperPheasantTraining <- reducedCopperPheasantTraining[[-6:-8]];
  #Removes Bio 17 through Bio 19
reducedCopperPheasantTraining <- reducedCopperPheasantTraining[[-9:-11]];
  #Removes Bio 5 through Bio 7

Raster Correlations

Final set.

pairs(reducedCopperPheasantTraining);

plot of chunk unnamed-chunk-11