Visualising and manipulating environmental data using the packages raster, ggplot2 and sf

Creating a new project

One of the most powerful and useful aspects of RStudio is its project management functionality. This allows us to ensure all our R Code, Data and R-generated figures are all stored in a single project folder in your directory.

To create a New Project in RStudio:

  • Click the File menu button, then New Project….
  • Click Existing Directory.
  • Browse… to the location where your saved folder ANM302_ComputerLab is stored.
  • Click on this folder, then Click Open. You want to read data (and write code, data and figures) to this ANM302_ComputerLab folder, not the in R or Shapes sub folders that lie within.
  • Click Create Project.

This step will allow you to keep your code for ANM302 together and easily accessible.

Now save and close any script files you have open, then quit RStudio.

Next, use your file browser to navigate to the ANM302_ComputerLab folder where your ANM302 data is stored. You should now see a file with the extension .Rproj. This is your ANM302 project.

Remembering that you have quit RStudio, you can now relaunch it by double-clicking your .Rproj file. This should be your “go-to” way of opening RStudio. It has one specific advantage beyond simply keeping track of your scripts: it automatically sets the working directory, saving us a bit of time and effort every week.

Installing packages

If this is the first time you have run this R script, you will need to install the necessary R packages to your local R library. We’ve put together a handy R script that allows you to install these packages in a couple of lines of code. You can source this R function from your local file using the code below.

#### Install packages
source("R/ANM302_install_packages.R")

#### Update packages
update.packages(ask = FALSE,
                lib = my_lib,
                repos = favourite_CRAN_mirror)

Note that you only install R packages once (well, until the next time you download R from the web).

Loading R libraries

Once packages are installed, you can load them from your local library using the library() function.

#### Load packages
library(raster)
library(rgdal)
library(maptools)
library(maps)
library(sf)
library(ggplot2)
library(tmap)
library(tmaptools)
library(ggthemes)
library(dplyr)

Importing data as a RasterLayer

The first step in manipulating environmental data in R is to import your dataset as a RasterLayer. This is a gridded object, with a value for the variable of interest in each grid cell. A gridded object is essentially a matrix of data organised in rows and columns. This is the format of pixelated images, such as .jpg. In the spatial context, grids are extremely useful, because we can think of maps as being specified in terms of latitude (rows) and longitude (columns).

We know that koala can be sensitive to heat stress, so perhaps a good place to start would be to explore the maximum temperatures that our study region in Southeast Queensland experiences.

We can get an indication of maximum temperature using climatological variables from the WorldClim portal http://worldclim.org. These climatological variables represent average conditions over the period 1970-2000. We’ll use the new version 2.0 dataset, at 30 arc-second resolution. As our surveys are taking place in September, we’ll use the monthly layer with the ‘09’ suffix.

We’ll use the raster function to import the data directly into our current R environment as a RasterLayer.

R knows this is an object with spatial characteristics.

tmax <- raster("Data/wc2.0_30s_tmax_09.tif")

We can look at the properties of the object, such as dimensions, resolution, extent and coordinate reference system by simply calling the object to the Console.

tmax
## class      : RasterLayer 
## dimensions : 21600, 43200, 933120000  (nrow, ncol, ncell)
## resolution : 0.008333333, 0.008333333  (x, y)
## extent     : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : wc2.0_30s_tmax_09.tif 
## names      : wc2.0_30s_tmax_09 
## values     : -57.9, 44.5  (min, max)

Initial visualisation

Let’s plot the data to check it. Note that because this is a BIG, global dataset, so it might take a moment or two to plot, especially on older machines.

plot(tmax)

Looks good.

But, this global dataset is too large for our purposes. Let’s try restricting it to the extent of the Sunshine Coast and surrounds.

We can do that using the crop function.

Cropping a raster to a smaller spatial extent

First, we need to define a spatial extent, using the extent function. For this, we need to know the vertices (corners) of our study area.

The function extent takes four arguments, in order: xmin, xmax, ymin, ymax. Remembering that we don’t have to name the arguments if we enter values in the same order as the arguments themselves, we can supply vertices using approximate latitude and longitude of the area we’re interested in.

studyarea <- extent(c(151, 154, -27, -24))

Or, alternatively, it’s possible to take a spatial extent from an existing raster or spatial object (assuming we have one, which we don’t, but we can use studyarea, which we just created to illustrate the principle).

studyarea2 <- extent(studyarea)

Now we can use this extent object to crop the original global RasterLayer, tmax, to the area we’ve specified.

The new extent is shown in the object summary.

tmaxSC <- crop(tmax, studyarea)
tmaxSC
## class      : RasterLayer 
## dimensions : 360, 360, 129600  (nrow, ncol, ncell)
## resolution : 0.008333333, 0.008333333  (x, y)
## extent     : 151, 154, -27, -24  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : wc2.0_30s_tmax_09 
## values     : 17.5, 25.9  (min, max)

Let’s run a quick check plot, too…

plot(tmaxSC)

Great! Looks familiar! So we now have the data for our study region.

End of Computer Lab 1

Reprojection to an appropriate Coordinate Reference System

Currently, our data are projected in a latitude-longitude Coordinate Reference System (CRS). However, this will distort the data, because the Earth is a sphere and we are plotting on a 2-dimensional plane. To eliminate associated artefacts, we’re going to want to make sure our data are projected appropriately.

Let’s reproject the data from a latitude-longitude format to a Universal Transverse Mercator (UTM) projection, which allows us to measure things in metres.

The UTM system divides the Earth’s surface into 60 zones, each 6° of longitude in width. Each has a corresponding alphanumeric code. You can see all of the UTM zones here: http://www.dmap.co.uk/utmworld.pdf. The region of Southeast Queensland that we are interested in is in UTM Zone 56J.

We need to tell R how to reproject our dataset to this UTM zone, using a standardised character string called a PROJ4 string. We can get the relevant PROJ4 string for our region from https://epsg.io (the previous resource at http://spatialreference.org is now depreciated).

epsg.io doesn’t need the ‘J’ from ‘56J’, so we can drop that. It does, however, need to know that we are in the Southern Hemisphere, so we add an ‘S’ to make ‘UTM Zone 56S’ and use that as our search term on the home page. That gives us three options. We need to select the ‘WGS84’ option. This is the datum that is used in most GPS technologies. If we look at the EPSG number next to the reference to “WGS 84 / UTM zone 56S”, it gives us the numeric code assigned to this projection - “32756”.

We’ll assign it to an object, here called UTM56S.

UTM56S <- CRS("EPSG:32756")

Now we’ll use this to reproject our cropped RasterLayer, tmaxSC. We’ll use the built-in projectRaster function.

tmaxSC.UTM <- projectRaster(tmaxSC, crs = UTM56S)
tmaxSC.UTM
## class      : RasterLayer 
## dimensions : 372, 377, 140244  (nrow, ncol, ncell)
## resolution : 837, 923  (x, y)
## extent     : 291528, 607077, 7007032, 7350388  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=56 +south +datum=WGS84 +units=m +no_defs 
## source     : memory
## names      : wc2.0_30s_tmax_09 
## values     : 17.5917, 25.9  (min, max)

Success! You can see that the coord.ref description has changed to the new projection, as has the extent.

Just while we’re at it, let’s take the CRS of the original RasterLayer; we’ll likely need it later:

lonlat <- projection(tmaxSC) # This is the same as coding lonlat <- CRS("EPSG:4326")

Changing colour scales

R has a number of built-in colour scales. There is a good guide to colours in R here:https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/colorPaletteCheatsheet.pdf

We could use the rainbow scale to replot the data. The rev function reverses the colour scale, so that warmer colours are in warmer shades.

plot(tmaxSC.UTM, col = rev(rainbow(255)))

We can also change the range of values defining the colour scale, using the zlim argument to plot.

plot(tmaxSC.UTM, col = rev(rainbow(255)), zlim = c(15,27))

End of Computer Lab 2

Changing the spatial resolution of a RasterLayer

You may wish to change the spatial resolution of your RasterLayer. This means changing the size of the grid cells. You can do this using the resample function.

First, let’s get the resolution of the original layer (this is in metres, as x and then y):

res(tmaxSC.UTM)
## [1] 837 923

Then, we need to define a new raster grid with appropriate extent, and projection. We can redefine the resolution using a new object, res.new

Let’s go for 1km x 1km.

res.new <- c(1000,1000)
newgrid <- raster(ext = extent(tmaxSC.UTM), crs = projection(tmaxSC.UTM), res = res.new)
tmaxSC.1km <- resample(tmaxSC.UTM, newgrid)
tmaxSC.1km
## class      : RasterLayer 
## dimensions : 343, 316, 108388  (nrow, ncol, ncell)
## resolution : 1000, 1000  (x, y)
## extent     : 291528, 607528, 7007388, 7350388  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=56 +south +datum=WGS84 +units=m +no_defs 
## source     : memory
## names      : wc2.0_30s_tmax_09 
## values     : 17.81538, 25.8987  (min, max)
plot(tmaxSC.1km, col = rev(rainbow(255)), zlim = c(15,27))

We could also use aggregate to make the spatial resolution of an existing RasterLayer coarser. Let’s aggregate our 1-km resolution RasterLayer to make a 5-km resolution RasterLayer.

tmaxSC.5km <- aggregate(tmaxSC.1km, fact = 5, fun=mean, na.rm = TRUE)
tmaxSC.5km
## class      : RasterLayer 
## dimensions : 69, 64, 4416  (nrow, ncol, ncell)
## resolution : 5000, 5000  (x, y)
## extent     : 291528, 611528, 7005388, 7350388  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=56 +south +datum=WGS84 +units=m +no_defs 
## source     : memory
## names      : wc2.0_30s_tmax_09 
## values     : 18.68501, 25.83309  (min, max)
plot(tmaxSC.5km, col = rev(rainbow(255)), zlim = c(15,27))

End of Computer Lab 3

Plotting with ggplot vs tmap

First, ggplot

OK, so we’ve used base plotting routines, but we’re going to get more fancy, so we need to start playing in ggplot. Unfortunately, we need to convert a RasterLayer into a data.frame before we can use ggplot to make a map. Fortunately, however, this isn’t difficult:

dgrid <- as.data.frame(tmaxSC.1km, xy = TRUE) # Returns xy coordinates for each cell with the corresponding values in those cells
    names(dgrid)[3] <- "vals" # Rename the values to something less unwieldy

Now, we can use ggplot:

ggplot() +
    geom_tile(data = dgrid,
                        aes(x = x, y = y, fill = vals))

Not perfect, but interesting. Let’s save this plot to an object, then modify it:

p1 <- ggplot() +
            geom_tile(data = dgrid,
                                aes(x = x, y = y, fill = vals))

OK, so we now have an object called p1, which is the basic plot, which we can now modify!

Let’s start by modifying the colour. Previously, we have used rainbow, but it is more common to use a red-yellow-blue diverging colour palette. But where can we find out what sorts of palette’s are available, you might ask? Well, here is the most useful place I know of: http://colorbrewer2.org/#type=sequential&scheme=BuGn&n=3.

The default page identifies a blue-green (‘BuGN’) sequential palette. Let’s see if we can find a red-yellow-blue diverging palette. Yes, it’s called ‘RdYlBu’. Here’s how we use it:

p2 <- p1 +
        scale_fill_distiller(type = "seq", palette = "RdYlBu")
p2

This seems to have warm areas mapped in red, cool areas in blue, and intermediate temperatures mapped in lighter shades. We could reverse this pattern using the argument direction = -1, but we don’t need to do so here. But can we get rid of that dark grey sea?

p2 <- p1 +
        scale_fill_distiller(type = "seq", palette = "RdYlBu", na.value = "grey99", # grey99 is pretty close to white, but not quite...
                                                 name = "Temperature (°C)") # A name for the legend 
p2

Another thing that appears problematic is the aspect ratio: the map seems “stretched out” horizontally. Since our base RasterLayer has a resolution of 1 km x 1 km, we could use an equal-coordinate projection on p2 to resolve this:

p3 <- p2 +
            coord_equal()
p3

Looking good…but perhaps we want to clear up some niggles, like the coordinates being huge numbers. We could fix this sort of thing manually, but we can also resort to a theme. There are plenty of neat themes in package ggthemes, but the one we want is theme_map:

p4 <- p3 +
            theme_map() + # Use a clean map theme
            theme(legend.position = c(0.525, 0.615), # Place the legend near the top in an area of blank sea
                        legend.background = element_blank())  # Make the the legend background transparent
p4

That’s looking quite good now. Of course, we have built the plot stage by stage here (to save us repeating the code), but you could easily have done it in one go.

Adding contours

Perhaps we want to add isoterms at 1° intervals to the plot…we can do this in ggplot using the contour geom:

p5 <- p4 +
    geom_contour(data = dgrid,
                                aes(x = x, y = y, z = vals),
                             binwidth = 1, # Contours at 1° intervals
                             colour = grey(0.5), # Make them light grey
                             size = .5, # Half the default line width
                             alpha = .5) # Slightly transparent
p5

Adding a scalebar and north arrow

The ggspatial package allows the functionality to add a scalebar and a north arrow to our ggplot2 objects. If you have installed the ggspatial package, you can add a scale bar using the annotation_scale() and annotation_north_arrow() functions

library(ggspatial)
p6 <- p5 + 
  ggspatial::annotation_scale(width_hint = 0.5,  #A scale bar 
                              location = "br") + # In the bottom-right 
  ggspatial::annotation_north_arrow(which_north = "true", # Point to actual north -
                                    location = "tr", # In the bottom-right-hand corner of the plot 
                                    height = unit(.5, "cm"), width = unit(.5, "cm"), # Make it small-ish 
                                    pad_x = unit(1, "cm"), pad_y = unit(1, "cm"), # move it in from the frame edge by a cm in each direction 
                                    style = north_arrow_orienteering(text_size = 5)) # Make the "N" half the size of its 10-point default 
p6

Next, tmap
tm_shape(tmaxSC.1km) + # The baseline data; uses "+" to add complexity
    tm_raster(palette = "-RdYlBu", # Note that the legend scale runs low (top) to high (bottom), so reverse the colour sequence
                        style = "cont", # If you omit this, you get temperature categories, instead
                        title = "Mean Sept Temp (°C)") + # Legend title
    tm_layout(legend.position = c("right", "top"), # Position of the legend
                        legend.title.size = .8, # Make legend title slightly smaller
                        legend.text.size = .6) + # Make labelling text smaller
    tm_scale_bar() + # Add a scale bar
    tm_compass(position = c("right", "top")) + # Add a compass arrow in the top right - note how it knows that there is a legend there, already
    tm_graticules() # If you happen to want them

For those who want to try something really cool:
In your console, try switching to an interactive view by running this code: tmap_mode("view")…then recreate the map. Notice how you can now zoom in and out? If you search the internet a bit, you will find all sorts of cool background layers you can add like this. To return to the conventional map for printing, run tmap_mode("plot").

Save the map as a PDF

We should know how to do this from ANM203…

ggsave("My first map.pdf", p5)

End of Computer Lab 4

Writing a RasterLayer to a portable format

We’ve made some new objects for our project, which we now need to save outside the R environment. To do this, we can use the writeRaster function.

R has a native format for raster files, known as .grd. To save in .grd format,

writeRaster(tmaxSC.UTM, filename = "tmaxSC_UTM.grd", overwrite= TRUE)
## class      : RasterLayer 
## dimensions : 372, 377, 140244  (nrow, ncol, ncell)
## resolution : 837, 923  (x, y)
## extent     : 291528, 607077, 7007032, 7350388  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=56 +south +datum=WGS84 +units=m +no_defs 
## source     : tmaxSC_UTM.grd 
## names      : wc2.0_30s_tmax_09 
## values     : 17.5917, 25.9  (min, max)

This works well enough, but the resultant files are not transferable to other platforms such as ArcGIS. An sensible alternative is to save as a different, more portable format, such as ASCII or GeoTIFF.

NB. Be sure to change the file extension, and don’t use full stops in the filename, or this might not work.

writeRaster(tmaxSC, filename = "tmaxSC.asc", format = "ascii", overwrite = TRUE) # An ASCII file readable by other GIS packages
writeRaster(tmaxSC.UTM, filename = "tmaxSC_UTM.tif", format = "GTiff", overwrite = TRUE) # A georeferenced TIFF, or GeoTIFF

For a list of other possible formats, see ?writeRaster.

End of Computer Lab 4

Making a RasterStack

RasterStacks allow us to store several RasterLayers of matching extent, resolution and projection in one multi-layer object.

Let’s make a RasterStack of maximum temperature and minimum temperature over the study area.

First, we need to repeat the steps above to create a RasterLayer of minimum temperature.

tmin <- raster("Data/wc2.0_30s_tmin_09.tif")
tminSC <- crop(tmin, studyarea)
tminSC.UTM <- projectRaster(tminSC, crs=UTM56S)
plot(tminSC.UTM, col = rev(rainbow(255)))

Now we can stack the two RasterLayers using the stack function.

tStack <- stack(tmaxSC.UTM, tminSC.UTM)
tStack
## class      : RasterStack 
## dimensions : 372, 377, 140244, 2  (nrow, ncol, ncell, nlayers)
## resolution : 837, 923  (x, y)
## extent     : 291528, 607077, 7007032, 7350388  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=56 +south +datum=WGS84 +units=m +no_defs 
## names      : wc2.0_30s_tmax_09, wc2.0_30s_tmin_09 
## min values :         17.591704,          6.837061 
## max values :              25.9,              18.2

We can now see in the dimensions slot of tStack that there are two layers, and these are listed by names.

Basic raster algebra

One advantage of a RasterStack is that we can do quick-and-dirty plots:

plot(tStack)

Results are not pretty, but are OK for an initial inspection.

Another advantage (because the values in each grid cell line up across RasterLayers) is that we can manipulate layers simultaneously, and do fancy maths. Let’s illustrate this by using our RasterStack of maximum and minimum temperatures to calculate a new RasterLayer of temperature difference.

To achieve this, we will subtract minimum temperature tmin from maximum temperature tmax. We can do this easily, just by subtracting our tmin RasterLayer from the tmax RasterLayer.

tvar.UTM <- tmaxSC.UTM - tminSC.UTM
tvar.UTM
## class      : RasterLayer 
## dimensions : 372, 377, 140244  (nrow, ncol, ncell)
## resolution : 837, 923  (x, y)
## extent     : 291528, 607077, 7007032, 7350388  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=56 +south +datum=WGS84 +units=m +no_defs 
## source     : memory
## names      : layer 
## values     : 5, 14.9785  (min, max)

As we can see, this yields a single RasterLayer of the same resolution and extent as the RasterStack, and with each grid cell containing the difference between the minimum and maximum temperatures for that grid cell.

Performing calculations over a RasterStack using overlay

We could also use the function overlay, with a short user-defined function that we want to apply to the layers, using x and y. For more information on the range of ways in which one can apply this function, see ?overlay:

tvarSC.UTM <- overlay(tStack, fun = function(x, y) {x-y})
tvarSC.UTM
## class      : RasterLayer 
## dimensions : 372, 377, 140244  (nrow, ncol, ncell)
## resolution : 837, 923  (x, y)
## extent     : 291528, 607077, 7007032, 7350388  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=56 +south +datum=WGS84 +units=m +no_defs 
## source     : memory
## names      : layer 
## values     : 5, 14.9785  (min, max)

Now try to use ggplot() or tmap() to create a more impactful figure using this new tvarSC.UTM object.

End of Computer Lab 5

Representing point locations as SpatialPoints

OK, so rasters are a great way to store, manipulate and plot data. But we often want to extract data from maps. This entails querying spatial points with in the mapped domain, and extracting resultant data.

Assuming we have a series of sample locations in the file called PointLocations.csv, let’s see what we can do:

pts <- read.csv("PointLocations.csv") # Get the data
head(pts)
##          x         y
## 1 152.8292 -25.43750
## 2 152.3125 -26.97083
## 3 151.5542 -24.60417
## 4 151.4875 -24.42917
## 5 151.9542 -26.10417
## 6 151.5125 -24.82083

So, we have a data frame of x (= longitude) and y (= latitude) coordinates. Before we can really use these in a mapping context (at least in a serious way), we need to convert them into a type of spatial object. We do this with the sf (Simple Features) package.

When we create new spatial points objects from an excel table, we need to assgn our new points object a CRS (in this case WGS84).

pts.sp <- st_as_sf(pts, coords = c("x", "y")) # Convert pts from a data frame into an sf object, using x and y as coordinates
st_crs(pts.sp) <- CRS(lonlat) # Make sure that we assign the lon-lat CRS so that R knows how this object is projected
pts.sp
## Simple feature collection with 50 features and 0 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 151.0208 ymin: -26.97083 xmax: 152.9625 ymax: -24.0375
## CRS:           GEOGCRS["unknown",
##     DATUM["World Geodetic System 1984",
##         ELLIPSOID["WGS 84",6378137,298.257223563,
##             LENGTHUNIT["metre",1]],
##         ID["EPSG",6326]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433],
##         ID["EPSG",8901]],
##     CS[ellipsoidal,2],
##         AXIS["longitude",east,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433,
##                 ID["EPSG",9122]]],
##         AXIS["latitude",north,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433,
##                 ID["EPSG",9122]]]]
## First 10 features:
##                      geometry
## 1   POINT (152.8292 -25.4375)
## 2  POINT (152.3125 -26.97083)
## 3  POINT (151.5542 -24.60417)
## 4  POINT (151.4875 -24.42917)
## 5  POINT (151.9542 -26.10417)
## 6  POINT (151.5125 -24.82083)
## 7  POINT (152.7292 -26.75417)
## 8  POINT (151.1292 -26.04583)
## 9  POINT (152.6792 -25.37083)
## 10 POINT (151.0708 -26.47917)

Great. R tells us that we now have 50 point ‘geometries’ that are projected on the standard lon-lat scale. Obviously, we couldn’t plot these directly onto our UTM-projected map, because the map and the points are on different projections, so we need to fix that before proceeding.

Reprojecting spatial data

Let’s reproject our lon-lat sf geometries into UTM space:

pts.UTM <- st_transform(pts.sp, UTM56S)
pts.UTM
## Simple feature collection with 50 features and 0 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 302702 ymin: 7016610 xmax: 496272.5 ymax: 7340466
## Projected CRS: WGS 84 / UTM zone 56S
## First 10 features:
##                    geometry
## 1  POINT (482822.8 7186596)
## 2    POINT (431771 7016610)
## 3  POINT (353627.2 7278112)
## 4  POINT (346664.7 7297420)
## 5  POINT (395424.3 7112360)
## 6  POINT (349668.9 7254072)
## 7    POINT (473071 7040764)
## 8  POINT (312823.5 7117898)
## 9  POINT (467722.5 7193951)
## 10 POINT (307701.4 7069806)

As simple as that…although it takes a little practice to avoid making mistakes!

Plotting spatial objects over raster objects

Now, we can simply add our sptial points to a plot of a projected raster:

plot(tmaxSC.UTM, col = rev(rainbow(255)))
plot(pts.UTM, add = TRUE)

Or in ggplot:

p5 +
    geom_sf(data = pts.UTM)

Notice how R adjusts the projection of the plot to match the projection of the sf point geometries we added. We can now also see the power of writing a ggplot to an object: it saves us many lines of typing code!

Sampling environmental data from point locations

Plotting the points is just one trick. A equally powerful trick is extracting data from the raster cells corresponding to those points. For this, we use the extract function:

tmax.pts <- extract(tmaxSC.UTM, pts.UTM)
tmax.pts
##  [1] 24.50460 24.31241 23.21335 24.64500 23.77582 22.78361 23.21331 23.04236
##  [9] 24.50000 23.34173 21.22707 24.79328 24.58616 24.20788 22.63787 24.49802
## [17] 25.10000 23.77705 24.97118 24.54060 24.70000 21.29193 25.04275 22.73327
## [25] 22.34494 22.79420 23.83144 24.92029 23.09407 24.19928 23.20700 23.93175
## [33] 24.60000 24.58031 23.10618 23.60000 24.00734 22.42545 24.93936 25.06429
## [41] 22.87119 23.39578 24.77719 24.74081 22.87933 23.75841 23.67101 25.23747
## [49] 25.13868 23.07944

Plot the extracted values as a histogram - are they as we would expect?

hist(tmax.pts, col="grey", main="Max temp (°C)")

Sampling environmental data from point locations over a RasterStack

Extracting values from a single RasterLayer is a useful trick, but it would be much more useful if we could extract values over an entire RasterStack. And it turns out that we can:

tStack.pts <- as.data.frame(extract(tStack, pts.UTM))
head(tStack.pts)
##   wc2.0_30s_tmax_09 wc2.0_30s_tmin_09
## 1          24.50460          14.69953
## 2          24.31241          10.10000
## 3          23.21335          12.20597
## 4          24.64500          12.35187
## 5          23.77582          10.13756
## 6          22.78361          10.69949

As you can see, the names have been taken directly from the filenames used to build the RasterStack, so it is easy to keep track of which values go where. We can rename the variables using names:

names(tStack.pts) <- c("tmax", "tmin")
head(tStack.pts)
##       tmax     tmin
## 1 24.50460 14.69953
## 2 24.31241 10.10000
## 3 23.21335 12.20597
## 4 24.64500 12.35187
## 5 23.77582 10.13756
## 6 22.78361 10.69949

Combining extracted data and spatial locations into a dataframe

Of course, just having the values isn’t that useful if we don’t also have the meta-data that come along with those values, but luckily R returns the extracted values in the sequence of points used to extract them. So tying together the coordinates (or other meta-data) and extracted values is straightforward (except for the fact that we ):

latlons <- st_coordinates(pts.sp) # Convert sf point geometries into a matrix of coordinates
    # Note that we made pts.UTM from pts.sp, so these are in the same order!
latlons <- as.data.frame(latlons) # Convert from matrix to data frame
names(latlons) <- c("lon","lat") # Change the names
coords.UTM <- as.data.frame(st_coordinates(pts.UTM))
names(coords.UTM) <- c("UTMx", "UTMy")
temps <- cbind(latlons, coords.UTM, tStack.pts)
head(temps)
##        lon       lat     UTMx    UTMy     tmax     tmin
## 1 152.8292 -25.43750 482822.8 7186596 24.50460 14.69953
## 2 152.3125 -26.97083 431771.0 7016610 24.31241 10.10000
## 3 151.5542 -24.60417 353627.2 7278112 23.21335 12.20597
## 4 151.4875 -24.42917 346664.7 7297420 24.64500 12.35187
## 5 151.9542 -26.10417 395424.3 7112360 23.77582 10.13756
## 6 151.5125 -24.82083 349668.9 7254072 22.78361 10.69949

Presto!

End of Computer Lab 6

Importing shapefiles

Lets import a shapefile of Local Government Authority (LGA) boundaries from ESRI ArcGIS into our R environment.

**NB*.** Do not move or copy the .shp out of its folder, or you risk corrupting it!

councils <- st_read("Shapes/Local_Government_Areas.shp")
## Reading layer `Local_Government_Areas' from data source 
##   `/Users/rdwyer2/Library/CloudStorage/OneDrive-UniversityoftheSunshineCoast/USC Teaching/ANM302/2021/Workshops/ANM302_ComputerLab/Shapes/Local_Government_Areas.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 78 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 137.9946 ymin: -29.17927 xmax: 153.5519 ymax: -9.087991
## Geodetic CRS:  WGS 84
st_crs(councils) <- lonlat # Remember, I told you that you might need this CRS!

OK, so what sort of object do we have now?

class(councils) 
## [1] "sf"         "data.frame"

Interesting, it’s an sf that includes a data.frame.

What variables are in the data.frame?

names(councils) 
## [1] "ADMINTYPEN" "ADMINAREAN" "LGA_CODE"   "ABBREV_NAM" "LGA"       
## [6] "CA_AREA_SQ" "SHAPE_Leng" "SHAPE_Area" "geometry"

One of these is the listing of local government areas. Let’s look at it, then make sure that R knows this is a factor variable:

councils$LGA
##  [1] "Torres Shire"                     "Torres Strait Island Regional"   
##  [3] "Townsville City"                  "Weipa Town"                      
##  [5] "Western Downs Regional"           "Whitsunday Regional"             
##  [7] "Winton Shire"                     "Woorabinda Aboriginal Shire"     
##  [9] "Wujal Wujal Aboriginal Shire"     "Yarrabah Aboriginal Shire"       
## [11] "Aurukun Shire"                    "Balonne Shire"                   
## [13] "Banana Shire"                     "Barcaldine Regional"             
## [15] "Barcoo Shire"                     "Blackall Tambo Regional"         
## [17] "Boulia Shire"                     "Brisbane City"                   
## [19] "Bulloo Shire"                     "Bundaberg Regional"              
## [21] "Burdekin Shire"                   "Burke Shire"                     
## [23] "Cairns Regional"                  "Carpentaria Shire"               
## [25] "Cassowary Coast Regional"         "Central Highlands Regional"      
## [27] "Charters Towers Regional"         "Cherbourg Aboriginal Shire"      
## [29] "Cloncurry Shire"                  "Cook Shire"                      
## [31] "Croydon Shire"                    "Diamantina Shire"                
## [33] "Doomadgee Aboriginal Shire"       "Douglas Shire"                   
## [35] "Etheridge Shire"                  "Flinders Shire"                  
## [37] "Fraser Coast Regional"            "Gladstone Regional"              
## [39] "Gold Coast City"                  "Goondiwindi Regional"            
## [41] "Gympie Regional"                  "Hinchinbrook Shire"              
## [43] "Hope Vale Aboriginal Shire"       "Ipswich City"                    
## [45] "Isaac Regional"                   "Kowanyama Aboriginal Shire"      
## [47] "Livingstone Shire"                "Lockhart River Aboriginal Shire" 
## [49] "Lockyer Valley Regional"          "Logan City"                      
## [51] "Longreach Regional"               "Mackay Regional"                 
## [53] "Mapoon Aboriginal Shire"          "Maranoa Regional"                
## [55] "Mareeba Shire"                    "McKinlay Shire"                  
## [57] "Moreton Bay Regional"             "Mornington Shire"                
## [59] "Mount Isa City"                   "Murweh Shire"                    
## [61] "Napranum Aboriginal Shire"        "Noosa Shire"                     
## [63] "North Burnett Regional"           "Northern Peninsula Area Regional"
## [65] "Palm Island Aboriginal Shire"     "Paroo Shire"                     
## [67] "Pormpuraaw Aboriginal Shire"      "Quilpie Shire"                   
## [69] "Redland City"                     "Richmond Shire"                  
## [71] "Rockhampton Regional"             "Scenic Rim Regional"             
## [73] "Somerset Regional"                "South Burnett Regional"          
## [75] "Southern Downs Regional"          "Sunshine Coast Regional"         
## [77] "Tablelands Regional"              "Toowoomba Regional"
councils$LGA <- as.factor(councils$LGA)

Let’s subset the sf to just the four regional Local Government Authorities that we are interested in. We’ll do this using squar-bracket indexing, but we could as easily have used subset, filter or some other similar function:

counc4 <- councils[councils$LGA == "Noosa Shire" | councils$LGA == "Sunshine Coast Regional" | councils$LGA == "Fraser Coast Regional" | councils$LGA == "Gympie Regional",]

As with all spatial objects, we need to check the Coordinate Reference System is appropriate

st_crs(counc4)
## Coordinate Reference System:
##   User input: +proj=longlat +datum=WGS84 +no_defs 
##   wkt:
## GEOGCRS["unknown",
##     DATUM["World Geodetic System 1984",
##         ELLIPSOID["WGS 84",6378137,298.257223563,
##             LENGTHUNIT["metre",1]],
##         ID["EPSG",6326]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433],
##         ID["EPSG",8901]],
##     CS[ellipsoidal,2],
##         AXIS["longitude",east,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433,
##                 ID["EPSG",9122]]],
##         AXIS["latitude",north,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433,
##                 ID["EPSG",9122]]]]

This is in longitude-latitude. Let’s reproject to UTM Zone 56S

counc4.UTM <- st_transform(counc4, UTM56S)
counc4.UTM
## Simple feature collection with 4 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 375109.2 ymin: 7015250 xmax: 536365.9 ymax: 7268579
## Projected CRS: WGS 84 / UTM zone 56S
##          ADMINTYPEN              ADMINAREAN LGA_CODE     ABBREV_NAM
## 37 LOCAL GOVERNMENT   FRASER COAST REGIONAL     3220   FRASER COAST
## 41 LOCAL GOVERNMENT         GYMPIE REGIONAL     3620         GYMPIE
## 62 LOCAL GOVERNMENT             NOOSA SHIRE     5740          NOOSA
## 76 LOCAL GOVERNMENT SUNSHINE COAST REGIONAL     6720 SUNSHINE COAST
##                        LGA CA_AREA_SQ SHAPE_Leng SHAPE_Area
## 37   Fraser Coast Regional  7993.4859   5.500605 0.71778841
## 41         Gympie Regional  6936.0164   6.121007 0.62608890
## 62             Noosa Shire   870.4707   1.682407 0.07869282
## 76 Sunshine Coast Regional  2286.5828   2.753923 0.20737350
##                          geometry
## 37 MULTIPOLYGON (((465469.9 71...
## 41 MULTIPOLYGON (((468187.3 70...
## 62 MULTIPOLYGON (((482757 7068...
## 76 MULTIPOLYGON (((500294.2 70...

Piece of cake!

Now, we’ll plot the boundaries of these four Local Government Authorities on one of the RasterLayers that we made earlier. To do this, we simply use the add = TRUE argument to plot.

plot(tmaxSC.UTM)
plot(st_geometry(counc4.UTM), add = TRUE)

Note that we plot the st_geometry, because the sf object has many different variables, any of which could be used to provide a colour fill for the spatial polygons. Anyway, it seems as if the shapfile lines up perfectly with our reprojected RasterLayer, so all is good in the world.

As with many other objects, we can save this transformed sf object to our working directory as a shapefile, using the st_write function:

st_write(counc4.UTM,"LGA_areas.shp",append=FALSE)
## Deleting layer `LGA_areas' using driver `ESRI Shapefile'
## Writing layer `LGA_areas' to data source `LGA_areas.shp' using driver `ESRI Shapefile'
## Writing 4 features with 8 fields and geometry type Multi Polygon.

Here, we will receive errors if we try to write the shapefile when one with the same name already exists, so watch out for that.

Rasterizing a shapefile

It’s one thing to plot a shapefile, but it would be more useful if we could turn it into a RasterLayer. Let’s try:

LGA.raster <- rasterize(counc4.UTM, tmaxSC.UTM) # Here we turn convert the sf object we have into a raster based on the structure of tmaxSC.UTM
LGA.raster
## class      : RasterLayer 
## dimensions : 372, 377, 140244  (nrow, ncol, ncell)
## resolution : 837, 923  (x, y)
## extent     : 291528, 607077, 7007032, 7350388  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=56 +south +datum=WGS84 +units=m +no_defs 
## source     : memory
## names      : layer 
## values     : 1, 4  (min, max)
## attributes :
##  ID       ADMINTYPEN              ADMINAREAN LGA_CODE     ABBREV_NAM
##   1 LOCAL GOVERNMENT   FRASER COAST REGIONAL     3220   FRASER COAST
##   2 LOCAL GOVERNMENT         GYMPIE REGIONAL     3620         GYMPIE
##   3 LOCAL GOVERNMENT             NOOSA SHIRE     5740          NOOSA
##   4 LOCAL GOVERNMENT SUNSHINE COAST REGIONAL     6720 SUNSHINE COAST
##                      LGA CA_AREA_SQ SHAPE_Leng SHAPE_Area
##    Fraser Coast Regional  7993.4859   5.500605 0.71778841
##          Gympie Regional  6936.0164   6.121007 0.62608890
##              Noosa Shire   870.4707   1.682407 0.07869282
##  Sunshine Coast Regional  2286.5828   2.753923 0.20737350

Note that there are four values in the RasterLayer (1 - 4). These are the factor levels of the four councils we have selected, in the sf object (which also happens to be alphabetical order).

Let’s explore with a plot…

plot(LGA.raster, col = rainbow(4))
plot(pts.UTM, add = TRUE)

You can see that some of our points fall outside these LGA boundaries. Let’s see what happens when we extract from the raster object

temps$council <- extract(LGA.raster, temps[,3:4])
summary(temps)
##       lon             lat              UTMx             UTMy        
##  Min.   :151.0   Min.   :-26.97   Min.   :302702   Min.   :7016610  
##  1st Qu.:151.5   1st Qu.:-26.40   1st Qu.:347416   1st Qu.:7079165  
##  Median :151.8   Median :-25.85   Median :377402   Median :7140406  
##  Mean   :151.9   Mean   :-25.62   Mean   :385361   Mean   :7166331  
##  3rd Qu.:152.1   3rd Qu.:-24.76   3rd Qu.:412147   3rd Qu.:7260814  
##  Max.   :153.0   Max.   :-24.04   Max.   :496272   Max.   :7340466  
##                                                                     
##       tmax            tmin           council    
##  Min.   :21.23   Min.   : 8.393   Min.   :1.00  
##  1st Qu.:23.10   1st Qu.:10.082   1st Qu.:1.25  
##  Median :23.88   Median :11.837   Median :2.00  
##  Mean   :23.83   Mean   :11.607   Mean   :2.10  
##  3rd Qu.:24.63   3rd Qu.:12.770   3rd Qu.:2.00  
##  Max.   :25.24   Max.   :14.977   Max.   :4.00  
##                                   NA's   :40

You’ll notice there are 40 NAs in the summary. We started with 50 points, so we know that only 10 of these fall inside one of our four councils.

Let’s look at the points that are not NAs, using the is.na(), function preceded by a !:

temps[!is.na(temps$council),]
##         lon       lat     UTMx    UTMy     tmax     tmin council
## 1  152.8292 -25.43750 482822.8 7186596 24.50460 14.69953       1
## 7  152.7292 -26.75417 473071.0 7040764 23.21331 10.88569       4
## 9  152.6792 -25.37083 467722.5 7193951 24.50000 14.19884       1
## 13 152.1292 -26.14583 412954.9 7107874 24.58616 10.52429       2
## 16 152.6458 -25.31250 464351.9 7200402 24.49802 14.40198       1
## 18 152.9208 -25.94583 492073.6 7130312 23.77705 13.37640       2
## 19 152.6875 -26.25417 468793.2 7096131 24.97118 11.77118       2
## 24 152.2458 -25.95417 424494.1 7129174 22.73327 10.53941       2
## 29 152.9208 -26.08750 492083.1 7114624 23.09407 12.01048       2
## 42 152.9625 -26.78750 496272.5 7037100 23.39578 12.59488       4

We have numeric identifiers for the four council areas. You’ll need to be sure of the LGAs these numeric identifiers apply to. A shortcut to do this is to look at the plot we made earlier. Alternatively, we can figure out what the names are and add them. We can access the names of the levels of the councils as follows:

cnames <- levels(droplevels(counc4$LGA))
cnames
## [1] "Fraser Coast Regional"   "Gympie Regional"        
## [3] "Noosa Shire"             "Sunshine Coast Regional"

NOTE: droplevels removes unused levels. If you fail to do this, you get ALL levels in the original dataset.

With this information, we can create a new variable in temps that provides these names.:

temps$c_Name <- cnames[temps$council]
temps[!is.na(temps$council),]
##         lon       lat     UTMx    UTMy     tmax     tmin council
## 1  152.8292 -25.43750 482822.8 7186596 24.50460 14.69953       1
## 7  152.7292 -26.75417 473071.0 7040764 23.21331 10.88569       4
## 9  152.6792 -25.37083 467722.5 7193951 24.50000 14.19884       1
## 13 152.1292 -26.14583 412954.9 7107874 24.58616 10.52429       2
## 16 152.6458 -25.31250 464351.9 7200402 24.49802 14.40198       1
## 18 152.9208 -25.94583 492073.6 7130312 23.77705 13.37640       2
## 19 152.6875 -26.25417 468793.2 7096131 24.97118 11.77118       2
## 24 152.2458 -25.95417 424494.1 7129174 22.73327 10.53941       2
## 29 152.9208 -26.08750 492083.1 7114624 23.09407 12.01048       2
## 42 152.9625 -26.78750 496272.5 7037100 23.39578 12.59488       4
##                     c_Name
## 1    Fraser Coast Regional
## 7  Sunshine Coast Regional
## 9    Fraser Coast Regional
## 13         Gympie Regional
## 16   Fraser Coast Regional
## 18         Gympie Regional
## 19         Gympie Regional
## 24         Gympie Regional
## 29         Gympie Regional
## 42 Sunshine Coast Regional

Cropping a RasterLayer to an sf object

You also might wish to crop your RasterLayer to the extent of an sf object.

Let’s use the Noosa Shire LGA object of our counc4.UTM sf polygon

# Extract the Noosa Shire-only polygon
councN.UTM <- counc4.UTM[counc4.UTM$LGA == "Noosa Shire",]
councN.UTM # inspect the new sf object containing only Noosa Shire 
## Simple feature collection with 1 feature and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 476356.4 ymin: 7068997 xmax: 511963 ymax: 7109130
## Projected CRS: WGS 84 / UTM zone 56S
##          ADMINTYPEN  ADMINAREAN LGA_CODE ABBREV_NAM         LGA CA_AREA_SQ
## 62 LOCAL GOVERNMENT NOOSA SHIRE     5740      NOOSA Noosa Shire   870.4707
##    SHAPE_Leng SHAPE_Area                       geometry
## 62   1.682407 0.07869282 MULTIPOLYGON (((482757 7068...
# mask and crop the raster
tmax.Noosa <- tmaxSC.UTM %>% 
  crop(councN.UTM)

plot(tmax.Noosa) # Plots the new raster cropped to the same extent as the Noosa Shire Polygon
plot(councN.UTM[1], add = TRUE,colour=NA) # Plot the Noosa Shire polygon on top of the raster

Now, you can see that the RasterLayer data is cropped to the extent of the sf object. This is really helpful when we’re working on projects that require a known spatial footprint. Note, however, that both objects in the call to crop must have the same projection, or things might not turn out the way you expect.

Masking a RasterLayer to an sf object

Alternatively, you might wish to mask an area of a RasterLayer using an sf object

tmax.NoNoosa <- tmaxSC.UTM %>% 
  mask(councN.UTM,inverse=T)
plot(tmax.NoNoosa) 

We can see that Noosa Shire LGA has now disappeared from our RasterLayer. If we were to extract values from the RasterLayer that fell within Noosa Shire, they would now be NA. Of course we could mask everthing but the councils in a fairly similar way:

tmax.NoosaMask <- tmaxSC.UTM %>% 
  mask(councN.UTM) %>%
  crop(councN.UTM)
plot(tmax.NoosaMask) 

Conclusion

With these basics in hand, you should now be able to:

1 - Read in raster-type data 2 - Stack, crop, reproject, resample and mask raster-type data 3 - Read in spatial points using sf 4 - Extract data from a single RasterLayer or a RasterStack, and store the output in a data.frame 5 - Read a shapefile into R using sf and use the resulting object to maipulate raster-type data 6 - Produce pretty maps using tmap

These are the skills that will underpin your initial data analysis for Task 3.

Remember that both the raster and sf packages have excellent vignettes get them from CRAN, which you could use if you need to check functions, or to explore the functionality of the packages further.