Background

Convolutional neural networks (CNN) can be used to map plant species from high-resolution red-green-blue (RGB) drone imagery. A typical workflow is to acquire imagery and create RGB orthomosaics from it via specialized photogrammetry software, and then use this imagery as a basemap to create annotations in vector format (e.g. polygons outlining individual plants, identified to species). These annotations may be restricted to species areas within the imagery (e.g. plots).

This tutorial assumes that the type of CNN used is semantic segmentation. It could be adapted, however, for instance segmentation since in this dataset, individual trees have been identified to species as individual polygons from the imagery.

To train such a CNN, one would usually start with:

a high-resolution RGB orthomosaic (e.g. GeoTiff)
a polygon layer of species annotations (e.g. a shapefile or other vector format)
a polygon layer showing the areas of interest or plots in which annotations were made (optional)

The goal is to create:

tiles of fixed size (e.g. 512 x 512 pixels) storing the RGB information (e.g. three-band raster, as GeoTiff)
masks of annotations for the same tiles with the species information (e.g. single-band raster, as GeoTiff)

These tiles can be restricted to areas of interest in which annotations were made, as shown in this tutorial.

Import the orthomosaic

For this example we will use a high-resolution orthomosaic from Station de biologie des Laurentides (SBL). The imagery was acquired on September 2, 2021.

The data for this tutorial can be downloaded here.

Note: the GeoTiff has four bands (band 4 is transparency, bands 1:3 are RGB). We only want the RGB bands.

library(stars) # library to deal with rasters
library(tidyverse) # library with useful data wrangling functions
bands_sel <- 1:3 # bands to select (RGB only)
img <- read_stars("2021-09-02-sbl-cloutier-z3-P4RTK-Terra_UTM.tif", # read the GeoTiff
                    proxy = TRUE) %>%  # import to not load the imagery in memory
  slice(band, bands_sel) 
plot(img, rgb = 1:3, # plot RGB orthomosaic (low res version)
     downsample = 8) # reduce resolution by a factor of 8

Import the areas of interest

The areas of interests are 30 m x 30 m “virtual plots”, stored in a GeoJSON file. We make sure they are in the same CRS as the orthomosaic.

Note: the GeoJSON layer is in WGS84 (latitude / longitude); we project it to UTM.

aoi <- read_sf("GrilleZone3_Intersect.geojson") %>%
  st_transform(st_crs(img)) # same UTM projection as orthomosaic
plot(st_geometry(aoi))

Import the annotations

The tree annotations are also stored in a GeoJSON file. We import them, again making sure they are in the same CRS as the orthomosaic.

tree_annot <- read_sf("TemplatePolygone.geojson") %>% 
  st_transform(st_crs(img)) # reproject to UTM

We then add a numeric ID for the different species / classes, in alphetical order.

tree_annot <- tree_annot %>%
 mutate(sp_id = as.numeric(as.factor(Label))) #  keep only one column with species numbers IDs
tree_annot[, c("Label", "sp_id")] # we inspect it

Get species (and other classes) names

We need to store a table listing all classes (mostly species), and create a numeric column listing a number for each class. This information needs to be kept since this will be our legend enabling us to translate numbers to species later on. We save it as a .csv file for later use.

# Get species names
sp_names <- tree_annot %>% 
  st_drop_geometry() %>% # transform to tibble
  select(Label, sp_id) %>% # keep only the Label colum
  distinct() %>% # find the unique values
  arrange(Label) %>% 
  drop_na()
sp_names # show the result

write_csv(sp_names, "sp_names.csv")

Cropping the annotations

For simplicity we crop the tree annotations layer to the area of interest. We then plot it.

species <- tree_annot %>% 
  st_crop(st_bbox(aoi)) # crop to extent of areas of interest layer
plot(species["Label"], main = "Classes") # plot the tree polygons

We can see that not all the area was annotated. This is typical of real field data. It takes a lot of time to draw these polygons and validate the species identity of every tree!

Create the grid for the tiles

We need to create a grid to create the tiles that will be used to extract the RGB data and masks.

Find the spatial resolution of imagery

The first step to create the grid is to know the resolution of the imagery, because we need to know the distance that corresponds to the number of pixels we want to extract. We need to do this for both spatial dimensions (X and Y) because resolutions are not always exactly the same along both dimensions.

img_xres <- abs(st_dimensions(img)[[1]]$delta) # x resolution
img_yres <- abs(st_dimensions(img)[[2]]$delta) # y resolution

Define the tile size

This is an important parameter. If the CNN we will be using expects tiles 512 x 512 pixels, then this parameter should be set to 512 to ensure tiles and masks are of the right dimension.

n_pixels <- 512 # tile size (width/height)

Create a new raster of coarser resolution

We then create an empty raster that has the same dimensions as the cropped orthomosaic, except that its spatial resolution (along the x and y dimensions) are coarser by a factor n_pixels. This will be used as our grid for the tiles.

grid_stars <- st_as_stars(st_bbox(img), # empty raster, extent of orthomosaic
                    dx = img_xres * n_pixels, # x resolution of tiles
                    dy = img_yres * n_pixels) # y resolution of tiles

Convert grid to a polygon layer

To extract the tiles, we need to convert the grid to a polygon layer.

grid_sf <- st_as_sfc(grid_stars, as_points = FALSE)

We can plot the orthomosaic and the grid to see the result.

plot(img, rgb = 1:3, reset = FALSE)

## downsample set to c(84)

plot(grid_sf, add = TRUE)

Filter cells that should not be used

The areas of interest, cells, or “virtual plots”, are used to select the regions that will be used to extract tiles and masks. Here, we use the status variable to guide this selection, keeping only cells that have no value (i.e. those that are not À valider or Non utilisé, which indicate cells to exclude). We plot the results.

aoi_sel <- aoi %>% 
  filter(!is.na(status)) # keep only the cells with no special notes
plot(st_geometry(aoi_sel), reset = FALSE)
plot(species["Label"], add = TRUE)

Only keep tiles that overlap selected cells

We will avoid tiles and masks which do not intersect selected cells. This assumes of course that we carefully selected which cells to use. This can be done with the following code.

Note: a more conservative selection could be done to only include cells that are within using st_within() instead of st_intersects(). This would retain fewer tiles and masks.

tiles_intersect <- st_intersects(grid_sf, aoi_sel)
sel_logical = lengths(tiles_intersect) > 0 # returns of logical vector for those cells we keep
grid_sf_sel <- grid_sf[sel_logical] # we filter only those cells

We can then view the result. The tiles that will be extracted are shown in red.

plot(st_geometry(aoi_sel), reset = FALSE)
plot(grid_sf_sel, border = "red", add = TRUE)

Extract the tiles and masks

We are now ready to extract the tiles and masks. To speed things up, we will run this in parallel on multiple cores.

Preparing for parallel computing

We need to prepare everything for parallel computing. This implies telling R how many cores to use. We use all but one, to ensure that we still have one core that remains free for other stuff, if needed. But you could decide to use all available cores to speed things up further.

library(foreach)
library(doParallel)
n_cores <- detectCores() # we use all cores
cl <- makeCluster(n_cores) # we start the parallel cluster
registerDoParallel(cl) # we register it
clusterExport(cl, "bands_sel") # we need to export that object to all cores

First loop: list of cropped tiles

Then we run the first parallel loop. That first loop generates for every tile (or cell) in our grid grid_sf_sel, a 512 x 512 pixels RGB tile from the imagery. The list will be as long as there are cells in our grid. To be clear, this first step just creates the metadata and stores it as a lit of stars_proxy objects.

# Loop to extract tile locations (as stars_proxy objects)
tiles_stars_proxy <- foreach(i = 1:length(grid_sf_sel),
                             .packages = c('tidyverse', 'stars')) %dopar% {
                               
    # Extract tile i from img
    tile_i <- img %>% 
      st_crop(grid_sf_sel[i])
    
} # end of foreach loop for tile locations

Second loop: save the RGB tiles

Then, we use this list and the second loop actually creates the RGB tiles extracted from the orthomosaic. They are stored as GeoTiff files.

Make sure that folders tiles tiles and masks are created first in your working directory. These are used to store the outputs.

tiles <- foreach(i = 1:length(tiles_stars_proxy),
                 .packages = c('tidyverse', 'stars')) %dopar% {
  
  # Extract tile i
  tile_i <- tiles_stars_proxy[[i]] %>% 
    # write GeoTiff
    write_stars(paste0("tiles/tile_", str_pad(i, 6, pad = "0"), ".tif"),
    type = "Byte",
            options = c("COMPRESS=LZW"))
}

Third loop: extract the masks

Finally, we save the masks. We first need to set a background (class corresponding to no annotations). We give it a value of 0 since our species (and other classes) started at 1.

bckgnd_class <- 0

This loop then crops the species layer in small tiles and converts the classes to a single-band raster. All of these masks are then stored in a folder.

masks <- foreach(i = 1:length(grid_sf_sel),
                 .packages = c('tidyverse', 'stars')) %dopar% {
                   
    # Extract species cell i from polygon layer             
    suppressWarnings(
        species_i <- species %>% 
           st_intersection(grid_sf_sel[i]) )

    # Create one-band template for rasterization
    template <- tiles_stars_proxy[[i]] %>%
      st_as_stars() %>% 
      slice(band, 1)   
    
    # Set values to background class
    template[[1]][] <- bckgnd_class
    
    # Rasterize the species vector tile
    mask_i <- st_rasterize(species_i[, 'sp_id'], 
                            template) %>%
    # Export as GeoTiff
    write_stars(paste0("masks/mask_", str_pad(i, 6, pad = "0"), ".tif"),
    type = "Byte",
            options = c("COMPRESS=LZW"))
} # end foreach loop masks

Finally, we stop the cluster.

stopCluster(cl)

We now have very many tiles and masks ready to be used to train a semantic segmentation CNN!

Inspect the tiles and masks

The individual tiles and masks can be opened in QGIS or other GIS software to check that everything worked well (it did!). But we can just check a random tile and mask (we select the 8th one).

num <- 8
tile_num <- read_stars(paste0("tiles/tile_", str_pad(num, 6, pad = "0"), ".tif"))
plot(tile_num, rgb = 1:3,
     downsample = 0)

We can also the tree polygons as red borders.

trees <- species %>% 
  st_crop(st_bbox(tile_num))
plot(tile_num, rgb = 1:3,
     downsample = 0, reset = FALSE)
plot(st_geometry(trees), add = TRUE, border = "red", col = NA)

Finally we can look at the associated mask for that tile.

mask_num <- read_stars(paste0("masks/mask_", str_pad(num, 6, pad = "0"), ".tif"))
plot(mask_num, useRaster = FALSE)

## downsample set to c(1,1)

Extracting tiles and masks from an orthomosaic for semantic segmentation

Etienne Laliberté

2021-11-19