Convolutional neural networks (CNN) can be used to map plant species from high-resolution red-green-blue (RGB) drone imagery. A typical workflow is to acquire imagery and create RGB orthomosaics from it via specialized photogrammetry software, and then use this imagery as a basemap to create annotations in vector format (e.g. polygons outlining individual plants, identified to species). These annotations may be restricted to species areas within the imagery (e.g. plots).
This tutorial assumes that the type of CNN used is semantic segmentation. It could be adapted, however, for instance segmentation since in this dataset, individual trees have been identified to species as individual polygons from the imagery.
To train such a CNN, one would usually start with:
The goal is to create:
These tiles can be restricted to areas of interest in which annotations were made, as shown in this tutorial.
For this example we will use a high-resolution orthomosaic from Station de biologie des Laurentides (SBL). The imagery was acquired on September 2, 2021.
The data for this tutorial can be downloaded here.
Note: the GeoTiff has four bands (band 4 is transparency, bands 1:3 are RGB). We only want the RGB bands.
library(stars) # library to deal with rasters
library(tidyverse) # library with useful data wrangling functions
bands_sel <- 1:3 # bands to select (RGB only)
img <- read_stars("2021-09-02-sbl-cloutier-z3-P4RTK-Terra_UTM.tif", # read the GeoTiff
proxy = TRUE) %>% # import to not load the imagery in memory
slice(band, bands_sel)
plot(img, rgb = 1:3, # plot RGB orthomosaic (low res version)
downsample = 8) # reduce resolution by a factor of 8
The areas of interests are 30 m x 30 m “virtual plots”, stored in a GeoJSON file. We make sure they are in the same CRS as the orthomosaic.
Note: the GeoJSON layer is in WGS84 (latitude / longitude); we project it to UTM.
aoi <- read_sf("GrilleZone3_Intersect.geojson") %>%
st_transform(st_crs(img)) # same UTM projection as orthomosaic
plot(st_geometry(aoi))
The tree annotations are also stored in a GeoJSON file. We import them, again making sure they are in the same CRS as the orthomosaic.
tree_annot <- read_sf("TemplatePolygone.geojson") %>%
st_transform(st_crs(img)) # reproject to UTM
We then add a numeric ID for the different species / classes, in alphetical order.
tree_annot <- tree_annot %>%
mutate(sp_id = as.numeric(as.factor(Label))) # keep only one column with species numbers IDs
tree_annot[, c("Label", "sp_id")] # we inspect it
We need to store a table listing all classes (mostly species), and create a numeric column listing a number for each class. This information needs to be kept since this will be our legend enabling us to translate numbers to species later on. We save it as a .csv file for later use.
# Get species names
sp_names <- tree_annot %>%
st_drop_geometry() %>% # transform to tibble
select(Label, sp_id) %>% # keep only the Label colum
distinct() %>% # find the unique values
arrange(Label) %>%
drop_na()
sp_names # show the result
write_csv(sp_names, "sp_names.csv")
For simplicity we crop the tree annotations layer to the area of interest. We then plot it.
species <- tree_annot %>%
st_crop(st_bbox(aoi)) # crop to extent of areas of interest layer
plot(species["Label"], main = "Classes") # plot the tree polygons
We can see that not all the area was annotated. This is typical of real field data. It takes a lot of time to draw these polygons and validate the species identity of every tree!
We need to create a grid to create the tiles that will be used to extract the RGB data and masks.
The first step to create the grid is to know the resolution of the imagery, because we need to know the distance that corresponds to the number of pixels we want to extract. We need to do this for both spatial dimensions (X and Y) because resolutions are not always exactly the same along both dimensions.
img_xres <- abs(st_dimensions(img)[[1]]$delta) # x resolution
img_yres <- abs(st_dimensions(img)[[2]]$delta) # y resolution
This is an important parameter. If the CNN we will be using expects tiles 512 x 512 pixels, then this parameter should be set to 512 to ensure tiles and masks are of the right dimension.
n_pixels <- 512 # tile size (width/height)
We then create an empty raster that has the same dimensions as the cropped orthomosaic, except that its spatial resolution (along the x and y dimensions) are coarser by a factor n_pixels. This will be used as our grid for the tiles.
grid_stars <- st_as_stars(st_bbox(img), # empty raster, extent of orthomosaic
dx = img_xres * n_pixels, # x resolution of tiles
dy = img_yres * n_pixels) # y resolution of tiles
To extract the tiles, we need to convert the grid to a polygon layer.
grid_sf <- st_as_sfc(grid_stars, as_points = FALSE)
We can plot the orthomosaic and the grid to see the result.
plot(img, rgb = 1:3, reset = FALSE)
## downsample set to c(84)
plot(grid_sf, add = TRUE)
The areas of interest, cells, or “virtual plots”, are used to select the regions that will be used to extract tiles and masks. Here, we use the status variable to guide this selection, keeping only cells that have no value (i.e. those that are not À valider or Non utilisé, which indicate cells to exclude). We plot the results.
aoi_sel <- aoi %>%
filter(!is.na(status)) # keep only the cells with no special notes
plot(st_geometry(aoi_sel), reset = FALSE)
plot(species["Label"], add = TRUE)
We will avoid tiles and masks which do not intersect selected cells. This assumes of course that we carefully selected which cells to use. This can be done with the following code.
Note: a more conservative selection could be done to only include cells that are within using
st_within()instead ofst_intersects(). This would retain fewer tiles and masks.
tiles_intersect <- st_intersects(grid_sf, aoi_sel)
sel_logical = lengths(tiles_intersect) > 0 # returns of logical vector for those cells we keep
grid_sf_sel <- grid_sf[sel_logical] # we filter only those cells
We can then view the result. The tiles that will be extracted are shown in red.
plot(st_geometry(aoi_sel), reset = FALSE)
plot(grid_sf_sel, border = "red", add = TRUE)
We are now ready to extract the tiles and masks. To speed things up, we will run this in parallel on multiple cores.
We need to prepare everything for parallel computing. This implies telling R how many cores to use. We use all but one, to ensure that we still have one core that remains free for other stuff, if needed. But you could decide to use all available cores to speed things up further.
library(foreach)
library(doParallel)
n_cores <- detectCores() # we use all cores
cl <- makeCluster(n_cores) # we start the parallel cluster
registerDoParallel(cl) # we register it
clusterExport(cl, "bands_sel") # we need to export that object to all cores
Then we run the first parallel loop. That first loop generates for every tile (or cell) in our grid grid_sf_sel, a 512 x 512 pixels RGB tile from the imagery. The list will be as long as there are cells in our grid. To be clear, this first step just creates the metadata and stores it as a lit of stars_proxy objects.
# Loop to extract tile locations (as stars_proxy objects)
tiles_stars_proxy <- foreach(i = 1:length(grid_sf_sel),
.packages = c('tidyverse', 'stars')) %dopar% {
# Extract tile i from img
tile_i <- img %>%
st_crop(grid_sf_sel[i])
} # end of foreach loop for tile locations
Then, we use this list and the second loop actually creates the RGB tiles extracted from the orthomosaic. They are stored as GeoTiff files.
Make sure that folders tiles
tilesandmasksare created first in your working directory. These are used to store the outputs.
tiles <- foreach(i = 1:length(tiles_stars_proxy),
.packages = c('tidyverse', 'stars')) %dopar% {
# Extract tile i
tile_i <- tiles_stars_proxy[[i]] %>%
# write GeoTiff
write_stars(paste0("tiles/tile_", str_pad(i, 6, pad = "0"), ".tif"),
type = "Byte",
options = c("COMPRESS=LZW"))
}
Finally, we save the masks. We first need to set a background (class corresponding to no annotations). We give it a value of 0 since our species (and other classes) started at 1.
bckgnd_class <- 0
This loop then crops the species layer in small tiles and converts the classes to a single-band raster. All of these masks are then stored in a folder.
masks <- foreach(i = 1:length(grid_sf_sel),
.packages = c('tidyverse', 'stars')) %dopar% {
# Extract species cell i from polygon layer
suppressWarnings(
species_i <- species %>%
st_intersection(grid_sf_sel[i]) )
# Create one-band template for rasterization
template <- tiles_stars_proxy[[i]] %>%
st_as_stars() %>%
slice(band, 1)
# Set values to background class
template[[1]][] <- bckgnd_class
# Rasterize the species vector tile
mask_i <- st_rasterize(species_i[, 'sp_id'],
template) %>%
# Export as GeoTiff
write_stars(paste0("masks/mask_", str_pad(i, 6, pad = "0"), ".tif"),
type = "Byte",
options = c("COMPRESS=LZW"))
} # end foreach loop masks
Finally, we stop the cluster.
stopCluster(cl)
We now have very many tiles and masks ready to be used to train a semantic segmentation CNN!
The individual tiles and masks can be opened in QGIS or other GIS software to check that everything worked well (it did!). But we can just check a random tile and mask (we select the 8th one).
num <- 8
tile_num <- read_stars(paste0("tiles/tile_", str_pad(num, 6, pad = "0"), ".tif"))
plot(tile_num, rgb = 1:3,
downsample = 0)
We can also the tree polygons as red borders.
trees <- species %>%
st_crop(st_bbox(tile_num))
plot(tile_num, rgb = 1:3,
downsample = 0, reset = FALSE)
plot(st_geometry(trees), add = TRUE, border = "red", col = NA)
Finally we can look at the associated mask for that tile.
mask_num <- read_stars(paste0("masks/mask_", str_pad(num, 6, pad = "0"), ".tif"))
plot(mask_num, useRaster = FALSE)
## downsample set to c(1,1)