Task 1 for Summer 2024 OpenTopography Internship

Task Description

From Jira – The dataset, Pellston, MI: Effects of Forest Canopy Structural Change on Carbon Uptake has DTM that are identical to the DSM data. In addition, the VRT is built based on the hillshade data, and not the DSM elevation grid.

After some inspection, it’s possible that the DSM on OpenTopography is actually a DTM. This might mean that the ground points are mis-classified in the metadata, or that incorrect methods were used for generating both/either DTM and DSM.

The task for me is to quality check the raster data from OpenTopography, generate a separate DTM and DSM from bulk-downloaded point cloud data here in R, and compare them to each other using differencing and visualization. THEN, I need to create a Cloud-Optimized GeoTIFF (COG) for both the DTM and DSM, make sure that values “make sense” and send it off to Matt.

View OpenTopo rasters

First, I’m going to download, read, and plot OpenTopography rasters to see what we have going on there.

I downloaded the OpenTopo bare earth (be) and highest hit (hh) bulk download raster data, then converted the ArcGrid format to .tif in QGIS by accessing Tools > search “convert” > GDAL > Raster conversion > Translate (convert format)

Both rasters were saved, and will be read in and visualized below!

ot_dsm = terra::rast(MI09_Hardiman_HH)
ot_dtm = terra::rast(MI09_Hardiman_BE)

OpenTopo DSM:

OpenTopo DTM:

Hmm… both ugly and the same! That could be partially due to the visualization methods I’ve selected for today. I’ll go ahead and generate the new DTM and DSM using the bulk-downloaded point clouds from OpenTopography.

Generating new rasters

In this section I am going to create a DSM which is missing from OpenTopography, but also a DTM in case they want to just update the current option.

Steps:

  • load packages
  • set directory
  • create catalog of lidar tiles
  • optimize processing chunks
  • ground classification
  • generate DTM using ground points at 1m resolution
  • generate DSM using first-return points
  • visualize

Load packages

We are using lidR to process data and terra to save the DTM

library(lidR)
library(terra)
library(gridExtra)

Set directory

setwd(directory)

Create catalog

The catalog is used by lidR to compile the (many) lidar files/tiles usually used in a job. When the raster is generated from the point cloud, the tiles are automatically merged and saved.

Filtering noise

lidR has a few methods for classifying and removing noise depending on the algorithm that you select. I haven’t tried it before, but what a great time to do so!
One option is filter_poi() which you can specify the class to include after reading in the point cloud (or catalog of clouds). lidR suggests adding the filter = argument within the readLAS() function because it’s more memory efficient. In the case of a catalog, we have to separate the argument using opt_filter() for efficiency. Also, the lidR package uses the same library as LAStools for reading and writing LAS and LAZ files, so this may look familiar to my reviewers.

#create and read lidar catalog 
ctg = readLAScatalog("MI09_Hardiman/")
opt_filter(ctg) = "-drop_class 7"
#check data for completeness and validity
las_check(ctg)
## 
##  Checking headers consistency
##   - Checking file version consistency... ✓
##   - Checking scale consistency... ✓
##   - Checking offset consistency... ✓
##   - Checking point type consistency... ✓
##   - Checking VLR consistency... ✓
##   - Checking CRS consistency... ✓
##  Checking the headers
##   - Checking scale factor validity... ✓
##   - Checking Point Data Format ID validity... ✓
##  Checking preprocessing already done 
##   - Checking negative outliers... ✓
##   - Checking normalization... no
##  Checking the geometry
##   - Checking overlapping tiles... ✓
##   - Checking point indexation... no
#plot catalog
plot(ctg, mapview=T, map.type="Esri.WorldImagery")

Ground classification

Ground classified points will be used for the DTM. First, we should specify to write to disk, then classify ground points.
HOWEVER – after reading the information about the dataset, ground points are already classified, and the rasterize_terrain() function automatically points to class 2 (ground) points.
SO: I’m including the code for classifying ground points for reference, but we will not be evaluating this code chunk due to the ground points already being classified.

#opt_output_files(ctg) = paste0(tempdir(), "{*}_classified")
#classified_ctg = classify_ground(ctg, csf())

Optimizing processing

Changing the buffer size and chunk size can speed up the processing time. Increasing buffer helps spatial consistency when the point cloud is less dense, but makes processing slower. When chunks are too large to be loaded in memory, chunk size can be reduced.

#opt_chunk_size(ctg) = 500
#opt_chunk_buffer(ctg) = 10

Generate DTM

Input set as lidar catalog, resolution = 1, tin algorithm, and function automatically uses ground class (2)

dtm = lidR::rasterize_terrain(ctg, res = 1, algorithm = tin())

## [1] "DTM successfully generated in: 15.380411863327 minutes."

Generate DSM

dsm = lidR::rasterize_canopy(ctg, res = 1, algorithm = dsmtin())

## [1] "DSM successfully generated in: 16.418203830719 minutes."

Quality check

My quality check methods include using a third-party DEM and checking elevation values around the newly generated rasters, generating canopy height models (CHM) and checking values of vegetation and objects on the ground for height verification, and looking at the overall distribution of values in each raster through the use of histograms.

Unfortunately, due to non-matching extents and incongruous data types, as well as my own weaning patience with this task in R, I’m going to move over to QGIS to “get on with it”. I’ll still include the code below, and hopefully will circle back later.

External data

Download an external dataset in order to check the OpenTopo data & my DTM and DSM.

#read in USGS DEM as a raster
usgs_dem = terra::rast("verify_data/USGS_13_N46W085_20240205.tif")
#clip extent to match other layers
terra::ext(usgs_dem) = c(672000, 682000, 5045000, 5052000)
#plot for visualization
usgs_dem_slope = terrain(usgs_dem, "slope", unit="radians")
usgs_dem_aspect = terrain(usgs_dem, "aspect", unit="radians")
usgs_dem_hill = shade(usgs_dem_slope, usgs_dem_aspect, 45, 270)
plot(usgs_dem_hill, col = grey(0:100/100), legend=F,mar=c(2,2,1,4))
## Warning: [is.lonlat] coordinates are out of range for lon/lat
plot(usgs_dem,col=terrain.colors(25,alpha=0.35),add=T,main="USGS DEM with Hillshade")
## Warning: [is.lonlat] coordinates are out of range for lon/lat

Histograms

ot_dsm_hist = ggplot() + geom_histogram(data = ot_dsm, aes(x=MI09_Hardiman_hh), ggtitle("OT DSM Elevation Vals"))
ot_dtm_hist = ggplot() + geom_histogram(data = ot_dtm, aes(x=MI09_Hardiman_be), ggtitle("OT DTM Elevation Vals"))
dsm_hist = ggplot() + geom_histogram(data = dsm, aes(x=Z), ggtitle("DSM Elevation Vals"))
dtm_hist = ggplot() + geom_histogram(data = dtm, aes(x=Z), ggtitle("DTM Elevation Vals"))
dem_hist = ggplot() + geom_histogram(data = usgs_dem, aes(x=Layer_1), ggtitle("USGS DEM Elevation Vals"))
grid.arrange(ot_dsm_hist, ot_dtm_hist, dsm_hist, dtm_hist, dem_hist, ncol=2)

CHMs

First test the original OpenTopo data: OT_DSM - OT_DTM

ot_chm = ot_dsm - ot_dtm
ot_chm_slope = terrain(ot_chm, "slope", unit="radians")
ot_chm_aspect = terrain(ot_chm, "aspect", unit="radians")
ot_chm_hill = shade(ot_chm_slope, ot_chm_aspect, 45, 270)
plot(ot_chm_hill, col = grey(0:100/100), legend=F,mar=c(2,2,1,4))
plot(ot_chm,col=terrain.colors(25,alpha=0.4),add=T,main="OpenTopo CHM with Hillshade")

Next test my data: DSM - DTM

chm = dsm - dtm
chm_slope = terrain(chm, "slope", unit="radians")
chm_aspect = terrain(chm, "aspect", unit="radians")
chm_hill = shade(chm_slope, chm_aspect, 45, 270)
plot(chm_hill, col = grey(0:100/100), legend=F,mar=c(2,2,1,4))
plot(chm,col=terrain.colors(25,alpha=0.4),add=T,main="CHM with Hillshade")

Now my DSM vs the OT DTM which may continue to be used

chm_mix = dsm - ot_dtm 
chm_mix_slope = terrain(chm_mix, "slope", unit="radians")
chm_mix_aspect = terrain(chm_mix, "aspect", unit="radians")
chm_mix_hill = shade(chm_mix_slope, chm_mix_aspect, 45, 270)
plot(chm_mix_hill, col = grey(0:100/100), legend=F,mar=c(2,2,1,4))
plot(chm_mix, col=terrain.colors(25,alpha=0.4),add=T,main="CHM (OT + Allison) with Hillshade")

Generate COGs

Finally, the last step! Now that we have a new DTM & DSM that should be accurate, we need to convert them into Cloud-Optimized GeoTIFFs (COGs)

To do this, we use the sf package in R which provides an interface with GDAL utilities. Provided the typical parameters for generating COGs and slightly adjusting to fit R and updated versions of GDAL, here’s what we have:

DTM:

sf::gdal_utils(
  util = "translate",
  source = "dtm.tif",
  destination = "dtm_cog.tif",
  options = c(
    "-of", "GTIFF",
    "-f", "COG",
    "-co", "COMPRESS=DEFLATE",
    "-co", "BLOCKSIZE=512"
  )
)
dtm_cog = terra::rast("dtm_cog.tif")

DSM:

sf::gdal_utils(
  util = "translate",
  source = "dsm.tif",
  destination = "dsm_cog.tif",
  options = c(
    "-of", "GTIFF",
    "-f", "COG",
    "-co", "COMPRESS=DEFLATE",
    "-co", "BLOCKSIZE=512"
  )
)
dsm_cog = terra::rast("dsm_cog.tif")

DTM from OT:

sf::gdal_utils(
  util = "translate",
  source = "MI09_Hardiman_be/MI09_Hardiman_be.tif",
  destination = "ot_dtm_cog.tif",
  options = c(
    "-of", "GTIFF",
    "-f", "COG",
    "-co", "COMPRESS=DEFLATE",
    "-co", "BLOCKSIZE=512"
  )
)
ot_dtm_cog = terra::rast("ot_dtm_cog.tif")