Task 1 for Summer 2024 OpenTopography Internship
From Jira – The dataset, Pellston, MI: Effects of Forest Canopy Structural Change on Carbon Uptake has DTM that are identical to the DSM data. In addition, the VRT is built based on the hillshade data, and not the DSM elevation grid.
After some inspection, it’s possible that the DSM on OpenTopography is actually a DTM. This might mean that the ground points are mis-classified in the metadata, or that incorrect methods were used for generating both/either DTM and DSM.
The task for me is to quality check the raster data from OpenTopography, generate a separate DTM and DSM from bulk-downloaded point cloud data here in R, and compare them to each other using differencing and visualization. THEN, I need to create a Cloud-Optimized GeoTIFF (COG) for both the DTM and DSM, make sure that values “make sense” and send it off to Matt.
First, I’m going to download, read, and plot OpenTopography rasters to see what we have going on there.
I downloaded the OpenTopo bare earth (be) and highest hit (hh) bulk download raster data, then converted the ArcGrid format to .tif in QGIS by accessing Tools > search “convert” > GDAL > Raster conversion > Translate (convert format)
Both rasters were saved, and will be read in and visualized below!
ot_dsm = terra::rast(MI09_Hardiman_HH)
ot_dtm = terra::rast(MI09_Hardiman_BE)
Hmm… both ugly and the same! That could be partially due to the visualization methods I’ve selected for today. I’ll go ahead and generate the new DTM and DSM using the bulk-downloaded point clouds from OpenTopography.
In this section I am going to create a DSM which is missing from OpenTopography, but also a DTM in case they want to just update the current option.
Steps:
We are using lidR to process data and terra to save the DTM
library(lidR)
library(terra)
library(gridExtra)
setwd(directory)
The catalog is used by lidR to compile the (many) lidar files/tiles usually used in a job. When the raster is generated from the point cloud, the tiles are automatically merged and saved.
lidR has a few methods for classifying and removing noise depending
on the algorithm that you select. I haven’t tried it before, but what a
great time to do so!
One option is filter_poi() which you can specify the class
to include after reading in the point cloud (or catalog of clouds). lidR
suggests adding the filter = argument within the
readLAS() function because it’s more memory efficient. In
the case of a catalog, we have to separate the argument using
opt_filter() for efficiency. Also, the lidR
package uses the same library as LAStools for reading and
writing LAS and LAZ files, so this may look familiar to my
reviewers.
#create and read lidar catalog
ctg = readLAScatalog("MI09_Hardiman/")
opt_filter(ctg) = "-drop_class 7"
#check data for completeness and validity
las_check(ctg)
##
## Checking headers consistency
## - Checking file version consistency...[0;32m ✓[0m
## - Checking scale consistency...[0;32m ✓[0m
## - Checking offset consistency...[0;32m ✓[0m
## - Checking point type consistency...[0;32m ✓[0m
## - Checking VLR consistency...[0;32m ✓[0m
## - Checking CRS consistency...[0;32m ✓[0m
## Checking the headers
## - Checking scale factor validity...[0;32m ✓[0m
## - Checking Point Data Format ID validity...[0;32m ✓[0m
## Checking preprocessing already done
## - Checking negative outliers...[0;32m ✓[0m
## - Checking normalization...[0;31m no[0m
## Checking the geometry
## - Checking overlapping tiles...[0;32m ✓[0m
## - Checking point indexation...[0;31m no[0m
#plot catalog
plot(ctg, mapview=T, map.type="Esri.WorldImagery")
Ground classified points will be used for the DTM. First, we should
specify to write to disk, then classify ground points.
HOWEVER – after reading the information about the dataset, ground points
are already classified, and the rasterize_terrain()
function automatically points to class 2 (ground) points.
SO: I’m including the code for classifying ground points for reference,
but we will not be evaluating this code chunk due to the ground points
already being classified.
#opt_output_files(ctg) = paste0(tempdir(), "{*}_classified")
#classified_ctg = classify_ground(ctg, csf())
Changing the buffer size and chunk size can speed up the processing time. Increasing buffer helps spatial consistency when the point cloud is less dense, but makes processing slower. When chunks are too large to be loaded in memory, chunk size can be reduced.
#opt_chunk_size(ctg) = 500
#opt_chunk_buffer(ctg) = 10
Input set as lidar catalog, resolution = 1, tin algorithm, and function automatically uses ground class (2)
dtm = lidR::rasterize_terrain(ctg, res = 1, algorithm = tin())
## [1] "DTM successfully generated in: 15.380411863327 minutes."
dsm = lidR::rasterize_canopy(ctg, res = 1, algorithm = dsmtin())
## [1] "DSM successfully generated in: 16.418203830719 minutes."
My quality check methods include using a third-party DEM and checking elevation values around the newly generated rasters, generating canopy height models (CHM) and checking values of vegetation and objects on the ground for height verification, and looking at the overall distribution of values in each raster through the use of histograms.
Unfortunately, due to non-matching extents and incongruous data types, as well as my own weaning patience with this task in R, I’m going to move over to QGIS to “get on with it”. I’ll still include the code below, and hopefully will circle back later.
Download an external dataset in order to check the OpenTopo data & my DTM and DSM.
#read in USGS DEM as a raster
usgs_dem = terra::rast("verify_data/USGS_13_N46W085_20240205.tif")
#clip extent to match other layers
terra::ext(usgs_dem) = c(672000, 682000, 5045000, 5052000)
#plot for visualization
usgs_dem_slope = terrain(usgs_dem, "slope", unit="radians")
usgs_dem_aspect = terrain(usgs_dem, "aspect", unit="radians")
usgs_dem_hill = shade(usgs_dem_slope, usgs_dem_aspect, 45, 270)
plot(usgs_dem_hill, col = grey(0:100/100), legend=F,mar=c(2,2,1,4))
## Warning: [is.lonlat] coordinates are out of range for lon/lat
plot(usgs_dem,col=terrain.colors(25,alpha=0.35),add=T,main="USGS DEM with Hillshade")
## Warning: [is.lonlat] coordinates are out of range for lon/lat
ot_dsm_hist = ggplot() + geom_histogram(data = ot_dsm, aes(x=MI09_Hardiman_hh), ggtitle("OT DSM Elevation Vals"))
ot_dtm_hist = ggplot() + geom_histogram(data = ot_dtm, aes(x=MI09_Hardiman_be), ggtitle("OT DTM Elevation Vals"))
dsm_hist = ggplot() + geom_histogram(data = dsm, aes(x=Z), ggtitle("DSM Elevation Vals"))
dtm_hist = ggplot() + geom_histogram(data = dtm, aes(x=Z), ggtitle("DTM Elevation Vals"))
dem_hist = ggplot() + geom_histogram(data = usgs_dem, aes(x=Layer_1), ggtitle("USGS DEM Elevation Vals"))
grid.arrange(ot_dsm_hist, ot_dtm_hist, dsm_hist, dtm_hist, dem_hist, ncol=2)
First test the original OpenTopo data: OT_DSM - OT_DTM
ot_chm = ot_dsm - ot_dtm
ot_chm_slope = terrain(ot_chm, "slope", unit="radians")
ot_chm_aspect = terrain(ot_chm, "aspect", unit="radians")
ot_chm_hill = shade(ot_chm_slope, ot_chm_aspect, 45, 270)
plot(ot_chm_hill, col = grey(0:100/100), legend=F,mar=c(2,2,1,4))
plot(ot_chm,col=terrain.colors(25,alpha=0.4),add=T,main="OpenTopo CHM with Hillshade")
Next test my data: DSM - DTM
chm = dsm - dtm
chm_slope = terrain(chm, "slope", unit="radians")
chm_aspect = terrain(chm, "aspect", unit="radians")
chm_hill = shade(chm_slope, chm_aspect, 45, 270)
plot(chm_hill, col = grey(0:100/100), legend=F,mar=c(2,2,1,4))
plot(chm,col=terrain.colors(25,alpha=0.4),add=T,main="CHM with Hillshade")
Now my DSM vs the OT DTM which may continue to be used
chm_mix = dsm - ot_dtm
chm_mix_slope = terrain(chm_mix, "slope", unit="radians")
chm_mix_aspect = terrain(chm_mix, "aspect", unit="radians")
chm_mix_hill = shade(chm_mix_slope, chm_mix_aspect, 45, 270)
plot(chm_mix_hill, col = grey(0:100/100), legend=F,mar=c(2,2,1,4))
plot(chm_mix, col=terrain.colors(25,alpha=0.4),add=T,main="CHM (OT + Allison) with Hillshade")
Finally, the last step! Now that we have a new DTM & DSM that should be accurate, we need to convert them into Cloud-Optimized GeoTIFFs (COGs)
To do this, we use the sf package in R which provides an
interface with GDAL utilities. Provided the typical parameters for
generating COGs and slightly adjusting to fit R and updated
versions of GDAL, here’s what we have:
DTM:
sf::gdal_utils(
util = "translate",
source = "dtm.tif",
destination = "dtm_cog.tif",
options = c(
"-of", "GTIFF",
"-f", "COG",
"-co", "COMPRESS=DEFLATE",
"-co", "BLOCKSIZE=512"
)
)
dtm_cog = terra::rast("dtm_cog.tif")
DSM:
sf::gdal_utils(
util = "translate",
source = "dsm.tif",
destination = "dsm_cog.tif",
options = c(
"-of", "GTIFF",
"-f", "COG",
"-co", "COMPRESS=DEFLATE",
"-co", "BLOCKSIZE=512"
)
)
dsm_cog = terra::rast("dsm_cog.tif")
DTM from OT:
sf::gdal_utils(
util = "translate",
source = "MI09_Hardiman_be/MI09_Hardiman_be.tif",
destination = "ot_dtm_cog.tif",
options = c(
"-of", "GTIFF",
"-f", "COG",
"-co", "COMPRESS=DEFLATE",
"-co", "BLOCKSIZE=512"
)
)
ot_dtm_cog = terra::rast("ot_dtm_cog.tif")