Chapter 10 Notes

Harold Nelson

2023-04-08

The Plan

I want to obtain cropland data and watershed boundaries for Thurston county so that I could do a zonal analysis showing how crops vary across the watersheds.

Setup

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.1     ✔ purrr   0.3.4
## ✔ tibble  3.2.1     ✔ dplyr   1.1.1
## ✔ tidyr   1.2.1     ✔ stringr 1.4.1
## ✔ readr   2.1.2     ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(sf)
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(terra)
## terra 1.7.3
## 
## Attaching package: 'terra'
## 
## The following object is masked from 'package:tidyr':
## 
##     extract
library(scales)
## 
## Attaching package: 'scales'
## 
## The following object is masked from 'package:terra':
## 
##     rescale
## 
## The following object is masked from 'package:purrr':
## 
##     discard
## 
## The following object is masked from 'package:readr':
## 
##     col_factor
library(colorspace)
## 
## Attaching package: 'colorspace'
## 
## The following object is masked from 'package:terra':
## 
##     RGB
library(ggspatial)
library(tmap)
library(CropScapeR)
## 
## Recommendated citation for the 'CropScapeR' package is:
## Bowen Chen (2020). CropScapeR: Access Cropland Data Layer Data via the 'CropScape' Web Service. R package version 1.1.5. 
## 
## For more information, visit the package website at: https://github.com/cbw1243/CropScapeR. If you have any question/suggestion/comment regarding the 'CropScapeR' package, please send emails to Bowen Chen (bwchen0719@gmail.com).
source("rasterdf.R")
load("thurston_sf.Rdata")

The Cropland Data

The following code using the CropScapeR package gets the crop raster data for Thurston county in 2010 and 2020.

GetCDLData(aoi = 53067,
       year = 2020,
       type = "f",
       save_path = "th_crops_2020.tif",
       readr = FALSE)
## Data is saved at:th_crops_2020.tif
## NULL
th_crops_2020 = rast("th_crops_2020.tif")

GetCDLData(aoi = 53067,
       year = 2010,
       type = "f",
       save_path = "th_crops_2010.tif",
       readr = FALSE)
## Data is saved at:th_crops_2010.tif
## NULL
th_crops_2010 = rast("th_crops_2010.tif")

Watersheds Data

The following code from the package nhdplusTools got the national file of watersheds boundaries.

This chunk was not run for this knitting because it takes a long time to execute.

library(nhdplusTools)
download_wbd(
  outdir = "./wbd",
  url = paste0("https://prd-tnm.s3.amazonaws.com/StagedProducts/",
               "Hydrography/WBD/National/GDB/WBD_National_GDB.zip"),
  progress = TRUE
)

The resulting data stored in /wbd is a geodata package, which contains multiple layers. First, use st_layers() to get the names of the layers.

layers = st_layers("WBD_national_GDB.gdb")

layers
## Driver: OpenFileGDB 
## Available layers:
##                     layer_name     geometry_type features fields crs_name
## 1                      WBDHU12     Multi Polygon   103015     20    NAD83
## 2             NWISDrainageLine Multi Line String   127876     12    NAD83
## 3                       WBDHU8     Multi Polygon     2413     15    NAD83
## 4  NonContributingDrainageLine Multi Line String        0     10    NAD83
## 5                       WBDHU4     Multi Polygon      245     15    NAD83
## 6                      WBDHU16     Multi Polygon     7202     20    NAD83
## 7  NonContributingDrainageArea     Multi Polygon        0     12    NAD83
## 8             NWISDrainageArea     Multi Polygon     2733     19    NAD83
## 9                       WBDHU6     Multi Polygon      405     15    NAD83
## 10                      WBDHU2     Multi Polygon       22     15    NAD83
## 11                     WBDHU14     Multi Polygon     8138     20    NAD83
## 12                     WBDLine Multi Line String   394995      8    NAD83
## 13                     WBDHU10     Multi Polygon    18866     17    NAD83
## 14           FeatureToMetadata                NA   432096      3     <NA>
## 15           ExternalCrosswalk                NA        0      7     <NA>
## 16                       HUMod                NA       23      5     <NA>
## 17           MetaProcessDetail                NA     1635     16     <NA>
## 18            MetaSourceDetail                NA     1638     27     <NA>
## 19        ProcessingParameters                NA        2      2     <NA>
## 20                UpdateStatus                NA        0      2     <NA>

Which Layer?

To make sense of this I asked ChatGPT. You can see the conversation here. https://sharegpt.com/c/xvFW7MG

I decided that the layer I wanted was WBDHUB10.

The following code does the work.

wsheds = st_read("WBD_national_GDB.gdb","WBDHU10")
## Reading layer `WBDHU10' from data source 
##   `/Users/haroldnelson/Library/CloudStorage/Dropbox/GDSWR/gdswr_data/Chapter10/WBD_National_GDB.gdb' 
##   using driver `OpenFileGDB'
## Warning in CPL_read_ogr(dsn, layer, query, as.character(options), quiet, :
## GDAL Message 1: organizePolygons() received a polygon with more than 100 parts.
## The processing may be really slow. You can skip the processing by setting
## METHOD=SKIP, or only make it analyze counter-clock wise parts by setting
## METHOD=ONLY_CCW if you can assume that the outline of holes is counter-clock
## wise defined
## Simple feature collection with 18866 features and 17 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -179.2297 ymin: -14.61019 xmax: 179.8567 ymax: 71.43957
## Geodetic CRS:  NAD83
save(wsheds,file = "wsheds.Rdata")

Intersecting with Thurston County

The following code failed because the naional watersheds boundary was not a valid sf object.

thurston_sf = st_transform(thurston_sf,crs(wsheds))
thurston_wsheds = st_intersection(thurston_sf,wsheds)
## Error in wk_handle.wk_wkb(wkb, s2_geography_writer(oriented = oriented, : Loop 201 is not valid: Edge 0 has duplicate vertex with edge 10

I then had another conversation with ChatGPT, which you can examine here https://sharegpt.com/c/R55wXZc

That led me to do the following. Note that ChatGPT gave me two possibilities. st_repair() doesn’t exist, even though it probably did at one time. Fortunately, st_make_valid() worked.

valcheck = st_is_valid(wsheds)
sum(!valcheck)
## [1] 10
wsheds_repaired = st_make_valid(wsheds)
valcheck = st_is_valid(wsheds_repaired)
sum(!valcheck)
## [1] 0

Finally

I was able to get the Thurston County watersheds.

thurston_wsheds = st_intersection(thurston_sf,wsheds_repaired)
## Warning: attribute variables are assumed to be spatially constant throughout all
## geometries
thurston_wsheds %>% 
  ggplot() + geom_sf()