This Exercise will introduce you to reading vector and raster data into R, and tabulating raster areas using a vector (watershed boundary). For more information on spatial analysis in R, see: https://www.paulamoraga.com/book-spatial/spatial-data-in-r.html

Focus on sf and stars (the new raster package): https://r-spatial.org/book/07-Introsf.html

0. Open R, and click on File > New File > R Script

For each line of code below, enter it into the R script.
Then, put the cursor on the line, and press “>Run” button in the upper right corner of script window.

1. Install the libraries (aka packages) you need.

You only have to “install.packages” once. After you have run this, you can comment it out with “#”

Note: Until 2023, R used the rgdal and regeos packages for spatial data. That has been changed to the “sf” package.

Install packages—only once! To make it run (the first time), set eval=TRUE. To make it not run after that, set eval=FALSE as shown below. Your Rmd file will not knit if this chunk is still set to be evaluated during knitting.

### 1) Install (run once per machine) ----
install.packages(c(
  "sf",
  "stars",
  "htmlTable",
  "kableExtra",  # For pretty tables
  "knitr",
  # optional but handy:
  "lwgeom"       # fixes invalid polygons
  # "terra",     # alternative to stars for rasters
  # "exactextractr" # optional: precise polygon/raster summaries
      # Much faster than the raster summary we are using below.
))
#### 2) Load every new R session ----
library(sf)
## Linking to GEOS 3.12.1, GDAL 3.8.4, PROJ 9.3.1; sf_use_s2() is TRUE
library(stars)
## Loading required package: abind
library(kableExtra)  # For pretty tables
library(knitr)
# optional helpers:
# library(lwgeom)

To see what each of these libraries do, do a google search on, e.g. “R cran raster package”

2. Loading data.

Set the input directory and filename for the watershed boundary shapefile. * This is the watershed boundary of the shapefile you delineated in Exercise 2. Note that there is no .shp extension on the shapefile name.

Call the directory with the shapefile “indir.shp” and the shapefile name “fname.shp”. Call the directory with the land cover raster “indir.lc” and the raster filename “fname.lc”.

Now read in the watershed boundary shapefile (or geopackage) and land cover raster.

#### 4) Read data with sf/stars ----
# Vector

# Note:  you can also read in a geopackage, just replace ".shp" with ".gpkg".

# Instead of using indir.shp,paste... as the argument for file.path you can also
# copy and paste the entire path of your file directly.  
# e.g. file.path("C:/mydir/GEOG576/whateverpath/wshed.shp")

wshed <- st_read(file.path(indir.wshed, paste0(fname.wshed, ".shp")))
## Reading layer `lospen_wshed_10m_utm11' from data source 
##   `K:\My Drive\Gdrive\mydocuments\SDSU_classes\past\2020_2021_701_576\2020F_576_701_498\576\hw\data\spatial\DEM\biggs_output\lospen_wshed_10m_utm11.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 3 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: 488586.1 ymin: 3641729 xmax: 504566.1 ymax: 3651339
## Projected CRS: WGS 84 / UTM zone 11N
# Raster
lc <- read_stars(file.path(indir.lc, fname.lc))

Check the coordinate reference system (aka projection) of the watershed shapefile:

crs.wshed <- st_crs(wshed)
crs.wshed
## Coordinate Reference System:
##   User input: WGS 84 / UTM zone 11N 
##   wkt:
## PROJCRS["WGS 84 / UTM zone 11N",
##     BASEGEOGCRS["WGS 84",
##         DATUM["World Geodetic System 1984",
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4326]],
##     CONVERSION["UTM zone 11N",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["Degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",-117,
##             ANGLEUNIT["Degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9996,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",500000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["(E)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["(N)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     ID["EPSG",32611]]

Now check the coordinate reference system of the land cover raster:

crs.lc <- st_crs(lc)
crs.lc
## Coordinate Reference System:
##   User input: WGS 84 / UTM zone 11N 
##   wkt:
## PROJCRS["WGS 84 / UTM zone 11N",
##     BASEGEOGCRS["WGS 84",
##         ENSEMBLE["World Geodetic System 1984 ensemble",
##             MEMBER["World Geodetic System 1984 (Transit)"],
##             MEMBER["World Geodetic System 1984 (G730)"],
##             MEMBER["World Geodetic System 1984 (G873)"],
##             MEMBER["World Geodetic System 1984 (G1150)"],
##             MEMBER["World Geodetic System 1984 (G1674)"],
##             MEMBER["World Geodetic System 1984 (G1762)"],
##             MEMBER["World Geodetic System 1984 (G2139)"],
##             ELLIPSOID["WGS 84",6378137,298.257223563,
##                 LENGTHUNIT["metre",1]],
##             ENSEMBLEACCURACY[2.0]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4326]],
##     CONVERSION["UTM zone 11N",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",-117,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9996,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",500000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["(E)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["(N)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Navigation and medium accuracy spatial referencing."],
##         AREA["Between 120°W and 114°W, northern hemisphere between equator and 84°N, onshore and offshore. Canada - Alberta; British Columbia (BC); Northwest Territories (NWT); Nunavut. Mexico. United States (USA)."],
##         BBOX[0,-120,84,-114]],
##     ID["EPSG",32611]]

Do they match? To check:

crs.wshed == crs.lc
## [1] TRUE

If they don’t match for some reason, you’ll need to project one dataset into the projection of the other:

wshed.proj <- st_transform(wshed,crs=st_crs(lc))

Now plot the watershed and land cover raster to see what they look like.

plot(lc, reset=FALSE)
## downsample set to 4
plot(st_geometry(wshed), add = TRUE, lwd = 2,col=NA,border="red")

#### 6) Crop/mask raster to watershed polygon ----
lc_ws <- lc[wshed]        # stars supports polygon subsetting
plot(lc_ws); plot(st_geometry(wshed), add = TRUE)

Create a color key for NLCD classes (optional, but helpful for plotting later).

nlcd_colors <- c(
  "11" = "#466B9F",  # Open Water
  "12" = "#D1DEF8",  # Perennial Ice/Snow
  "21" = "#DECACA",  # Developed, Open Space
  "22" = "#D99482",  # Developed, Low Intensity
  "23" = "#EE0000",  # Developed, Medium Intensity
  "24" = "#AB0000",  # Developed, High Intensity
  "31" = "#B3ACA3",  # Barren Land
  "41" = "#68AA63",  # Deciduous Forest
  "42" = "#1C6330",  # Evergreen Forest
  "43" = "#B5CA8F",  # Mixed Forest
  "52" = "#CCB879",  # Shrub/Scrub
  "71" = "#E2E2C1",  # Grassland/Herbaceous
  "81" = "#C9C977",  # Pasture/Hay
  "82" = "#99C247",  # Cultivated Crops
  "90" = "#82B3D1",  # Woody Wetlands
  "95" = "#C1E5F0"   # Emergent Herbaceous Wetlands
)

lc.key <- data.frame(
  Code = c(11,21,22,23,24,31,42,43,52,71,81,82,90,95),
  Name = c("Open Water","Developed, Open Space","Developed, Low Intensity",
           "Developed, Medium Intensity","Developed, High Intensity","Barren",
           "Evergreen Forest","Mixed Forest","Shrub/Scrub","Grassland",
           "Pasture","Crops","Woody Wetlands","Emergent wetlands")
)

Plot the map with the legend. There’s currently overlap…see if you can fix it.

# expand the plotting margins
par(xpd = TRUE, mar = c(12, 3, 3, 3))  # XPD allows the legend to be outside plot area

plot(
  lc,
  col = nlcd_colors[as.character(sort(unique(as.vector(lc[[1]]))))],
  key.pos = NULL,            # <— removes the default numeric legend
  reset = FALSE,
  main = NULL
)
## downsample set to 4
plot(st_geometry(wshed),
     add = TRUE,
     col = NA,
     border = "black",
     lwd = 2)

legend(
  "bottomleft",
  legend = lc.key$Name,
  fill = nlcd_colors[as.character(lc.key$Code)],
  border = "gray40",
  cex = 0.8,
  bty = "n",
  ncol = 2,
  title = "NLCD Land Cover Classes",
  inset = c(0, 0)  # push legend further right if needed
)

#### 7) Tabulate NLCD classes inside the watershed ----
tbl <- table(lc_ws)                 # counts by NLCD code
lc.df <- data.frame(Code = as.integer(names(tbl)), Cells = as.integer(tbl))
lc.df <- merge(lc.df, lc.key, by = "Code", all.x = TRUE)
### 8) Compute area (derive pixel size instead of hard-coding 30 m) ----
dims <- st_dimensions(lc_ws)
dx <- abs(dims[[1]]$delta)  # pixel width (map units)
dy <- abs(dims[[2]]$delta)  # pixel height
cell_area_km2 <- (dx * dy) / 1e6

lc.df$Area.km2     <- lc.df$Cells * cell_area_km2
lc.df$Area.percent <- 100 * lc.df$Cells / sum(lc.df$Cells)

# tidy printing
lc.df$Area.km2     <- round(lc.df$Area.km2, 2)
lc.df$Area.percent <- round(lc.df$Area.percent, 1)
#### 9) Sort & format output ----
lc.out <- lc.df[order(lc.df$Area.km2, decreasing = TRUE), 
                c("Name","Area.km2","Area.percent")]

lc.out$Area.km2     <- format(lc.out$Area.km2, digits = 3)
lc.out$Area.percent <- format(lc.out$Area.percent, digits = 3)
### 9) Display results in a kable----
kable(
  lc.out,
  caption = "Table 1. Land cover in the Los Peñasquitos watershed.",
  align = c("l", "r", "r"),
  col.names = c("Land Cover Class", "Area (km²)", "Percent of Total"),
  row.names= FALSE
)
Table 1. Land cover in the Los Peñasquitos watershed.
Land Cover Class Area (km²) Percent of Total
Shrub/Scrub 48.42 44.2
Developed, Medium Intensity 22.39 20.4
Developed, Low Intensity 16.77 15.3
Developed, Open Space 15.32 14.0
Developed, High Intensity 3.05 2.8
Grassland 2.99 2.7
Woody Wetlands 0.35 0.3
Pasture 0.17 0.2
Mixed Forest 0.09 0.1
Barren 0.06 0.1
Evergreen Forest 0.03 0.0
Emergent wetlands 0.02 0.0
Open Water 0.00 0.0
Crops 0.00 0.0

Or you can make the table a bit fancier with kableExtra.

kable(
  lc.out[, c("Name", "Area.km2", "Area.percent")],
  caption = "Table 1. Land cover in the Los Peñasquitos watershed.",
  col.names = c("Land Cover Class", "Area (km²)", "Percent of Total"),
  align = c("l", "r", "r"),
  row.names = FALSE,
  format = "html"   # ensures styling works in HTML output
) %>%
  kable_styling(
    full_width = FALSE,                   # keeps table compact
    bootstrap_options = c("striped", "hover", "condensed", "responsive"),
    position = "center",
    font_size = 13
  ) %>%
  column_spec(1, bold = TRUE) %>%         # make class names bold
  column_spec(2:3, width = "6em") %>%     # set column widths
  add_header_above(c(" " = 1, "Land Cover Summary" = 2)) %>%  # grouped header
  footnote(
    general = "Areas are derived from NLCD data (30 m resolution).",
    general_title = "Note: ",
    footnote_as_chunk = TRUE
  )
Table 1. Land cover in the Los Peñasquitos watershed.
Land Cover Summary
Land Cover Class Area (km²) Percent of Total
Shrub/Scrub 48.42 44.2
Developed, Medium Intensity 22.39 20.4
Developed, Low Intensity 16.77 15.3
Developed, Open Space 15.32 14.0
Developed, High Intensity 3.05 2.8
Grassland 2.99 2.7
Woody Wetlands 0.35 0.3
Pasture 0.17 0.2
Mixed Forest 0.09 0.1
Barren 0.06 0.1
Evergreen Forest 0.03 0.0
Emergent wetlands 0.02 0.0
Open Water 0.00 0.0
Crops 0.00 0.0
Note: Areas are derived from NLCD data (30 m resolution).