Remote Sensing Data in R

Overview

Remotely sensed data have a huge array of uses when applying GIS to global environmental problems. In a lot of our work in this class, we’re using remotely sensed data that have already been classified or otherwise processed for specific purposes.

In this tutorial, though, we’ll have a look at what remote sensing data look like when they’re a bit closer to the format in which they come from the satellite. We’ll be looking, first at an area of Egypt’s West Desert that has been the subject of massive agricultural expansion efforts.

After that, we’ll have a look at a DEM of the area around Mount Everest to see how to make impressive topographic maps using R and the ggplot2 package universe.

The Data

In this tutorial, we’ll be working with data from NASA/USGS’s LandSat program. LandSat data can be downloaded from USGS’s EarthExplorer.

We have two LandSat scenes from the same area in Egypt’s Western Desert, one in June 2014 and one in June 2024. You can access these data in this zip file.

Packages

Most of the LandSat work we’ll be doing will use the terra package, but later on when we get to plotting, we’ll want to use some functions from packages that help us make some nice maps. This includes the ggplot2 package, which is part of the tidyverse, as well as the tidyterra package, which provides functions for plotting terra objects in ggplot2. Finally, we’ll be using a package called ggspatial, which allows us to add north arrows and scale bars to maps in ggplot2.

Let’s install and activate those packages:

require(tidyverse)

Loading required package: tidyverse

Warning: package 'ggplot2' was built under R version 4.3.3

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2

── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

require(terra)

Loading required package: terra

Warning: package 'terra' was built under R version 4.3.3

terra 1.7.78

Attaching package: 'terra'

The following object is masked from 'package:tidyr':

    extract

require(tidyterra)

Loading required package: tidyterra

Warning: package 'tidyterra' was built under R version 4.3.3


Attaching package: 'tidyterra'

The following object is masked from 'package:stats':

    filter

require(ggspatial)

Loading required package: ggspatial

LandSat Data in `R`

To start off, we’ll read in sections of two LandSat scenes from Egypt’s West Desert. The two images were taken a decade apart, so we can see the changes in the area very clearly. The first, taken in June of 2014, was taken by the LandSat 8 satellite, while the second, taken in June of 2024, was taken by LandSat 9.

I have already processed these images a bit to cut down the file size and compile them into a single TIF file to make it easier to load them in. Don’t forget to set your working directory appropriately, first!

setwd("G:/My Drive/My Classes/GIS for Global Environment/Labs/Remote Sensing Data in R")

landsat_2014 <- rast("LandSat_8_June_2014_clip.tif")

Task 1: Alter the above code to read in the 2024 LandSat 9 image and assign it to an object called landsat_2024.

So let’s have a quick look at what we’ve got here:

plot(landsat_2014)

plot(landsat_2024)

LandSat Bands

So let’s break down what we’re looking at here. Each of those two raster objects have seven components, which we call bands, stacked on top of each other. You can think of band as a different measure for each pixel, almost like columns in a table. In this case, the seven bands correspond to the seven different wavelengths at which LandSat 8 and 9 collect data. In other words, each band corresponds to a different section of the electromagnetic spectrum.

Here’s how the bands map out:

We’re only going to use the first seven of the bands, to save download time. The other bands collect information on features like temperature and cloud cover.

Now it’s not super helpful to know the wavelengths if we don’t really know what they represent. Here’s a more reaily interpretable breakdown:

1 Coastal / Aerosol
2 Blue
3 Green
4 Red
5 Near Infra-Red
6 Short Wave Infra-Red 1
7 Short Wave Infra-Red 2
8 Panchromatic
9 Cirrus
10 Thermal 1
11 Thermal 2

The real power of these different bands comes from knowing how different features on the planet’s surface interact with different parts of the electromagnetic spectrum. By highlighting the energy LandSat detects at different wavelengths, we can highlight specific features of the planet’s surface.

Naming raster bands in `terra`

Just to make things easier for us, we can actually give the bands in our rasters informative names to make it easier to keep track of what we’re doing when we’re working with them as we move through the tutorial.

Naming SpatRaster bands is easy. We just use the names() function, like this:

names(landsat_2014) <- c("coastal", "blue", "green", "red",
                         "nir", "swnir1", "swnir2")

Task 2: Create a line of code to assign the same names to the landsat_2024 raster:

We can pull out a specific band of a SpatRaster like this:

landsat_2014[["green"]]

class       : SpatRaster 
dimensions  : 2155, 4286, 1  (nrow, ncol, nlyr)
resolution  : 30, 30  (x, y)
extent      : 166185, 294765, 3305055, 3369705  (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 36N (EPSG:32636) 
source      : LandSat_8_June_2014_clip.tif 
name        : green 
min value   :  7990 
max value   : 27369

Task 3: Create a line of code that will plot only the Blue band from the landsat_2024 object. It should look something like this:

Notice that even though a lot of the image is bluish, that’s just a coincidence - R just happened to assign a color ramp where blue was low, and, not surprisingly, there isn’t a lot of blue in Egypt’s West Desert!

You can also use the band number to extract a band. So to get the green band from the landsat_2014 object, we could also do this:

landsat_2014[[3]]

class       : SpatRaster 
dimensions  : 2155, 4286, 1  (nrow, ncol, nlyr)
resolution  : 30, 30  (x, y)
extent      : 166185, 294765, 3305055, 3369705  (xmin, xmax, ymin, ymax)
coord. ref. : WGS 84 / UTM zone 36N (EPSG:32636) 
source      : LandSat_8_June_2014_clip.tif 
name        : green 
min value   :  7990 
max value   : 27369

Plotting true-color images in `terra`

Because we have the red, green, and blue sections of the visual spectrum in our LandSat images, we can use them to make what amounts to a visual-spectrum photograph of the area we are studying. terra has a special function, plotRGB(), that allows us to accomplish this task. We just need to tell it which bands of the raster correspond to which colors. plotRGB() ignores band names, so you have to designate the correct bands using their number:

plotRGB(landsat_2014, r = 4, g = 3, b = 2)

Yup, looks like a desert to me!

Task 4: Plot a true-color image of the landsat_2024 object. It should look something like this:

Wow! It looks like over the course of the past decade the West Desert has grown a five o’ clock shadow!

Of course, if you read the materials for today, you know exactly what’s going on: there has been a major push for “greening” the West Desert in order to boost Egyptian food production. All those little dots are new agricultural fields, irrigated by circular irrigation units like this one:

Calculating normalized difference indices

To be clear, just looking at the visual spectrum really doesn’t take much advantage of the richness of LandSat and similar satellite data. We have several other bands that we can use to really highlight particular aspects of the landscape.

One common way of taking advantage of the information recorded in different bands of a satellite image is computing normalized difference indices. These measures compare the values of different bands in the image to identify key features of interest.

There are literally dozens of these indices, each used for slightly different purposes. We’ll just give an example of a couple here, but you can see a list of a bunch of them, as well as how to calculate them at this link.

The Normalized Difference Vegetation Index (NDVI)

We’re going to start with by far the most commonly used difference index, the Normalized Difference Vegetation Index (NDVI). This measure compares the infra-red and red bands to identify vegetated areas. The idea is that healthy plants tend to absorb red light while reflecting infra-red, so areas that reflect relatively more infra-red than red should be more highly vegetated.

We can compute NDVI using the following formula:

\[NDVI = \frac{NIR - Red}{NIR + Red}\]

So to compute this value for an image, all we need to do is:

ndvi_numerator <- landsat_2014[["nir"]] - landsat_2014[["red"]]

ndvi_denomenator <- landsat_2014[["nir"]] + landsat_2014[["red"]]

ndvi_2014 <- ndvi_numerator/ndvi_denomenator

plot(ndvi_2014)

Not surprisingly, most of the desert areas have an NDVI of 0 or below (technically values below 0 should all just be interpreted as the absence of vegetation), while irrigated areas have higher values. The resulting image highlights the green areas in the West Desert in 2014.

Task 5: Compute the NDVI for the region in 2024 assign it to an object called ndvi_2024. Then plot it. It should look something like this:

Once we have computed difference indices, we can treat them like any other SpatRaster. That means we can also perform map algebra on them. For example, we could subtract the NDVI for 2014 from the NDVI for 2024 to get a map highlighting the places where NDVI changed the most.

Task 6: Create a plot showing the areas where NDVI changed the most between 2014 and 2024. It should look something like this:

This new plot really highlights the areas of new cropland development.

We could also use map algebra to try to estimate how much new cropland was developed between 2014 and 2024. We’ll use the hist() function to draw a quick histogram to look at the distribution of NDVI in 2014 and 2024:

hist(ndvi_2014)

Warning: [hist] a sample of 11% of the cells was used

hist(ndvi_2024)

Warning: [hist] a sample of 11% of the cells was used

From the histogram, it appears as though the bulk of the values are below 0.1, which is probably marking out desert pixels. So what we would be looking for would be pixels that are below, say, 0.1 in 2014 but above it in 2024. Here’s how we can get those:

ndvi_change <- (ndvi_2014 < 0.1) * (ndvi_2024 > 0.1)

So why does that work? Well, the comparative statements in the parentheses give us back binary (1/0) rasters that are 1 if the condition is true. Because anything times 0 equals 0, if we multiply the two rasters together, we get a new raster that has a 1 everywhere both of the conditional statements are true.

Here’s what the resulting raster looks like:

plot(ndvi_change)

That looks good, for a rough estimate. Let’s say we use that as our cutoff. Because we know that LandSat has about a 30-meter resolution, we can add up all the pixels greater than 0.05 and multiply by 900 to estimate the square meters of change in irrigated land in our image between 2014 and 2024. Then we can divide by 1,000,000 to convert to square kilometers:

#First, we need to convert the raster into a matrix that we can
#add up:

ndvi_mat <- as.matrix(ndvi_change)

sum(ndvi_mat >= 0.05) * 900 / 1000000

[1] 1100.385

To make this a bit more intuitive, we can convert square kilometers to football (i.e. soccer) fields:

1100 * 140 #It can be handy to know that there are about 140

[1] 154000

#football/soccer fields in a square kilometer when communicating
#geographic information.

Okay, so 154,000 soccer fields of new irrigated land does seem like quite a bit!

The Normalized Difference Built-up Index (NDBI)

The Normalized Difference Built-up Index (NDBI) is very similar to the NDVI, except it is designed to highlight areas of human construction. In this case, we compare the near infra-red and, for LandSat 8 and 9, the first short-wave infra-red bands:

\[NDBI = \frac{SWIR - NIR} {SWIR + NIR}\]

So the computation would be like this:

ndbi_numerator <- landsat_2024[["swnir1"]] - landsat_2024[["nir"]]

ndbi_denomenator <- landsat_2024[["swnir1"]] + landsat_2024[["nir"]]

ndbi_2024 <- ndbi_numerator/ndbi_denomenator

plot(ndbi_2024)

You’ll probably notice an immediate problem here - the NDBI doesn’t seem to be doing a great job of distinguishing built-up from desert in this area. In a more heavily vegetated area, however, it might be more useful.

Task 7: Select another difference index from the list at this link and calculate the change in that index between 2014 and 2024. Plot your results and describe what you think you’re seeing.

Plotting with the `tidyterra` and `ggplot2` packages

These plots that we’ve been making so far are fine for checking our progress as we work through a project, but they aren’t exactly pretty. Luckily, there are some very good packages to help you make high-quality graphics in R.

Throughout most of our tutorials, we’re going to be leaning pretty heavily on the ggplot2 package, which is the most widely used workhorse for high-quality graphics in R. It’s so important that there have been a bunch of other packages that have been developed to extend it to allow you to make all kinds of data visualizations, including maps.

tidyterra is one of those packages. It works as a translator between the terra package and ggplot2, allowing you to make high-quality maps with raster data. We’re going to take advantage of those capabilities to spruce up our LandSat maps.

In what follows, I’m going to assume that you’re familiar with the basic idea of layers as they are implemented in ggplot2. If you haven’t watched the Equitable Equations video for today, you should definitely do that first. If you have, you might still want to check out this five-minute video for a quick refresher.

We’ll start by plotting a true-color image of our area in 2014:

map_1 <- ggplot() + #This first line tells R that we're building a ggplot2 object.
  #Normally, we would include an aes() function in here to set our axes, etc., but
  #tidyterra works a bit differently, because it is designed to have us point R to the
  #correct raster layer when we plot it. Notice that this line ends with a + sign. That
  #is how we add (literally) additional characteristics to our evolving ggplot2 object.
  geom_spatraster(data = ndvi_2014) +
  #The line above is called a geometry - that's what ggplot2 uses to draw
  #an image based on data. In this case, we're directing it to the
  #ndvi_2014 object.
  scale_fill_continuous("NDVI", low = "black", high = "green") + 
  #This line tells ggplot2 how to draw the color ramp for the raster.
  annotation_scale(location = "br", pad_x = unit(0.25, "in"),
                   pad_y = unit(0.25, "in")) + #This line adds a scale
  #bar. The location argument shows where to place it (br = bottom right;
  #tr = top right; bl = bottom left; tl = top left). The
  #pad_ arguments show how far you want it from the corner of the map.
  #The unit() function specifies what measurement units to use for the
  #offset.
  annotation_north_arrow(location = "bl", pad_x = unit(0.25, "in"),
                   pad_y = unit(0.25, "in"))+ #This adds the north arrow.
  #The syntax is pretty similar to the scalebar function.
  labs(title = "NDVI in Egypt's West Desert, 2014",
       subtitle = "LandSat Image, June 2014") +
  theme_minimal() #This line sets the color, font, and other formatting patterns for

<SpatRaster> resampled to 500996 cells.

  #the plot.

map_1 #After you create the object, you have to run it as a separate line for R to plot it.

Task 8: Use ggplot2 to create a polished map showing the changes in NDVI between 2014 and 2024 in Egypt’s West Desert.