1. Introduction

Getting satellite image datasets for the study area is a first task in several research projects. It is usually conducted throught a myriad of web applications set up by the data providers. This means that, many times, such a task is conducted independently of the remaining research processes.

This notebook tests functionalities of getSpatialData an R package that aims to provide homogeneous function bundles to query, download, prepare and transform various kinds of spatial datasets from open sources, including satellite sensor data. It supports both sf and sp classes as area-of-interest (AOI) inputs. The package is being developed by Jakob Schwalb-Willmann.

This notebook aims to illustrate Geomatica Basica students at Universidad Nacional de Colombia the potential of R for geospatial data processing and analysis including remote sensing tasks.

Note that several chunks of code are commented. You would need to uncomment them and run each one by one. It is advised to run the code in interactive mode (not as a notebook).

Let’s start by installing the current beta version of the getSpatialData, using devtools.

### run this chunk from the console - line-by-line
#   install.package("devtools")
#   devtools::install_github("16EAGLE/getSpatialData")

2. Functionalities

The following Landsat functions are publicly available:

The following generic functions are available:

3. Semantics

The following universal semantics on data are used by getSpatialData (from smallest to biggest entity):

The following universal semantics on computational steps are used by getSpatialData:

4. Getting started to test the library

4.1 Landsat query, preview and download

The following code represents a working chain for querying, filtering, previewing and downloading Landsat data within R. The workflow for dealing with MODIS and Sentinel data using getSpatialData is very similar.

## Load packages
library(getSpatialData)
library(raster)
## Loading required package: sp
library(sf)
## Linking to GEOS 3.5.1, GDAL 2.2.2, PROJ 4.9.2
library(sp)
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.0     ✓ purrr   0.3.4
## ✓ tibble  3.0.1     ✓ dplyr   0.8.5
## ✓ tidyr   1.0.2     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x tidyr::extract() masks raster::extract()
## x dplyr::filter()  masks stats::filter()
## x dplyr::lag()     masks stats::lag()
## x dplyr::select()  masks raster::select()

Let’s read geospatial data representing Montes de Maria, a region in the Colombian Caribbean composing several municipalities from Bolivar and Sucre. Its economy is based on agricultural activities, with a tradition of cattle raising and peasant cultivation of cassava, yams, corn, rice, banana, tobacco, coffee and avocado. Recently, agroindustrial crops have been developed including oil palm, cocoa and hot pepper. Thus, in my case, this region is the area-of-interest.

I would use the geojson format which is becoming popular for encoding a variety of geographic data structures. If you want to know more about this format go to this link.

## Use st_read function from the  sf library
regions <- st_read("./datos/Montes.geojson")
## Reading layer `Montes' from data source `/cloud/project/datos/Montes.geojson' using driver `GeoJSON'
## Simple feature collection with 15 features and 11 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -75.70418 ymin: 9.22535 xmax: -74.77214 ymax: 10.14548
## CRS:            4326

Let’s review the structure of the regions object:

str(regions)
## Classes 'sf' and 'data.frame':   15 obs. of  12 variables:
##  $ DPTO_CCDGO: Factor w/ 2 levels "13","70": 1 1 1 1 1 1 1 2 2 2 ...
##  $ MPIO_CCDGO: Factor w/ 15 levels "13212","13244",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ MPIO_CNMBR: Factor w/ 15 levels "CHALÁN","CÓRDOBA",..: 2 3 4 6 10 11 15 13 9 1 ...
##  $ MPIO_CRSLC: Factor w/ 11 levels "1780","1853",..: 6 8 2 7 4 1 4 9 5 11 ...
##  $ MPIO_NAREA: num  597 946 383 560 443 ...
##  $ MPIO_NANO : int  2017 2017 2017 2017 2017 2017 2017 2017 2017 2017 ...
##  $ DPTO_CNMBR: Factor w/ 2 levels "BOLÍVAR","SUCRE": 1 1 1 1 1 1 1 2 2 2 ...
##  $ Shape_Leng: num  1.206 2.069 0.898 1.405 1.349 ...
##  $ Shape_Area: num  0.0492 0.0779 0.0316 0.0461 0.0365 ...
##  $ layer     : Factor w/ 2 levels "bol_montes","suc_montes": 1 1 1 1 1 1 1 2 2 2 ...
##  $ path      : Factor w/ 2 levels "/Users/ivanlizarazo/Documents/ivan/PR/montes/bol_montes.shp|layername=bol_montes",..: 1 1 1 1 1 1 1 2 2 2 ...
##  $ geometry  :sfc_MULTIPOLYGON of length 15; first list element: List of 1
##   ..$ :List of 1
##   .. ..$ : num [1:3362, 1:2] -75 -75 -75 -75 -75 ...
##   ..- attr(*, "class")= chr  "XY" "MULTIPOLYGON" "sfg"
##  - attr(*, "sf_column")= chr "geometry"
##  - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
##   ..- attr(*, "names")= chr  "DPTO_CCDGO" "MPIO_CCDGO" "MPIO_CNMBR" "MPIO_CRSLC" ...

Let’s plot the simple feature collection:

# texts and labels
p <- ggplot(regions) +
  geom_sf(aes(fill = MPIO_NAREA), colour="white")

p + geom_sf_label(aes(label = MPIO_CNMBR), colour = "black", size = 2.0)

Now, let’s dissolve the internal boundaries of our region:

montes <-
  regions %>%
  summarise(area = sum(MPIO_NAREA))

What is inside the montes object?

montes
## Simple feature collection with 1 feature and 1 field
## geometry type:  POLYGON
## dimension:      XY
## bbox:           xmin: -75.70418 ymin: 9.22535 xmax: -74.77214 ymax: 10.14548
## CRS:            4326
##       area                       geometry
## 1 6513.845 POLYGON ((-75.11082 9.44062...

Let’s visualize the montes object:

p <- ggplot(montes) + geom_sf(aes(fill = area), colour="black") 
p + geom_sf_text(aes(label = "Montes de Maria"), colour = "white", size = 5.0)

Set area of interest for this session:

## set AOI for this session
set_aoi(st_geometry(montes))
view_aoi() #view AOI in viewer, which will look like this:

Set login USGS Earth Explorer credentials and archive directory.

First, you need to have an user account at this link. Then, you can run the code below:

## set login credentials and archive directory
## login_USGS(username = "ials-un") #asks you for password
## set_archive("./datos/")

## Use getSentinel_query to search for data (using the session AOI)
## records <- getLandsat_query(time_range = c("2020-01-01", "2020-04-30"), 
##                            platform = "Landsat-8") #or "Sentinel-1" or "Sentinel-3"

See results:

# records

Filter the records to keep only Landsat 8 images with LandCloudCover lower than 5%:

## Filter the records
## Uncomment to run the code
##colnames(records) #see all available filter attributes
##unique(records$processinglevel) #use one of the, e.g. to see available processing levels

##records_filtered <- records[which(records$product == "LANDSAT_8_C1"),] #filter by Level
##records_filtered <- records_filtered[as.numeric(records_filtered$LandCloudCover) <= 5, ] #filter by clouds
##records_filtered
#browser records or your filtered records

See filtered records:

library(knitr)
include_graphics("./datos/filtered_records.png")

Let’s preview the most recent Landsat 8 image within AOI. It correspond to the sixth record (i.e. the Landsat image acquired on 2020-03-29):

## Preview a single record with no map
## See other preview options on package documentation
## getLandsat_preview(record = records_filtered[6,], on_map = FALSE)

Now, let’s find what products levels are available:

#print available levels for the most recent image
#records_filtered[6,]$levels_available
library(knitr)
include_graphics("./datos/levels.png")

Please note that Landsat 8 Level 1 products provide raw digital numbers that can be converted into Top-of-the-Atmosphere (TOA) reflectance values. On the contrary, Landsat 8 Level 2 product provide surface reflectance (SR) values.

If you want to know more about this topic, it is a good idea to visit: - Landsat Level 1 site - Landsat Collection 2 - Landsat Collection 2 Level 2.

Let’s download the selected level (e.g. the NDVI index calculated using surface reflectance values). Note that this task may take several minutes.

## Uncomment the code
## download record 6 with level "sr_ndvi" (will be processed on demand by ESPA)
## files <- getLandsat_data(records = records_filtered[6,], level = "sr_ndvi", source = "auto")

The following image shows the message received after running the above chunk of code:

The following image shows details of the downloaded file:

In case you want to download Landsat Level 1 products, use:

## download record 5 with level "l1" (will direct to Amazon Web Services automaticaly)
## files <- getLandsat_data(records = records_filtered[6,], level = "l1", source = "auto")

In case you want to download surface reflectance products (i.e. all Level 2 bands), use:

## download record 5 with level "sr" (will be processed on demand by ESPA)
## files <- getLandsat_data(records = records_filtered[6,], level = "sr", source ## = "auto")

4.2 Landsat unzip, crop, and view

Note that I have moved the downloaded file to an upper directory (i.e ./datos), and that I have change the original name to a shorter one (i.e. sr_ndvi.tar.gz).

## Uncompress the downloaded file 
## untar("./datos/sr_ndvi.tar.gz")

The following image shows the uncompressed files:

Note that the NDVI values are stored into the file with prefix *_sr_ndvi.tif* (111.7 Mbytes). The .xml file contains metadata. The *_pixel_qa.tif* describes pixel quality.

Now, we will be use the raster library to read the NDVI:

## Finally, read the Landsat data
ndvi <- raster("./datos/LC08_L1TP_009053_20200329_20200409_01_T1_sr_ndvi.tif")

What is ndvi?

## 
ndvi 
## class      : RasterLayer 
## dimensions : 7731, 7571, 58531401  (nrow, ncol, ncell)
## resolution : 30, 30  (x, y)
## extent     : 411885, 639015, 1002585, 1234515  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0 
## source     : /cloud/project/datos/LC08_L1TP_009053_20200329_20200409_01_T1_sr_ndvi.tif 
## names      : LC08_L1TP_009053_20200329_20200409_01_T1_sr_ndvi 
## values     : -32768, 32767  (min, max)

Note that the Landsat 8-based NDVI is projected using the UTM 18N coordinate reference system.

Now, it is time for cropping the NDVI:

## if necessary, reproject AOI to file CRS
aoi <- st_transform(montes, crs(ndvi))

Let’s check the result:

## 
crs(aoi)
## [1] "+proj=utm +zone=18 +datum=WGS84 +units=m +no_defs "

Finally, let’s conduct the cropping task:

## crop NDVI to a new file
## uncomment for running
## r_crop <- cropFAST("./datos/LC08_L1TP_009053_20200329_20200409_01_T1_sr_ndvi.tif", 
##                   ext = aoi, filename = "./datos/NDVI_montes.tiff")

Read the cropped file:

## Read the cropped NDVI
#r <- raster("./datos/NDVI_montes.tiff")

Check the contents:

## What is r?
#r 

Note file size corresponding to the cropped NDVI. In my case, it is just 1.4 KB!

It seems that there is something wrong with the cropFAST function.

We will continue working with the original NDVI. Let’s aggregate its raster cells to lower memory usage:

#r2 <- aggregate(ndvi, fact=4, fun=mean, expand=FALSE, na.rm=TRUE)

What is inside the aggregated NDVI?

## What is r2?
#r2 

Normalization of NDVI values between 0 and 255:

normalize <- function(x) {
  min <- raster::minValue(x)
  max <- raster::maxValue(x)
  return(255* (x - min) / (max - min))
}
#r3 <- normalize(r2)

Now, let’s convert the RasterLayer object (i.e. the cropped NDVI) into an image and add it to Leaflet map using the addRasterImage function:

#library(leaflet)
#library(RColorBrewer)
#pal <- colorNumeric(c("red", "orange", "yellow", "green", "darkgreen"), values(r3),
#  na.color = "transparent")
#leaflet() %>% addTiles() %>%
#  addRasterImage(r3, colors = pal, opacity = 0.8) %>%
#  addLegend(pal = pal, values = values(r3),
#    title = "NDVI from Landsat 8 - 29.03.2020")

This is the result:

Now, it is your turn to replicate, adapt and improve this notebook. Use your own study area as defined in the course. Good luck!!!