Spatial data science with R

Exercises for Meeting May 24 2019

Work on the exercise for Data Cubes Exercises described in https://keen-swartz-3146c4.netlify.com/raster.html#exercises-2

Exercise 1

NDVI, normalized differenced vegetation index, is computed as (NIR-R)/(NIR+R), with NIR the near infrared and R the red band. Read the L7_ETMs.tif file into object x, and distribute the band dimensions over attributes by split(x, "band"). Then, compute NDVI by using an expression that uses the NIR (band 4) and R (band 3) attributes directly.

tif = system.file("tif/L7_ETMs.tif", package = "stars")
library(stars)

## Loading required package: abind

## Loading required package: sf

## Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3

library(sf)
x = read_stars(tif)
x

## stars object with 3 dimensions and 1 attribute
## attribute(s):
##   L7_ETMs.tif    
##  Min.   :  1.00  
##  1st Qu.: 54.00  
##  Median : 69.00  
##  Mean   : 68.91  
##  3rd Qu.: 86.00  
##  Max.   :255.00  
## dimension(s):
##      from  to  offset delta                       refsys point values    
## x       1 349  288776  28.5 +proj=utm +zone=25 +south... FALSE   NULL [x]
## y       1 352 9120761 -28.5 +proj=utm +zone=25 +south... FALSE   NULL [y]
## band    1   6      NA    NA                           NA    NA   NULL

plot(x)

per_band=split(x, "band")

ndvi_brazil <- (per_band[4,] - per_band[3,]) / (per_band[4,] + per_band[3,])
plot(ndvi_brazil,col=colorRampPalette(c("red","yellow","green"))(50),main="NDVI for Brazil Imagery")

Exercise 2

Compute NDVI for the S2 image, using st_apply and an a function ndvi = function(x) (x[4]-x[3])/(x[4]+x[3]). Plot the result, and write the result to a GeoTIFF. Explain the difference in runtime between plotting and writing.

#WARNING!

#The next file size is around 1GB. Uncomment the line if you DO NOT already have the file in your local machine!

#install.packages("starsdata", repos = "http://pebesma.staff.ifgi.de", type = "source")

granule = system.file("sentinel/S2A_MSIL1C_20180220T105051_N0206_R051_T32ULE_20180221T134037.zip", package = "starsdata")
base_name = strsplit(basename(granule), ".zip")[[1]]
s2 = paste0("SENTINEL2_L1C:/vsizip/", granule, "/", base_name, ".SAFE/MTD_MSIL1C.xml:10m:EPSG_32632")
(p = read_stars(s2, proxy = TRUE))

## stars_proxy object with 1 attribute in file:
## $`MTD_MSIL1C.xml:10m:EPSG_32632`
## [1] "SENTINEL2_L1C:/vsizip/C:/Users/Violet/Documents/R/win-library/3.5/starsdata/sentinel/S2A_MSIL1C_20180220T105051_N0206_R051_T32ULE_20180221T134037.zip/S2A_MSIL1C_20180220T105051_N0206_R051_T32ULE_20180221T134037.SAFE/MTD_MSIL1C.xml:10m:EPSG_32632"
## 
## dimension(s):
##      from    to offset delta                       refsys point values    
## x       1 10980  3e+05    10 +proj=utm +zone=32 +datum...    NA   NULL [x]
## y       1 10980  6e+06   -10 +proj=utm +zone=32 +datum...    NA   NULL [y]
## band    1     4     NA    NA                           NA    NA   NULL

ndvi = function(x) (x[4]-x[3])/(x[4]+x[3])
ndvi_s2<-st_apply(p, c("x", "y"), ndvi)
plot(ndvi_s2, col=colorRampPalette(c("red","yellow","green"))(50),main="NDVI for Sentinel Imagery")

start.time <- Sys.time()
write_stars(p, "p2.tif")

## ===========================================================================

end.time <- Sys.time()
time.takenG <- end.time - start.time
time.takenG

## Time difference of 1.983504 mins

start.time <- Sys.time()
plot(ndvi_s2, col=colorRampPalette(c("red","yellow","green"))(50),main="NDVI for Sentinel Imagery")

end.time <- Sys.time()
time.takenP <- end.time - start.time
time.takenP

## Time difference of 7.168569 secs

The time for the GeoTIFF generation is 1.98350378274918 mins and for the plot 7.16856908798218 secs. These differences are because of the pixel size and image resolution for the outputs. Meanwhile, the plot shows a basic result that renders according to zoom levels, the TIFF image (apart from all the metadata stored?) contains the complete resolution values for the image.

Exercise 3

Use st_transform to transform the stars object read from L7_ETMs.tiff to EPSG 4326. Print the object. Is this a regular grid? Plot the first band using arguments axes=TRUE and borders=NA, and explain why this takes such a long time.

l7_transform<-st_transform(x,crs =4326)
l7_transform

## stars object with 3 dimensions and 1 attribute
## attribute(s):
##   L7_ETMs.tif    
##  Min.   :  1.00  
##  1st Qu.: 54.00  
##  Median : 69.00  
##  Mean   : 68.91  
##  3rd Qu.: 86.00  
##  Max.   :255.00  
## dimension(s):
##      from  to offset delta                       refsys point
## x       1 349     NA    NA +proj=longlat +datum=WGS8... FALSE
## y       1 352     NA    NA +proj=longlat +datum=WGS8... FALSE
## band    1   6     NA    NA                           NA    NA
##                               values    
## x    [349x352] -34.9165,...,-34.8261 [x]
## y     [349x352] -8.0408,...,-7.94995 [y]
## band                            NULL    
## curvilinear grid

It is not a regular grid. It is a curvilinear grid.

start.time <- Sys.time()
plot(l7_transform[,,,1],axes=TRUE,borders=NA)

end.time <- Sys.time()
time.takenL7 <- end.time - start.time
time.takenL7

## Time difference of 34.45923 secs

The computing time for this occasion is 34.4592349529266 secs. It takes much time because of the computation of the values for each pixel to render it properly in a lower resolution by scaling the result to fit the user screen. First time running the exercise in local machine took 2 minutes, the second time around 15 seconds, this may be due to the cache store in the workspace.

Exercise 4

Use st_warp to warp the L7_ETMs.tif object to EPSG 4326, and plot the resulting object with axes=TRUE. Why is the plot created much faster than after st_transform?

warp_trans<-st_warp(x,crs=4326)
plot(warp_trans[,,,1],axes=TRUE,borders=NA)

start.time <- Sys.time()
plot(warp_trans[,,,1],axes=TRUE,borders=NA)

end.time <- Sys.time()
time.takenwarp <- end.time - start.time
time.takenwarp

## Time difference of 2.156435 secs

This plot is much faster, it only took 2.1564347743988 secs. This is because the st_transform function leads to lossless transformation[3]. It keeps the image data integration: For gridded spatial data, a curvilinear grid with a transformed grid cell (centers) is returned [3]. In the other hand, st_wraps converts the object into a regular grid in the new CRS. The st_wraps function implies a resampling is done over the image, causing the loss of data that can not be reversed[3].

References

[1] Spatial Data Science.Edzer Pebesma, Roger Bivand

[2] R Spatial: Plotting and subsetting stars objects

[3] R Stars Library Documentation