1. Introduction

This notebook illustrates how to calculate uniband and multiband image statistics using the raster library.

I will use a spatial and spectral subset of a Landsat 8 scene collected in 2013. The subset is a seven-band image which covers the area known as Montes de Maria, in the Colombian Caribbean region. I explained in a previous notebook how to find, download, and subset Landsat images.

2. Reading the image

Let’s use the raster library to create a raster object from a Landsat multispectral TIF file obtained in a previous notebook.

library(raster)
## Loading required package: sp
landsat13 <- brick("./landsat/montes-landsat.TIF")

Let’s check what is landsat13:

landsat13
## class      : RasterBrick 
## dimensions : 3392, 3409, 11563328, 7  (nrow, ncol, ncell, nlayers)
## resolution : 30, 30  (x, y)
## extent     : 422745, 525015, 1019775, 1121535  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs 
## source     : /Users/ivanlizarazo/Documents/ivan/PR/datos/landsat/montes-landsat.TIF 
## names      : ultra.blue,  blue, green,   red,   NIR, SWIR1, SWIR2 
## min values :          0,     0,     0,     0,     0,     0,     0 
## max values :      20788, 21783, 23294, 27555, 43270, 65535, 65535

We can see the spatial resolution, extent, number of layers, coordinate reference system and more.

3. Computing TOA reflectance

Note that digital values range from 0 to 65535. In order to obtain apparent reflectance values it is necessary to review the metadata and look for the conversion factors:

Reflectance conversion factors As the conversion factors are equal for bands 1 to 7, we can convert all bands in a single operation:

toa13 <-  -0.1 + landsat13*2.0E-5

Let’s check the output:

toa13
## class      : RasterBrick 
## dimensions : 3392, 3409, 11563328, 7  (nrow, ncol, ncell, nlayers)
## resolution : 30, 30  (x, y)
## extent     : 422745, 525015, 1019775, 1121535  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=18 +datum=WGS84 +units=m +no_defs 
## source     : /private/var/folders/l_/1nv5l0dx78g_d3k3_t85lrjr0000gn/T/RtmpDYCDt2/raster/r_tmp_2021-03-05_204931_10069_15267.grd 
## names      : ultra.blue,    blue,   green,     red,     NIR,   SWIR1,   SWIR2 
## min values :       -0.1,    -0.1,    -0.1,    -0.1,    -0.1,    -0.1,    -0.1 
## max values :    0.31576, 0.33566, 0.36588, 0.45110, 0.76540, 1.21070, 1.21070

4. Computing uniband statistics

In a previous notebook, we saw how to use terra for computing uniband stats. Let’s do it now with raster:

Mean value for all bands:

toa13_mean <- cellStats(toa13, 'mean')
toa13_mean
## ultra.blue       blue      green        red        NIR      SWIR1      SWIR2 
## 0.07200951 0.05691908 0.04736394 0.03298331 0.18072476 0.11212926 0.04841935

Standard deviation value for all bands:

toa13_sd <- cellStats(toa13, 'sd')
toa13_sd
## ultra.blue       blue      green        red        NIR      SWIR1      SWIR2 
## 0.05569004 0.05104050 0.04863306 0.04600959 0.12010867 0.09146298 0.05948233

First, image visualization:

par(mfrow=c(1, 2))
tmp <- stretch(toa13[[1]], minv=0, maxv=255, minq=0, maxq=1)
plot(tmp, col = gray(0:100 / 100), main='L8 2013 B1 - linear stretch', legend=FALSE, axes=FALSE)
tmp <- stretch(toa13[[2]], minv=0, maxv=255, minq=0, maxq=1)
plot(tmp, col = gray(0:100 / 100), main='L8 2013 B2 - linear stretch', legend=FALSE, axes=FALSE)

Now, the corresponding histograms:

hist(toa13, c(1,2), maxpixels=1000000, plot=TRUE, main="Histograms of B1 and B2", xlab= "TOA reflectance", ylab="Frequency", col="wheat3")

par(mfrow=c(1, 2))
tmp <- stretch(toa13[[3]], minv=0, maxv=255, minq=0, maxq=1)
plot(tmp, col = gray(0:100 / 100), main='L8 2013 B3 - linear stretch', legend=FALSE, axes=FALSE)
tmp <- stretch(toa13[[4]], minv=0, maxv=255, minq=0, maxq=1)
plot(tmp, col = gray(0:100 / 100), main='L8 2013 B4 - linear stretch', legend=FALSE, axes=FALSE)

hist(toa13, c(3,4), maxpixels=1000000, plot=TRUE, main="Histograms of B3 and B4", xlab= "TOA reflectance", ylab="Frequency", col="darkgreen")

par(mfrow=c(1, 2))
tmp <- stretch(toa13[[5]], minv=0, maxv=255, minq=0, maxq=1)
plot(tmp, col = gray(0:100 / 100), main='L8 2013 B5 - linear stretch', legend=FALSE, axes=FALSE)
tmp <- stretch(toa13[[6]], minv=0, maxv=255, minq=0, maxq=1)
plot(tmp, col = gray(0:100 / 100), main='L8 2013 B6 - linear stretch', legend=FALSE, axes=FALSE)

hist(toa13, c(5,6), maxpixels=1000000, plot=TRUE, main="Histograms of B5 and B6", xlab= "TOA reflectance", ylab="Frequency", col="springgreen")

The next tasks are for you:

For a conceptual explanation of these two measures see this link.

5. Computing multiband statistics

par(mfrow=c(1,2))
# Create an RGB image from the raster stack
plotRGB(toa13, r = 4, g = 3, b = 2,main='L8 2013 RGB432', stretch='lin',  colNA='white')
plotRGB(toa13, r = 5, g = 6, b = 4,main='L8 2013 RGB564', stretch='lin', colNA='white')

Multiband statistics include the covariance matrix, the correlation matrix, and pairwise scatterplots.

A pairwise scatterplot is usually accompanied by the corresponding correlation coefficient.

In remote sensing, the most common used correlation coefficient is the Pearson correlation (r), which measures a linear dependence between two bands (x and y). It is meaningful only when x and y are characterized by a normal distribution. The plot of y = f(x) is named the linear regression curve.

Let’s obtain pairwise plots and correlation coefficients:

pairs(toa13[[1:3]], hist=TRUE, cor=TRUE, use="pairwise.complete.obs", maxpixels=100000)

pairs(toa13[[4:6]], hist=TRUE, cor=TRUE, use="pairwise.complete.obs", maxpixels=100000)

Covariance is a measure of the relationship between two random variables. The metric evaluates how much – to what extent – the variables change together. In other words, it is essentially a measure of the variance between two variables. However, the metric does not assess the dependency between variables.

Unlike the correlation coefficient, covariance is measured in units. The units are computed by multiplying the units of the two variables. The variance can take any positive or negative values. The values are interpreted as follows:

The covariance formula is similar to the formula for correlation and deals with the calculation of data points from the average value in a dataset. For example, the covariance between two random variables X and Y can be calculated using the following formula (for population):

# Small fig.width
include_graphics("./pngs/Covariance.png")

Now, let’s create the covariance matrix for our Landsat image:

covmat <- cov(as.matrix(toa13))
# Note the values rescaling
round(covmat*1000,2)
##            ultra.blue blue green  red   NIR SWIR1 SWIR2
## ultra.blue       3.10 2.84  2.67 2.44  4.87  3.86  2.73
## blue             2.84 2.61  2.46 2.26  4.43  3.60  2.55
## green            2.67 2.46  2.37 2.20  4.57  3.71  2.57
## red              2.44 2.26  2.20 2.12  4.09  3.69  2.56
## NIR              4.87 4.43  4.57 4.09 14.43  9.29  5.59
## SWIR1            3.86 3.60  3.71 3.69  9.29  8.37  5.34
## SWIR2            2.73 2.55  2.57 2.56  5.59  5.34  3.54

Covariance and correlation both primarily assess the relationship between variables. The closest analogy to the relationship between them is the relationship between the variance and standard deviation.

Covariance measures the total variation of two random variables from their expected values. Using covariance, we can only gauge the direction of the relationship (whether the variables tend to move in tandem or show an inverse relationship). However, it does not indicate the strength of the relationship, nor the dependency between the variables.

On the other hand, correlation measures the strength of the relationship between variables. Correlation is the scaled measure of covariance. It is dimensionless. In other words, the correlation coefficient is always a pure value and not measured in any units.

The relationship between the two concepts can be expressed using the formula below:

# Small fig.width
include_graphics("./pngs/Correlation.png")

Now, let’s create the correlation matrix for our Landsat image:

#boxplot(toa13[[4:6]], maxpixels=100000)
cmat <- cor(as.matrix(toa13))
round(cmat, 2)
##            ultra.blue blue green  red  NIR SWIR1 SWIR2
## ultra.blue       1.00 1.00  0.99 0.95 0.73  0.76  0.82
## blue             1.00 1.00  0.99 0.96 0.72  0.77  0.84
## green            0.99 0.99  1.00 0.98 0.78  0.83  0.89
## red              0.95 0.96  0.98 1.00 0.74  0.88  0.94
## NIR              0.73 0.72  0.78 0.74 1.00  0.85  0.78
## SWIR1            0.76 0.77  0.83 0.88 0.85  1.00  0.98
## SWIR2            0.82 0.84  0.89 0.94 0.78  0.98  1.00

It would be nice to get the p-values for every correlation coefficient, isn’t?

Let’s try with a pair of bands:

cor.test(x=as.matrix(toa13[[4]]), y=as.matrix(toa13[[5]]), method="pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  as.matrix(toa13[[4]]) and as.matrix(toa13[[5]])
## t = 3747.6, df = 11563326, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.7403116 0.7408321
## sample estimates:
##      cor 
## 0.740572

6. Further tasks

I explored here how to obtain and display bands statistics. However, I used the R old-style way for visualization. There may be much modern ways to make graphs and charts more informative, clear, and attractive. You can make many changes.

For example, we can use the corrplot package to create a graphical display of a correlation matrix, highlighting the most correlated variables in a data table.

In this plot, correlation coefficients are colored according to the value. Correlation matrix can be also reordered according to the degree of association between variables.

library(corrplot)
## corrplot 0.84 loaded
ncmat <- round(cmat, 2)
corrplot(ncmat, type = "upper", order = "hclust", 
         tl.col = "black", tl.srt = 45)

The function chart.Correlation()[ in the package PerformanceAnalytics], can be used to display a chart of a correlation matrix.

tmp1 <- as.matrix(sample(toa13,500000))
library(PerformanceAnalytics)
## Loading required package: xts
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
## 
## Attaching package: 'PerformanceAnalytics'
## The following object is masked from 'package:graphics':
## 
##     legend
chart.Correlation(tmp1, histogram=TRUE, pch=19)

The rasterVis package and/or the leaflet package may help for such purpose.

7. Reproducibility

sessionInfo()
## R version 4.0.4 (2021-02-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] PerformanceAnalytics_2.0.4 xts_0.12.1                
## [3] zoo_1.8-8                  corrplot_0.84             
## [5] png_0.1-7                  ggplot2_3.3.3             
## [7] knitr_1.31                 raster_3.4-5              
## [9] sp_1.4-5                  
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.6        bslib_0.2.4       compiler_4.0.4    pillar_1.4.7     
##  [5] jquerylib_0.1.3   highr_0.8         tools_4.0.4       digest_0.6.27    
##  [9] jsonlite_1.7.2    evaluate_0.14     lifecycle_1.0.0   tibble_3.0.6     
## [13] gtable_0.3.0      lattice_0.20-41   pkgconfig_2.0.3   rlang_0.4.10     
## [17] DBI_1.1.1         yaml_2.2.1        rgdal_1.5-23      xfun_0.21        
## [21] withr_2.4.1       dplyr_1.0.4       stringr_1.4.0     generics_0.1.0   
## [25] sass_0.3.1        vctrs_0.3.6       tidyselect_1.1.0  grid_4.0.4       
## [29] glue_1.4.2        R6_2.5.0          rmarkdown_2.7     purrr_0.3.4      
## [33] magrittr_2.0.1    scales_1.1.1      codetools_0.2-18  htmltools_0.5.1.1
## [37] ellipsis_0.3.1    assertthat_0.2.1  colorspace_2.0-0  quadprog_1.5-8   
## [41] stringi_1.5.3     munsell_0.5.0     crayon_1.4.1