1 Load library

library(sp)
library(rgdal)
library(raster)
library(ggplot2)
library(viridis)
library(rasterVis)

2 Load data

# Load data
tay <- raster('taycrop.tif')
# Get properties of the Tay raster
tay
## class      : RasterLayer 
## band       : 1  (of  12  bands)
## dimensions : 507, 848, 429936  (nrow, ncol, ncell)
## resolution : 9.217891e-05, 9.217891e-05  (x, y)
## extent     : -4.320218, -4.242051, 56.45366, 56.50039  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : taycrop.tif 
## names      : taycrop

3 Create individual raster layers for each of the spectral bands

b1 <- raster('taycrop.tif', band=1)
b2 <- raster('taycrop.tif', band=2)
b3 <- raster('taycrop.tif', band=3)
b4 <- raster('taycrop.tif', band=4)
b5 <- raster('taycrop.tif', band=5)
b6 <- raster('taycrop.tif', band=6)
b7 <- raster('taycrop.tif', band=7)
b8 <- raster('taycrop.tif', band=8)
b9 <- raster('taycrop.tif', band=9)
b10 <- raster('taycrop.tif', band=10)
b11 <- raster('taycrop.tif', band=11)
b12 <- raster('taycrop.tif', band=12)

4 Compare brands

We can now compare two bands to see if they have the same extent, number of rows and column, projection, resolution and origin. As can be seen below, bands 2 and 3 match.

compareRaster(b2, b3)
## [1] TRUE

Note: Checking the coordinate systems and extents of rasters is a very useful skill - quite often when you have problems with working with multiple raster objects, it’s because of differences in coordinate systems or extents.

5 Plot brand

Using the plot or image function

plot(b8)

image(b8)

Using zoom for a specific area:

plot(b8)
zoom(b8)    # run this line, then click twice on your plot to define a box

Alternatively, an extent can be cropped and plotted from the plot image using the same double click method described above and the code below. By using drawExtent and crop, Zooming in allows you to visualise spatial data for specific areas you might be interested in:

plot(tay)
e <- drawExtent()    # run this line, then click twice on your plot to define a box

cropped_tay <- crop(b7, e)
plot(cropped_tay)

6 Visualise spectral bands

6.1 Visualise single layer

6.1.1 Plot with viridis

image(b8, col= viridis_pal(option="D")(10), main="Sentinel 2 image of Loch Tay")

6.1.2 Save plot

png('tayplot.png', width = 4, height = 4, units = "in", res = 300)                
# to save plot
image(b8, col= viridis_pal(option="D")(10), main="Sentinel 2 image of Loch Tay")

dev.off()                                           
## quartz_off_screen 
##                 2

dev.off() is a function that “clears the slate”. it just means you are done using that specific plot. if you don’t dev.off(), that can create problems when you want to save another plot

6.2 Visualise a multi-layered raster object

6.2.1 Solution1: Create a raster stack

First, we will create a raster stack, a multi-layered raster object, of the red(b4), green(b3) and blue(b2) bands (red, green, and blue (RGB) values)

To save plot, using png() function together with dev.off():

# this code specifies how we want to save the plot
png('RGB.png', width = 5, height = 4, units = "in", res = 300)
tayRGB <- stack(list(b4, b3, b2))              # creates raster stack
plotRGB(tayRGB, axes = TRUE, stretch = "lin", main = "Sentinel RGB colour composite")

dev.off()
## quartz_off_screen 
##                 2

6.2.2 Solution2: false colour composite (FCC)

Another popular way to visualise remote sensing data is using a false colour composite (FCC) (to hop mau gia), where the red, green, and blue bands have been replaced in order to accentuate vegetation (tham thuc vat)

In a FCC: 1. The red bands is replaced by the near infrared band (band 8 in Sentinel 2), 2. The green band by red and 3. The blue band by green.

This creates an image where the vegetation stands out in red.

Check (help(plotRGB)) for more information and other arguments for the function.

Now, let’s plot 1 band, for example Band 8:

# Plot band 8

gplot(b8) +
  geom_raster(aes(x = x, y = y, fill = value)) +
  # value is the specific value (of reflectance) each pixel is associated with
  scale_fill_viridis_c() +
  coord_quickmap() +
  ggtitle("West of Loch tay, raster plot") +
  xlab("Longitude") +
  ylab("Latitude") +
  theme_classic() +                         # removes defalut grey background
  theme(plot.title = element_text(hjust = 0.5),             # centres plot title
        text = element_text(size=20),                   # font size
        axis.text.x = element_text(angle = 90, hjust = 1))

ggsave("ggtay.png", scale = 1.5, dpi = 300)         # to save plot

To save plot, by using ggplot2 we dont need to use dev.off()

Now, let’s plot all bands

# create a stack of all the bands
t <- stack(b1,b2, b3, b4, b5, b6, b7, b8, b9, b10, b11, b12)
gplot(t) +
  geom_raster(aes(x = x, y = y, fill = value))+
  scale_fill_viridis_c() +
  facet_wrap(~variable) +
  coord_quickmap()+
  ggtitle("Sentinel 2 Loch tay, raster plots") +
  xlab("Longitude") +
  ylab("Latitude") +
  theme_classic() +
  theme(text = element_text(size=20),
        axis.text.x = element_text(angle = 90, hjust = 1)) +
  theme(plot.title = element_text(hjust = 0.5))

ggsave("allbands.png", scale = 1.5, dpi = 300)      # to save plot

A quick plot:

s_tay <- brick('taycrop.tif')
plot(s_tay)

7 Manipulate rasters: NDVI and KMN classification

7.1 The Normalised Difference Vegetation Index (NDVI)

The Normalised Difference Vegetation Index (NDVI) is a widely used vegetation index that quantifies vegetation presence, health or structure. It is calculated using the Near Infrared (NIR) and Red bandwith of the spectrum. Healthy vegetation reflects light strongly in the NIR part of the spectrum and absorbs light in red part of the visible spectrum for photosynthesis.

A high ratio between light refected in the NIR part of the spectrum and light reflected in the red part of the spectrum would represent areas that potentially have healthy vegetation. It is worth noting that different plant species absorb light in the red part of the spectrum at different rates. The same plant will also absorb light in the red band differently depending on whether it is stressed or healthy, or the time of year. It is often used over large areas as an indication of land cover change.

The NDVI ratio is calculated using (NIR - Red) / (NIR + Red). For example, a pixel with an NDVI of less than 0.2 is not likely to be dominated by vegetation, and an NDVI of 0.6 and above is likely to be dense vegetation.

In R, we can calculate the NDVI by creating a function and using raster math operations where NIR = band 8 and Red = band 4 in Sentinel 2 images. We will first use the raster brick we created earlier from the original file.

7.1.1 Create NDVI function

# Created a VI function (vegetation index)
VI <- function(img, k, i) {
  bk <- img[[k]]
  bi <- img[[i]]
  vi <- (bk - bi) / (bk + bi)
  return(vi)
}

# For Sentinel 2, the relevant bands to use are:
# NIR = 8, red = 4

7.1.2 Plot

ndvi <- VI(s_tay, 8, 4) # 8 and 4 refer to the bands we'll use
plot(ndvi, col = rev(terrain.colors(10)), main = 'Sentinel 2, Loch Tay-NDVI')

7.1.3 Plot and save

ndvi <- VI(s_tay, 8, 4)

# png() > plot() > dev.off 
png('ndviplot.png', width = 4, height = 4, units = "in", res = 300)
plot(ndvi, col = rev(terrain.colors(10)), main = 'Sentinel 2, Loch Tay-NDVI')
dev.off()
## quartz_off_screen 
##                 2

7.1.4 Histogram

To find out the distribution of the pixel NDVI values, we can plot a histogram.

# Create histogram of NDVI data and save: # png() > plot() > dev.off 
png('ndvihist.png', width = 4, height = 4, units = "in", res = 300)

hist(ndvi,
     main = "Distribution of NDVI values",
     xlab = "NDVI",
     ylab= "Frequency",
     col = "aquamarine3", #color in R 
     xlim = c(-0.5, 1),
     breaks = 30,
     xaxt = 'n') # stat frequency 
axis(side = 1, at = seq(-0.5,1, 0.05), labels = seq(-0.5,1, 0.05))

dev.off()
## quartz_off_screen 
##                 2
# Create histogram of NDVI data, not save 
hist(ndvi,
     main = "Distribution of NDVI values",
     xlab = "NDVI",
     ylab= "Frequency",
     col = "aquamarine3", #color in R 
     xlim = c(-0.5, 1),
     breaks = 30,
     xaxt = 'n') # stat frequency 
axis(side = 1, at = seq(-0.5,1, 0.05), labels = seq(-0.5,1, 0.05))

Note: The histogram is strongly skewed to the right, towards highh NDVI values, indicating a highly vegetated area.

Now that we know that this area has lots of vegetation, we can also mask (che) the pixels with an NDVI value of less than 0.4 (less likely to be vegetation) to highlight where the vegetated areas occur.

veg <- reclassify(ndvi, cbind(-Inf, 0.4, NA))
# We are reclassifying our object and making all values between negative infinity and 0.4 be NAs

Next: plot veg

plot(veg, main = 'Veg cover')

AND Next: plot veg & Save:

png('ndvimask.png', width = 4, height = 4, units = "in", res = 300)
plot(veg, main = 'Veg cover')
dev.off()
## quartz_off_screen 
##                 2

7.1.5 Save the raster itself, not just plots

To save a raster object, use the writeraster function:

writeRaster(x = ndvi,
            filename="/Users/admin/Desktop/Linh Data Studio/Raster data & Remote sensing/tay_ndvi_2018.tif",    
            format = "GTiff",                   # save as a tif
            datatype = 'INT2S',
            overwrite=TRUE)                     # save as a INTEGER rather than a float

7.2 KMN classification

In this context, unsupervised means that we are not using training data for the clustering. This type of classification can be useful when not a lot is known about an area.

In the example below, we are going to use the kmeans algorithm. The algorithm groups pixels that have similar spectral properties in the same cluster.

In general, we are going to create 10 clusters using the NDVI raster we have just created above, but first, we need to convert the raster into an array (getValues), which is the object format required for the classification.

Let go!

7.2.1 Convert raster to array for classification

We are going to create 10 clusters using the NDVI raster we have just created above, but first, we need to convert the raster into an array (getValues), which is the object format required for the classification.

# convert the raster to vector/matrix ('getValues' converts the RasterLAyer to array) )
nr <-getValues(ndvi)
str(nr)
##  num [1:429936] 0.791 0.791 0.785 0.783 0.783 ...

7.2.2 Set.seed

It is important to set the seed generator because kmeans initiates the centres in random locations. The seed generator just generates random numbers

set.seed(99)

7.2.3 Clustering

# create 10 clusters, allow 500 iterations, start with 5 random sets using 'Lloyd' method
kmncluster <- kmeans(na.omit(nr), centers = 10, iter.max = 500,
                     nstart = 5, algorithm = "Lloyd")

# kmeans returns an object of class 'kmeans'
str(kmncluster)
## List of 9
##  $ cluster     : int [1:429936] 7 7 7 7 7 7 7 4 4 7 ...
##  $ centers     : num [1:10, 1] -0.525 0.421 0.846 0.696 0.233 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:10] "1" "2" "3" "4" ...
##   .. ..$ : NULL
##  $ totss       : num 156503
##  $ withinss    : num [1:10] 67.4 23.6 28.1 28.5 32.4 ...
##  $ tot.withinss: num 484
##  $ betweenss   : num 156020
##  $ size        : int [1:10] 4650 8155 71872 45902 10754 58153 58643 18762 4109 148936
##  $ iter        : int 287
##  $ ifault      : NULL
##  - attr(*, "class")= chr "kmeans"

Note:

Kmeans returns an object with 9 elements.

  1. The length of the cluster element within kmncluster is 429936 which is the same as the length of nr created from the ndvi object.

  2. The cell values of kmncluster$cluster range between 1 to 10 corresponding to the input number of clusters we provided in the kmeans() function.

  3. kmncluster$cluster indicates the cluster label for the corresponding pixel.

7.2.4 Convert arrary back to raster for visualisation

To visualise the results, we need to convert the kmncluster$cluster array back to a RasterLayer of the same dimension as the ndvi object.

# First create a copy of the ndvi layer
knr <- ndvi

# Now replace raster cell values with kmncluster$cluster
# array
knr[] <- kmncluster$cluster

knr
## class      : RasterLayer 
## dimensions : 507, 848, 429936  (nrow, ncol, ncell)
## resolution : 9.217891e-05, 9.217891e-05  (x, y)
## extent     : -4.320218, -4.242051, 56.45366, 56.50039  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : layer 
## values     : 1, 10  (min, max)
# First create a copy of the ndvi layer
knr <- ndvi

# Alternative way to achieve the same result
values(knr) <- kmncluster$cluster
knr
## class      : RasterLayer 
## dimensions : 507, 848, 429936  (nrow, ncol, ncell)
## resolution : 9.217891e-05, 9.217891e-05  (x, y)
## extent     : -4.320218, -4.242051, 56.45366, 56.50039  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : layer 
## values     : 1, 10  (min, max)

Plot with NDVI:

par(mfrow = c(1, 2))
plot(ndvi, col = rev(terrain.colors(10)), main = "NDVI")
plot(knr, main = "Kmeans", col = viridis_pal(option = "D")(10))

Plot with RGB:

par(mar = c(10.8, 5, 10.8, 2), mfrow = c(1, 2))
plotRGB(tayRGB, axes = TRUE, stretch = "lin", main = "RGB")
plot(knr, main = "Kmeans", yaxt = 'n', col = viridis_pal(option = "D")(10))

Plot with RGB & save:

png('rgb_kmeans.png', width = 10, height = 8, units = "in", res = 300)
par(mar = c(10.8, 5, 10.8, 2), mfrow = c(1, 2))
plotRGB(tayRGB, axes = TRUE, stretch = "lin", main = "RGB")
plot(knr, main = "Kmeans", yaxt = 'n', col = viridis_pal(option = "D")(10))
dev.off()
## quartz_off_screen 
##                 2