Open Source Spatial Analytics - A15

Sarah Woolard

4/22/2021

Methods:

This model predicts the percentage of forest canopy cover within a 30 meter resolution Landsat image using four predictor variables, derived from the Landsat imagery bands:

  • Normalized Difference Vegetation Index (NDVI)
  • Brightness
  • Greenness
  • Wetness

Training and validation data are provided for the model to assess its accuracy to predict forest percent canopy cover and to predict the accuracy onto a raster image. A raster grid stack containing all four predictor variables and a forest mask layer to only make predictions within the extent of forest are also included. The support vector machines (SVM) algorithm is used within this model alongside a tune length of 10, 5-fold cross validation, and optimized relative to the RMSE metric. The svm model is predicted to the validation data to calculate MSE and RMSE values and also predicted to the raster stacked image. The 4 layers/bands are renamed ndvi, Brightness, Greenness, and Wetness to align with the model. Finally, the forest mask is multiplied against the raster result to only measure areas within a forest and to gain prediction percentage for forest percent canopy cover.

NDVI <- ggplot(train, aes(x=ndvi, y=pCC)) + 
  geom_point(color="black", fill = "tan", size = .75, shape=21) + 
  ggtitle("NDVI & Percentage Canopy Cover (pCC)") +
  theme(plot.title = element_text(size=8)) 

bright <- ggplot(train, aes(x = Brightness, y = pCC)) + 
  geom_point(color = "black", fill ="yellow", size = .7, shape = 21) + 
  ggtitle("Brightness & Percentage Canopy Cover (pCC)") +
  theme(plot.title = element_text(size=8)) 

green <- ggplot(train, aes(x = Greeness, y = pCC)) + 
  geom_point(color ="black", fill= "green", size = .7, shape = 21) +
  ggtitle("Greeness & Percentage Canopy Cover (pCC)") +
  theme(plot.title = element_text(size=8)) 

wet <- ggplot(train, aes(x = Wetness, y = pCC)) + 
    geom_point(color = "black", fill = "blue", size = .7, shape = 21) +
    ggtitle("Wetness & Percentage Canopy Cover (pCC)") +
    theme(plot.title = element_text(size=8)) 

plot_grid(NDVI, bright, green, wet)

Results/Discussion:

Greater NDVI values, which represent “healthier” vegetation, is more associated with higher percent canopy cover. Brightness values and percent canopy cover have a slight positive relationship but overall it appears that increased brightness does not have that much of an effect on pCC. Greenness and percent canopy cover has a relationship similar to brightness & pCC in the sense that a relationship exists between greater values of greenness and pCC but there is not that strong of a relationship. Wetness and percent canopy cover, like NDVI, has a strong positive relationship as wetness increases so does pCC. Out of the four predictor variables, NDVI and wetness have the strongest relationships with percent canopy cover.

set.seed(34)
trainctrl_1 <- trainControl(method = "cv", number = 5, verboseIter = FALSE)

set.seed(34)
svm.model <- train(pCC~., data= train, method = "svmRadial", tuneLength = 10, preProcess = c("center", "scale"), trControl = trainctrl_1, metric = "RMSE")
svm.predict <- predict(svm.model, val)

svm_rmse <- rmse(val$pCC, svm.predict)
svm_mse <- mse(val$pCC, svm.predict)

names(image) <- c("ndvi", "Brightness", "Greeness", "Wetness")
predict(image, svm.model, overwrite = TRUE, filename = "C:/SW/Grad_WVU/693c/A15/predict_image.img")
## class      : RasterLayer 
## dimensions : 503, 660, 331980  (nrow, ncol, ncell)
## resolution : 30, 30  (x, y)
## extent     : 610997, 630797, 4314959, 4330049  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=17 +datum=NAD83 +units=m +no_defs 
## source     : C:/SW/Grad_WVU/693c/A15/predict_image.img 
## names      : predict_image 
## values     : 14.26239, 88.73397  (min, max)
raster_result <- raster("C:/SW/Grad_WVU/693c/A15/predict_image.img")
result_mask <- raster_result*mask

The reported mean sqaure error (MSE) is 44.78 and its units are percent squared as it is the average of squares of error between the predicted and actual percent canopy coverage. The root mean sqaure error (RMSE) is 6.69 and its units are percentage(s) as it has the sqaure root.

RMSE computes root mean squared error between two values’s residuals, in this case it was between the predicted model pCC value and actual pCC value, and a lower RMSE value reflects a better/more accurately predicted model. The value of RMSE of 6.69 suggests that there is a good amount of error and uncertainity in predicting percent canopy coverage in forests.

tm_shape(result_mask) + 
tm_raster(style = "pretty", n = 7,  title = "Prediction Accuracy (%)", palette=get_brewer_pal("YlGn", plot=FALSE)) + 
tm_layout(legend.outside = TRUE, main.title = "Prediction of Percent Canopy Cover (PCC) in Forested Areas", main.title.size = 1, main.title.fontface = "bold") +
tm_compass(position = c(0.02, "bottom"), bg.color = "white", bg.alpha = .75) + 
tm_scale_bar(position = c(0.14, "bottom"), bg.color = "white", bg.alpha = .75)