Machine Learning for Land Cover Classification

Objectives

  1. To learn,how to build models usimg machine learning and how to use such models for spatial predictions(eg.Landcover Classification).

  2. To learn how to evaluate such models using a test data.

  3. To familiarize onself with the caret package in R,and the classifier algorithm Random Forest.

Overview

This tutorial seeks to demonstrate the basic workflow of building,testing,tuning and deploying models for landcover classification.The data being used is a(Landsat-8 OLI & TIRS) multispectral image.The image is a coverage of Binduri Distrcit,Bawku Muncipal and Pusiga District in the North-East Region of Ghana . The Caret package in R, provides a standard platorm to use a variety of Classifier Algorithm.However, In this tutorial we would only be particular about the Random Forest Algorithm which have gain tremendous use in spatial predictions in the area of Remote Sensing. We would build a model using the Random Forest algorithm, and use the model to make spatial predictions on the landsat 8 image collected over the area of Binduri,bawku and Pusiga. The 10 LandCover classes digitized(roi) within QGIS would serve as the response parameter, whereas the various band of landsat image, in this case, the blue,green,red,nir(near-infrared),swir1(shortwave infrared 1),swir2(shortwave infrared 2)bands in addition ndvi(Normalized Diffrence Vegetation Index),dem(Digital Elevation Model) would serve as the predictor variables.

Let’s load the following libraries.

Let’s load our raster image file using the stack()function. NB: remember to set your working directory using setwd().

This a landsat 8 satellite image covering the binduri,Bawku(municipal)and Pusiga district of Ghana downloaded from the USGS staellite data platform.It has a Path and Row of (194,052)respectively.The regions of interest(roi) were collected based on the landsat, and also Google Satellite in QGIS with expert knowledge of the site and terrain.A total of 392 ploygons were collected(digitized) and grouped into 10 LandCover Classes.

Lets plot a 5,4,3 false colour composite of our area of study(imageLayer) using the plotRGB() function.

Let’s load our digitized polygons(Shapefiles) using the read_sf() function.

The output shows that there are indeed 10 unique classes.

It’s also very important to make sure our raster layer(bawkuStack) and the sf objects(roiTraining) have the same coordinate refrence system.

The two object have the same coordinate refrence system.

Data Partitioning

At this stage we partition our dataset into two, thats a train and test dataset. We train the model with the train dataset and evaluate the model perfomance using the test data.

We would use the caret function createDataPartition() to split data into training and testing sets.

Let’s now look at the count of the various Classes of the {trainData}.

Proportion of the response Classes that makes up the training data.

Clearly an imbalance data but not that bad, very common in geospatial studies in relation to land cover analysis.

We would also again look at the scale of the values for the vraious predictors.We print first six(6) rows of the trainData.

We can see clearly the different scale of measurement,especially for the dem and also ndvi.However,scale diffrence is not a issue for tree based classifier(Decision Tree,Random forest,Bagging etc.).

Model Training

At this satge, we are ready to build our model.As was stated in the objectives, we would be developing the model using Random Forest algorithm.

Random Forest

Random Forest basically consist of an ensemble of trees. It also uses random Sampling with replacement(bootsrap samples).It can be used for both regression and classification.Rgression when the response variable is numeric and classification when the response variable is a factor(categorical).Each tree within the ensemble(forest) is considered as a model on its own, and at each terminal node, a random number of the predictor variables(mtry) are sampled for splitting.This split the new sample into the various class.Since each tree within the ensemble has equal weight, the total votes in each class is aggregated from each tree to identify the class with the highest votes.

From the breif explaination above, we have 2 hyperparamters we can tune.

  1. The number of trees(ntree).

  2. The number of predictor variable randomly sampled as candidates for splitting at each terminal node(mtry).The default is the square root of the predictors.

NB:Random Forest does quite well in handling multicollinearity.

Let’s define our response and predictor variables.

After building the model,we would use test data(not seen by the model*) to provide unbiased evaluation of the model. We then use Kappa Index as the main performance metric to choose the best model.

It is recommended to use ntree >= 500.

Cross-validation results for rfModel.

Cross-validation results shows a high Kappa Index of 0.9725665 at mtry = 3.This depicts a high concordance between the observed and predicted classes.

Model Evaluation using Test Data

We would evaluate the model using the test data.We would look at the confusion matrix, which would indicate to us how well our model fared in predicting the various classes in the test data.We would also compare the overall acurracy to the No Information Rate, this would indicate to us whether our model is worth using.

Evaluation of the rfModel model using the test data.

rfPredict <- predict(rfModel,testData)
confusionMatrix(rfPredict,testData$Class)
Confusion Matrix and Statistics

                               Reference
Prediction                      Agriculture Agriculture_shallow_recession
  Agriculture                           183                             2
  Agriculture_shallow_recession           0                            43
  Burntland                               1                             0
  Dam                                     0                             0
  Plantation                              0                             0
  River                                   0                             0
  River_DriedUp                           2                             0
  Settlements                             0                             0
  Tree_Shrub_Savanna                      0                             0
  Wooded_Savanna                          0                             2
                               Reference
Prediction                      Burntland Dam Plantation River River_DriedUp
  Agriculture                           0   0          0     0             0
  Agriculture_shallow_recession         0   0          0     0             0
  Burntland                           127   0          0     0             0
  Dam                                   0  34          0     0             0
  Plantation                            0   0        165     0             0
  River                                 0   0          0    31             0
  River_DriedUp                         0   0          0     0            17
  Settlements                           1   0          0     0             0
  Tree_Shrub_Savanna                    0   0          0     0             0
  Wooded_Savanna                        0   0          0     0             0
                               Reference
Prediction                      Settlements Tree_Shrub_Savanna Wooded_Savanna
  Agriculture                             1                  0              0
  Agriculture_shallow_recession           0                  0              5
  Burntland                               0                  0              0
  Dam                                     0                  0              0
  Plantation                              0                  1              0
  River                                   0                  0              0
  River_DriedUp                           0                  0              0
  Settlements                           197                  0              0
  Tree_Shrub_Savanna                      0                163              6
  Wooded_Savanna                          0                  1             46

Overall Statistics
                                          
               Accuracy : 0.9786          
                 95% CI : (0.9678, 0.9865)
    No Information Rate : 0.1926          
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.975           
                                          
 Mcnemar's Test P-Value : NA              

Statistics by Class:

                     Class: Agriculture Class: Agriculture_shallow_recession
Sensitivity                      0.9839                              0.91489
Specificity                      0.9964                              0.99490
Pos Pred Value                   0.9839                              0.89583
Neg Pred Value                   0.9964                              0.99592
Prevalence                       0.1809                              0.04572
Detection Rate                   0.1780                              0.04183
Detection Prevalence             0.1809                              0.04669
Balanced Accuracy                0.9902                              0.95490
                     Class: Burntland Class: Dam Class: Plantation Class: River
Sensitivity                    0.9922    1.00000            1.0000      1.00000
Specificity                    0.9989    1.00000            0.9988      1.00000
Pos Pred Value                 0.9922    1.00000            0.9940      1.00000
Neg Pred Value                 0.9989    1.00000            1.0000      1.00000
Prevalence                     0.1245    0.03307            0.1605      0.03016
Detection Rate                 0.1235    0.03307            0.1605      0.03016
Detection Prevalence           0.1245    0.03307            0.1615      0.03016
Balanced Accuracy              0.9955    1.00000            0.9994      1.00000
                     Class: River_DriedUp Class: Settlements
Sensitivity                       1.00000             0.9949
Specificity                       0.99802             0.9988
Pos Pred Value                    0.89474             0.9949
Neg Pred Value                    1.00000             0.9988
Prevalence                        0.01654             0.1926
Detection Rate                    0.01654             0.1916
Detection Prevalence              0.01848             0.1926
Balanced Accuracy                 0.99901             0.9969
                     Class: Tree_Shrub_Savanna Class: Wooded_Savanna
Sensitivity                             0.9879               0.80702
Specificity                             0.9930               0.99691
Pos Pred Value                          0.9645               0.93878
Neg Pred Value                          0.9977               0.98876
Prevalence                              0.1605               0.05545
Detection Rate                          0.1586               0.04475
Detection Prevalence                    0.1644               0.04767
Balanced Accuracy                       0.9905               0.90196

Kappa Index of 0.975 with 95% confidence interval(0.9678, 0.9865) indicates high concordance between the observed and the predicted classes.Also, an Accuracy value 0.9786 is very encouraging, compared to the No Information Rate of 0.1926.We can also see high values of sensitivity and specificity for the various classes of the response variable.

Feature Selection(recursive feature elimination)

In machine learning, we are usually face with the problem of large numbers of predictors,some of this predictors have little to no contribution to the performance of the model.Given such situations, there is the need to find a way to drop some of this non-informative features(predictors) mainly to increase the performance metric(be it Kappa Index or Accuracy) or to find a subset of features(predictors) that reduces the complexity of the model.

NB:Multicollinearity does not affect the performance of Random Forest models.However the interpretability of variable importance becomes a problem

RFE: It begins with all predictors(backward selection), and works it way down by a given number of iterations untill a subset(combination of predictors) of optimal model performance is achieved.

In the practical field of devloping models, large number of features(predictors) would increase

  1. Cost(cost of collecting/measuring information on the predictors)
  2. Computation time.
  3. Complexity of the model.

NB: This is just a demonstration, we are not burden with large number of features(predictors). Our interest here, is whether there is a subset that could reduce the model complexity and yet increase the Kappa Index. There could also be a trade-off, where a reduction in complexity would lead to a minimal reduction in the performance metric(Kappa Index).

Recursive feature elimination(ntree=500)

Cross-validation results for the recursive feature elimination.

From running the Recursive feature Elimination(rfe), the top five variables were given as nir, dem, ndvi, swir1, green.The cross-validation assessment indicates that all eight(8) features were selected at a Kappa Index of 0.9676.

Evaluation of the testData with the recursive feature elimination model.

rfElemPredict <- predict(rfModelRfe,testData)

confusionMatrix(rfElemPredict$pred,testData$Class)
Confusion Matrix and Statistics

                               Reference
Prediction                      Agriculture Agriculture_shallow_recession
  Agriculture                           184                             2
  Agriculture_shallow_recession           0                            43
  Burntland                               1                             0
  Dam                                     0                             0
  Plantation                              0                             0
  River                                   0                             0
  River_DriedUp                           1                             0
  Settlements                             0                             0
  Tree_Shrub_Savanna                      0                             0
  Wooded_Savanna                          0                             2
                               Reference
Prediction                      Burntland Dam Plantation River River_DriedUp
  Agriculture                           0   0          0     0             0
  Agriculture_shallow_recession         0   0          0     0             0
  Burntland                           127   0          0     0             0
  Dam                                   0  34          0     0             0
  Plantation                            0   0        165     0             0
  River                                 0   0          0    31             0
  River_DriedUp                         0   0          0     0            17
  Settlements                           1   0          0     0             0
  Tree_Shrub_Savanna                    0   0          0     0             0
  Wooded_Savanna                        0   0          0     0             0
                               Reference
Prediction                      Settlements Tree_Shrub_Savanna Wooded_Savanna
  Agriculture                             2                  0              0
  Agriculture_shallow_recession           0                  1              5
  Burntland                               0                  0              0
  Dam                                     0                  0              0
  Plantation                              0                  0              0
  River                                   0                  0              0
  River_DriedUp                           0                  0              0
  Settlements                           195                  0              0
  Tree_Shrub_Savanna                      1                163              5
  Wooded_Savanna                          0                  1             47

Overall Statistics
                                          
               Accuracy : 0.9786          
                 95% CI : (0.9678, 0.9865)
    No Information Rate : 0.1926          
    P-Value [Acc > NIR] : < 2.2e-16       
                                          
                  Kappa : 0.975           
                                          
 Mcnemar's Test P-Value : NA              

Statistics by Class:

                     Class: Agriculture Class: Agriculture_shallow_recession
Sensitivity                      0.9892                              0.91489
Specificity                      0.9952                              0.99388
Pos Pred Value                   0.9787                              0.87755
Neg Pred Value                   0.9976                              0.99591
Prevalence                       0.1809                              0.04572
Detection Rate                   0.1790                              0.04183
Detection Prevalence             0.1829                              0.04767
Balanced Accuracy                0.9922                              0.95439
                     Class: Burntland Class: Dam Class: Plantation Class: River
Sensitivity                    0.9922    1.00000            1.0000      1.00000
Specificity                    0.9989    1.00000            1.0000      1.00000
Pos Pred Value                 0.9922    1.00000            1.0000      1.00000
Neg Pred Value                 0.9989    1.00000            1.0000      1.00000
Prevalence                     0.1245    0.03307            0.1605      0.03016
Detection Rate                 0.1235    0.03307            0.1605      0.03016
Detection Prevalence           0.1245    0.03307            0.1605      0.03016
Balanced Accuracy              0.9955    1.00000            1.0000      1.00000
                     Class: River_DriedUp Class: Settlements
Sensitivity                       1.00000             0.9848
Specificity                       0.99901             0.9988
Pos Pred Value                    0.94444             0.9949
Neg Pred Value                    1.00000             0.9964
Prevalence                        0.01654             0.1926
Detection Rate                    0.01654             0.1897
Detection Prevalence              0.01751             0.1907
Balanced Accuracy                 0.99951             0.9918
                     Class: Tree_Shrub_Savanna Class: Wooded_Savanna
Sensitivity                             0.9879               0.82456
Specificity                             0.9930               0.99691
Pos Pred Value                          0.9645               0.94000
Neg Pred Value                          0.9977               0.98978
Prevalence                              0.1605               0.05545
Detection Rate                          0.1586               0.04572
Detection Prevalence                    0.1644               0.04864
Balanced Accuracy                       0.9905               0.91074

This is not different from the results of the initial model rfModel.

We would still proceed to making spatial predictions with the initial model rfModel.This is because rfModelRfe(recursive feature model) had no improvement(reduction) in it’s complexity, nor improvement in it’s perfomance metric(Kappa Index) in refrence to the intial model rfModel.

Spatial Prediction

We would make spatial predictions of the stack image using the initial model rfModel. We would use theraster::predict() function.

Further Reading


Kuhn,M., and Johnson, K. (2013). Applied Predictive Modeling.1st edn. New York: Springer.


Hastie,T.,Tibshirani,R., and Friedman,J.H. (2009). The Elements of Statistical Learning

:Data Mining,Inference and Prediction.2nd edn. New York: Springer.