Generate predictor variables from the PlanetScope imagery
The PlanetScope imagery
Planet surface reflectance (SR) imagery was used to generate predictor variables. SR data is critical because it represents the proportion of light reflected from the Earth’s surface while correcting for atmospheric effects, which enables accurate comparisons across acquisition dates. Temporal variation strengthens the model’s ability to generalise by capturing seasonal differences in vegetation.
The Planet sensor provides imagery in eight spectral bands: Red, Near-Infrared (NIR), Blue, Green, Green I, Coastal Blue, Yellow, and Red Edge. Among these, the Red, NIR, and Blue bands were combined to generate RGB composites, enabling a detailed visual assessment of Aberfoyle Forest and enhancing the distinction of vegetation patterns.
Create vegetation indices (Summer)
The vegetation indices from the summer scene can improve the Random Forest model by providing features that capture the peak of vegetative growth, when foliage is densest and most uniform. This allows for a clearer distinction between species or vegetation types. Moreover, vegetation indices help to reduce noise from raw spectral data, resulting in more reliable and accurate predictions. These enhancements can significantly boost the model’s performance.
Since the assessment focuses on closed canopies, the effects of soil are minimal. For this reason, it makes sense to use indices that focus on small differences in plant health, chlorophyll levels, and canopy structure. MCARI, GNDVI, NDVI, and MSAVI2 each highlight these aspects, and together they provide a fuller picture that helps distinguish species better than indices that only correct for soil or water.
MCARI is highly sensitive to variations in chlorophyll and canopy structure, making it particularly useful for distinguishing species with subtle differences in greenness.
GNDVI, which uses the green band, is more responsive to chlorophyll levels than NDVI and can better differentiate species with similar biomass but differing health.
NDVI remains a reliable general-purpose index and serves as a solid baseline. It complements MCARI and GNDVI effectively.
MSAVI2 can be valuable when species vary in crown density or have incomplete canopy cover, as it adds structural contrast.
Combining Spectral Bands and Vegetation Indices into a Multi-Layered Image
The imagery dataset contains several raster images, with each one showing either a spectral band from the Planet sensor or a vegetation index. Since these images cover the same area and have the same resolution, they can be stacked to form a single multi-layered image. This final image brings together the original PlanetScope bands for each season along with the calculated vegetation indices.
The list below shows a total of 36 predictor variables.
| Covariates |
|---|
| coastal.blue_winter |
| blue_winter |
| green.I_winter |
| green_winter |
| yellow_winter |
| red_winter |
| rededge_winter |
| nir_winter |
| coastal.blue_spring |
| blue_spring |
| green.I_spring |
| green_spring |
| yellow_spring |
| red_spring |
| rededge_spring |
| nir_spring |
| coastal.blue_summer |
| blue_summer |
| green.I_summer |
| green_summer |
| yellow_summer |
| red_summer |
| rededge_summer |
| nir_summer |
| coastal.blue_autumn |
| blue_autumn |
| green.I_autumn |
| green_autumn |
| yellow_autumn |
| red_autumn |
| rededge_autumn |
| nir_autumn |
| NDVI |
| GNDVI |
| MCARI |
| MSAVI2 |
Field data (The response variable)
The response variable represents the various cover classes identified in the study area, based on data from the Forester Subcompartment database. To keep the spectral signatures clear, I am only using compartments that are completely covered by a single component.
Random Forest Classification
Forward Feature Selection (FFS)
Highly autocorrelated variables lead to overfitting, then removing these variables should solve the problem. CAST’s ffs (forward feature selection) function selects predictor variables with user-defined cross-validation.
As the number of variables in the model increases, performance improves in terms of Kappa, and the results become more consistent. Kappa stabilises over 0.7 with sixteen variables.
Tune Spatial cross-validation
I will evaluate the classification using spatial cross-validation with the blockCV R. Spatial cross-validation with the blockCV package is particularly beneficial for avoiding spatial autocorrelation, which can lead traditional cross-validation methods to overestimate model performance. The blockCV package enables you to divide spatial data into blocks, effectively minimising spatial dependency between the training and validation sets. By creating these spatial blocks, you can preserve spatial relationships and prevent autocorrelation in the validation data, resulting in a more accurate assessment of model performance.
The figure below shows the autocorrelation of the predictors. The plots display the extent of spatial autocorrelation for each input raster covariate and also show the spatial block created using the median of these extents. Based on this, the area should be divided into blocks that are 1200 meters wide. For each round of x-fold cross-validation, all data from one spatial block is left out.
Running Random Forest
I split the data into a training set containing 80% of the samples and a validation set containing the remaining 20%. Additionally, I established the 12 predictors and 10 response variables. The R code configures a 10-fold spatial cross-validation using trainControl, with method = “cv” specifying cross-validation. The random forest model (method = “rf”) is trained using the train function with cross-validation (trControl = ctrl_sp_spatial), a specified tuning grid (tuneGrid), and 300 trees (ntree = 300), optimizing for the Kappa metric. We see that the classes could be distinguished with a high Kappa value (>0.68). The optimal mtry value for the model is 2.
Random Forest
18198 samples 14 predictor 11 classes: ‘Beech’, ‘Birch’, ‘Corsican pine’, ‘Douglas fir’, ‘Larch’, ‘Norway spruce’, ‘Oak’, ‘Other Broadleaves’, ‘Other Conifers’, ‘Scots pine’, ‘Sweet chestnut’
No pre-processing Resampling: Cross-Validated (10 fold) Summary of sample sizes: 16463, 16603, 16695, 16318, 16341, 16708, … Resampling results across tuning parameters:
mtry Accuracy Kappa
2 0.8259865 0.7911304 8 0.8241937 0.7895839 14 0.8155421 0.7794587
Kappa was used to select the optimal model using the largest value. The final value used for the model was mtry = 2.
The image below provides insight into how variables help distinguish between different species in the forest.
For instance, in the Lodgepole pine panel, the nir_spring and red_winter variables show high importance for classifying this species. This suggests that reflectance in these bands during those seasons is especially useful for distinguishing Lodgepole pine from other classes
| Beech | Birch | Corsican pine | Douglas fir | Larch | Norway spruce | Oak | Other Broadleaves | Other Conifers | Scots pine | Sweet chestnut | Sum | UA | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Beech | 1133.0 | 3.0 | 0.0 | 0 | 7.0 | 0.0 | 71 | 3.0 | 0.0 | 1 | 7.0 | 1225 | 92.5 |
| Birch | 4.0 | 276.0 | 0.0 | 0 | 13.0 | 1.0 | 102 | 1.0 | 0.0 | 0 | 1.0 | 398 | 69.3 |
| Corsican pine | 0.0 | 0.0 | 451.0 | 6 | 0.0 | 0.0 | 0 | 0.0 | 0.0 | 35 | 0.0 | 492 | 91.7 |
| Douglas fir | 0.0 | 0.0 | 1.0 | 2313 | 0.0 | 3.0 | 0 | 0.0 | 25.0 | 2 | 0.0 | 2344 | 98.7 |
| Larch | 0.0 | 0.0 | 0.0 | 0 | 3133.0 | 1.0 | 13 | 2.0 | 0.0 | 0 | 1.0 | 3150 | 99.5 |
| Norway spruce | 0.0 | 0.0 | 1.0 | 9 | 0.0 | 1818.0 | 0 | 0.0 | 10.0 | 11 | 0.0 | 1849 | 98.3 |
| Oak | 13.0 | 1.0 | 0.0 | 0 | 19.0 | 0.0 | 4849 | 1.0 | 0.0 | 0 | 24.0 | 4907 | 98.8 |
| Other Broadleaves | 2.0 | 2.0 | 0.0 | 0 | 2.0 | 1.0 | 38 | 247.0 | 0.0 | 0 | 16.0 | 308 | 80.2 |
| Other Conifers | 0.0 | 0.0 | 13.0 | 107 | 2.0 | 31.0 | 0 | 1.0 | 989.0 | 13 | 0.0 | 1156 | 85.6 |
| Scots pine | 0.0 | 0.0 | 17.0 | 1 | 9.0 | 4.0 | 0 | 0.0 | 2.0 | 819 | 0.0 | 852 | 96.1 |
| Sweet chestnut | 39.0 | 1.0 | 0.0 | 0 | 4.0 | 0.0 | 32 | 2.0 | 0.0 | 0 | 1439.0 | 1517 | 94.9 |
| Sum | 1191.0 | 283.0 | 483.0 | 2436 | 3189.0 | 1859.0 | 5105 | 257.0 | 1026.0 | 881 | 1488.0 | 18198 | NA |
| PA | 95.1 | 97.5 | 93.4 | 95 | 98.2 | 97.8 | 95 | 96.1 | 96.4 | 93 | 96.7 | NA | 96.0 |