1. Introduction

This notebook explores estimation of vegetation properties from Sentinel-2 data and PROSAIL hybrid inversion. It provides the complete process step by step, including registration on the Copernicus Data Space platform, downloading Sentinel-2 iamgery, image preprocessing and masking, hybrid model training, application of the model on an image, and visualization of the estimated properties.

For the process, we use several functionalities of the R package prosail, based on the canopy reflectance model 4SAIL coupled with the leaf model PROSPECT. In turn, prosail includes the package prospect.Both packages were developed by Jean-Baptiste Feret and Florian de Boissieu. See package documentation at https://jbferet.gitlab.io/prosail/index.html

The SAIL model is a four-stream representation of the radiative transfer equation which distinguishes two direct fluxes (incident solar flux and radiance in the viewing direction) and two diffuse fluxes (upward and downward hemispherical flux).

Four-stream RT modeling concept in SAIL model (Verhoef et al., 2007)

Four-stream RT modeling concept in SAIL model (Verhoef et al., 2007)

PROSAIL is a popular and efficient radiative transfer model (RTM) that has been successfully used to simulate satellite reflectance measurements from randomly sampled bio-physical variables. It is the composite model of the PROSPECT RTM, which simulates the optical properties of leaves from leaf variables (e.g. pigment concentrations), and the SAIL RTM which simulates canopy reflectances from canopy variable and leaf reflectance. Several version of PROSPECT and SAIL enable to take into account different properties of the vegetation canopy into account. Finally, the spectral sensitivity of an instrument (commonly referred to as spectral response function as illustrated at https://www.cesbio.cnrs.fr/multitemp/revised-spectral-bands-for-sentinel-2a/) enables estimation of surface reflectance band measurements (e.g. S2 bands).

Flowchart of the PROSAIL model

Flowchart of the PROSAIL model

PROSAIL input parameters

PROSAIL input parameters

Note that a lot of the code that follows has been adapted from Jean-Baptiste Féret website at https://jbferet.gitlab.io/prosail/articles/prosail4.html

2. Setup

First of all, we need to install several libraries. The best way to do it is by using the Console.

#clean the memory
rm(list=ls())
#run the following lines, one at a time, from the R console (NOT FROM HERE!)
#install.packages("mlr3verse")
#devtools::install_github('jbferet/preprocS2')
#install.packages("package_name")

Then, we load the libraries:

# libraries
library(preprocS2)
library(prosail)
library(mlr3verse)
library(sf)
library(leaflet)
library(leafem)
library(RColorBrewer)

3. Access to CDSE STAC catalog

The preprocS2 library is used here to obtain Sentinel-2 multispectral images. It uses the STAC API endpoint provided by Microsoft Planetary Computer (MPC). However, some products optionally required by preprocS2 are not provided by MPC. This is the case for raster data corresponding to the geometry of acquisition of Sentinel-2 images.

These products are available from the STAC catalog provided by the Sentinel-hub Catalog API via Copernicus Dataspace at https://dataspace.copernicus.eu/.

To be able to download data from the Sentinel-hub STAC Catalog API via Copernicus Dataspace, you need to create an account on https://dataspace.copernicus.eu/. After you have registered and logged into CDSE, you can create and activate the OAuth clients following this link: https://shapps.dataspace.copernicus.eu/dashboard/#/account/settings

Go to the OAuth clients tab and click on the Create button to start the process.:

Then, complete the form that displays and click on Create:

Then, you will be see the client ID as well as the client secret word. Copy such data to a safe place.

Finally, create a .Renviron file and save it on your home directory:

More information about configuring environment variables in R can be found at https://docs.posit.co/ide/user/ide/guide/environments/r/managing-r.html

Once the R session is restarted, you can use all functionalities of preprocS2.

4. Identification and download of Sentinel-2 images

The Sentinel-2 acquisition corresponds to an agricultural area in Mosquera (Colombia). A polygon corresponding to the study area can be obtained at https://geojson.io/#map=13.67/4.67359/-74.22275

Area of interest

Area of interest

4.1 Define area of interest

Using the information obtained from geojson.io, the aoi can be defined:

# define bounding box for area of interest
aoi_bbox <- sf::st_bbox(obj = c('xmin' = -74.24, 'ymin' = 4.65, 
                                'xmax' = -74.20, 'ymax' = 4.69))
aoi <- preprocS2::bbox_to_poly(aoi_bbox, crs = 4326)

4.2 Identify date of the selected data

Go to Copernicus browser at https://browser.dataspace.copernicus.eu/?zoom=15&lat=4.68703&lng=-74.20984 and identify latest date of acquisition of Sentinel-2 L2A over the study area:

Date of data acquisition

Date of data acquisition

4.3 Getting data and acquisition geometry

# define input & output directories
output_dir_s2 <- './S2_images'
dir.create(path = output_dir_s2, showWarnings = F, recursive = T)
# https://browser.dataspace.copernicus.eu/
datetime <- as.Date("2025-01-12")
# save aoi as vector file
path_aoi <- 'sena_aoi.GPKG'
sf::st_write(obj = aoi, dsn = path_aoi, driver = 'GPKG', append=FALSE)
## Deleting layer `sena_aoi' using driver `GPKG'
## Writing layer `sena_aoi' to data source `sena_aoi.GPKG' using driver `GPKG'
## Writing 1 features with 0 fields and geometry type Polygon.
# get tiling grid kml from https://sentiwiki.copernicus.eu/web/s2-products
path_S2tilinggrid <- 'Sentinel-2_tiling_grid.kml'
# get list of available data
(list_files <- get_s2_raster(aoi_path = path_aoi, datetime = datetime, 
                            output_dir = output_dir_s2, 
                            path_S2tilinggrid = path_S2tilinggrid, 
                            siteName = 'Sena', overwrite = TRUE, 
                            geomAcq = TRUE))
## get S2 tiles corresponding to aoi
## get S2 geometry of acquisition of tiles overlapping with aoi
## download S2 collection
## $Refl_L2A
## [1] "./S2_images/raster_samples/Sena_001_2025-01-12.tiff"
## 
## $Binary_mask
## [1] "./S2_images/raster_samples/Sena_001_2025-01-12_BIN.tiff"
## 
## $vegetation_mask
## [1] "./S2_images/raster_samples/Sena_001_2025-01-12_BIN_v2.tiff"
## 
## $provider_mask
## [1] "./S2_images/raster_samples/Sena_001_2025-01-12_SCL.tiff"
## 
## $geometryAcquisition
## $geometryAcquisition$`2025-01-12`
## $geometryAcquisition$`2025-01-12`$saa
## [1] "./S2_images/geomAcq_S2/saa_2025-01-12.tiff"
## 
## $geometryAcquisition$`2025-01-12`$sza
## [1] "./S2_images/geomAcq_S2/sza_2025-01-12.tiff"
## 
## $geometryAcquisition$`2025-01-12`$vaa
## [1] "./S2_images/geomAcq_S2/vaa_2025-01-12.tiff"
## 
## $geometryAcquisition$`2025-01-12`$vza
## [1] "./S2_images/geomAcq_S2/vza_2025-01-12.tiff"

Now, obtain reflectance data and sensor geometry:

# Sentinel-2 L2A reflectance
Refl_L2A <- list_files$Refl_L2A
# Sentinel-2 binary mask identifying vegetation, discarding clouds & shadows
vegetation_mask <- list_files$vegetation_mask
# Sentinel-2 mask from provider ( = SCL from ESA products)
SCL <- list_files$provider_mask
# Sentinel-2 geometry of acquisition (if requested from CDSE)
geometryAcquisition <- list_files$geometryAcquisition

Check the output:

geometryAcquisition
## $`2025-01-12`
## $`2025-01-12`$saa
## [1] "./S2_images/geomAcq_S2/saa_2025-01-12.tiff"
## 
## $`2025-01-12`$sza
## [1] "./S2_images/geomAcq_S2/sza_2025-01-12.tiff"
## 
## $`2025-01-12`$vaa
## [1] "./S2_images/geomAcq_S2/vaa_2025-01-12.tiff"
## 
## $`2025-01-12`$vza
## [1] "./S2_images/geomAcq_S2/vza_2025-01-12.tiff"

5. Training and application of hybrid inversion model

The default parameterization for the distribution of input PROSAIL parameters applied to simulate a training dataset follows the distribution described in the ATBD document

However, PROSAIL inversion can be fully parameterized, in order to adapt the distribution of input parameters, geometry of acquisition, soil parameters, spectral bands to be used, noise level, etc.

The package bigRaster can be used to handle inversion over large rasters if needed, but may be more time consuming over smaller raster data.

output_dir_BP <- './PROSAIL'
dir.create(path = output_dir_BP, showWarnings = FALSE,recursive = TRUE)
# get sensor response for Sentinel-2
SensorName <- 'Sentinel_2'
SRF <- GetRadiometry(SensorName)

# define parameters to estimate
Parms2Estimate <- c('lai', 'CHL', 'EWT', 'LMA', 'fCover', 'fAPAR', 'albedo')

# define spectral bands required to train SVR model for each variable
S2BandSelect <- Bands2Select <- list()
for (parm in Parms2Estimate){
  S2BandSelect[[parm]] <- c('B3','B4','B5','B6','B7','B8','B11','B12')
  Bands2Select[[parm]] <- match(S2BandSelect[[parm]],SRF$Spectral_Bands)
}
ImgBandNames <- SRF$Spectral_Bands
ImgBandNames
##  [1] "B2"  "B3"  "B4"  "B5"  "B6"  "B7"  "B8"  "B8A" "B11" "B12"
# get S2 geometry from preprocS2
angles_path <- preprocS2::get_s2_angles(path_angles = geometryAcquisition[["2025-01-12"]], 
                                        path_bbox = path_aoi)
# define ranges for zenith angles of acquisition
GeomAcq <- list('min' = data.frame('tto' = angles_path$MinAngle['vza'],
                                   'tts' = angles_path$MinAngle['sza'], 
                                   'psi' = angles_path$MinAngle['psi']), 
                'max' = data.frame('tto' = angles_path$MaxAngle['vza'], 
                                   'tts' = angles_path$MaxAngle['sza'], 
                                   'psi' = angles_path$MaxAngle['psi']))

# users can define GeomAcq following their own conditions by following the list 
# structure above if preprocS2 is not applied to get imagery

Now, let’s train the regression model:

# train regression model
# set method = 'svmRadial' or method = 'svmLinear' to use SVR implemented in the 
# R package caret instead of liquidSVM
modelSVR <- prosail::train_prosail_inversion(Parms2Estimate = Parms2Estimate,
                                    atbd = TRUE, GeomAcq = GeomAcq, 
                                    SRF = SRF, 
                                    Bands2Select = Bands2Select, 
                                    Path_Results = output_dir_BP, 
                                    method = 'svmRadial') #'liquidSVM')
## Loading required package: ggplot2
## Loading required package: lattice

If needed, check the package documentation.

# apply regression model on Sentinel-2 raster data
# bigRaster = F by default. Do not mention it if the package is not installed
# Sentinel-2 imagery is provided with reflectance defined as integer between 0 and 10000
# Adapt MultiplyingFactor if not the case 
# MultiplyingFactor = 1 if reflectance defined between 0 and 1
BPvars <- prosail::Apply_prosail_inversion(raster_path = Refl_L2A, 
                                  HybridModel = modelSVR, 
                                  PathOut = output_dir_BP,
                                  SelectedBands = S2BandSelect, 
                                  bandname = ImgBandNames, 
                                  MaskRaster = vegetation_mask, 
                                  bigRaster = F, 
                                  MultiplyingFactor = 10000)
## [1] "Computing lai"
## [1] "Computing CHL"
## [1] "Computing EWT"
## [1] "Computing LMA"
## [1] "Computing fCover"
## [1] "Computing fAPAR"
## [1] "Computing albedo"
## [1] "processing completed"

What we got? Where is it?

BPvars
## $mean
## $mean$lai
## [1] "./PROSAIL/Sena_001_2025-01-12_lai.tiff"
## 
## $mean$CHL
## [1] "./PROSAIL/Sena_001_2025-01-12_CHL.tiff"
## 
## $mean$EWT
## [1] "./PROSAIL/Sena_001_2025-01-12_EWT.tiff"
## 
## $mean$LMA
## [1] "./PROSAIL/Sena_001_2025-01-12_LMA.tiff"
## 
## $mean$fCover
## [1] "./PROSAIL/Sena_001_2025-01-12_fCover.tiff"
## 
## $mean$fAPAR
## [1] "./PROSAIL/Sena_001_2025-01-12_fAPAR.tiff"
## 
## $mean$albedo
## [1] "./PROSAIL/Sena_001_2025-01-12_albedo.tiff"
## 
## 
## $SD
## $SD$lai
## [1] "./PROSAIL/Sena_001_2025-01-12_lai_STD.tiff"
## 
## $SD$CHL
## [1] "./PROSAIL/Sena_001_2025-01-12_CHL_STD.tiff"
## 
## $SD$EWT
## [1] "./PROSAIL/Sena_001_2025-01-12_EWT_STD.tiff"
## 
## $SD$LMA
## [1] "./PROSAIL/Sena_001_2025-01-12_LMA_STD.tiff"
## 
## $SD$fCover
## [1] "./PROSAIL/Sena_001_2025-01-12_fCover_STD.tiff"
## 
## $SD$fAPAR
## [1] "./PROSAIL/Sena_001_2025-01-12_fAPAR_STD.tiff"
## 
## $SD$albedo
## [1] "./PROSAIL/Sena_001_2025-01-12_albedo_STD.tiff"

6. Visualization of vegetation parameters

Let’s check the names of the obtained variables:

list.files("PROSAIL")
##  [1] "PROSAIL_LUT_InputParms.txt"         "PROSAIL_LUT_Reflectance.txt"       
##  [3] "Sena_001_2025-01-12_albedo_STD.tif" "Sena_001_2025-01-12_albedo.tif"    
##  [5] "Sena_001_2025-01-12_CHL_STD.tif"    "Sena_001_2025-01-12_CHL.tif"       
##  [7] "Sena_001_2025-01-12_EWT_STD.tif"    "Sena_001_2025-01-12_EWT.tif"       
##  [9] "Sena_001_2025-01-12_fAPAR_STD.tif"  "Sena_001_2025-01-12_fAPAR.tif"     
## [11] "Sena_001_2025-01-12_fCover_STD.tif" "Sena_001_2025-01-12_fCover.tif"    
## [13] "Sena_001_2025-01-12_lai_STD.tif"    "Sena_001_2025-01-12_lai.tif"       
## [15] "Sena_001_2025-01-12_LMA_STD.tif"    "Sena_001_2025-01-12_LMA.tif"

For the sake of illustration, let’s choose LAI as our variable of interest.

Let’s highlight that the leaf area index (LAI) is typically defined as half the total leaf surface area per unit of ground surface area and is a crucial variable in terrestrial ecosystems for describing biophysical changes in vegetation and canopy structure. It directly influences the efficiency of vegetation transpiration, photosynthesis, and energy balance states. Recognized by the World Meteorological Organization as a key land surface parameter influencing global climate change, the LAI plays a pivotal role in the exchanges of carbon, water, and energy between vegetation and the atmosphere.

Let’s go. First, we read the raster file containing the leaf area index property:

(lai_2025 <- terra::rast("PROSAIL/Sena_001_2025-01-12_lai.tif"))
## class       : SpatRaster 
## size        : 443, 444, 1  (nrow, ncol, nlyr)
## resolution  : 10, 10  (x, y)
## extent      : 584290, 588730, 514020, 518450  (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 18N (EPSG:32618) 
## source      : Sena_001_2025-01-12_lai.tif 
## name        : Sena_001_2025-01-12_lai 
## min value   :               0.7555608 
## max value   :               5.9836369
names(lai_2025) <- "lai"
lai_2025
## class       : SpatRaster 
## size        : 443, 444, 1  (nrow, ncol, nlyr)
## resolution  : 10, 10  (x, y)
## extent      : 584290, 588730, 514020, 518450  (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 18N (EPSG:32618) 
## source      : Sena_001_2025-01-12_lai.tif 
## name        :       lai 
## min value   : 0.7555608 
## max value   : 5.9836369
# change this chunk to meet your preferences
paleta <- colorNumeric(c("orange", "yellow", "cyan", "green"), terra::values(lai_2025), na.color = "transparent")
leaflet() %>% addTiles() %>%
  addProviderTiles("Esri.WorldImagery") %>%
  addRasterImage(lai_2025, colors = paleta, opacity = 0.6) %>%
  addLegend(pal = paleta, values = terra::values(lai_2025),
    title = "LAI estimation - PROSAIL") 

Is there an optimum LAI?

CID Bio-Science states that LAI levels vary with canopy architecture, which depends on the cultivars, geography, and field cultural practices. Then there are differences which arise from the types of crops and fruits. It provides LAI values for some fruits:

  • Apples can be between 1.5 and 5.

  • Peaches can be 7 to 10.

  • Mangoes is, on average, 2.94 and can lie between 1.18 and 4.48.

  • Oranges is high, between 9 and 11.

Let the researchers speak

The following article examines relationship between yield, phenology and LAI max during four soybean growing seasons in southern Brazil:

Relationship between yield and LAI max

Relationship between yield and LAI max

In case you need additional information on LAI, visit this page.

Furthermore, you may find helpful a recent paper on intercomparison of LAI derived from satellite data published at https://www.mdpi.com/2072-4292/17/7/1233.

That’s all. I hope this notebook will spark your interest to dive into PROSAIL using R.

7. References

Berger K, Atzberger C, Danner M, D’Urso G, Mauser W, Vuolo F & Hank T 2018. Evaluation of the PROSAIL Model Capabilities for Future Hyperspectral Model Environments: A Review Study. Remote Sensing, 10:85. https://doi.org/10.3390/rs10010085

Féret, J.-B. & de Boissieu, F. (2024). prospect: an R package to link leaf optical properties with their chemical and structural properties with the leaf model PROSPECT. Journal of Open Source Software, 9(94), 6027, https://doi.org/10.21105/joss.06027

Jacquemoud S, Verhoef W, Baret F, Bacour C, Zarco-Tejada PJ, Asner GP, François C & Ustin SL, 2009. PROSPECT+ SAIL models: A review of use for vegetation characterization. Remote Sensing of Environment, 113:S56–S66. https://doi.org/doi:10.1016/j.rse.2008.01.026

sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.4
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.1
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: US/Pacific
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] caret_7.0-1        lattice_0.22-7     ggplot2_3.5.2      RColorBrewer_1.1-3
##  [5] leafem_0.2.4       leaflet_2.2.2      sf_1.0-21          mlr3verse_0.3.1   
##  [9] mlr3_1.0.0         prosail_2.5.10     preprocS2_2.5.9   
## 
## loaded via a namespace (and not attached):
##   [1] splines_4.5.1           palmerpenguins_0.1.1    tibble_3.3.0           
##   [4] hardhat_1.4.1           mlr3viz_0.10.1          expint_0.1-8           
##   [7] pROC_1.18.5             XML_3.99-0.18           rpart_4.1.24           
##  [10] lifecycle_1.0.4         httr2_1.1.2             globals_0.18.0         
##  [13] prabclus_2.3-4          MASS_7.3-65             crosstalk_1.2.1        
##  [16] backports_1.5.0         magrittr_2.0.3          sass_0.4.10            
##  [19] rmarkdown_2.29          jquerylib_0.1.4         yaml_2.3.10            
##  [22] askpass_1.2.1           flexmix_2.3-20          sp_2.2-0               
##  [25] DBI_1.2.3               lubridate_1.9.4         abind_1.4-8            
##  [28] sfsmisc_1.1-20          purrr_1.0.4             mlr3misc_0.18.0        
##  [31] nnet_7.3-20             pracma_2.4.4            rappdirs_0.3.3         
##  [34] ipred_0.9-15            paradox_1.0.1           satellite_1.0.5        
##  [37] lava_1.8.1              listenv_0.9.1           mlr3learners_0.12.0    
##  [40] terra_1.8-54            units_0.8-7             spacefillr_0.4.0       
##  [43] parallelly_1.45.0       codetools_0.2-20        tidyselect_1.2.1       
##  [46] raster_3.6-32           farver_2.1.2            matrixStats_1.5.0      
##  [49] stats4_4.5.1            base64enc_0.1-3         jsonlite_2.0.0         
##  [52] geojsonsf_2.0.3         e1071_1.7-16            CDSE_0.2.1             
##  [55] progressr_0.15.1        survival_3.8-3          iterators_1.0.14       
##  [58] foreach_1.5.2           tools_4.5.1             progress_1.2.3         
##  [61] bbotk_1.6.0             crsuggest_0.4           Rcpp_1.1.0             
##  [64] glue_1.8.0              prodlim_2025.04.28      leaflet.providers_2.0.0
##  [67] xfun_0.52               dplyr_1.1.4             withr_3.0.2            
##  [70] fastmap_1.2.0           mlr3cluster_0.1.11      mlr3filters_0.8.1      
##  [73] openssl_2.3.3           mlr3fselect_1.3.0       digest_0.6.37          
##  [76] truncnorm_1.0-9         timechange_0.3.0        R6_2.6.1               
##  [79] mlr3tuning_1.4.0        colorspace_2.1-1        wk_0.9.4               
##  [82] jpeg_0.1-11             mlr3tuningspaces_0.6.0  diptest_0.77-1         
##  [85] generics_0.1.4          data.table_1.17.6       recipes_1.3.1          
##  [88] robustbase_0.99-4-1     class_7.3-23            prettyunits_1.2.0      
##  [91] httr_1.4.7              htmlwidgets_1.6.4       mlr3data_0.9.0         
##  [94] ModelMetrics_1.2.2.2    pkgconfig_2.0.3         gtable_0.3.6           
##  [97] timeDate_4041.110       modeltools_0.2-24       mlr3pipelines_0.8.0    
## [100] htmltools_0.5.8.1       mlr3measures_1.0.0      clue_0.3-66            
## [103] scales_1.4.0            png_0.1-8               gower_1.0.2            
## [106] lgr_0.4.4               knitr_1.50              rstudioapi_0.17.1      
## [109] mlr3hyperband_0.6.0     reshape2_1.4.4          uuid_1.2-1             
## [112] checkmate_2.3.2         nlme_3.1-168            curl_6.4.0             
## [115] proxy_0.4-27            cachem_1.1.0            stringr_1.5.1          
## [118] KernSmooth_2.23-26      parallel_4.5.1          s2_1.1.9               
## [121] pillar_1.10.2           grid_4.5.1              vctrs_0.6.5            
## [124] mapview_2.11.2          prospect_1.7.2          cluster_2.1.8.1        
## [127] evaluate_1.0.4          rstac_1.0.1             cli_3.6.5              
## [130] compiler_4.5.1          rlang_1.1.6             crayon_1.5.3           
## [133] future.apply_1.20.0     mclust_6.1.1            classInt_0.4-11        
## [136] plyr_1.8.9              simsalapar_1.0-12       mlr3inferr_0.1.0       
## [139] stringi_1.8.7           gridBase_0.4-7          stars_0.6-8            
## [142] Matrix_1.7-3            hms_1.1.3               future_1.58.0          
## [145] fpc_2.2-13              kernlab_0.9-33          lutz_0.3.2             
## [148] bslib_0.9.0             mlr3mbo_0.3.0           DEoptimR_1.1-3-1