This outline of the workflow to filter the ECOSTRESS data is based on the material in the following documents:
1) ECOSTRESS L2 User Guide
2) ECOSTRESS L2 ATBD
3) ECOSTRESS Cloud Detection Algorithm ATBD

There are a total of 24 different Science DataSets available on AppEEARS from the L2, L3 and L4 products for ECOSTRESS. The L1 products that are available for download are the geolocation-height, solar azimuth and other variables used to to produce the L2 and higher level products. We downloaded all but one of the L2, L3 and L4 SDS that were available for a given date (did not download the Water Use Efficiency SDS). The products are:

We are ultimately interested in the Evapotranspiration data, which is relatively sparse. The number of scenes for which higher level (L3 and L4) products are available is considerably less than for L2 products:

## CloudMask   LST_doy   LST_err    SDS_QC Emis1_doy Emis2_doy Emis3_doy Emis4_doy 
##       239       228       239       239       103       230       105       230 
## Emis5_doy 
##       229
## Emis1_err Emis2_err Emis3_err Emis4_err Emis5_err    EmisWB       PWV  ETcanopy 
##       109       239       109       239       239       230       239        98 
##   ETdaily 
##        98
##      ETsoil    L3_L4_QA  ETinst_doy Uncertainty 
##          98          97         104         104

In order to maximise the initial data we have for L3 and L4 products, while ensuring good quality of the data calls for a judicious application of the results of the Quality Control / Quality Assurance algorithm provided with the data, particularly for the LST and the Cloud Mask algorithms. The L3_L4_QA algorithm is based on and closely follows these two algorithms, so ensuring data of good quality for the L2 products also ensures good quality of L3 and L4 products.

The QC values for L2 products are available as a separate science dataset, and the same values apply to LST and Emissivity data. The values are in decimal (base 10), and AppEEARS also makes available a QC lookup table with the binary values converted to decimal. We need to look up each QC value in the lookup table to understand the various quality flags associated with that value.

Based on the information in the L2 User Guide, section 2.4, ‘Quality Assurance’, which states that “L2 retrieval runs on all pixels regardless of cloud, and the user may then further inspect the cloud bit mask to tailor the result to his/her needs, or to even produce their own cloud mask”, the steps I intended to follow to filter the data are as follows:

At this point, there are a few decisions to make.
- i. What is a good LST error value to decide on retaining the pixel?
- ii. What is a good PWV value to retain the pixel?
- iii. How to decide if the pixels with cloud detected should be retained?

This process of filtering the data is illustrated by applying it to two different days of data: Data available on 2019-07-26 22:21:26 and data available on 2019-01-21 23:58:02.

For 2019-07-26 22:21:26, 19 of the 23 files are available:

##  [1] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2CLD.001_SDS_CloudMask_doy2019207222126_aid0001.tif"                                  
##  [2] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_Emis2_doy2019207222126_aid0001.tif"                                     
##  [3] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_Emis2_err_doy2019207222126_aid0001.tif"                                 
##  [4] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_Emis4_doy2019207222126_aid0001.tif"                                     
##  [5] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_Emis4_err_doy2019207222126_aid0001.tif"                                 
##  [6] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_Emis5_doy2019207222126_aid0001.tif"                                     
##  [7] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_Emis5_err_doy2019207222126_aid0001.tif"                                 
##  [8] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_EmisWB_doy2019207222126_aid0001.tif"                                    
##  [9] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_LST_doy2019207222126_aid0001.tif"                                       
## [10] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_LST_err_doy2019207222126_aid0001.tif"                                   
## [11] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_PWV_doy2019207222126_aid0001.tif"                                       
## [12] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO2LSTE.001_SDS_QC_doy2019207222126_aid0001.tif"                                        
## [13] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO3ANCQA.001_L3_L4_QA_ECOSTRESS_L2_QC_doy2019207222126_aid0001.tif"                     
## [14] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO3ETPTJPL.001_EVAPOTRANSPIRATION_PT_JPL_ETcanopy_doy2019207222126_aid0001.tif"         
## [15] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO3ETPTJPL.001_EVAPOTRANSPIRATION_PT_JPL_ETdaily_doy2019207222126_aid0001.tif"          
## [16] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO3ETPTJPL.001_EVAPOTRANSPIRATION_PT_JPL_ETinst_doy2019207222126_aid0001.tif"           
## [17] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO3ETPTJPL.001_EVAPOTRANSPIRATION_PT_JPL_ETinstUncertainty_doy2019207222126_aid0001.tif"
## [18] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO3ETPTJPL.001_EVAPOTRANSPIRATION_PT_JPL_ETsoil_doy2019207222126_aid0001.tif"           
## [19] "/home/vemurk01/qnap_geo/ECOSTRESS/ECO4ESIPTJPL.001_Evaporative_Stress_Index_PT_JPL_ESIavg_doy2019207222126_aid0001.tif"

Step 1: Plotting the raw data

The files for 2019-07-26 22:21:26 have data on 1588360 pixels, and 1524780 (approximately 96%) have data on LST. After reading in all the available raster files for 2019-07-26 22:21:26, the LST data for this scene appears as in the image below:

Step 2: Applying the Land-Water mask

By applying the land-water mask provided with the cloudmask dataset, only about 15% of the pixels are expected to have reliable data quality. In other words, nearly 85% of the data is excluded by applying a rigorous decision of quality/cloudmask.

The land-water mask appeared to systematically exclude a large portion of data over land, so I examined the cloudmask data for 10 random scenes, to see if this was true:

The scenes above suggest that there is some pattern to the way the land-water mask excludes certain data, and we might need t obuild our own land water mask, or adapt the current one to suit our needs better.

Step 3: Applying QC codes.

## number of pixels under land: 334625
## proportion of pixels under land: 21.0673

The sparse data in the image above illustrates the need for judicious filtering of the data by applying quality control and cloudmask values. Since ~80% of pixels are contained in the cloudmask values with less than good quality data, fine-tuning the application of the cloudmask values is essential to retain as much data of acceptable quality as possible.