We aim to create simplified analysis-ready version of AER FloodScan that can be shared with users on the HDX platform. The goal is to facilitate near real time monitoring and contextualization of flooding across humanitarian responses. To achieve this, we propose the following simplified data structure and file formats for sharing:
a 90d rotating zip file containing 90 Cloud Optimized GeoTiffs (1 per day). With the following bands: SFED , SFED_ANOMALY, and SFED_BASELINE
Below we display the file structure and illustrate some crude, but promising ways this simplification could be easily usable for a wide user-base on HDX. For the sake of this example we simulate a file package that would have been downloaded for Somalia on the 15th of January 2024.
Code
box::use(terra[...])box::use(sf[...])box::use(tidyterra[...])box::use(dplyr[...])box::use(stringr[...])box::use(lubridate[...])box::use(purrr[...])box::use(ggplot2[...])box::use(forcats[...])box::use(tidyr[...])box::use(readr[...])box::use(gghdx[...])box::use(rnaturalearth)box::use(extract = exactextractr)box::use(AzureStor[...])box::use(gganimate[...])box::use(tmap[...])box::use(gganimate[...])box::use(zoo)box::use(paths=../../R/path_utils[load_paths,vp])box::use(../../R/utils) # func to get fieldaps/box::use(../../src/utils/blob)box::use(paths =../../R/path_utils)sf_use_s2(FALSE)gghdx()Sys.setenv(AZURE_SAS =Sys.getenv("DSCI_AZ_SAS_DEV"))Sys.setenv(AZURE_STORAGE_ACCOUNT =Sys.getenv("DSCI_AZ_STORAGE_ACCOUNT"))# averages will be calculated from historical data up to this dateEND_DATE_BASELINE <-as.Date("2020-12-31")SFED_THRESHOLD <-0.01bc <- blob$load_containers()gc <- bc$GLOBAL_CONTpc <- bc$PROJECTS_CONTfps <- paths$load_paths(virtual =TRUE)lgdf <- utils$download_fieldmaps_sf(iso3 ="SOM", layer =c("som_adm0","som_adm1","som_adm2"))
Data Structure
We imagine that many users will be using a desktop GIS application and will inspect and contextualize the individual .tif files contained in the zip. Here is a screenshot of a QGIS environment with the AER FloodScan data loaded along with some extra contextual layers.
To more comprehensively explain the data in this document we will load all the zip files in at once using R rather than QGIS.
Code
# utility function to extract date from file/source name.extract_date <-function(x){as.Date(str_extract(x, "\\d{8}"),format ="%Y%m%d")}ZIP_PATH <- here::here("20240115_aer_area_300s_SFED_90d.zip")# get the names of the tifs inside the zip - sort by datetif_files <-tibble(tif_name =str_subset(unzip(ZIP_PATH,list=T)$Name,".tif$"),tif_date =extract_date(tif_name) ) |>arrange(tif_date)# format for GDAL Virtual File System (VFS)ZIP_VFS <-paste0("/vsizip/",ZIP_PATH,"/",tif_files$tif_name)# read in all rasters at once.r <-rast(ZIP_VFS)
We subset the SFED_ANOM band to get a time series of ANOMALY values
r_anom <- r[[names(r)=="SFED_ANOM"]]
# now that we have subset to anomaly data we can rename the bands according to there# date for easier time-series construction/manipulationr_date <-extract_date(basename(sources(r_anom)))set.names(r_anom, r_date)
Mapping
We can then map this data at any time-step to understand where specifically flooding/flood anomalies are occurring. It is likely many users will use desktop GIS software for this. Here we show an animation of the last 90 days where each frame represents 1 day.
Code
# list geodataframesr_anom_map <-deepcopy(r_anom)r_anom_map[r_anom_map<=0.01] <-NA
While mapping is useful to understand where the events are happening it can be difficult to directly quantify. Therefore, we could imagine that creating zonal statistic time series of the simplified data sets could be useful in many contexts.
Below we’ve taken the daily average SFED anomaly across Bay State in Somalia and created an animation where each frame represents a day. We’ve expanded the analysis time range here to the past 365 days to provide more context for this document, but in real-time applications users would have access to the current day and the past 90.
Underneath the SFED Anomaly plot we show a theoretical example illustrating how SFED anomaly dates could be flagged according to one type of momentum indicator known as the MACD crossover. A crossover occurs when the shorter window rolling average breaches the longer window rolling average and indicates a shift in momentum. There are many interesting methods that anomaly values could be flagged and thresholded, this method is just chosen for illustration purposes.
Code
# zonal means per admin 1df_zstat_adm1 <- extract$exact_extract( r_anom, lgdf$som_adm1,fun ="mean",append_cols =c("ADM1_EN","ADM1_PCODE"),progress =FALSE) |>pivot_longer(-starts_with("ADM")) |>separate(name, into =c("stat","date"), sep ="\\.") |>mutate(date =as.Date(date) )