We at CivicDataLab are engaged in a data experiment for preparing an intelligent data model for flood response and management. As part of the experiment, we have to assign weights to multiple socio-economic variables by assessing the relationship of the selected variable with the flood-proneness of the region. We aim to use a data-driven approach, borrowing concepts from Bayesian statistics, to calculate the weight of different classes of a variable to the flood-proneness of the region.
Bayesian statistics calculates probability based on conditions, hence also known as conditional probability. Here, we use Bayesian statistics to estimate the probability at which different categories of a given socio-economic variable experiences flooding in any sub-district. For a given variable, it is the proportion of the affect flood has on a category to the affect of the entire category in a study area. The proportion is recalculated for different years of floods and averaged to get a probability, which is taken as the weight of each category of a given variable.
Here, we calculate the yearly probability of physical infrastructure layers as VarWt = P(No. of cells that are flooded and having the infrastructure) / P(No. of cells with the infrastructure in the studied area). The physical infrastructure layer shown in the code is that of hospital. The same logic can be applied to different road categories and embankments (CONDITIONAL_PROB_LAYER_Rd_Arterial.R).
For the work, we have to load the following libraries:
packages <- c("terra")
#read the district shapefile
state_4326 <- vect("assam_State.shp")
The data for the analysis are (1) the physical infrastructure layer , which in this case is the raster map showing the point locations of hospital,(2) the yearly flooding probability map, (3) the micro-level study area is the sub-districts of the state of Assam.
layer_type <- 'hosptial'
#reading all the subdistrict of assam
districts <- vect("ASSAM_SUBDISTRICT_4326.shp")
dummy_var <- rast("conditioningfactors/slope.tif")
dummy_rast <- rast(ext=ext(dummy_var),crs=crs(dummy_var, proj=T, describe=FALSE, parse=FALSE),
nrow=dim(dummy_var)[1],ncol=dim(dummy_var)[2])
###***reading the layer of physical infr***###
infr <- rast(sprintf("state/phyinfr/%s_rd.tif",layer_type))
infr <- infr > 0
plot(infr)
Next, we calculate the proportion of the layer affected by floods each
year. We create a map which shows the infrastructure that is affected by
floods and extract the flooded area of each sub-district. We divide this
with the spatial extent of the infrastructure for each sub-district.
year <- c(2018:2018)
condProb <- function(yr){
flood <- rast(sprintf("conditioningfactors/floodedareas_grtr1in5_%sAug.tif",yr))
#resampling pop map to same extents # flood maps does not work as the terra::extract function goes out of available disk space
#therefore, resample the flood to extents and resolution of population
pop <- rast(sprintf('state/pop_ind_ppp_%s_UNadj.tif',yr))
dummy <- rast(ext=ext(pop),crs=crs(pop, proj=T, describe=FALSE, parse=FALSE), nrow=dim(pop)[1], ncol=dim(pop)[2])
flood_RS <-resample(flood, dummy, method = 'bilinear')
flooded_and_infr <- flood_RS * infr
flooded_and_infr <- flooded_and_infr > 0
flooded_and_infr[flooded_and_infr == 0] <- NA
plot(flooded_and_infr, main = "Infrastrucutre affectd by floods")
more65_sub <- terra::extract(flooded_and_infr, districts, sum, na.rm=T)
pop_more65 <- terra::extract(infr, districts, sum, na.rm=T)
cond_prob <- (more65_sub[,2]/pop_more65[,2])
return(cond_prob)
}
cond_prob <- sapply(year, condProb)
Then, we take the average of the computed value and rasterise it. The
rasterised image can also be taken as the vulnerability of
administrative unit due to the number and location of hospitals.
districts$avgProb <- rowMeans(cond_prob)
cond_prob_table <- data.frame(layer = layer_type,
rc = districts$sdtname,
prob = districts$avgProb)
#rasterise the sub-district shape file with the average conditional probability
pop <- rast('state/pop_ind_ppp_2015_UNadj.tif')
sampl <- rast(nrow = nrow(pop), ncol = ncol(pop))
ext(sampl) <- ext(pop)
res(sampl) <- res(pop)
crs(sampl) <- crs(pop)
#this will also be the vulnerability of adminstrative unit due to the number and location of hospitals
dar_avgProb <- rasterize(districts , sampl, 'avgProb')
plot(rast(sprintf("outputs_vulnerability/%s_State_estimatedconditionalProb_005.tif",layer_type)),
main = sprintf('Estimated conditional probability of flooding \n affecting %s at Revenue circles',layer_type),
col = heat.colors(100))
The product of the average of the computed value with the infrastructure
will give the vulnerability of the infrastructure to floods with repect
to the floode area in the revenue circle and infrastructure.
layer_vul <- dar_avgProb * infr
plot(layer_vul, main = sprintf("%s weighted with \n Probability (Flooding|%s)",layer_type,layer_type))
#save the layers
The overall impact of floods on the infrastructure or the flood risk on the infrastructure is the product of the vulnerability of the layer and the yearly flooding probability.
#cal flood risk of the layer each year
#run the loop for all layers
for (yr in 2018:2018){
layr <- rast(sprintf("outputs_vulnerability/%s_State_005.tif",layer_type))
layr <-resample(layr, dummy_rast)
fld <- rast(sprintf("conditioningfactors/floodedareas_grtr1in5_%sAug.tif",yr))
fld <-resample(fld, dummy_rast)
plot(rast("conditioningfactors/floodedareas_grtr1in5_2018Aug_RS.tif"),
main = "flooded areas")
risk <- fld * layr
plot(risk, main = "Flood risk to the infrastructure")
}