Final Project

Riverine Flooding Risk in AZ

Introduction

I will be examining the annualized frequency of riverine flooding events on the tract level in Arizona. Riverine flooding occurs when the water in rivers or creeks swells and overflows the banks, creating potentially hazardous conditions for the surrounding area. Riverine flooding occurs over time but is made worse when sudden torrential rain causes flash floods. These rain events are especially common during Arizona’s monsoon season from June to September. Arizona’s arid landscape of washes and gullies makes it a prime target of landslides, making these flooding events more dangerous. By better understanding the risks of riverine flooding, we can alleviate the negative outcomes for the surrounding environment and population.

Data Preparation

## Loading in Libraries 
library(tidyverse)
library(sf)
library(tmap)
library(wordcountaddin)
library(spdep)
library(kableExtra)

## Read in CSV File
dat <- read_csv("raw_data/NRI_Table_CensusTracts_Arizona.csv")

## Select the Important Columns to be Included
dat <- dat |>
  select(STATE, TRACTFIPS, RFLD_AFREQ)

## Rename the columns
dat <- dat |>
  rename(Annualized_Frequency = RFLD_AFREQ, GEOID = TRACTFIPS)

## Just Get AZ Tracts for AZ From Tigris
tct_tiger_sf <- read_sf("raw_data/tct_tiger_sf.gpkg")

## Select the Important Columns to be Included for the Spatial Data Layer
tct_tiger_sf <- tct_tiger_sf |>
  select(GEOID)

## Complete Table Join
az_tct_risk_sp <- left_join(tct_tiger_sf,
                            dat,
                            by = "GEOID")

My data sources were from the National Risk Index where I imported the census tract data for Arizona, and I also brought in a spatial data layer by downloading AZ tract level data from Tigris. I wrangled my data so I could specifically work with tracts in Arizona and my variable of interest, annualized frequency of riverine flooding by event days per year. Throughout the data wrangling process, I also selected the rows with data specifically for the annualized frequency of riverine flooding in Arizona and renamed these rows, so that I could better organize my data. I then joined my National Risk Index layer with my spatial data layer by the GeoID, so that I would only have to work with one table rather than two.

Exploratory Spatial Data Analysis

Data Description and Summary

## Calculate Mean Annualized Frequency
mean <- mean(az_tct_risk_sp$Annualized_Frequency)

## Calculate Minimum Annualized Frequency
min <- min(az_tct_risk_sp$Annualized_Frequency)
  
## Calculate Maximum Annualized Frequency
max <- max(az_tct_risk_sp$Annualized_Frequency)

## Calculate Number of Observations
obs <- nrow(az_tct_risk_sp)

## Plot Histogram
ggplot(data = az_tct_risk_sp,
       aes(Annualized_Frequency)) + 
  geom_histogram(binwidth = 1,
                 fill = "lightblue",
                 col = "grey20") +
  xlab("Annualized Frequency of Riverine Flooding Event Days per Year in AZ")

The variable of interest, annualized frequency, helps to understand where the tracts in Arizona with the highest and lowest annual rates of riverine flooding are. This variable is calculated by taking the recorded number of event days over the period of record. Of the 1765 tracts in Arizona that I am observing, the largest annualized frequency of riverine flooding event days was 12.6666667, the smallest annualized frequency was 0 , the average annualized frequency was 4.1362134. Around 800 of the tracts experienced an annualized frequency of between 0-1.25 riverine flooding event days, but a little less than 500 had an annualized frequency between 6-7.5 events. It is difficult to see a general trend on the histogram as there are a significant tracts with small values as well as moderate annualized frequency.

Geographic Distribution and Clustering

## Make copy
az_af_org <- az_tct_risk_sp

## Remove NAs
az_tct_risk_sp <- az_tct_risk_sp |>
  filter(!is.na(Annualized_Frequency))

## Create Queen case neighbors
az_queen <- poly2nb(az_tct_risk_sp,
                    queen = TRUE)

## First, convert our neighbors to weight matrix
az_q_weights <- nb2listw(az_queen,
                         style = "B",         
                         zero.policy = TRUE)  

## Moran's I
az_af_tct_moran <- moran.test(az_tct_risk_sp$Annualized_Frequency,
                             az_q_weights,          
                             zero.policy = TRUE, 
                             randomisation = TRUE)

## MORAN'S I VALUE
moran_i_value <- az_af_tct_moran$estimate[1]

## P-Value for MORAN'S I
moran_p_value <- az_af_tct_moran$p.value

## LISA Map
az_af_lisa <- localmoran(az_tct_risk_sp$Annualized_Frequency,# Variable we're analyzing
                            listw = az_q_weights,      # Weights object
                            alternative = "two.sided", # Clustering or Dispersion
                            zero.policy = TRUE) |>     # Best to keep TRUE for LISA
  as_tibble() |>                                     # Better object type
  mutate(across(everything(), as.vector))            # Remove junk from localmoran output

## Add values required for LISA category
az_af_lisa <- az_af_lisa |>
  mutate(SCVAR =  scale(az_tct_risk_sp$Annualized_Frequency) |> as.vector(), 
         LAGVAR = lag.listw(az_q_weights, scale(az_tct_risk_sp$Annualized_Frequency)),  
         LISACAT = case_when(SCVAR >= 0 & LAGVAR >= 0 & `Pr(z != E(Ii))` <= 0.05 ~ 1,
                             SCVAR <= 0 & LAGVAR <= 0 & `Pr(z != E(Ii))` <= 0.05 ~ 2,
                             SCVAR >= 0 & LAGVAR <= 0 & `Pr(z != E(Ii))` <= 0.05 ~ 3,
                             SCVAR <= 0 & LAGVAR >= 0 & `Pr(z != E(Ii))` <= 0.05 ~ 4,
                             `Pr(z != E(Ii))` > 0.05 ~ 5))

## This simply adds a label based on the values
az_af_lisa <- az_af_lisa |>
  mutate(CATNAME = case_when(LISACAT == 1 ~ "HH",
                             LISACAT == 2 ~ "LL",
                             LISACAT == 3 ~ "HL",
                             LISACAT == 4 ~ "LH",
                             LISACAT == 5 ~ "Not Significant"))

## Quick SUMMARY of LISA output
lisa_summary <- table(az_af_lisa$CATNAME)

## Add LISA category column to the spatial data for mapping!
az_tct_risk_sp <- az_tct_risk_sp |>
  mutate(LISACAT = az_af_lisa$LISACAT,
         CATNAME = az_af_lisa$CATNAME)

## Choropleth Map 
az_af_tmap <- 
  tm_shape(az_af_org) +
  tmap_mode("view") +
  tm_polygons(col = "grey50") +
  tm_shape(az_tct_risk_sp) + 
  tm_polygons("Annualized_Frequency",
              title = "Annualized_Frequency (Days per Year)",
              style = "jenks",
              palette = "Blues",
              colorNA = "white",
              border.col = "black",
              border.alpha = 0.25,
              legend.hist = TRUE) +
  tm_layout(frame = FALSE,
            legend.outside = TRUE)

## LISA map
lisa_tmap <- 
  tm_shape(az_af_org) +
  tmap_mode("view") +
  tm_polygons(col = "grey50") +
  tm_shape(az_tct_risk_sp) + 
  tm_polygons("LISACAT", 
              title = "LISA Category",
              breaks = c(1, 2, 3, 4, 5, 6),
              palette =  c("red", 
                           "blue", 
                           "lightpink", 
                           "skyblue", 
                           "grey90"),
              colorNA = "white",
              labels = c("High-High", 
                         "Low-Low",
                         "High-Low",
                         "Low-High", 
                         "Not significant"),
              border.col = "black", 
              border.alpha = 0.25) +
  tm_layout(frame = FALSE,
            legend.outside = TRUE)

## Map Them Together
tmap_arrange(az_af_tmap, 
             lisa_tmap,
             ncol = 1,
             nrow = 2)

## Print nice version of the LISA results
kable(lisa_summary, 
      digits = 1,
      format.args = list(big.mark = ",",
                         scientific = FALSE,
                         drop0trailing = TRUE),
      caption = "LISA Summary Results") %>% 
  kable_styling(bootstrap_options = c("striped", 
                                      "hover", 
                                      "condensed", 
                                      "responsive"), 
                full_width = F)

LISA Summary Results
Var1	Freq
HH	235
HL	5
LH	36
LL	216
Not Significant	1,273

Areas with the highest annualized frequency of riverine flooding event days are to the south and northwest of Arizona, but there are some areas in the central part of Arizona that appear to experience a moderate annual frequency of riverine flooding event days. The data from the choropleth map also appears to be clustered. The Moran’s I value was 0.5516272, so this means the data is in fact clustered. From the LISA results we can see that there are 216 low-low clusters and 235 high-high clusters, so the clustering tracts are relatively similar between high and low clusters. There are also 1,273 tracts that have no significant value for clustering. The LISA map shows how there are high-high clusters to the south and northwest where the choropleth map showed the high values of annualized frequency. The LISA map also shows low-low clusters to the northeast, central, and southwest where the choropleth map showed lower frequency values.

Conclusions

Across the 1765 tracts observed in Arizona, the annual frequency of riverine flooding ranged from 0 to around 12 event days. The tracts are clustered in areas of high-high concentration to the northwest and south as well as low-low clusters in the northeast, central, and southwest regions of Arizona. This is to be expected as these flooding events are caused by geographical features such as rivers and streams. Understanding riverine flooding events and where the risks are is especially important as climate change impacts the region through more frequent torrential rain and erratic storm systems.

Document Statistics

wordcountaddin::text_stats()

Method	koRpus	stringi
Word count	642	635
Character count	4025	4025
Sentence count	28	Not available
Reading time	3.2 minutes	3.2 minutes