Prior Literature

Research Questions

  • What has been the marginal effect of flood events on municipal budget factors in the past?
  • How does recurring and multiple flood events affect municipal budgets?
  • What is the timing of effects? What is the flood level threshold where effects are felt?

Data Sources

Municipal Financial Data

Municipal financial reports are from state-standardized annual financial reports by municipalities (cities, townships, and borroughs). For Pennsylvania, data is available for download here: PA Financial Reports

This data includes revenues, taxes, expenditures, and debt each year (2006 - 2020). For analysis, all monetary variables are expressed in $2020.

Municipal Geographic Boundaries
Municipal boundary records are generally kept by State records. For Pennsylvania, municipal boundaries were downloaded in 2023 from Pennsylvania Department of Transportation records: PA Municipal Boundary Shapefile

There are a few anomalous municipality names in the financial data that do not match geographic boundaries, but the majority of financial reports match to state-defined geographic boundaries: 97% of records match.

Flood Event Data
Historical flood events are from NOAA National Centers for Environmental Information Storm Events Database. This dataset contains NOAA’s officieal record of storms and significant weather phenomena. From 1996 onwards, the dataset includes all event types - including multiple types of flooding events. The NCEI recieves event data from the National Weather Service, which in turn gathers data from several sources: public agencies and officials (county, state, federal), local law enforcement, skywarn spotters, NWS damage surveys, insurance industry, and news/general public. (1996 - 2022)

The NWS Storm Events dataset captures flooding events that are smaller, more localized, and more frequent than FEMA’s federally declared disasters dataset. It is the core data source of ASU’s SHELDUS dataset.

The quality of locational data (especially latitude and longitude) from this dataset varies over time. From 1996 - 2006, the NWS used a range / direction process, referenced from the nearest town. After 2006, NWS implemented a google maps interface that allowed for more precise geolocating. Because our geography of interest is municipalities, we’ve kept 1996 - 2006 data here, but it may be prudent to limit analysis to post-2006 in future work. Similarly, damage estimates are considered just “estimates” - they are from a variety of sources, and expressed in nominal dollars.

Data Overview

Within the NWS Storm Events database, we categorize several events as flood-related: Coastal Floods, Flash Floods, Lakeshore Floods, Tropical Storms, and Debris Flows. The vast majority of flood events are categorized as “Floods” or “Flash Floods.”

Matching the flood event data with municipal boundaries is not perfect. The database is standardized at the county or “county-equivalent” level, meaning all records have a valid county and state designation. However, many records do have lat/lons for the beginning and end of an event; they also have municipality names listed for those affected. By matching via lat/lons and municipality names, 72% of the historical flood events in Pennsylvania can be successfully linked to a municipal boundary. 20% of records are explicitly listed as “countywide” in their effects. 8% of records are not explicitly “countywide,” and have some narrative description from which municipal associations could be gleaned - however, the process for attribution can be subjective.

However, the Storm Events Database geolocation data is higher in later years (as described above). Given that municipal financial data for Pennsylvania is only available after 2006, there is a higher matching rate for storm events to municipalities during the study period. Below is the match rate for events that occurred after 2006; 94% of storm events in the study period can be matched to municipalities from longitude/latitude, or Municipality name.

Geographic Distribution of Flood Events
Flood events in the database are distributed throughout the state, but high population cities (Philadelphia, Pittsburg) have more flood events reported.

The subset of events that are not attributed to a municipality, generally accounted for as “countywide,” are more likely to counties closer to the coast, and likely the result of a tropical storms or hurricanes. “Countywide” impacts will be included by assuming all municipalities are affected by the event, and distributing any recorded damages weighted by population.

Table A1 shows the summary statistics of municipalities exposed to flood events, and those not expposed. Comparing treated muncipalities (those with at least one flood event during the 2006 - 2020 time period), and untreated, we see that treated municipalities have greater population and greater land area than municipalities without. Treated municipalities also have higher per capita tax revenue, driven more from real estate tax (i.e. property value) than earned income tax revenue.

Table A1: Comparison of Treated and Non-treated Municipalities (2020)
Control (N=1310)
Treatment (N=1173)
Mean Std. Dev. Mean Std. Dev. Diff. in Means Std. Error
Population (thousands) 2.4 4.2 7.9 46.7 5.5*** 1.4
Land Area (sq miles) 15.4 19.2 20.3 20.3 4.9*** 0.8
Total Revenues 2167.0 7342.1 16587.8 301449.7 14420.8 8899.4
Total Expenditures 2056.5 7044.5 16216.0 298621.2 14159.5 8815.8
Revenues Over Expenditures 110.5 876.3 371.9 4075.0 261.4* 122.8
Real Estate Tax Revenues 394.0 1396.1 2125.6 21340.2 1731.6** 631.1
Earned Income Tax Revenues 338.5 1025.0 3012.4 62893.9 2673.9 1856.5
Per Capita Tax Revenues 2.6 7.1 6.6 30.9 4.0*** 0.9
Intergovernmental Revenues-Federal Government 23.7 153.2 826.7 23983.2 803.0 707.9
Total Debt 1156.9 7383.9 9670.8 165982.2 8513.9+ 4903.2
N Pct. N Pct.
Town Type 1TWP 19 1.5 71 6.1
2TWP 674 51.5 760 64.8
BORO 603 46.0 302 25.7
CITY 14 1.1 40 3.4
Monetary variables in thousands of dollars
These effects remain even when removing clear outliers of major cities (Philadelphia, Pittsburgh). Table A2 shows the summary statistics of exposed / unexposed municipalities, removing Philadelphia and Pittsburgh. While the difference in means are lower, the different between treated and untreated municipalities are still statistically significant.

Table A2: Outliers Removed - Comparison of Treated and Non-treated Municipalities (2020)
Control (N=1310)
Treatment (N=1171)
Mean Std. Dev. Mean Std. Dev. Diff. in Means Std. Error
Population (thousands) 2.4 4.2 6.3 9.5 3.9*** 0.3
Land Area (sq miles) 15.4 19.2 20.2 19.9 4.7*** 0.8
Total Revenues 2167.0 7342.1 6965.3 16242.2 4798.3*** 522.5
Total Expenditures 2056.5 7044.5 6650.9 16103.2 4594.4*** 515.4
Revenues Over Expenditures 110.5 876.3 314.4 2668.1 203.9* 82.6
Real Estate Tax Revenues 394.0 1396.1 1386.4 3431.4 992.4*** 108.7
Earned Income Tax Revenues 338.5 1025.0 1066.1 2192.9 727.6*** 70.9
Per Capita Tax Revenues 2.6 7.1 6.6 30.9 4.0*** 0.9
Intergovernmental Revenues-Federal Government 23.7 153.2 84.0 421.9 60.3*** 13.2
Total Debt 1156.9 7383.9 4429.4 20448.2 3272.5*** 638.9
N Pct. N Pct.
Town Type 1TWP 19 1.5 71 6.1
2TWP 674 51.5 760 64.9
BORO 603 46.0 302 25.8
CITY 14 1.1 38 3.2
Monetary variables in thousands of dollars

Research Design

Given the systemic differences in exposed vs non-exposed, testing two different research designs:
* incident-specific dataset comparing treated units to later treated units
* Propensity score matching of treatment and control groups, and controlling for observable characteristics

(working on this stage)

Model Structure

Classic Diff-in-Diff estimation:

\[ Y_{it} = \beta Flood_{i} \times Post_{i} + \delta_{i} + \delta_{t} + \epsilon_{ift} \] where \(i\) is each municipality and \(t\) is each year. \(\beta\) represents the average effect across the entire observational buffer period.

Diff-in-Diff with lag years: \[ Y_{it} = \sum_{T} \beta_T Flood_{i} \times EventTime_{it}^{T} + Flood_{i} + \delta_{i} + \delta_{t} + \epsilon_{ift} \]