NFIP Data

Most insurance market data in the US is private, with one major exception: the National Flood Insurance Program (NFIP). The NFIP covers the vast majority of flood insurance policies in the US; while there is a nascent private market, FEMA’s NFIP still administers over 95% of flood insurance policies. 1 To examine the reported differential processing of insurance claims processing across place and socioeconomic demographics, we use publicly-available data on NFIP claims at the census tract level. Some fields core to our analysis are not publicly available (such as dates that claims are opened and closed); that data we gathered from a FOIA request to FEMA (received February 2024).


Data Cleaning

Filtering

Our core dataset includes all NFIP claims between 1970(?)-2022, but we filter to a subset of claims that reflects the focus of our analysis. Table 1 shows the data filtering steps, and the effect on our sample.

Our analysis relies on comparing insurance claims with census and ACS data, so we limit claims to geographies surveyed under the Census (within the contiguous US states; excluding Puerto Rico and other territories), and those with valid Census Tract identifiers. We also limit the sample to claims after 1990, as Census prior to 1990 did not have complete coverage of census tracts across the US, and did not gather comparable socioeconomic data across several variables. We focus on claims that are from residential insurance policies (filtering out insurance held by commercial or industrial entities), and on claims where the building is someone’s primary residence (filtering out second home claims).

Table 1: NFIP Claims Data Cleaning
Data Filtering Step N remaining % remaining
All Claims 2582604 100%
US States 2537158 98%
Valid Census Tracts 2418644 94%
Between 1990-2022 2007210 78%
Residential 1854518 72%
Primary Residences 1215059 47%
Calculations

To better compare the monetized damage and claims payments regardless of building value, we calculated the ratios over property value and assessed damage. To account for the fact that different policyholders set different deductible levels (resulting in different claims payments), we calculate a “total payment” variable that sums the claim payed and deductible. Unless otherwise noted, the payment ratios are based on this “total payment” composite value of claim + deductible. All monetary variables were converted to $2022 using the CPI2.

  • Monetary ratios:
    • damage over value: Assessed Damage / Property Value
    • payment over value: Claim Payment + Deductible / Property Value
    • payment over damage: Claim Payment + Deductible / Assessed Damage

To compare the timing of claims processing, we also calculate several time variables: the number of days between the date of loss and the date the claim was submitted; the days between the claim was submitted and the date the claim was closed; and the total days between the loss and claim closing (the sum of the prior 2). When applicable, we also calculate the time between when a claim is opened, reopened, and closed.

  • Calculated time frames:
    • loss to claim: time between date of loss and date of claim submitted
    • claim to close: time between date of claim submitted and date of claim closed
    • loss to close: total time between date of loss and date of claim closed
    • claim to reopen: time between claim originally opened, and then reopened (presumably after a denial)
    • reopen to close: time between a claim reopened, and then finally closed
Data Record Completeness

Most fields have fairly high record completeness (Table 2). Not all fields will be relevant for all claims (i.e. Flood Event names are relevant only for events that are a named catastrophe). There are missing data in the amount paid, building value, and flood intensity records that will further reduce the size for analysis.

Additionally, this table does not capture records that are complete, but erroneous and excluded from analysis, such as extreme time outliers or records of payment that exceed the value of the home significantly. These will be filtered out in subsequent steps.

Table 2: NFIP Data Record Completeness
fields record_count record_percent
AmountPaidonBuildingClaim 933708 76.4%
YearofLoss 1222415 100.0%
TotalBuildingInsuranceCoverage 1210166 99.0%
BuildingDamageAmount 981238 80.3%
BuildingPropertyValue 980259 80.2%
NetBuildingPaymentAmount 932821 76.3%
OccupancyType 1222415 100.0%
BuildingReplacementCost 918037 75.1%
PolicyCount 1222415 100.0%
WaterDepth 1109942 90.8%
FloodWaterDuration 1116632 91.3%
NumberofUnits 1220545 99.8%
AmountPaid_over_value_total 984056 80.5%
AmountPaid_over_damage_total 1002375 82.0%
Damage_over_value_total 984056 80.5%
loss_to_claim 1222415 100.0%
claim_to_close 1220639 99.9%
loss_to_close 1220639 99.9%
claim_to_reopen 170274 13.9%
reopen_to_close 170125 13.9%
ClaimStatus 1222415 100.0%
FloodInsuranceClaimsOffice_FICO_Number 859190 70.3%
FloodCharacteristicsIndicator 16662 1.4%
BuildingDeductibleCode 1211890 99.1%
BuildingNonPaymentReason 246368 20.2%
CauseofDamageCode 1202869 98.4%
PrimaryResidenceIndicator 1222415 100.0%
RentalPropertyIndicator 1222415 100.0%
FloodEvent 930723 76.1%
ElevationCertificateIndicator 302422 24.7%
RatedFloodZone 1216463 99.5%
ElevatedBuildingIndicator 1222415 100.0%
FloodproofedIndicator 1222415 100.0%


Claims Processing Time

How long do consumers take to file a claim after a loss? What is the average claim processing time?

The distribution of processing time is long-tailed: there are a relatively few records that appear to take a very long period of time to process (Figure 2). Some of these very long times are almost certainly due to record errors, as they indicate > 10 year processing time. To correct for those likely erroneous outliers, we have filtered out the top 5% of claims processing times for each time stage.

On average, policyholders submit claims within a week of the loss date (median 5 days). The median claim processing time (period between the claim made, and claim closing) is 51 days, or approximately 7 weeks. The maximum recorded claims processing time is still well over a year (425 days); while this is a numerical outlier, the claims processing in that time frame cannot be assumed to be a data error. NFIP claims that are “closed without payment” are significantly shorter claim-to-close time. This confirms some intuition that decisions to decline payment happen quickly, and decisions for payment take longer to assess and resolve.

Table 3: Average Claims Processing Time
time_variable Minimum Median Mean Maximum SD
claim_to_close 0 51 69 425 69
loss_to_claim 0 5 11 70 15
loss_to_close 0 65 87 466 75
Table 4: Average Claims Processing Time, CWOP
time_variable Minimum Median Mean Maximum SD
claim_to_close 0 26 46 425 61
loss_to_claim 0 7 15 70 16
loss_to_close 0 47 72 466 74

Approximately 70% of NFIP claims are attributed to FEMA-designated “catastrophes”: named storms that usually result in the temporary establishment of local NFIP processing or support offices. Catastrophe-related claims longer claims processing time than non-catastrophe claims (see Table 3).

Table 5: Average Claims Processing Time, by Cause of Loss
time_variable Minimum Median Mean Maximum SD
Non-Catastrophe
claim_to_close 0 38 57 425 63
loss_to_claim 0 3 9 70 14
loss_to_close 0 51 75 466 75
Catastrophe
claim_to_close 0 58 74 425 70
loss_to_claim 0 5 12 70 15
loss_to_close 0 72 93 466 74

How has claims submission and processing times varied over time?

NFIP claim processing over time suggests a few trends (Figure 3). The time between loss and claim being filed by a policyholder has decreased over time, the time it takes to process claims increases in years with major natural disasters that likely increase the volume of claims, and the amount of time for a claim to be processed seems to be relatively stable over time, except for big loss years (like Hurricane Katrina). This suggests that cross-year comparisons of claim-to-close time treatment is reasonable even without time fixed effects, if controlling for catastrophe or volume of claims.


Claims Payment Amounts

On average, how much of assessed damage is paid in claims? How has the proportion of property value damaged, and the amount paid on that damage, varied over time?

There is not full data coverage for claim payment, damage estimate, or value estimate. There are also some records with illogical payment records: claims payment that are higher than the assessed damage or the property value, or negative monetary values (resulting in ratios below 0, or 1). For analysis based on claims payment ratios, we remove those illogical payment records. The result is about 54% of claim records with valid claim payment ratio estimates.

On average, the amount of assessed damage covered by claim payment (accounting for the deductible) is high: over 95% of assessed damage. The payment coverage is slightly lower for contents damage than building damage, though the sample of such claims is much lower, as there is much lower reporting of assessed building contents damage. Flood insurance claim payments, on average, are about 14% of total home value.

Table 7: Average Claims Payment Ratios
payment_ratio median mean min max sd
Building
payment/damage 0.962 0.903 0 1 0.152
payment/value 0.139 0.261 0 1 0.273
damage/value 0.145 0.278 0 1 0.297
Contents
payment/damage 0.926 0.848 0 1 0.193
payment/value 0.254 0.351 0 1 0.304
damage/value 0.378 0.461 0 1 0.371
Total
payment/damage 0.958 0.890 0 1 0.168
payment/value 0.141 0.259 0 1 0.265
damage/value 0.134 0.252 0 1 0.265

NFIP claim payments and assessed damage as a proportion of the property value has stayed generally stable over time, with the exception of major natural disasters: Katrina and Sandy appeared to have caused damage that was on average a larger proportion of property value, and claim payments seem to have matched that. The proportion of the assessed damage paid in claim payments (represented by the red line) appears to have increased slightly over time.


Claims Closed without Payment

How many claims were closed with no payment over time? What are common causes of nonpayment?

We consider claims closed without payment as any claim labeled as “closed without payment,” or any claim labeled “closed” but with $0 or NA payment. A significant proportion (20-25%) of claims submitted are closed without payment. The likelihood of the claims being closed without payment does not appear to be associated with major storms / catastrophes.

The reasons for claims being closed without payment vary, and the categories are broad and vague. Of claims denied payment, 23% have “Other” or no reason listed, leaving much ambiguity as to the denials of those claims. 23% of claims are denied due to “not actual flood”, and another 13% due to “no demonstrable damage” - bot categories that could be issues of damage attributed to the prior state of the building, or not due to the storm event. Note: counts and distributions are almost identical for building and contents claims denied.

Table 8: Top 10 Reasons for Building Claims without Payments
Reason_cwop_bldg n_claims percent
Not actual flood 59228 23
Claim denied that was less than deductible 42969 17
Other 37035 14
No demonstrable damage 32621 13
Erroneous assignment 27020 10
NA 22892 9
Error-delete claim (no assignment) 12903 5
Failure to pursue claim 7703 3
Not insured, wind damage 6930 3
Seepage 2082 1

Census Data

For historical census data, we use the Longitudinal Tract Database (LTDB) and American Community Survey (ACS) estimates.

FEMA’s NFIP data associates claims to census geographies (census tracts and block groups). However, the GEOIDs associated with all claims prior to 2020 are from 2010 census geographies - including pre-2010 claims. Census geography boundaries change every 10 years, with the decennial census. Census survey questions (and the data they produce) also changes over time, and differs between census products (decennial censuses and ACS surveys). Therefore, associating NFIP claims to census data in a way that is comparable across all years (1990 - 2022) requires historical census estimates interpolated to 2010 tract geographies, and crosswalks between census tables/variables over time. The LTDB provides that, based on areal interpolation of historical census tracts.

There are a few other sources for historical geography crosswalks, notably NHGIS. NHGIS has developed similar geographic crosswalks, based on an arguably more accurate interpolation method (areal and population interpolation, with census block groups). However, their published standardized data is limited (only population and race variables), and they do not provide guidance on how to standardize variables across census products. In the future, it would be worth developing historical census estimates using NHGIS and comparing results, as a robustness check.


Data Cleaning

Interpolation

Prior to the start of the ACS, the only source for socioeconomic data estimates is the decennial census (published every 10 years). We linearly interpolate data between the 1990 and 2000 decennial census, and the 2000 and 2010 ACS, to develop estimates intercensal years. Starting in 2008, we use ACS 5-year estimates for the mid-year value (e.g. for 2008, we use the 2006-2010 ACS 5-yr estimate), following the findings of Weden et. al3. For 2021 and 2022, we extrapolate the most recent ACS sample that includes those years (the 2018 - 2022 ACS 5-yr estimate). Figure 6 shows an example of the resulting data series, for a single sample tract.

ACS Data Precision Suppression

While the ACS provide more frequent and in-depth socioeconomic data, it also has a not insignificant sampling error, especially at smaller geographies. All estimates are published alongside margins of error, which allow us to assess the precision of ACS estimates. Generally, precision increases with sample size (population in the census tract, or estimate), meaning sub-population estimates are more likely to have lower precision, especially in lower-population tracts.4

There is no standard threshold for acceptable precision in estimates; generally, a small coefficient of variation (CV) (the ratio of the standard error to the estimate itself) indicates higher reliability. Generally, guidance has been that CV’s above .30 are highly unreliable, and can be used as a suppression threshold; some sources cite lower thresholds between .10 - .15. For our purposes, however, suppressing annual estimates creates a challenge: there is known bias in which tracts would be removed from analysis if those data points are removed, and cannot be applied consistently over time for the full history of claims because decennial census data (prior to 2008) does not include MOEs. In Figure 6, the CV Precision Threshold applied is .30; many, but not all, ACS data points do not meet that threshold.

Table 9 displays the proportion of all Census Tract estimates that would be suppressed under a CV threshold of .30. Estimates of population by race, and measure of poverty, largely do not meet the precision threshold.

Table 9: Proportion of ACS Estimates Suppressed Under CV Threshold (0.30)
name_standard 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
hh_total 0.02 0.02 0.02 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.02 0.02 0.02
med_homevalue 0.02 0.02 0.02 0.02 0.02 0.01 0.01 0.01 0.01 0.01 0.03 0.03 0.03
med_inc 0.02 0.02 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.04 0.04 0.04
med_inc_black 0.31 0.32 0.32 0.27 0.27 0.15 0.15 0.15 0.15 0.15 0.15 0.15 0.14
med_inc_hisp 0.35 0.36 0.37 0.31 0.32 0.18 0.18 0.18 0.18 0.18 0.18 0.18 0.18
med_inc_percap 0 0 0 0 0 0 0 0 0 0 0.01 0.01 0.01
med_inc_white 0.09 0.09 0.09 0.08 0.08 0.05 0.05 0.05 0.05 0.05 0.09 0.09 0.09
med_rent 0.07 0.07 0.06 0.05 0.05 0.03 0.03 0.04 0.04 0.04 0.06 0.06 0.06
pct_edu_col 0.36 0.29 0.09 0.08 0.07 0.06 0.06 0.06 0.06 0.06 0.09 0.09 0.09
pct_edu_hs 0.25 0.18 0.03 0.03 0.02 0.02 0.02 0.02 0.03 0.03 0.06 0.06 0.06
pct_owners 0.03 0.03 0.03 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.04 0.04 0.04
pct_pop_asian 0.54 0.55 0.56 0.57 0.58 0.59 0.59 0.59 0.59 0.59 0.56 0.57 0.57
pct_pop_black 0.61 0.65 0.66 0.66 0.67 0.67 0.67 0.67 0.68 0.68 0.66 0.65 0.66
pct_pop_hisp 0.7 0.71 0.71 0.72 0.72 0.73 0.72 0.73 0.73 0.73 0.72 0.72 0.72
pct_pop_white 0.76 0.76 0.76 0.76 0.76 0.77 0.76 0.76 0.76 0.76 0.76 0.75 0.75
pct_pov 0.49 0.46 0.42 0.39 0.36 0.35 0.34 0.36 0.37 0.4 0.54 0.56 0.56
pct_unemp 0.99 0.98 0.6 0.55 0.58 0.65 0.72 0.8 0.86 0.9 0.94 0.95 0.95
pct_units_vac 0.62 0.6 0.59 0.58 0.57 0.56 0.55 0.56 0.56 0.58 0.66 0.67 0.7
pop_edu 0.02 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.02 0.01
pop_total 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
units_total 0.02 0.02 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.02 0.02 0.02
Calculations

(To be filled in…)

Compare NFIP claims sample vs. whole

Reference for Census MOE calculations for computed variables


Joining Census data with NFIP claims

When joining NFIP claims with ACS / Census data, there some records with GEOIDs that match to ACS data. For pre-2018 years, it’s hard to know exactly why the FEMA / NFIP census tract ids do not match - it could be record errors.

The years from 2018 - 2022 reflect a different problem. By using mid-year estimates for single years, we are applying the 2020 5-year estimates to 2018 (and so forth). In 2020, with the decennial census, the census tract census IDs change. Our assumption was that starting in 2020, the NFIP claims were also geocoded to 2020 GEOIDs.

Table X: Summary of Matching Claims to Census Data
Year Matched Claims Unmatched Claims Total Claims Proportion Unmatched
1990 2315 2 2317 0
1991 5586 3 5589 0
1992 12977 6 12983 0
1993 7790 11 7801 0
1994 6806 3 6809 0
1995 27727 16 27743 0
1996 22740 25 22765 0
1997 13784 9 13793 0
1998 32477 30 32507 0
1999 39397 31 39428 0
2000 14972 5 14977 0
2001 41094 18 41112 0
2002 22548 25 22573 0
2003 30650 59 30709 0
2004 44990 55 45045 0
2005 205662 55 205717 0
2006 20165 21 20186 0
2007 19504 7 19511 0
2008 64735 19 64754 0
2009 26675 4 26679 0
2010 25704 17 25721 0
2011 62501 24 62525 0
2012 104795 30 104825 0
2013 14876 45 14921 0
2014 11857 2 11859 0
2015 22355 4 22359 0
2016 59357 22 59379 0
2017 104534 47 104581 0
2018 20314 7147 27461 0.26
2019 17508 7126 24634 0.29
2020 14116 7558 21674 0.35
2021 24887 9490 34377 0.28
2022 26886 10859 37745 0.29

  1. Kousky et. al The Emerging Private Residential Flood Insurance Market in the United States. 2018. RFF.↩︎

  2. Consumer Price Index, All Urban Consumers. https://data.bls.gov/timeseries/CUUR0000SA0↩︎

  3. Weden, Margaret, Christine Peterson, Jeremy Miles, and Regina Shih. Evaluating Linearly Interpolated Intercensal Estimates of Demographic and Socioeconomic Characteristics of U.S. Counties and Census Tracts 2001–2009. 2016.↩︎

  4. https://www.census.gov/content/dam/Census/library/publications/2018/acs/acs_general_handbook_2018_ch07.pdf↩︎