Data Quality: Gap in 2016 data

Firstly - it’s worth noting that there is a gap in data in 2016 in both Harris and Dallas counties.

What the data shows

Outage event counts by year. National totals are consistent (~116k–189k records) — the gap is Texas/county-specific.
Year Dallas Harris Note
2014 173 271
2015 725 373
2016 0 10 ⚠ Sparse / missing
2017 117 168
2018 1073 1412
2019 937 992
2020 952 1404
2021 1011 1008
2022 1033 1066
2023 996 1090

Why 2016 is missing

The 2016 gap is likely a reporting / utility coverage gap in the Eagle-I data collection system:

  • National record count is normal: 2015 = 115,962 records · 2016 = 116,438 · 2017 = 118,929. The dataset as a whole is not truncated.
  • Texas coverage collapsed: Texas went from 214 counties reporting in 2015 to 97 counties in 2016 to 190 counties back in 2017. 131 counties that reported in both adjacent years are simply absent in 2016.
  • Harris County has only 10 records in 2016 (vs 373 in 2015 and 168 in 2017); Dallas County has zero records in 2016 entirely.
  • Other large states (Florida, Virginia, Pennsylvania) have thousands of 2016 records, so it is not a national collection outage.

This is likely because one or more TX utilities (Oncor for Dallas, CenterPoint for Houston) stopped reporting to the DOE feed. Coverage resumed in 2017.


Outage Frequency

Month-to-month trend

Seasonal pattern (monthly heatmap)

Key finding: Dallas peaks in May–July, likely from summer heat-driven grid stress. Harris County (Houston) shows a different pattern: January, November, and December are high, reflecting Gulf Coast winter storm exposure alongside a summer peak.


Outage Duration

Summary statistics

Duration statistics by county. Harris mean is pulled up by catastrophic events.
county Total events Min (hrs) Median (hrs) Mean (hrs) 90th pct (hrs) 99th pct (hrs) Max (hrs) Extreme (≥24h)
Dallas 7017 0.25 1.25 2.61 6.25 23.5000 124.50 67
Harris 7794 0.25 1.25 3.51 8.00 29.7675 636.75 119

Distribution (boxplot)

Extreme outages (≥ 24 hours) by year

Key finding: Harris County (Houston) has 119 extreme outages vs Dallas’s 67, which is nearly twice as many. Harris’s 2019 and 2023 peaks in % of outages lasting over 24 hours likely reflect Tropical Storm Imelda (2019) and later hurricane activity. The Harris County max of 636.75 hours (~26.5 days) corresponds to Hurricane Harvey (2017).


Customers Affected


Income & Outage Burden

Data and methodology

This section examines the relationship between income and outage burden at two geographic levels.

County level. Income indicators for Harris and Dallas counties are pulled directly from the ACS 2022 5-year estimates. Outage burden is taken directly from the Eagle-I data.

ZCTA level. Because Eagle-I data is county-level only, sub-county equity analysis requires linking ZIP Code Tabulation Areas (ZCTAs) to county outage totals. ZCTAs are assigned to counties where ≥50% of its land area falls; ACS 2022 5-year median household income and population are retrieved by ZCTA. I allocated county outage totals to ZCTAs proportionally by population

Income tier classification. Tier boundaries are derived from the population-weighted cumulative ZCTA income distribution, pooled across both Harris and Dallas counties. T1 and T2 are the income values below which approximately one-third and two-thirds of the combined county population reside.

County-level comparison

The table below places income indicators and outage burden side by side for both counties. Income data are directly observed at the county level (ACS 2022); outage figures are from Eagle-I 2014–2023.

County-level income and outage burden indicators. Income: ACS 2022 5-Year Estimates. Outages: Eagle-I 2014-2023.
Metric Dallas Harris
Income indicators (ACS 2022)
Median household income $70,732 $70,789
Mean household income $104,220 $104,780
Poverty rate 14.0% 15.8%
Gini coefficient 0.492 0.497
Outage burden (Eagle-I 2014-2023)
Total outage events (2014-2023) 7,017 7,794
Outage events per 1,000 residents 2.69 1.65
Total outage duration (hours) 18,308 27,392
Mean duration per outage (hours) 2.61 3.51
Extreme outages (>= 24 hours) 67 119
Share of outages >= 24 hours 1.0% 1.5%

Key finding: Both Harris and Dallas counites are closely matched on income, and both Gini coefficients exceed the US average of 0.486, indicating above-average within-county income inequality. Harris County has a greater outage burden with more total events (7,794 vs 7,017) and longer mean duration per event (3.51 vs 2.61 hours), but Dallas has a higher rate of outages per 1,000 residents (2.69 vs 1.65) due to its smaller population. Harris’s substantially higher total duration and extreme outage count reflect exposure to catastrophic weather events — Hurricane Harvey (2017) and Winter Storm Uri (2021) in particular.

ZCTA level comparison

Within-county income distribution (ZCTA level)

Population-weighted distribution of ZCTA median household incomes in Harris and Dallas counties. Histogram bins weighted by ZCTA total population (ACS 2022 5-year, Table B19013). Dashed lines indicate empirical tertile breaks derived from the pooled population-weighted income distribution across both counties.

Population-weighted distribution of ZCTA median household incomes in Harris and Dallas counties. Histogram bins weighted by ZCTA total population (ACS 2022 5-year, Table B19013). Dashed lines indicate empirical tertile breaks derived from the pooled population-weighted income distribution across both counties.

Key finding: Both counties show right-skewed ZCTA income distributions, with the bulk of residents in ZCTAs below $80,000 and long upper tails reflecting high-income enclaves — River Oaks (77024, ~$200k) in Harris and Highland Park (75225, ~$195k) in Dallas. Harris has a heavier left tail, consistent with its slightly higher poverty rate at the county level. The pooled tertile thresholds (T1 = $55,000, T2 = $80,000) apply reasonably to both counties given their similar distributional shapes.

Population share vs. outage burden share

Share of county population and allocated outage burden by income tier. If outage burden were distributed in exact proportion to population, all three bars within each income tier group would be equal in height. See Section 6.5 for interpretation of the near-equal pattern.

Share of county population and allocated outage burden by income tier. If outage burden were distributed in exact proportion to population, all three bars within each income tier group would be equal in height. See Section 6.5 for interpretation of the near-equal pattern.

Important note: The near-identical bar heights in the figure above should not be read as evidence that outage burden is equitably distributed. The pattern is a direct consequence of the allocation method: distributing county outage totals to ZCTAs proportionally by population mathematically constrains each tier’s outage share to approximate its population share. The three bars within each group will be nearly equal by construction, regardless of the true spatial distribution of outages.

What the ZCTA analysis does contribute is the baseline population distribution across income tiers. Low-income ZCTAs (below $55,000) account for approximately 37% of Dallas County population and 39% of Harris County population.These figures serve as the proportionality benchmark against which genuine sub-county outage data, if obtained, could be tested.

The county-level comparison in Section 6.2 is the more interpretable result given the data available. Harris County’s higher mean outage duration, greater number of extreme events, and modestly higher poverty rate are consistent with a pattern of greater outage burden in the higher-poverty county, though the income gap between the two counties is small enough that caution is warranted in drawing strong conclusions.