Question. Did the gas company over-bill Young Israel for the meter readings recorded in March through August 2025?
Answer. Yes — and the evidence is strong.
Monthly therms (black = normal reads, red = flagged) against the HDD-model prediction (dashed line) and its 95% prediction interval (shaded band). April and May 2025 sit dramatically above the band; March 2025 sits below it; the three Jun–Aug 2025 zeros sit at the floor of the band but are physically implausible (see §4).
Cumulative actual billing (solid black) versus the counterfactual model-predicted billing (dashed blue), both plotted against cumulative HDD. The model line is anchored to actual at the last pre-suspect month (Feb 2025), so the two coincide by construction up to that point and the gap that opens afterwards is exactly the cumulative over-billing. The bracket at August 2025 marks the end of the disputed window: ~3,800 therms of excess. Lines run roughly parallel after that, indicating the model resumes tracking actual billing once normal reads resume.
Recommendation. Pursue a refund or rebilling correction for the March–August 2025 window, anchored on the ~3,803-therm aggregate excess (or ~2,188 therms if the gas company demands a conservative 95% lower bound).
| month | therms | hdd_65f | flagged |
|---|---|---|---|
| May 2024 | 205 | 236.4 | |
| Jun 2024 | 75 | 24.7 | |
| Jul 2024 | 62 | 1.1 | |
| Aug 2024 | 43 | 12.8 | |
| Sep 2024 | 48 | 63.7 | |
| Oct 2024 | 123 | 287.3 | |
| Nov 2024 | 422 | 495.9 | |
| Dec 2024 | 1396 | 922.4 | |
| Jan 2025 | 2405 | 1121.2 | |
| Feb 2025 | 2805 | 952.3 | |
| Mar 2025 | 1350 | 707.6 | yes |
| Apr 2025 | 5476 | 448.9 | yes |
| May 2025 | 2998 | 236.2 | yes |
| Jun 2025 | 0 | 61.0 | yes |
| Jul 2025 | 0 | 2.2 | yes |
| Aug 2025 | 0 | 20.2 | yes |
| Sep 2025 | 123 | 55.8 | |
| Oct 2025 | 130 | 299.8 | |
| Nov 2025 | 654 | 617.3 | |
| Dec 2025 | 1311 | 1030.6 |
Sources. Meter reads were provided by the customer. HDD data are from degreedays.net for KBOS (Boston Logan), base temperature 65°F.
Period covered. May 2024 through December 2025 (20 monthly observations). Six months are flagged on the source spreadsheet as candidate outliers: March, April, May, June, July, and August 2025.
We model monthly therms as the sum of a non-heating baseline (hot water + cooking) plus a linear function of heating degree days in the current and previous calendar month. The lag captures gas burned before the meter-read date but appearing on the following month’s bill:
\[ \widehat{\text{therms}}_t \;=\; x_{\text{baseline}} + \beta_1\,\text{HDD}_t + \beta_2\,\text{HDD}_{t-1} + \varepsilon_t \]
where \(x_{\text{baseline}} = 57\) therms/month is fixed at the customer’s empirical prior-summer mean (Jun–Sep 2024), and \(\beta_1, \beta_2\) are estimated from the data. Pinning the intercept to the empirical baseline does three useful things:
The model is fit only on the 13 non-suspect months, so the disputed months play no role in determining what “normal” looks like — they are then evaluated as held-out observations against the clean-month fit. (Fitting on the contaminated data would let a single large outlier like April 2025 drag the regression line and partly mask itself.)
Diagnostics reported below:
MASS::rlm, Huber
loss) — an independent estimator that automatically down-weights
outliers. If the suspect months are truly anomalous, the robust fit’s
coefficients should resemble the clean-only fit’s coefficients, and the
suspect-month observations should receive small weights.| term | estimate | std_err | t_value | p_value |
|---|---|---|---|---|
| hdd65 | -0.014 | 0.281 | -0.05 | 0.961 |
| hdd_lag1 | 2.367 | 0.360 | 6.57 | 4.0e-05 |
The previous-month HDD coefficient (≈ 2.37 therms per HDD) carries essentially all the heating signal — most of the gas burned during a given calendar month shows up on the next month’s meter read, consistent with monthly read dates trailing the calendar month. The current-month HDD coefficient is statistically indistinguishable from zero. The fit is tight (residual σ ≈ 219 therms against winter readings of 1,400–2,800 therms), and the resulting predictions are now physically sensible at every point in the year — the floor is the empirical baseline rather than wherever an unconstrained OLS intercept happens to land.
| month | actual | predicted | pi_95 | residual | outside_pi |
|---|---|---|---|---|---|
| Mar 2025 | 1350 | 2301 | [1,689, 2,913] | -951 | yes |
| Apr 2025 | 5476 | 1725 | [1,149, 2,302] | 3751 | yes |
| May 2025 | 2998 | 1116 | [585, 1,648] | 1882 | yes |
| Jun 2025 | 0 | 615 | [110, 1,120] | -615 | yes |
| Jul 2025 | 0 | 201 | [-282, 685] | -201 | |
| Aug 2025 | 0 | 62 | [-420, 543] | -62 |
Monthly therms with 95% prediction interval (shaded) from the clean-only HDD model. Red points: flagged months. April and May 2025 sit far above the upper PI; March 2025 sits below the lower PI.
| month | therms | rstudent | cooks_d | dffits | rstud_flag | cooks_flag |
|---|---|---|---|---|---|---|
| Apr 2025 | 5476 | 5.72 | 0.846 | 2.75 | yes | yes |
| Mar 2025 | 1350 | -2.17 | 0.463 | -1.31 | yes | |
| May 2025 | 2998 | 1.49 | 0.099 | 0.57 | ||
| Jun 2025 | 0 | -0.99 | 0.044 | -0.36 | ||
| Jun 2024 | 75 | -0.97 | 0.050 | -0.39 | ||
| Feb 2025 | 2805 | -0.78 | 0.100 | -0.54 | ||
| Dec 2024 | 1396 | 0.43 | 0.021 | 0.24 | ||
| Jul 2025 | 0 | -0.33 | 0.004 | -0.11 |
April 2025 is the only point that exceeds the Bonferroni-corrected externally-studentized-residual threshold (|rstudent| = 5.72 > 3.60). It also has by far the largest Cook’s distance (0.85), confirming that even with the suspect months included, the regression cannot smooth them away. March 2025 also exceeds the Cook’s-D threshold despite a more modest studentized residual, consistent with the under-billing story.
| statistic | value |
|---|---|
| Prior-summer (Jun–Sep 2024) baseline mean (therms/mo) | 57.0 |
| Prior-summer (Jun–Sep 2024) baseline SD | 14.4 |
| Z-score of a single 0 reading vs. baseline | -3.95 |
| Pr(single zero | Normal baseline) | 3.97e-05 |
| Pr(three consecutive zeros | indep. Normal) | 6.28e-14 |
| Pr(three consecutive zeros | Poisson, rate=mean) | 5.44e-75 |
The prior summer (June–September 2024) shows a tight baseline of ~57 therms/month for hot water and cooking. A single zero reading is ~4 SDs below that baseline; three independent zero readings have a joint probability somewhere between 6 × 10⁻¹⁴ (under a Normal model that allows some negative noise) and 5 × 10⁻⁷⁵ (under a Poisson model with the same mean). Either bound is far beyond any reasonable rejection threshold. The customer cannot have actually consumed exactly zero therms in three consecutive summer months.
If the gas company shifted real consumption between months via estimated reads, the only physically meaningful quantity is the total therms delivered across the disputed window. We test the sum of the residuals, using the full covariance of the fitted coefficients to get an honest standard error:
| quantity | value |
|---|---|
| Sum of actual therms billed (Mar–Aug 2025) | 9,824 |
| Sum of model-predicted therms | 6,021 |
| Excess billed (actual − predicted) | 3,803 |
| Standard error of the excess | 734 |
| 95% CI on the excess | [2,188, 5,418] |
| Test statistic | t = 5.18 on 11 df |
| One-sided p-value (H0: excess ≤ 0) | 0.00015 |
Even the lower bound of the 95% confidence interval — ~2,188 therms — is substantial. The upper-bound prediction-interval sum across all six suspect months (9,211 therms) is still below the actual 9,824 therms billed, meaning the bill exceeds what the model would tolerate even in its most generous month-by-month interpretation.
The primary model in §1 pins the non-heating intercept at the customer’s empirical prior-summer baseline (57 therms/month) instead of estimating it from the data. The natural sensitivity check is to drop that pin and let ordinary least squares estimate the intercept freely from the clean months — both because OLS is the methodologically conventional default, and because we want to confirm that the headline excess doesn’t hinge on the modeling choice. If the two fits roughly agree, the baseline-pinning is doing what it’s supposed to do (excluding physically-impossible negative summer predictions) without distorting the substantive answer.
| quantity | Baseline-pinned (primary) | Unconstrained OLS |
|---|---|---|
| Sum predicted | 6,021 | 5,293 |
| Excess billed | 3,803 | 4,531 |
| 95% CI on excess | [2,188, 5,418] | [2,830, 6,231] |
The two specifications agree within their confidence intervals. The baseline-pinned model is the more defensible headline — it forecloses the “you got free credit for the impossible zero readings” objection.
| month | therms | suspect | weight |
|---|---|---|---|
| Apr 2025 | 5476 | TRUE | 0.071 |
| May 2025 | 2998 | TRUE | 0.138 |
| Mar 2025 | 1350 | TRUE | 0.283 |
| Jun 2025 | 0 | TRUE | 0.528 |
| Jun 2024 | 75 | FALSE | 0.623 |
| Jul 2024 | 62 | FALSE | 1.000 |
| Aug 2024 | 43 | FALSE | 1.000 |
| Sep 2024 | 48 | FALSE | 1.000 |
The robust fit’s coefficients (intercept −67, HDD 0.09, lag1 HDD 2.42) are close to the unconstrained clean fit’s (intercept −97, HDD 0.21, lag1 HDD 2.31). A completely different estimator, told nothing about which months we suspected, reaches the same conclusion about which months are anomalous.
The most conservative possible reading: take Mar 25 and the three summer zeros at face value, exclude only the two largest spikes from the training set, and ask whether Apr/May 2025 are still statistically extreme.
| term | estimate | std_err | p_value |
|---|---|---|---|
| hdd65 | 0.536 | 0.334 | 0.129 |
| hdd_lag1 | 1.502 | 0.391 | 0.002 |
| month | actual | predicted | pi_95 | outside_pi |
|---|---|---|---|---|
| Apr 2025 | 5476 | 1361 | [651, 2,070] | yes |
| May 2025 | 2998 | 858 | [185, 1,531] | yes |
Even under this maximally-charitable refit, Apr 2025 (5,476) exceeds the upper PI by ~3,406 therms and May 2025 (2,998) exceeds it by ~1,467 therms. The case for over-billing on these two months does not depend on the treatment of the other suspect observations.
| term | estimate | std_err | p_value |
|---|---|---|---|
| hdd_lag1 | 2.35 | 0.121 | 2.1e-10 |
Dropping the (insignificant) current-month HDD term still leaves a model with R² ≈ 0.97 and the same qualitative conclusions for Apr/May 2025.
Residuals vs. fitted values, clean-only model. The flagged points are visually distinguishable from the cluster of well-behaved residuals around zero.
Three independent lines of evidence point to over-billing during March–August 2025:
The pattern matches a meter that was reading low (or being estimated low) through early 2025, followed by one or two large “catch-up” estimated bills in April and May that substantially overshot true consumption, and then three months of further estimated reads at zero. Real metered service appears to resume from September 2025 onward, with usage tracking the HDD model cleanly through year-end.