This document is a consolidated reference for the Vermont tax revenue forecasting project. The forecasting target is annual Vermont state tax revenue across approximately 38 categories. Forecasts are updated twice yearly:
January update: Forecasts FY2026 through FY2030 (five years). Individual filer data available through December of the prior year covers the first six months of the current fiscal year, making FY2026 a hybrid nowcast-plus-forecast problem.
July update: Forecasts FY2027 through FY2031 (five years). FY2026 is fully observed and serves as the structural base year. The legislature votes on the first two forecast years (FY2027–FY2028), making these the most consequential.
The two most important forecasting challenges are (1) large-payer concentration — a handful of filers, particularly in CORP and ESTATE, can dominate year-to-year volatility in ways that macro models cannot capture — and (2) within-year real-time signals — individual filer data for the first half of the fiscal year provides contemporaneous information that improves the current-year nowcast.
Revenue Analysis Data - CALM F0126.xlsx, sheet
CALM Data Entrycalm_clean_tblThe raw file is read in three range segments and column-bound.
Cleaning: first column renamed to Date and parsed as date;
last row dropped; character columns coerced to numeric; columns whose
last value is NA dropped; pirefpos dropped as redundant.
Aggregate/subtotal columns (TAXREV, OTHEREV, GENREV, TRANSREV, OTHBIG5)
are excluded as forecast targets.
Important: CALM captures gross tax collections at the source, before any fund allocation. Changes in how revenue is split across the General Fund, Education Fund, Transportation Fund, or other dedicated funds are downstream accounting matters that have no effect on CALM values and are irrelevant to forecasting.
Major Taxes
| Code | Description | Start | End | Complete Rate |
|---|---|---|---|---|
| PINCOME | Personal Income Tax | 1977-07 | 2025-12 | 1.000 |
| PIWITH | PIT — Withholding | 1987-07 | 2025-12 | 0.794 |
| PIEST | PIT — Estimated payments | 1987-07 | 2025-12 | 0.794 |
| PIPAID | PIT — Total paid | 1987-07 | 2025-12 | 0.794 |
| PIREF | PIT — Refunds | 1987-07 | 2025-12 | 0.794 |
| PIOTHER | PIT — Other | 2003-07 | 2025-12 | 0.464 |
| S&U | Sales & Use Tax | 1977-07 | 2025-12 | 1.000 |
| CORP | Corporate Income Tax | 1977-07 | 2025-12 | 1.000 |
| M&R | Meals & Rooms Tax | 1977-07 | 2025-12 | 1.000 |
| ESTATE | Estate Tax | 1977-07 | 2025-12 | 1.000 |
Additional categories include CIG, LIQ, INSUR, BANK, GAS, DIESEL, MVP&U, MVFEES, PARIM, PROPT, LOT, INT, and OTHREV among excise, property, business, non-tax, and transportation revenues.
Target variables are transformed to annual fiscal year year-over-year
percent change: monthly values summed to FY totals (dates shifted +6
months before annual grouping, so July 1977–June 1978 = FY1978), then
Pct = Value / lag(Value) - 1. FY2026 is excluded as the
forecast target.
Key properties: near-zero annual AR1 for most taxes (range -0.44 to +0.45); ESTATE is the most volatile (SD = 0.722, AR1 = -0.436) and least forecastable from macro variables (max r = 0.442); M&R is the most forecastable (top macro predictor r = 0.830); MVP&U has the strongest single mechanistic predictor link (motor vehicle PCE, r = 0.781); CORP has moderate macro correlations (max r = 0.531) and is heavily affected by large-payer concentration; INT has an extreme right tail (p99 = 15.4); GAS has a significant negative time trend (r = -0.420).
Variables excluded from statistical modeling: OTHTT (0 observations), ELECANNABIS (Inf/NaN from structural break), SPEC (NaN/Inf from sign-changing base values), DIESLIC (4 observations).
Variables requiring simplified modeling: MFASSMT (12 obs), TIBGAS (15 obs), TIBDIESEL (15 obs), PIOTHER (22 obs), PARIM (19 obs), BANK (40 obs).
Strong positive clusters: TIBDIESEL ↔︎ DIESEL (r = 0.918), TIBDIESEL ↔︎ BEV (r = 0.908), PINCOME ↔︎ PIEST (r = 0.883), PINCOME ↔︎ PIWITH (r = 0.864), PIWITH ↔︎ INSUR (r = 0.857), M&R ↔︎ TIBGAS (r = 0.826), MVP&U ↔︎ S&U (r = 0.795).
Notable near-zero or negative: ESTATE ↔︎ M&R (r = -0.026), PARIM ↔︎ ESTATE (r = -0.546), INT near-zero with almost all taxes.
| Abbreviation | Full File Name | Notes |
|---|---|---|
| BFT_ALLOC | BFT Allocations to December 2025 Period - AS OF 121825 | Prior: File BFT-2 |
| BFT_RTN | BFT Returns CY20 Through CY25 - AS OF 121825 | Prior: File BFT-1 |
| BIT_CIT_NORETURN | BIT & CIT - Payments on Periods With No Return - Prepared 20260105 | New; 6 sheets: 2025 BIT, 2024 BIT, 2025 CIT, 2024 CIT, 2023 CIT, query |
| BIT_PMT_CUM | BIT Payments Cumulative by Account - FY26 First Half Update - Prepared 2025-12-29 | New; 3 sheets: FY26-FY25 Compare, FY26 First Half Only, FY25 First Half Only |
| CAPGAIN_SUM | Capital Gains Summary TY16 - TY24 - Prepared 2025-12-19 | Prior: File 2 |
| CCC_DUE | CCC Due on PIT Returns - Prepared 2025-12-19 | Prior: File 3 |
| PIT_AGI100K_CHG | Change TY24-TY23 in PIT Filers with AGI over $100K - Prepared 2025-12-19 | Prior: File 1 |
| CIG_STAMPS | Cigarette Stamps CY 2025 Totals by Vendor - AS OF 12.18.25 | New; leading indicator for CIG |
| CIT_CARRYFORWARD | CIT Carryforward - Sums by FY - January 2026 update | New; structural headwind/tailwind signal |
| CIT_INFO | CIT Information January 2026 - AS OF 12.18.25 | New; 6 sheets: Pending Refunds, Outstanding Bills FY23–FY26, FY26 Undirected Ext. Payments, Query |
| CIT_PMT_CUM | CIT Payments Cumulative by Account - FY26 First Half Update | 3-sheet structure identical to BIT_PMT_CUM; reconciliation flag pending |
| CIT_BIT_NRW_10K | CIT, BIT, & NRW Payments over $10K, First Half of FY26 - Prepared 2025-12-29 | Prior: Files 5a/5b; per-transaction ≥$10K threshold; 4 sheets: CIT, BIT, NRW, Query |
| CTT_INFO | CTT Info 06-30-2025 Through 12-31-2025 - AS OF 12.18.25 | New; Cigarette/Tobacco Tax |
| EST_EXT_100K | EST Extension & Return Payments Over $100K 07-01-25 Through 12-31-25 - AS OF 12.18.25 | Prior: File 7; 2 sheets: DATA, QUERY |
| ESTATE_FILINGS | Estate Tax Filings 07-01-2025 - 12-31-2025 - Prepared 2025-12-19 | Prior: File 8; 1 sheet: Table |
| MRT_STR_MONTHLY | MRT & STR Monthly totals 01-01-2019 Through 11-30-2025 - Prepared 2025-12-19 | Prior: File 9 |
| NRW_PMT_CUM | NRW Payments Cumulative by Account - FY26 First Half Update - Prepared 2025-12-29 | Prior: File 5b extended |
| CIT_PMT_ANNUAL | Payment Totals by Year - CIT - Prepared 20260106 | Prior: File 10; fully analyzed |
| ANY_PMT_50K | Payments (Any Tax) over $50K, First Half of FY26 - Prepared 2025-12-29 | Prior: File 11 |
| WHT_FILINGS | WHT Quarterly Filings Over $25K CY2025 - AS OF 12.18.25 | Prior: File 12 |
These files are not the forecasting target and do not represent the full population of filers. Every file has at least one filter (dollar threshold, date window, status filter, or residency filter). Their relationship to aggregate totals should be treated as an empirical question.
BIT/source tax revenue mapping: BIT maps to CORP in source tax revenues — confirmed with Tax Department (May 2026).
NRW mapping: NRW maps to CORP in source tax revenues — confirmed with Tax Department (May 2026).
CIT_PMT_CUM reconciliation flag: Unresolved 5.17× discrepancy with CIT_PMT_ANNUAL. Do not use CIT_PMT_CUM YoY signal as standalone nowcast input until threshold definition is confirmed.
Sign convention: Payments stored negative in the
transaction system. CIT_BIT_NRW_10K applies * -1 in SQL
before export. CIT_PMT_CUM requires negation on read.
Excel subtotal rows: EST_EXT_100K requires filtering
to ^EST-[0-9]+$ before analysis.
Account ID systems: EST_EXT_100K uses EST-XXXXXXXX format; ESTATE_FILINGS uses decedent SSN or V-prefixed ID. Tax Department acknowledged the mismatch and will provide revised files (May 2026). Cross-file join pending receipt of revised files.
Confidentiality: All individual filer files contain
confidential taxpayer information. Use scrub_confidential()
at point of ingestion before any preview or analysis output. See Section
14.4 for the data safety utility.
Personal Income Tax files
PIT_AGI100K_CHG — Individual filer comparison TY24 vs TY23. AGI ≥ $100K only; residents only.
CAPGAIN_SUM — Capital gains by filing period, TY2016–present. Residents only; aggregated.
CCC_DUE — Single aggregate row for TY2024.
Corporate Income Tax / Business Income Tax files
CIT_PMT_ANNUAL, CIT_PMT_CUM, CIT_BIT_NRW_10K, CIT_CARRYFORWARD, CIT_INFO — Fully analyzed; see Sections 6–10.
BIT_CIT_NORETURN, BFT_RTN, BFT_ALLOC — Not yet analyzed.
Estate Tax files — Fully analyzed; see Section 5.
Meals & Rooms Tax files
MRT_STR_MONTHLY — Monthly M&R totals by sub-category from January 2019 through November 2025.
Withholding Tax files
WHT_FILINGS — Large withholding filers (tax due > $25K), quarterly, CY2025.
| CALM Category | Individual Files | Notes |
|---|---|---|
| PINCOME | PIT_AGI100K_CHG, CAPGAIN_SUM, CCC_DUE | Top-level PIT aggregate |
| PIWITH | WHT_FILINGS | Large filers (tax due > $25K); CY2025 |
| PIEST | BIT_CIT_NORETURN | BIT mapping previously provisional; now confirmed to map to CORP, not PIEST — see CORP row |
| CORP | CIT_PMT_ANNUAL, CIT_PMT_CUM, CIT_BIT_NRW_10K (CIT, BIT, and NRW sheets), CIT_CARRYFORWARD, CIT_INFO, BIT_PMT_CUM, BIT_CIT_NORETURN | BIT and NRW confirmed to map to CORP (May 2026) |
| M&R | MRT_STR_MONTHLY | |
| CIG/BEV | CIG_STAMPS, CTT_INFO | |
| ESTATE | ESTATE_FILINGS, EST_EXT_100K | Non-overlapping; complementary |
| BANK | BFT_RTN, BFT_ALLOC | |
| S&U | None | Relies entirely on macro variables |
| GAS/DIESEL/Transportation | None | Relies entirely on macro variables |
| PARIM | None | Relies entirely on macro variables |
synthetic_ESTATE_FILINGS.rds.synthetic_CIT_PMT_ANNUAL_long.rds and
synthetic_CIT_PMT_ANNUAL_wide.rds.synthetic_CIT_CARRYFORWARD.rds.For CORP and ESTATE, the preferred approach decomposes total revenue into: (1) the base component — revenue from the broad population of small and medium filers, trackable by macro models; and (2) the large-filer component — revenue from a small number of large filers requiring individual filer files. Total forecast = Component 1 + Component 2, with separate uncertainty bands.
Base-year cleaning: The structural base year adjusts for idiosyncratic large-filer effects: \(Y^*_{FY2026} = Y_{FY2026}^{actual} - \hat{\epsilon}_{large,FY2026}\). For CORP, use FY2021–FY2024 (post-PTET) as the reference period. The 2022 CIT apportionment overhaul (effective FY2024) is an additional structural break that should be considered when using FY2024 as a reference — accounts whose Vermont apportionment changed materially under the new single-factor rule may have anomalous FY2024 payments that are neither idiosyncratic nor representative of the new normal.
Step-by-step: 1. Run macro model for base component forecast for each tax, using appropriate post-break estimation windows (see Section 13) 2. Pull July–December large-payer data and compute year-over-year growth rate vs. prior year same period 3. For ESTATE, enumerate known pipeline from ESTATE_FILINGS ($13.9M committed H2 revenue) plus MPYEXT accounts in EST_EXT_100K 4. For CORP, use CIT_PMT_ANNUAL H1 signal ($103M across 861 accounts) with post-PTET H1 ratio (0.380); adjust for carryforward regime and 2022 apportionment change context 5. Combine base component + large-filer projection + known pipeline 6. Present forecast with explicit decomposition
Model development note: Forecasting models are built and validated using synthetic data (see Section 14) before being applied to confidential real data. The synthetic data preserves all statistical properties documented in this brief.
Purpose 1 — Base year cleaning: Remove idiosyncratic effects from FY2026 actuals. Use post-PTET, post-apportionment-change reference period for CORP.
Purpose 2 — Concentration risk characterization: Report top-payer share and stress scenarios to the legislature.
CORP
CALM CORP identity: CALM CORP = CIT + BIT + NRW (confirmed mutually exclusive and collectively exhaustive, Tax Department May 2026). NRW is net carryforward credits applied against cash liability — suppresses CALM net collections below gross inflows. Individual filer files capture gross inflows only; CALM CORP is net of carryforward credits applied. Never sum individual filer gross totals as a direct CALM CORP estimate. CALM CORP H1 FY2026 = $76,107,953 (CIT $73.4M + BIT $24.0M − NRW $21.3M). See Section 15.3 for full reconciliation and H2 NRW scenario framework.
January: Primary signal is CIT_PMT_ANNUAL H1 FY2026 ($103M across 861 accounts, implied full-year ~$271.1M). Cross-check against CIT_BIT_NRW_10K CIT sheet ($76.9M). Note the carryforward regime change (Section 9) — the structural tailwind interpretation is conditional on summer 2026 extension filing data; treat as unresolved until then. Note that the 2022 apportionment overhaul likely caused the FY2024 payment decline (−12.6%) and may still be working through the filer population in FY2026.
July: Use CIT_PMT_ANNUAL for complete FY2026 panel. Apply post-PTET, post-apportionment-change reference window for base-year cleaning. For full-year CALM CORP projection, use pre-NRW structural H1 share (34.63%, FY2021–2023 mean) with explicit NRW scenario overlay (see Section 15.3).
ESTATE
January: Three-tier pipeline construction (Known: $13.9M / Extension: $5.4M collected / Structural base: macro model).
July: Remove outlier estates for structural base. Known large unsettled estates become FY2027 pipeline items.
PIWITH
January: WHT_FILINGS YoY growth rate vs. CY2024 as direct contemporaneous signal.
M&R
January: MRT_STR_MONTHLY sub-category breakdown; flag 2021 meal delivery platform structural shift.
Tier 1 — Point forecast for each of the five years.
Tier 2 — Structural decomposition for near-term years: macro base + large-filer component + known pipeline.
Tier 3 — Concentration risk statement — top-payer share and stress scenario.
Overall: 45 filings. Total adjusted VT estate tax = $26,762,047. Total prior payments = $13,527,359. Total amount due = $13,889,105. Total refunds = $654,417.
Monthly distribution:
| Month | Filings | Total Adj Tax | Amount Due |
|---|---|---|---|
| July 2025 | 7 | $10,165,530 | $3,799,485 |
| August 2025 | 11 | $3,552,364 | $3,340,533 |
| September 2025 | 10 | $2,355,115 | $1,312,688 |
| October 2025 | 8 | $8,680,907 | $3,899,268 |
| November 2025 | 5 | $1,413,726 | $962,726 |
| December 2025 | 4 | $594,405 | $574,405 |
Size distribution:
| Band (VT Taxable Estate) | Filings | % of Filings | Adj Tax | % of Revenue |
|---|---|---|---|---|
| Under $1M | 3 | 6.8% | $0 | 0% |
| $1M–$2M | 1 | 2.3% | $0 | 0% |
| $2M–$5M | 5 | 11.4% | $0 | 0% |
| $5M–$10M | 21 | 47.7% | $3,111,216 | 11.6% |
| $10M+ | 14 | 31.8% | $23,650,831 | 88.4% |
The $5M effective exclusion threshold (current since January 1, 2021) is confirmed by the zero-tax bands below $5M. The 2019 estate tax legislation raised the exclusion from $2.75M to $4.25M (effective January 2020) and then to $5.0M (effective January 2021), materially reducing the number of taxable estates relative to the pre-2020 period. Historical ESTATE CALM data before FY2021 is not directly comparable to the current regime without adjustment.
Revenue concentration: top_1 = 24.6%, top_5 = 69.8%, top_10 = 88.2%, HHI = 0.124.
Death-to-receipt lag: median 434 days, modal band 12–18 months (52.9%), only 32.4% arrive within 9 months.
Death cohort pipeline:
| Death FY | Filings | Adj Tax | Amount Due | Mean Prior Pmt Ratio |
|---|---|---|---|---|
| FY2024 | 8 | $7,438,491 | $78,182 | ~1.06 (overpaid) |
| FY2025 | 17 | $18,651,061 | $13,188,422 | ~0.30 |
Overall: 10 real transactions totaling $13,905,140. Extension (MPYEXT): 2 payments, $5,401,516. Return (MPYRTN): 8 payments, $8,503,624. Concentration: top_1 = 31.4%, top_3 = 62.5%, HHI = 0.177. Zero overlap with ESTATE_FILINGS — structurally coherent.
Note: Tax Department has acknowledged an account identifier mismatch between EST_EXT_100K (EST-XXXXXXXX format) and ESTATE_FILINGS (decedent SSN or V-prefixed ID) and will provide revised files. Cross-referencing between the two files should be revisited upon receipt.
The PTET election introduced in FY2021 caused a one-time, non-cyclical regime shift:
| FY | Accounts | Total Large Payments | YoY Growth | YoY Acct Change |
|---|---|---|---|---|
| 2020 | 358 | $36,794,811 | — | — |
| 2021 | 996 | $150,542,598 | +309.1% | +638 |
| 2022 | 1,212 | $173,110,812 | +15.0% | +216 |
| 2023 | 1,405 | $214,223,857 | +23.7% | +193 |
| 2024 | 1,382 | $187,252,268 | −12.6% | −23 |
| 2025 | 1,596 | $248,980,916 | +33.0% | +214 |
The FY2024 decline of −12.6% is consistent with the 2022 CIT apportionment overhaul (effective January 1, 2023 = FY2024) reducing Vermont apportionment for some multistate corporations. This is a second structural break within the post-PTET window, though smaller in magnitude than the FY2021 PTET shift.
| Use Case | Recommended Period | Rationale |
|---|---|---|
| Concentration statistics | FY2021–FY2025 | FY2020 pre-PTET regime |
| Coverage ratio | FY2021–FY2025 | FY2020 coverage (33.8%) vs post-PTET mean (82.1%) |
| H1-to-full-year ratio | FY2021–FY2025 | Post-PTET mean 0.380 |
| Base-year cleaning reference | FY2021–FY2024 | FY2020 distorts post-PTET entrants |
Concentration: top_1_mean = 6.2%, top_5_mean = 16.4%, HHI_mean = 0.012.
Coverage ratio: mean = 82.1%, SD = 6.8%, range 76.1%–91.3%.
H1-to-full-year ratio: mean = 0.380, SD = 0.049, range 0.344–0.452.
Payment size distribution (log-normal): log_mean ≈ 10.7, log_sd ≈ 1.25, log_skew ≈ 0.23 (stable across FY2020–FY2026).
Account persistence (FY2020–FY2025): 45.4% of accounts appear in only 1 year; 20.3% in 2 years; 11.8% in 3 years; 9.9% in 4 years; 8.1% in 5 years; 4.6% in all 6 years.
YoY retention rates: FY2020→2021: 73.2%; FY2021→2022: 64.8%; FY2022→2023: 65.3%; FY2023→2024: 61.2%; FY2024→2025: 63.5%.
Base-year cleaning sensitivity (FY2025): post-PTET reference gives structural base $198.3M (~20.4% idiosyncratic); full reference gives $174.3M (~30.0% idiosyncratic).
H1 FY2026 observed (raw data): $103M across 861 accounts. Post-PTET H1 ratio (0.380): implied full-year ~$271.1M.
Three sheets. Payments stored negative; negate on read. H1 FY2026: $18.2M (160 accounts). H1 FY2025: $55.4M (235 accounts).
5.17× gap vs. CIT_PMT_ANNUAL ($103M across 861 accounts). Accounts present in both files show nearly identical payment totals, so the gap is driven by accounts present in ANNUAL but absent from CUM. Most likely cause: per-transaction vs. annual cumulative threshold definition. Status: UNRESOLVED.
Dramatic apparent concentration decline (top_1: 0.549 → 0.063) is an artifact of the threshold filter, not a genuine economic signal.
Per-transaction ≥$10K threshold. Date range: July 1–December 31,
2025. SQL applies * -1 — amounts arrive positive. CIT, BIT,
and NRW are distinct account types that all map to CORP in source tax
revenues — confirmed with Tax Department (May 2026).
| Sheet | Rows | Total Amount |
|---|---|---|
| CIT | 955 | $76,901,686 |
| BIT | 145 | $4,253,745 |
| NRW | 265 | $9,808,084 |
| Total | 1,365 | $90,963,515 |
CIT: MPYEST dominates at 83.3% ($64.0M). MPYRTN 9.6%; MPYEXT 4.5%.
BIT: MPYRTN dominates at 68.7%. Pass-through entities tend to settle on filing rather than making quarterly estimated payments.
NRW: Entirely MPYNRW (100%).
CIT: 84.8% attributed to FY2026; 11.2% to FY2025. Small tail back to FY2013.
BIT: FY2025 dominates at 74.6% — modal filing year is FY2025, consistent with calendar year 2024 liabilities settling in H1 FY2026.
September and December each account for roughly 36% of CIT and NRW cash, consistent with quarterly estimated tax deadlines (September 15 and December 15).
CIT sheet ($76.9M) = 81.7% of CIT_PMT_ANNUAL. Remaining 18.3% gap reflects accounts making multiple sub-$10K transactions. BIT and NRW both map to CORP in source tax revenues (confirmed May 2026) and should be benchmarked against CORP alongside CIT.
Carryforward generated (rtncfd): corporation overpays
and carries excess forward as future credit rather than receiving a
refund. Carryforward applied (rtncfc): corporation draws
down prior credits to reduce current cash liability — a silent deduction
that suppresses CALM CORP collections without any cash payment.
Historically offset roughly 32–44% of gross CALM CORP collections
annually.
| FY | CF Generated | CF Applied | CF Net | Cumulative Net |
|---|---|---|---|---|
| 2017 | $33.7M | $31.5M | +$2.2M | $2.2M |
| 2018 | $51.6M | $42.9M | +$8.7M | $10.9M |
| 2019 | $47.3M | $48.7M | −$1.4M | $9.6M |
| 2020 | $57.5M | $47.1M | +$10.5M | $20.0M |
| 2021 | $81.5M | $57.8M | +$23.7M | $43.7M |
| 2022 | $89.3M | $81.4M | +$7.9M | $51.6M |
| 2023 | $101.9M | $89.2M | +$12.7M | $64.3M |
| 2024 | $105.0M | $103.6M | +$1.4M | $65.7M |
| 2025 | $3.7M | $103.1M | −$99.3M | −$33.7M |
| 2026 | $0 | $3.7M | −$3.7M | −$37.4M |
Carryforward generation collapsed from $105M to $3.7M in FY2025, with zero generation in FY2026. This is an abrupt structural discontinuity unrelated to any legislation listed in the JFO Highlights of Recent Tax Legislation.
Tax Department meeting (May 2026) identified two competing explanations. First, an extension lag: corporations routinely file returns late — one case of a five-year filing lag has been observed — and FY2025 carryforward generation is not expected to appear in the data until after summer 2026. The near-zero FY2025 figure may therefore be a data artifact rather than an economic signal. Second, a structural suppression effect: even after the lag resolves, carryforward generation may not return to ~$100M because the corporate minimum tax was raised from $750 to $100,000, preventing many corporations from using credits to reduce liability below the high floor. The 2023 shift to Finnigan methodology compounded this by aggregating entire unitary group sales to determine whether a corporation clears the $300M threshold, sweeping more companies into the $100,000 minimum tax bracket than under prior rules and expanding the pool of corporations for whom carryforward credits are effectively stranded assets.
The relative magnitude of the two effects will become clearer after summer 2026, when extension filers are expected to appear. Do not treat the FY2025 collapse as a confirmed structural change until that diagnostic window has passed.
If the collapse is structural and permanent, the cessation of carryforward generation represents a tailwind for future CORP collections — firms will no longer be able to offset tax bills with prior credits, meaning CALM collections will trend upward relative to underlying corporate profitability as the existing stock exhausts. This tailwind is invisible to macro models. However, this interpretation is conditional: it holds only if the legacy stock of accumulated credits continues to be applied while new generation has permanently stopped. The summer 2026 extension filing window is the key diagnostic.
December dominates (mean $43.1M applied vs. September $4.0M next largest), consistent with December 31 filing period end dates. CORP collections in December are systematically suppressed relative to gross liability due to year-end carryforward application.
21 refunds totaling $6.0M requested, $0 posted. All in REVIEW status. TRNHIG approval level = 87.9% ($5.3M). TRNHIG is a high-balance pending review status — confirmed with Tax Department (May 2026). Typical timeline from REVIEW to posting for TRNHIG refunds remains unconfirmed. FY2025 filing periods = 96.9% of total. Concentration: top_1 = 45.6%, HHI = 0.245. Represents a contingent future cash outflow of ~7.9% of H1 FY2026 CORP.
| Sheet | Years Outstanding | Bills | Total Balance |
|---|---|---|---|
| FY26 | 0 | 30 | $74,982 |
| FY25 | 1 | 999 | $3,640,721 |
| FY24 | 2 | 306 | $1,603,530 |
| FY23 | 3 | 149 | $559,969 |
FY23 dominated by APL (appeal) and COL (collections) — lower probability of full recovery. FY25 contains $1.54M under HBREV — a temporary status indicating a high-balance account pending review, confirmed with Tax Department (May 2026). Collection rate for HBREV accounts remains unconfirmed. Total pipeline ($5.9M) approximately offsets pending refunds ($6.0M). Net effect on CORP forecast is near zero.
12 payments totaling $4,430 — negligible relative to CALM CORP.
Six sheets, approximately 80 years of history plus forecasts through
FY2055. Vermont sheets ~90 variables each; US sheets ~300 variables
each. Combined into predictors_tbl (55 rows × 1,065
columns).
Transformation: variables with “%” in Moody’s description are first-differenced; all others converted to YoY % change.
| Tax | Top Predictors | Notes |
|---|---|---|
| PINCOME | US wages — manufacturing durable (r = 0.753); consumer credit delinquencies (r ≈ -0.73) | |
| PIWITH | US wages — retail trade CY lagged (r = 0.753) | |
| PIEST | US wages — manufacturing durable (r = 0.715); S&P 500 (r = 0.669) | |
| PIPAID | S&P 500 CY lagged (r = 0.695) | |
| CORP | US profits tax liability (r = 0.472); US corporate cashflow CY lagged (r = 0.472) | Individual filer data essential; use post-break estimation window |
| M&R | US retail sales — clothing CY lagged (r = 0.830); VT leisure & hospitality employment (r = 0.785) | Most forecastable |
| ESTATE | Max r = 0.442; no macro variable structurally meaningful | Individual filer data is primary tool |
| S&U | US retail sales — building materials (r = 0.689) | Consider post-2019 marketplace facilitator window |
| MVP&U | US motor vehicle PCE (r = 0.781) | |
| INSUR | US legal services employment (r = 0.746) | |
| PROPT | VT consumer credit delinquencies (r = -0.749) | |
| BANK | US minimum wage CY lagged (r = 0.561) | Treat with caution |
| GAS | Time trend (r = -0.420); US electricity retail sales (r = 0.465) | Include explicit time trend |
| PARIM | VT home price median; FHFA index; VT population; 30-year mortgage rate | Use theoretical predictors only; 19 obs |
The 14 key predictors selected for forecasting model development,
with their Moody’s variable codes (_fy fiscal year
versions):
| Variable | Code | Mean | SD |
|---|---|---|---|
| US wages — manufacturing durable | fypewmfdq_us_fy |
0.021 | 0.039 |
| US consumer credit delinquencies | fccdto_us_fy |
−0.014 | 0.770 |
| US wages — retail trade | fypewrtq_us_fy |
0.032 | 0.026 |
| S&P 500 | fsp500q_us_fy |
0.092 | 0.122 |
| US profits tax liability | fztax_us_fy |
0.060 | 0.165 |
| US corporate cashflow | fcashflow_us_fy |
0.063 | 0.112 |
| US retail sales — clothing | frt448_us_fy |
0.029 | 0.054 |
| VT leisure & hospitality employment | re5416q_vt_fy |
0.049 | 0.035 |
| US motor vehicle PCE | fcdmvp_us_fy |
0.052 | 0.076 |
| US legal services employment | re5411q_us_fy |
0.019 | 0.027 |
| VT consumer credit delinquencies | fccdto_vt_fy |
−0.010 | 0.362 |
| US retail sales — building materials | frt444_us_fy |
0.040 | 0.053 |
| US minimum wage | fminwage_us_fy |
0.029 | 0.043 |
| US electricity retail sales | fyle_us_fy |
0.056 | 0.028 |
This section documents structural changes to Vermont tax law that affect CALM gross collections and therefore require attention in the statistical modeling framework. Fund allocation changes (e.g., dedications of revenue to the Education Fund or Clean Water Fund) are excluded because CALM captures source revenue before fund splits and is unaffected by allocation changes.
General modeling principle: Running a single macro regression over the full CALM history without accounting for structural breaks will produce biased coefficient estimates. For each affected tax category, the estimation window should be restricted to the post-break regime, or a regime dummy variable should be included.
| Effective | Change | Direction | Modeling Implication |
|---|---|---|---|
| FY2007 | Mandatory unitary combined reporting for all C-corporations with Vermont income | Ambiguous | First major unitary reporting requirement; structural break in who files |
| FY2007–FY2008 | Double-weighted sales factor replaces equal-weighted three-factor apportionment | Revenue-negative for multistate corps with small VT sales share | Partial structural shift; phased over two years |
| TY2020 (FY2021) | Market-based sourcing for intangibles replaces cost-of-performance | Revenue-positive for Vermont (intangible income sourced to customer location) | Structural upward shift; coincides with PTET and is difficult to isolate |
| TY2021 (FY2022) | PTET election introduced | Structural increase — pass-through entity owners can elect CIT treatment | +309% payment growth and +638 accounts in FY2021; dominant structural break. Use FY2021 as post-PTET base year |
| TY2023 (FY2024) | Single sales factor apportionment; repeal of Throwback Rule; Finnigan method; all US corporations in unitary group | Revenue-negative for some large multistate corporations | Consistent with the −12.6% FY2024 payment decline in CIT_PMT_ANNUAL. Second structural break within post-PTET window. Finnigan methodology also expands the pool of corporations subject to the $100K minimum tax — see Section 9.3 |
| FY2025 | Carryforward generation collapses from $105M to $3.7M | Structural tailwind if permanent — silent offset to CORP collections diminishing | Cause partially explained (extension lag + minimum tax/Finnigan structural suppression); diagnostic window is post-summer 2026. See Section 9.3 |
Recommended CORP estimation window: FY2022–FY2025 for concentration benchmarks and macro model calibration, with sensitivity tests using FY2021–FY2025. FY2024 may warrant further scrutiny as a second transition year. Do not use pre-FY2021 data without explicit regime controls.
| Effective | Change | Direction | Modeling Implication |
|---|---|---|---|
| January 1, 2016 | Restructured to flat 16% rate on value over $2.75M exclusion; includes taxable gifts within two years of death | Structural change to rate and base definition | Prior graduated structure not comparable; pre-FY2016 data requires regime dummy |
| January 1, 2020 | Exclusion raised from $2.75M to $4.25M | Revenue-negative — fewer estates taxable | Structural downward shift in FY2020 collections |
| January 1, 2021 | Exclusion raised from $4.25M to $5.0M (current) | Revenue-negative — further reduction in taxable estates | Current regime; confirmed by ESTATE_FILINGS showing zero tax below ~$5M. FY2022 onward is the cleanest post-reform window |
| Ongoing | Revenue above 125% of prior July forecast dedicated to Higher Education Trust Fund | Mechanical fund diversion in very large ESTATE years | Does not affect CALM gross collections; irrelevant for forecasting |
Recommended ESTATE estimation window: FY2022–FY2025 for the current $5.0M exclusion regime. FY2020 and FY2021 are transition years. Pre-FY2016 data requires a regime dummy at minimum. Given the small number of post-FY2022 observations (4 years), macro model estimation is nearly impossible — the individual filer pipeline (ESTATE_FILINGS and EST_EXT_100K) is and should remain the primary forecasting tool.
| Effective | Change | Direction | Modeling Implication |
|---|---|---|---|
| TY2002 | Shift from percentage-of-federal-liability to bracket-based system with 40% capital gains exclusion | Structural regime change | Pre-FY2003 PINCOME not comparable without regime dummy |
| TY2009 | Capital gains exclusion restricted to farms and timber; flat $2,500 exclusion for other gains | Revenue-positive — reduced exclusion for most capital gains | Structural upward shift in PIEST and PIPAID around FY2010 |
| TY2011 | Two-method capital gains: 40% exclusion for certain business assets held >3 years OR flat $5,000 for stocks, real estate, depreciable personal property | Modest structural change | Affects PIEST and PIPAID |
| TY2015 | 3% minimum tax for taxpayers with AGI > $150,000 | Revenue-positive | Structural upward shift; most visible in PINCOME for high-income years |
| TY2018 | Comprehensive reform: decoupling from federal AGI, new Vermont standard deduction and exemption, four brackets at 3.35%/6.6%/7.6%/8.75%, EITC expansion | Revenue-negative — rate reductions | Major structural break; FY2019 is the first full year under new system. Use FY2019 as preferred start for PINCOME macro model estimation |
| TY2019 | Capital gains exclusion capped at $875K | Revenue-positive — reduced exclusion for large gains | Affects PIEST and PIPAID in high capital gain years |
| TY2022 | New Child Tax Credit ($1,000 refundable per eligible child ≤5); EITC increased to 38% federal | Revenue-negative — increased refundable credits suppress PIREF net | Affects PIREF; PIPAID net of refunds |
| TY2025 | EITC increased to 100% federal for filers without qualifying children; child tax credit age extended to 6; military retirement exclusion expanded | Revenue-negative — further credit expansion | Most recent structural change; partially in FY2026 data |
Recommended PINCOME/PIWITH/PIEST estimation window: FY2019–FY2025 for the current bracket system. FY2019 is the first full year after the 2018 reform. The 2022 credit expansions are a minor secondary break within this window that can be handled with a dummy variable if needed.
| Effective | Change | Direction | Modeling Implication |
|---|---|---|---|
| October 1, 2003 | Rate increase from 5% to 6% | Revenue-positive | Structural upward level shift; use post-FY2004 for rate-consistent history |
| FY2020 (2019 legislation) | Marketplace facilitators (Amazon, eBay, etc.) required to collect and remit S&U on third-party sales | Revenue-positive — large previously uncollected base now taxable | Structural upward shift beginning FY2020; significant and likely persistent. Preferred estimation window: FY2020–FY2025 |
| July 1, 2024 (FY2025) | Prewritten computer software accessed remotely (cloud software) subject to S&U | Revenue-positive — new taxable base | Structural upward shift beginning H2 FY2025; relevant for FY2026 nowcast as tailwind |
Recommended S&U estimation window: FY2020–FY2025 for marketplace-facilitator-era collections. The cloud software expansion (FY2025) is too recent to estimate from but should be noted as an additional structural tailwind in FY2026 and beyond.
| Effective | Change | Direction | Modeling Implication |
|---|---|---|---|
| FY1998 | Rate increased from 7% to 9% (current general rate); 10% alcohol component maintained | Revenue-positive | Last major rate change; post-FY1998 rate structure is current |
| August 1, 2021 | Meal delivery platform facilitators (DoorDash-type) required to collect and remit meals tax | Revenue-positive — previously uncollected delivery charge now taxable | Structural upward shift beginning FY2022. Analogous to marketplace facilitator rule for S&U. Preferred estimation window: FY2022–FY2025 |
| August 1, 2024 (FY2025) | 3% surcharge on short-term rentals | Revenue-positive — new STR surcharge captured in CALM M&R | STR surcharge is tracked separately in MRT_STR_MONTHLY; relevant for nowcast decomposition. Structural upward shift beginning FY2025 |
Recommended M&R estimation window: FY2022–FY2025 for meal delivery platform-era collections. The STR surcharge (FY2025) adds a further structural tailwind in FY2026 that MRT_STR_MONTHLY can help quantify.
| Effective | Change | Direction | Modeling Implication |
|---|---|---|---|
| July 1, 2015 | Rate increased to $3.08 per pack (current rate); smokeless tobacco to $2.57/oz | Revenue-positive (rate) but offset by consumption decline | Last rate change; FY2016 onward is rate-consistent. Secular consumption decline dominates |
| FY2020 (2019 legislation) | E-cigarettes subject to 92% wholesale price tax | Revenue-positive — new base | Partially offsets cigarette volume decline; relevant for CIG CALM column interpretation |
Recommended CIG estimation window: FY2016–FY2025 for current rate regime. Include explicit time trend to capture secular consumption decline.
| CALM Category | Preferred Window | Primary Structural Break | Notes |
|---|---|---|---|
| CORP | FY2022–FY2025 | PTET (FY2021), apportionment overhaul (FY2024) | Only 4–5 observations; individual filer data essential |
| ESTATE | FY2022–FY2025 | $5.0M exclusion (FY2021) | Only 4 observations; pipeline method primary |
| PINCOME/PIWITH/PIEST | FY2019–FY2025 | 2018 PIT reform (FY2019) | 7 observations; consider 2022 credit expansion dummy |
| S&U | FY2020–FY2025 | Marketplace facilitator (FY2020) | 6 observations; cloud software tailwind from FY2025 |
| M&R | FY2022–FY2025 | Meal delivery platform (FY2022) | 4 observations; STR surcharge tailwind from FY2025 |
| CIG | FY2016–FY2025 | Rate freeze at $3.08 (FY2016) | 10 observations; include time trend |
| GAS | FY2010–FY2025 | No rate changes; secular decline | Include time trend |
| MVP&U | FY2010–FY2025 | No structural breaks | Mechanistic predictor (motor vehicle PCE) preferred |
| INSUR | FY2010–FY2025 | No structural breaks | |
| BANK | FY2020–FY2025 | Monthly filing (FY2017); short series | Treat with caution |
General observation: The preferred estimation windows for the most revenue-significant categories (CORP, ESTATE, M&R) are extremely short — 4 to 5 observations — which means macro regression models will be poorly identified for these categories. This further reinforces the primacy of the individual filer data approach for CORP and ESTATE, and the importance of using the full longer history with regime dummies as a robustness check rather than the primary estimation strategy.
BIT/source tax revenue mapping — RESOLVED (May 2026). BIT maps to CORP in source tax revenues.
CIT_PMT_CUM threshold definition — SQL definition underlying CIT_PMT_CUM must be confirmed to resolve the discrepancy with CIT_PMT_ANNUAL. Status: UNRESOLVED.
CIT_CARRYFORWARD regime change — PARTIALLY RESOLVED (May 2026). Two competing explanations identified: (1) extension lag — FY2025 carryforward generation expected to appear after summer 2026 due to late corporate filing, with one known case of a five-year filing lag; (2) structural suppression — corporate minimum tax increase from $750 to $100,000 and 2023 Finnigan methodology shift have rendered carryforwards stranded assets for a larger pool of corporations. Diagnostic window: post-summer 2026.
EST_EXT_100K cross-reference — RESOLVED IN PRINCIPLE (May 2026). Tax Department acknowledged the account identifier mismatch between EST-XXXXXXXX and decedent SSN/V-ID formats and will provide revised files. Cross-referencing with ESTATE_FILINGS pending receipt.
MPYRTN accounts in EST_EXT_100K — LIKELY RESOLVED PENDING FILES (May 2026). Expected to be addressed when Tax Department delivers revised EST_EXT_100K files. Verify upon receipt.
Remaining files not yet analyzed — BFT_RTN, BFT_ALLOC, WHT_FILINGS, MRT_STR_MONTHLY, CAPGAIN_SUM. BIT_PMT_CUM and BIT_CIT_NORETURN now fully analyzed (May 2026).
Short estimation windows — For CORP, ESTATE, and M&R, the preferred post-break estimation windows contain only 4–5 observations, insufficient for reliable macro regression. Consider whether longer windows with explicit regime dummies or a Bayesian approach with informative priors would be preferable. PARTIALLY RESOLVED for CORP (May 2026): H1 signal method using individual filer data bypasses the short-window macro regression problem. See Section 15.
NRW/source tax revenue mapping — RESOLVED (May 2026). NRW maps to CORP in source tax revenues.
HBREV and TRNHIG collection rates — PARTIALLY RESOLVED (May 2026). HBREV is a temporary high-balance pending review status; TRNHIG is an analogous high-balance pending review status. Typical collection rates and resolution timelines for both remain unconfirmed.
CIT/BIT/NRW mutual exclusivity — RESOLVED (May 2026). Tax Department confirmed that CIT, BIT, and NRW are mutually exclusive and collectively exhaustive for CORP in CALM. CIT and BIT versions of payment reports include a clause to exclude NRW payments. The three categories were designed to show major payers within each category separately. Important qualification (May 2026 CALM reconciliation): Summing gross individual filer totals (CIT + BIT + NRW = $118.9M H1) substantially overstates CALM CORP H1 actual ($76.1M) due to (1) filing-period attribution differences (BIT payments 74.6% FY2025-attributed) and (2) CALM netting carryforward credits while individual filer files record gross inflows only. Mutual exclusivity (no double-counting) is confirmed; gross-to-net equality is not. See Section 15.3 for full reconciliation.
BIT_CIT_NORETURN CALM mapping correction — RESOLVED (May 2026). Project brief Section 3.4 mapping table previously listed BIT_CIT_NORETURN under PIEST. Both BIT and CIT account types in this file map to CORP. Section 3.4 updated accordingly.
CIT_PMT_ANNUAL and CIT no-return overlap — RESOLVED (May 2026). CIT, BIT, and NRW confirmed mutually exclusive by Tax Department, so no cross-type double-counting is possible. No-return CIT exposure ($8.4M CY2025 synthetic) treated as a contingent risk overlay on the CIT component of the CORP H1 signal pending resolution rate data from the Tax Department.
Full-year FY2026 CIT_PMT_ANNUAL file pending —
Expected post-June 30, 2026 (Vermont fiscal year close). Required to
finalize TIER 2 base-year cleaning, confirm idiosyncratic account flags,
and produce the definitive structural base for FY2027–FY2031
projections. Until received, all TIER 2 outputs in Section 16.6 use the
$220.5M FY2026 CALM nowcast estimate as a placeholder; the preliminary
structural base is $165.8M. Update PMT_FILE_JULY and
FY2026_ACTUAL (line ~36) in
corp_baseyear_cleaning_fy2026_real.R when file arrives.
Status: AWAITING FILE.
NRW H2 FY2026 scenario resolution — Three scenarios defined in Section 15.3 (Upside: $0; Central: −$21,280,971; Downside: −$31,921,457). Central scenario (H2 NRW = H1 run rate) yields full-year CORP of $219.8M, consistent with the January $220.5M nowcast ($0.7M immaterial difference). Key diagnostic: summer 2026 extension filing window will reveal whether new carryforward credits are generated and immediately applied, determining which H2 NRW scenario materializes. Status: UNRESOLVED — awaiting summer 2026 extension filers.
Because the individual filer files contain confidential taxpayer information, forecasting models are built and validated using synthetic data before being applied to the real data. The synthetic data preserves all statistical properties documented in this brief — distributions, correlations, AR(1) structure, structural breaks, and aggregate totals — without containing any real taxpayer identifiers or payment amounts.
All synthetic datasets were generated using statistical parameters
derived from the real data (means, standard deviations, correlations,
AR(1) coefficients, log-normal parameters, retention rates, etc.) via
skimr::skim(), cor(), and ar()
applied to the real files. No individual-level records from the real
data were used in generating the synthetic data. The workflow for model
development is:
| Object | File | Rows | Cols | Description |
|---|---|---|---|---|
synthetic_predictors |
synthetic_predictors.rds |
61 | 30 | 28 key Moody’s macro predictors (14 _fy + 14 _cy), FY1971–FY2031; empirical params from extract_predictor_params_confidential.R; extended to FY2031 via extend_predictors_confidential.R |
synthetic_regression_params |
synthetic_regression_params.rds |
list | 9 keys | Regression params for 8 conditional CALM target models (alpha, beta,
resid_sd, resid_ar1, key_predictor per series); 11×11 updated
inter-series residual correlation matrix
(cor_resid_updated); also stores all_targets
(11 series), conditional_targets,
unconditional_targets, calm_fy_range,
moodys_fy_range; created by
extract_regression_params_confidential.R |
synthetic_targets |
synthetic_targets.rds |
47 | 12 | 11 CALM tax revenue YoY % change targets, FY1979–FY2025; mixed generation — 4 unconditional series (CORP, ESTATE, GAS, M&R) use constant-mean AR(1); 7 conditional series (PINCOME, PIWITH, PIEST, S&U, MVP&U, CIG, INSUR) use time-varying fitted mean α + β·x_t from synthetic_regression_params |
synthetic_full |
synthetic_full.rds |
47 | 41 | synthetic_targets joined with
synthetic_predictors on FY/Year; primary dataset for macro
model estimation |
synthetic_CIT_PMT_ANNUAL_long |
synthetic_CIT_PMT_ANNUAL_long.rds |
7,996 | 5 | Long-format CIT account panel FY2020–FY2026; log-normal payment distribution; realistic account entry/exit dynamics |
synthetic_CIT_PMT_ANNUAL_wide |
synthetic_CIT_PMT_ANNUAL_wide.rds |
3,513 | 8 | Wide-format version (one row per account); FY2020–FY2026 payment columns |
synthetic_ESTATE_FILINGS |
synthetic_ESTATE_FILINGS.rds |
45 | 11 | 45 synthetic estate filings; correct size band distribution, death-to-receipt lag, prior payment ratios, and monthly distribution |
synthetic_CIT_CARRYFORWARD |
synthetic_CIT_CARRYFORWARD.rds |
10 | 5 | FY2017–FY2026 annual carryforward generated/applied series with 1.5% noise; preserves FY2025 collapse and FY2026 zero generation exactly |
All files saved to ../00_Data/Data_Wrangled/.
Macro predictors
(synthetic_predictors): - 28 variables (14 _fy +
14 _cy) with empirical means, SDs, and 28×28 correlation matrix from
extract_predictor_params_confidential.R (FY2007–FY2025 estimation
window) - fminwage_us_fy and fminwage_us_cy
both floored at zero (non-negativity constraint) - Linear time trend
added as 29th column for GAS modeling
CALM targets (synthetic_targets): - 11
tax revenue YoY % change series, FY1979–FY2025 (BANK and PROPT removed —
not in CALM CORP scope; CIG added) - Mixed generation architecture
driven by synthetic_regression_params: -
Unconditional (CORP, ESTATE, GAS, M&R):
constant-mean AR(1) y_t = μ + φ(y_{t-1} − μ) + ε_t; M&R
demoted from conditional due to 4-observation estimation window
(FY2022–2025 only) - Conditional (PINCOME, PIWITH,
PIEST, S&U, MVP&U, CIG, INSUR): time-varying fitted mean
y_t = (α + β·x_t) + φ(y_{t-1} − (α + β·x_{t-1})) + ε_t; CIG
uses time_trend predictor (secular decline); all others use
a single Moody’s macro predictor selected by
extract_regression_params_confidential.R - AR(1)
coefficients for unconditional series: ESTATE φ = −0.399; all others φ =
0; conditional series φ from params_tbl (all near zero) - Per-series
start years enforced (PIWITH/PIEST: 1989, GAS/MVP&U: 1981) -
Inter-series residual correlations from cor_resid_updated
(11×11) in synthetic_regression_params
CIT_PMT_ANNUAL
(synthetic_CIT_PMT_ANNUAL_long/wide): - Account
counts by FY match exactly: 358 / 996 / 1,212 / 1,405 / 1,382 / 1,596 /
864 - Year totals rescaled to match exactly: $36.8M / $150.5M / $173.1M
/ $214.2M / $187.3M / $249.0M / $103.2M - Log-normal payment
distribution: log_mean ≈ 10.7, log_sd ≈ 1.25 - YoY retention rates:
73.2% / 64.8% / 65.3% / 61.2% / 63.5% - H1 ratio per account: truncated
normal, mean = 0.380, SD = 0.049
ESTATE_FILINGS
(synthetic_ESTATE_FILINGS): - 45 filings;
aggregate totals match within rounding ($2) - Size band distribution: 4
Under $1M / 1 $1M–$2M / 5 $2M–$5M / 21 $5M–$10M / 14 $10M+ -
Death-to-receipt lag: mean ≈ 394 days, median ≈ 434 days - Death cohort:
12 FY2024 deaths, 22 FY2025 deaths, 11 no date - Vermont estate tax
computed mechanistically: 16% × (estate − $5M) - Prior payment ratio:
truncated normal, mean = 0.589, SD = 0.561
CIT_CARRYFORWARD
(synthetic_CIT_CARRYFORWARD): - FY2017–FY2026;
documented values ±1.5% multiplicative noise - FY2026 generation =
exactly 0 (preserved without noise) - FY2025 generation = $3.6M (within
3.7% of documented $3.7M) - FY2024 generation = $104.9M (within 0.1% of
documented $105M)
A reusable name-scrubbing utility (scrub_confidential.R)
is sourced at the top of every script that reads real individual filer
data. Located at ../00_Scripts/scrub_confidential.R. Three
functions:
scrub_confidential(df, extra_patterns) — drops columns
matching confidential patterns (name, SSN, EIN, address fields) at point
of ingestion; patterns use word-boundary matching to avoid false
positives on columns like vt_taxable_estatesafe_preview(df, n, extra_patterns) — replacement for
head(); scrubs before printingsafe_glimpse(df, extra_patterns) — replacement for
glimpse(); scrubs before printingUsage pattern:
source("../00_Scripts/scrub_confidential.R")
data_raw <- readxl::read_excel(path, sheet = "Data") %>%
janitor::clean_names() %>%
dplyr::rename(key_col = raw_col_name) %>% # rename BEFORE scrubbing
scrub_confidential() # then scrub
safe_preview(data_raw) # instead of head()
safe_glimpse(data_raw) # instead of glimpse()
Important: Always rename key analytical columns
before scrubbing, since scrub_confidential() will drop any
column whose name matches a confidential pattern. For example,
x17_adjusted_vt_estate_tax would be dropped because it
contains “state” — renaming it to adj_vt_estate_tax first
prevents this.
Methodology: 3-year weighted H1-to-full-year ratio (FY2023 weight 1, FY2024 weight 2, FY2025 weight 3).
FY2023 outlier strip: 3 accounts, $1.164M total payments stripped, mechanically neutral on ratio (proportional strip).
Weighted H1 ratio: 0.42
H1 FY2026 observed: $94.2M across 695 large-payer accounts. YoY H1 change: −2.3% vs approximated H1 FY2025.
Pipeline adjustment: −$3.7M net - Refund risk: −$6.0M (18 unposted FY2025-vintage accounts, none approved) - Collectible receivables: +$2.3M (tiered by vintage: FY2026 75%, FY2025 50%, FY2024 25%, FY2023 10%)
Carryforward finding: structural H2 tailwind, not headwind. Credit pool nearly exhausted after $103M applied in FY2025 and $88.5M in H1 FY2026. H2 draw rate collapsed from $14.7M/month (H1) to $0.9M/month (Jan–Apr 2026), meaning taxpayers must cover H2 liability in cash.
FY2026 estimates (pipeline-adjusted):
| Scenario | Estimate | Basis |
|---|---|---|
| Central | $220.5M | Weighted ratio 0.42 |
| Upside | $244.1M | Post-PTET floor ratio 0.38 — more H2 to come |
| Downside | $211.6M | 2-year weighted ratio 0.44 — less H2 to come |
| Spread | $32.5M |
FY2025 actual: $272.6M. All scenarios imply material decline vs FY2025.
Files: - Script:
00_Scripts/corp_nowcast_fy2026.R - Memo:
00_Notes/CORP_nowcast_memo_FY2026.txt
Date completed: May 5, 2026.
PRIMARY MODEL RESULTS (final, 28-predictor pool: 14 _fy + 14 _cy): Predictors selected: fsp500q_us_fy (S&P 500, fiscal year) and fztax_us_cy (US profits tax liability, calendar year). One _cy calendar year variant selected. Adj R-squared: 0.496 (improved from 0.35 with 14-predictor pool).
Validation: - FY2025 out-of-sample error: −7.5 pp — larger than the 14-predictor run (−0.2 pp). In-sample fit improved markedly (adj R² 0.496 vs 0.35); the larger OOS error reflects synthetic data idiosyncrasy and does not invalidate the model structure. - FY2026 gap: +42.7 pp — macro model implies positive growth while H1 signal implies −19.1%. Genuine divergence between macro predictors and individual filer signal; not a data error.
FY2027–2031 projections from $220.5M FY2026 base:
| FY | YoY (primary) | CORP (primary) | YoY (robust) | CORP (robust) |
|---|---|---|---|---|
| FY2027 | +21.5% | $267.9M | +18.3% | $260.9M |
| FY2028 | +17.4% | $314.6M | +19.7% | $312.3M |
| FY2029 | +15.2% | $362.4M | +19.7% | $373.9M |
| FY2030 | +16.3% | $421.4M | +19.7% | $447.6M |
| FY2031 | +15.7% | $487.4M | +19.8% | $536.3M |
Primary: full history FY2007–2025, regime dummies + fsp500q_us_fy + fztax_us_cy. Robust: short window FY2022–2025, d_apportionment + 1 macro predictor (secondary check only; n=4). All predictor values FY2026–2031 are real Moody’s forecasts. No placeholders remain.
Predictor data notes: - 28-predictor pool (14 _fy + 14 _cy) extracted from Moody’s via extract_predictor_params_confidential.R; FY2007–2025 estimation window, matched to primary model regime - synthetic_predictors regenerated at 55 rows × 30 columns (Year + time_trend + 28 predictors), then extended to 61 rows × 30 columns (FY1971–2031) by appending predictors_fy2026_2031_extension.rds (real Moody’s FY2026–2031 forecasts for all 28 predictors) - synthetic_full is 47 rows × 41 columns — inner join restricted to years with target data (FY1979–2025); FY2026–2031 predictor rows used by projection script directly, not via synthetic_full
July 2026 update instructions: - Replace FY2026 base $220.5M with FY2026 actual - Re-run extract_predictor_params_confidential.R to refresh empirical means, SDs, and 28×28 correlation matrix on updated history - Re-run extract_regression_params_confidential.R to refresh synthetic_regression_params.rds with updated CALM target regression coefficients and residual correlation matrix - Re-run extend_predictors_confidential.R to refresh Moody’s forecasts for all 28 predictors through FY2027–2032 - Re-run create_synthetic_data.R, then append extension rows - Re-run corp_projection_fy2027_2031.R - Model coefficients and predictor selection will re-estimate on updated training data including FY2026 actual
Files: - Script: 00_Scripts/corp_projection_fy2027_2031.R - Param extraction: 00_Scripts/extract_predictor_params_confidential.R (confidential session only — gitignored output) - Extension script: 00_Scripts/extend_predictors_confidential.R (confidential session only — gitignored) - Memo: 00_Notes/CORP_projection_memo_FY2027_2031.txt - Param file: 00_Data/Data_Wrangled/synthetic_predictor_params_28.rds (gitignored — Moody’s proprietary) - Extension data: 00_Data/Data_Wrangled/predictors_fy2026_2031_extension.rds (gitignored — Moody’s proprietary)
Date completed: May 6, 2026.
CALM CORP = CIT + BIT + NRW (confirmed mutually exclusive and collectively exhaustive, Tax Department May 2026). The individual filer files capture gross inflows; CALM CORP is net collections after carryforward credits are applied. The two sources cannot be directly summed: filing-period attribution differences and carryforward netting create a systematic wedge. CALM CORP H1 FY2026 actual ($76,107,953) is $42.7M below the raw individual filer composite ($118,853,982), of which roughly $20M is explained by filing-period attribution (BIT payments 74.6% FY2025-attributed) and the remainder by CALM netting carryforward credits that gross inflow files do not capture.
| Period | CIT | BIT | NRW | CALM CORP Total |
|---|---|---|---|---|
| FY2024 full year | $197,764,751 | $10,889,688 | $30,158,087 | $238,812,527 |
| FY2025 full year | $241,005,317 | $73,045,996 | (−$41,423,565) | $272,627,748 |
| FY2026 H1 (Jul–Dec 2025) | $73,429,273 | $23,959,650 | (−$21,280,971) | $76,107,953 |
NRW was a net tailwind in FY2024 (+$30.2M; legacy credits not yet drawn at scale) and became a persistent headwind in FY2025 (−$41.4M) and H1 FY2026 (−$21.3M) as the accumulated carryforward stock is drawn down faster than new credits are generated (Section 9).
| FY | H1 | H2 | Full Year | H1 Share | H2/H1 |
|---|---|---|---|---|---|
| FY2021 | $59,277,302 | $113,141,924 | $172,419,226 | 34.38% | 1.909 |
| FY2022 | $77,614,412 | $145,647,072 | $223,261,485 | 34.76% | 1.877 |
| FY2023 | $97,777,377 | $183,591,942 | $281,369,318 | 34.75% | 1.878 |
| FY2024 | $97,597,495 | $141,215,032 | $238,812,527 | 40.87% | 1.447 |
| FY2025 | $123,200,771 | $149,426,978 | $272,627,748 | 45.19% | 1.213 |
Pre-NRW structural baseline (FY2021–2023 mean): 34.63% H1 share, H2/H1 ≈ 1.88×. The sharp H1 share increase in FY2024–2025 reflects NRW carryforward credits suppressing H2 cash collections. This effect is modeled explicitly via the NRW scenario overlay below rather than embedded in the H1/H2 ratio. The pre-NRW structural ratio is used as the mechanical H2 projection baseline; NRW scenarios adjust H2 from that baseline.
Three scenarios for H2 FY2026 NRW (see Section 13 item 14 for resolution status). H1 fixed at $76,107,953. Full-year NRW = H1 NRW + H2 NRW.
| Scenario | H2 NRW | Full-Year NRW | Rationale |
|---|---|---|---|
| Upside | $0 | (−$21.3M) | Carryforward stock exhausts; taxpayers cover H2 liability in cash |
| Central | (−$21,280,971) | (−$42.6M) | Continuation of H1 run rate; FY2025 drawdown pattern |
| Downside | (−$31,921,457) | (−$53.2M) | Extension filers appear in H2 and generate new credits applied immediately |
The downside H2 NRW = 1.5× H1 run rate. Resolution pending summer 2026 extension filing window (Section 9.3).
Pre-NRW structural H1 share (34.63%) implies mechanical H2 = $76,107,953 / 0.3463 − $76,107,953 = $143,657,779. NRW scenarios adjust H2 relative to central; H1 is fixed.
| Scenario | Full-Year CORP | Delta vs Central |
|---|---|---|
| Upside | $241,046,703 | +$21,280,971 |
| Central | $219,765,732 | — |
| Downside | $209,125,246 | −$10,640,486 |
January nowcast central: $220.5M (Section 15.1). CALM-grounded revised central: $219.8M ($219,765,732). Difference: $0.7M — immaterial. The January individual-filer-based nowcast is consistent with the CALM structural baseline and requires no revision.
Date completed: May 2026.
The July update serves two purposes (Section 4.3): (1) clean FY2026 of idiosyncratic large-payer effects to establish the structural base year for FY2027–FY2031 projections; and (2) characterize concentration risk for the legislature. FY2026 is fully observed after the fiscal year closes June 30, 2026. The workflow requires a new full-year CIT_PMT_ANNUAL file from the Tax Department before the real-data steps can run.
| Script | Session | Purpose |
|---|---|---|
00_Scripts/corp_baseyear_cleaning_fy2026.R |
Synthetic | FY2026 base year cleaning; outputs
corp_baseyear_cleaning_fy2026_summary.rds |
00_Scripts/corp_baseyear_cleaning_fy2026_real.R |
Confidential (gitignored) | Identical logic applied to real CIT_PMT_ANNUAL July file; update
PMT_FILE_JULY when file received |
00_Scripts/july_update_workflow.R |
Synthetic | Orchestration: sources Steps 1–2, prints confidential-session checklist |
july_update_workflow.R runs the two synthetic steps in
sequence and prints a seven-step checklist (Steps A–G) for the separate
confidential-session run.
Per Section 4.1, the structural base year removes idiosyncratic large-payer effects:
\(Y^*_{FY2026} = Y_{FY2026}^{actual} - \hat{\epsilon}_{large,FY2026}\)
Reference period: FY2021–FY2024 (post-PTET; FY2020 excluded as pre-PTET regime; FY2025 excluded as the contiguous prior year). This matches the Section 6.2 recommendation.
Flagging rule: per-account idiosyncratic component =
FY2026 payment − per-account reference mean over FY2021–FY2024. Flag
accounts where |idio| > 2 × SD(idio distribution across all FY2026
accounts). New accounts with no FY2021–2024 history receive reference
mean = 0 (conservative; consistent with January methodology in
clean_CIT_PMT_ANNUAL.qmd).
Cleaned base formula:
cleaned = actual − sum(idio actuals) + sum(idio ref means)
Idiosyncratic accounts’ payments are replaced with their reference means; all other accounts retain their actual payments. The net cleaning adjustment = sum(idio actuals) − sum(idio ref means). Idio payments are not removed entirely — the reference mean is substituted in their place.
H1 data note (real companion):
corp_baseyear_cleaning_fy2026_real.R compares H1 payments
to H1-scaled reference means (ref_mean × h1_fraction) to flag
candidates. Annualized idio actuals are then scaled back to full-year
via the implied H1 fraction before applying the formula. TIER 2 dollar
outputs use FY2026_ACTUAL = $220.5M as a placeholder.
Note: Synthetic FY2026 = H1 only ($103.2M, 1,047 accounts). All figures below are from the synthetic data run and are illustrative. Real FY2026 results come from the confidential companion.
Base year cleaning (FY2021–2024 reference, primary):
| Metric | Value |
|---|---|
| FY2026 actual (H1 panel) | $103.2M |
| minus: idio accounts actual payments | $23.1M |
| plus: idio accounts reference means | $19.6M |
| FY2026 structural base (actual − idio + ref) | $99.7M |
| Net cleaning adjustment (idio − ref) | $3.5M |
| Net adj as pct of actual | 3.4% |
| Candidate idiosyncratic accounts | 53 of 1,047 |
Reference period sensitivity:
| Reference period | n_idio | Idio actuals | Ref means | Net adj | Cleaned | Net % |
|---|---|---|---|---|---|---|
| FY2021–2024 (recommended) | 53 | $23.1M | $19.6M | $3.5M | $99.7M | 3.4% |
| FY2022–2024 (post-apportionment) | 53 | $23.3M | $19.4M | $3.9M | $99.4M | 3.8% |
| FY2021–2023 (pre-apportionment) | 49 | $23.9M | $16.9M | $6.9M | $96.3M | 6.7% |
The FY2021–2023 reference strips more because FY2024’s apportionment- driven payment decline pulls reference means lower, making FY2026 payments look more elevated by comparison. FY2021–2024 is the recommended window (Section 6.2).
FY2026 concentration (synthetic): top_1 = 1.6%, top_5 = 6.5%, top_10 = 10.5%, HHI = 0.0032. Consistent with post-PTET mean (top_1 = 1.6%, top_5 = 6.8%, HHI = 0.0033). Synthetic data has no extreme idiosyncratic outliers; real data concentration may differ materially.
Legislature stress scenarios (synthetic):
| Scenario | Total | Delta |
|---|---|---|
| Actual FY2026 | $103.2M | — |
| Top-1 reverts to reference mean | $101.7M | −1.5% |
| Top-3 revert to reference means | $98.9M | −4.2% |
| Top-5 revert to reference means | $96.7M | −6.3% |
| Top-10 revert to reference means | $92.7M | −10.2% |
Run these steps separately in the real-data session after obtaining the full-FY2026 CIT_PMT_ANNUAL file from the Tax Department:
| Step | Action |
|---|---|
| A | Receive full-FY2026 CIT_PMT_ANNUAL file; update
PMT_FILE_JULY and
FY2026_ACTUAL (line ~36) in
corp_baseyear_cleaning_fy2026_real.R |
| B | Run corp_baseyear_cleaning_fy2026_real.R; TIER 1
outputs will validate; TIER 2 outputs update to definitive values; key
output: base_year_inputs$fy2026_cleaned |
| C | Run extract_predictor_params_confidential.R; training
window now FY2007–FY2026 |
| D | Run extend_predictors_confidential.R; refresh Moody’s
forecasts through FY2032 |
| E | Run 00_Notes/create_synthetic_data.R; append new
FY2027–2032 extension rows |
| F | Update corp_projection_fy2027_2031.R:
BASE_DOLLARS <- [FY2026 actual CORP from CALM];
FY2025_ACTUAL unchanged at 272,627,748 |
| G | Re-run july_update_workflow.R with updated synthetic
data and real base |
Files: - Synthetic script:
00_Scripts/corp_baseyear_cleaning_fy2026.R - Real
companion: 00_Scripts/corp_baseyear_cleaning_fy2026_real.R
(gitignored) - Orchestration:
00_Scripts/july_update_workflow.R - Summary output:
00_Data/Data_Wrangled/corp_baseyear_cleaning_fy2026_summary.rds
- Real output:
00_Data/Data_Wrangled/corp_baseyear_cleaning_real_FY2026_results.rds
(gitignored)
Date completed (synthetic steps): May 7, 2026. H1 real-data run: May 2026 (see Section 16.6). Definitive run awaits full-year FY2026 file.
Executed against
Payment Totals by Year - CIT - Prepared 20260106.xlsx (H1
FY2026 file; x2026 column = Jul–Dec 2025). TIER 1 outputs are reliable
now. TIER 2 outputs are PRELIMINARY and will be updated when the
full-year FY2026 file is received post-June 30, 2026.
| Metric | Value |
|---|---|
| Accounts in CIT_PMT_ANNUAL file | 3,187 |
| Active accounts in FY2026 H1 | 861 |
| H1 panel total | $103.2M |
| FY2025 full-year total | $249.0M |
| H1 ratio vs FY2025 | 41.5% |
| Implied H1 share of $220.5M placeholder | 46.8% |
| Candidate idiosyncratic accounts (H1 flags) | 15 |
FY2026 H1 concentration vs post-PTET norms:
| Measure | FY2026 H1 | Post-PTET mean (FY2021–2025) |
|---|---|---|
| Top-1 share | 11.3% | 6.2% |
| Top-5 share | 22.2% | 16.4% |
| Top-10 share | 30.3% | — |
| HHI | 0.0196 | 0.0121 |
Concentration is materially elevated relative to post-PTET norms, consistent with one or two large payers making anomalously large H1 payments. Synthetic data (top-1 = 1.6%, HHI = 0.0032) has no extreme outliers by design; real H1 data does.
Base year cleaning — FY2021–2024 reference (primary):
| Component | Amount |
|---|---|
| FY2026 actual — CALM placeholder | $220.5M |
| minus: idio accounts annualized actuals | $76.0M |
| plus: idio accounts reference means | $21.3M |
| FY2026 structural base (actual − idio + ref) | $165.8M |
| Net cleaning adjustment | $54.7M (24.8%) |
Important caveat: The $76.0M annualized idio actuals is derived by scaling H1 idio payments (from 15 candidate accounts) by the implied H1 fraction (46.8%). This annualization introduces significant uncertainty. The idiosyncratic account flags and the cleaning amount will change materially when the full-year FY2026 file is received — the preliminary 24.8% net cleaning share should not be treated as a reliable estimate of the final cleaning adjustment.
Reference period sensitivity [TIER 2 — preliminary]:
| Reference period | n_idio | Ann. actuals | Ref means | Net adj | Cleaned [PH] | Net % |
|---|---|---|---|---|---|---|
| FY2021–2024 (recommended) | 15 | $76.0M | $21.3M | $54.7M | $165.8M | 24.8% |
| FY2022–2024 (post-apportionment) | 14 | $76.0M | $20.1M | $55.8M | $164.7M | 25.3% |
| FY2021–2023 (pre-apportionment) | 15 | $76.2M | $13.6M | $62.6M | $157.9M | 28.4% |
Legislature stress scenarios [TIER 2 — preliminary, $220.5M placeholder]:
| Scenario | Total | Delta |
|---|---|---|
| FY2026 actual — placeholder | $220.5M | — |
| Top-1 reverts to reference mean | $199.8M | −9.4% |
| Top-3 revert to reference means | $189.4M | −14.1% |
| Top-5 revert to reference means | $181.2M | −17.8% |
| Top-10 revert to reference means | $167.1M | −24.2% |
Update instructions when full-year file received: 1.
Update PMT_FILE_JULY in
corp_baseyear_cleaning_fy2026_real.R 2. Update
FY2026_ACTUAL (line ~36) with the full-year CIT_PMT_ANNUAL
total 3. Re-run corp_baseyear_cleaning_fy2026_real.R — TIER
1 outputs validate; TIER 2 update to definitive values 4. Use
base_year_inputs$fy2026_cleaned as
BASE_DOLLARS in Step F (projection update)
| Hash | Commit message |
|---|---|
670b0ea |
Initial commit: CORP nowcast FY2026, synthetic data scripts, project docs, CLAUDE.md guardrails, .gitignore |
631dc63 |
Update project brief: CORP nowcast FY2026 results, open questions status, synthetic data dimensions |
bc7f19b |
CORP 5-year projection FY2027-2031: primary full-history model with regime dummies, robustness short window |
dd3c30b |
Gitignore: exclude confidential session script and Moody’s extension output |
4bc69e9 |
CORP projection updated with real Moody’s FY2026-2031 predictor forecasts |
02a9216 |
CORP projection final: all FY2026-2031 predictors now on real Moody’s values including FY2031 |
026bb24 |
Update project brief: CORP projection final results, synthetic_predictors extended to FY2031 |
21de6e7 |
Add confidential param extraction script for 28-predictor synthetic data expansion |
9678ef1 |
Expand synthetic_predictors to 28 predictors: add _cy calendar year variants via empirical params; update extend_predictors_confidential.R to match |
f31fff7 |
CORP projection final: 28-predictor pool with _cy variants; fsp500q_us_fy + fztax_us_cy selected; adj R2 0.496 |
a32e850 |
Update project brief Section 14.2: synthetic_predictors 61x30 (28 predictors), synthetic_full 47x42 |
56456b4 |
Update CLAUDE.md synthetic data dims: predictors 61x30, full 47x42 |
dadf310 |
Update project brief Section 14.3: 28 predictors, 28x28 correlation matrix, dual fminwage clip |
b8f6535 |
Add Section 16 revision history with full git log to project brief |
9bd88c4 |
Gitignore: exclude Quarto render artifacts (*_files/) and rsconnect/ deployment metadata |
42a4d1e |
Track CIG, CIT, and Estate cleaning scripts: 9 qmd files reviewed and staged |
35f18dd |
Update Estate qmd files for revised Tax Dept files with shared account_key |
44eaffb |
Fix date parsing in Estate qmd files: use lubridate::parse_date_time for robustness |
4358211 |
Regenerate Estate summary files from revised Tax Dept data with account_key and additional extension payment |
36849c4 |
Track CIG and CIT aggregate summary rds files |
4181ade |
Add real-data CORP nowcast companion script corp_nowcast_fy2026_real.R; gitignore updated |
c98de8f |
July update workflow: base year cleaning, BIT/CIT noreturn analysis, orchestration script; real-data companion gitignored |
611da1c |
Update CLAUDE.md: add corp_baseyear_cleaning_fy2026_summary and July update scripts |
3feedea |
Update CLAUDE.md: add account_name anonymization safety pattern and mark real companion as validated |
bfe7bc1 |
Fix base year cleaning: explicit replacement formula + TIER 1/2 output structure |
c60eb6d |
Update project_brief: Section 16 July update workflow, Section 17 revision history, resolve item 12 |
f34345e |
Update project_brief: H1 real-data results, cleaning formula, TIER structure |
de1e5c9 |
CALM CORP reconciliation: NRW scenario framework, pre-NRW structural baseline, January nowcast validated at $219.8M vs original $220.5M |
Document consolidates the original project brief, all individual filer analysis through CIT_INFO and BIT (BIT_PMT_CUM and BIT_CIT_NORETURN now fully analyzed), the Vermont Tax Law Change Log derived from the JFO Fiscal Facts 2026 publication, Tax Department meeting findings (May 2026), and the synthetic data stack for model development (May 2026). All five CIT individual filer files are fully analyzed. BIT and NRW confirmed to map to CORP in source tax revenues (May 2026). July 2026 update workflow scripts complete and validated against H1 real data (Section 16); TIER 1 H1 results in Section 16.6 are reliable now; TIER 2 definitive cleaning awaits full-year FY2026 CIT_PMT_ANNUAL file (Section 13 item 13). WHT_FILINGS, MRT_STR_MONTHLY, BFT_RTN, and CAPGAIN_SUM analysis awaits. CALM captures gross source revenue before fund allocation; fund dedication changes are irrelevant to forecasting. Confidentiality guidelines throughout: no individual taxpayer identifiers, names, EINs, or record-level data used in any aggregate output. All model development uses synthetic data; real confidential data is accessed only at the final application stage.