Reproduction of: California Billionaires: Wealth, Taxes, and Wealth Tax Revenue Estimates

R replication of Boll, Saez, and Zucman (2026) — NBER WP 35218

Author

Fernando Hoces de la Guardia

Published

May 27, 2026

Overview

This document is a section-by-section R replication of California Billionaires: Wealth, Taxes, and Wealth Tax Revenue Estimates by Jasper Boll, Emmanuel Saez, and Gabriel Zucman (NBER Working Paper No. 35218, May 2026). Every table and figure embedded below is built from the project’s {targets} pipeline; the upstream targets re-derive each Excel cell of the BSZ supplementary workbook in R and verify element-wise against the cached values; see the project README for the full pipeline structure.

The text from the original paper is shown in blockquotes; section commentary outside the blockquotes is the replicator’s own.

Abstract (verbatim from BSZ 2026)

This paper documents the wealth of California’s billionaires and the taxes they pay. California billionaires’ wealth exceeds $2 trillion today, the equivalent of 50% of California’s GDP. It has grown 144% from 2023 to 2025, fueled by the AI boom. Over the longer run, the real wealth of California’s billionaire class — the 0.0002% richest households — has been multiplied by 30 from 1982 to 2025, while average real family income in California has about doubled. California billionaires pay about 0.2% of their wealth in California income tax ($3.2 billion/year), representing 2.4% of total California income tax revenue on average over 2023-2025. Using Securities and Exchange Commission data from Alphabet, Meta, Oracle, and Nvidia since 2004, we estimate the trajectory of wealth, income, and taxes paid by the top 4 California billionaires — Page, Brin, Zuckerberg, Ellison (through 2020), and Huang (since 2021) — focusing on their business wealth. This group alone holds nearly $1 trillion in business wealth, almost half of total California billionaire wealth. For this group, wealth growth (+322% over 2023-2025) and low taxation (0.04% of wealth in annual California income tax) are more pronounced. The proposed one-off California billionaire tax of 5%, payable over 5 years, is both small relative to California billionaires’ wealth gains and large relative to the taxes they currently pay. We estimate that it could raise about $100 billion, with comparatively minor impacts on income tax revenue. Using empirical estimates of mobility responses to wealth taxation, we find that an annual wealth tax on California billionaires could raise substantial additional revenue even after accounting for income tax losses due to mobility.

R re-derivations (computed once, cached, reused below)

Code
data_sec_agg_r           <- compute_data_sec_agg(data_sec_all)
pareto_missing_r         <- compute_pareto_missing(pareto_missing)
pareto_summary           <- compute_pareto_summary(pareto_missing_r)
tab5_r                   <- compute_tab5(pareto_missing_r, tab2, tab3)
fig8_laffer_r            <- compute_fig8_laffer()
billionaires_ca_inctax_r <- compute_billionaires_ca_inctax(
                              data_sec_agg_r, billionaires_ca_inctax, ftb_b4a)
shortrunseries_r         <- compute_shortrunseries(
                              data_sec_agg_r, data_sec_top4,
                              billionaires_ca_inctax_r, shortrunseries)
top4taxes_r              <- compute_top4taxes(data_sec_top4)
# Read-only display. Function bodies live in R/compute_*.R; tar_target() wiring
# is in _targets.R. The report imports each value via tar_read(<name>).

1. California Billionaires Wealth Growth

This section documents the level and growth of California billionaire wealth, both recent (2019-2025, where SEC Form 4 filings give per-individual detail) and long-run (1982-2025, where Forbes 400 lists give the top .0002%).

1.1 Recent and long-term wealth growth

Table 1 Panel A provides basic statistics on California billionaires in recent years. Appendix Table A1 provides the same statistics at the US level (instead of California). […] Larry Ellison is excluded from these series throughout the 2019 to 2025 period for consistency. (BSZ 2026, p. 3)

Methodology. Panel A pulls 2022-2025 totals straight from the Forbes real-time billionaire list (Ellison excluded). For each year, total CA billionaire wealth is the sum across CA residents, \[ W_t \;=\; \sum_{i \in \text{CA}_t} w_{i,t}, \] the public-stock fraction divides the public-worth aggregate by the total, and the top-4 column sums Page + Brin + Zuckerberg + Huang’s individual SEC Form 4-derived wealth. Year-over-year growth is \(g_t = W_t / W_{t-1} - 1\); the 2023-2025 cumulative growth row is \(W_{2025} / W_{2022} - 1\).

Panel B uses the Forbes 400 historical lists. The “billionaire class” is defined as a fixed fraction of all US (or CA) families — the top .0002% — so the population size is dynamic (45 families in CA in 2025, 24 in 1982); this controls for population growth across the 43-year window. All dollar values are deflated to 2025 dollars to control for inflation. The “annualized growth” row is the geometric average growth rate over the period, \[ \bar{g} \;=\; \left(\frac{X_{2025}}{X_{1982}}\right)^{1/43} - 1, \] applied separately to top-class wealth, top-class wealth per family, # CA families, CA GDP, and CA GDP per family.

Table 1 is rendered by build_tab1() from three upstream tibbles — compute_data_sec_agg() for the year-by-year totals, compute_shortrunseries() for the top-4 wealth column, and extract_longrunseries() for CA GDP.

Code
data_sec_agg_r   <- compute_data_sec_agg(data_sec_all)
shortrunseries_r <- compute_shortrunseries(
                       data_sec_agg_r, data_sec_top4,
                       billionaires_ca_inctax_r, shortrunseries)
build_tab1(data_sec_agg_r, shortrunseries_r, longrunseries)
# Read-only display. _targets.R wires these into the pipeline.
Table 1. Wealth Growth of California Billionaires
Year # Wealth Growth % public Top 4 wealth CA GDP Wealth / GDP
A. Recent nominal wealth growth of CA billionaires
2022 175 843 47.7% 209 3,618 23.3%
2023 186 1,155 37.0% 59.6% 399 3,827 30.2%
2024 196 1,567 35.7% 66.1% 625 4,048 38.7%
2025 239 2,052 30.9% 66.9% 882 4,251 48.3%
Growth during 2023-2025 1 3 0
B. Long-term real wealth growth: top .0002% wealthiest CA families (2025 $)
1982 24 22 94.6% 1,184.1% 1,209 102,141
2025 45 1,266 2,836.5% 2,231.8% 4,251 190,451
Ratio 2025 to 1982 2 57 2,999.8% 188.5% 4 2
Annualized growth 0 0 8.2% 1.5% 0 0
Notes: Panel A illustrates the wealth growth of CA billionaires in 2022-2025. Panel B compares the top .0002% wealthiest CA families’ wealth in 1982 vs 2025, all in 2025 dollars. Source: Forbes RTB snapshots + SEC EDGAR.

Figure 1 plots the trajectory of California’s billionaires’ total wealth (in nominal current dollars) at the end of years 2019 to 2025, the longest time series we have for all California billionaires. It also plots the wealth trajectory of the top 4 (Larry Page, Sergei Brin, Mark Zuckerberg, and Jensen Huang) who are centi-billionaires (wealth above $100 billion). (BSZ 2026, p. 2–3)

Methodology. Two lines, the all-CA total \(W_t\) and the top-4 sum \(W_t^{\text{top4}}\) from Table 1 Panel A, plotted at the end of years 2019-2025. The dashed counterfactuals depict the post-tax wealth \(0.95 \cdot W_{2025}\) to visualize the magnitude of the proposed 5% one-off wealth tax against the 2023-2025 wealth build-up.

Figure 1 is rendered by build_fig1() from the panel produced by compute_shortrunseries().

Code
build_fig1(shortrunseries_r)
# Read-only display. _targets.R wires this into the pipeline.

Figure 2 depicts the full time series. Panel A depicts the trajectory of top .0002% wealth (top 45 families in California today) in 2025 $ billions vs. the trajectory of California GDP per family. (BSZ 2026, p. 4)

Methodology. Panel A overlays two inflation-controlled trajectories at annual frequency 1982-2025: the wealth of the top .0002% CA families (in 2025 $ billions) and CA GDP per family (in 2025 $100,000s). The two series are plotted on the same y-axis after rescaling by \(\$100\text{k}\) versus \(\$1\text{B}\), so the steeper slope of the wealth series is read directly off the chart. Panel B normalizes both series by GDP to remove the remaining drift from real economic growth: each year’s value is \(W_t^{0.0002\%} / \text{GDP}_t\) for the US and CA separately, where \(W_t^{0.0002\%}\) is the aggregate wealth of the year-specific top .0002% group. Both panels exclude Ellison from CA after 2020.

Figure 2 is rendered by build_fig2() from the positional dump produced by extract_longrunseries().

Code
build_fig2(longrunseries)
# Read-only display. _targets.R wires this into the pipeline.

2. Taxes paid by California Billionaires

Section 2 estimates the California income tax paid by California billionaires each year from 2019 through 2025. Two complementary approaches are used: a top-down extrapolation from California Franchise Tax Board (FTB) tabulated statistics for high-AGI brackets, and a bottom-up calculation from SEC Form 4 filings of the top-4 holders’ realized capital gains, dividends, and exercised stock options.

2.1 Headline estimates for all California billionaires

For all California billionaires, we estimate the California income tax they pay using Franchise Tax Board (the California tax administration) tabulated statistics, a Pareto extrapolation, and an adjustment based on Federal tax data. (BSZ 2026, p. 6)

Methodology. The left half of Table 2 (“All California Billionaires”) is the centerpiece of the paper’s headline-tax estimate and combines (i) Franchise Tax Board bracket-level tabulations, (ii) a Pareto extrapolation to project the top-\(N\) taxpayer (where \(N\) is the number of CA billionaires that year), and (iii) two empirical corrections that adjust the projected top-income result into a top-wealth result. The right half (“Top 4”) is mechanically simpler: SEC Form 4 transaction-level data lets us compute Larry Page, Sergey Brin, Mark Zuckerberg, and Jensen Huang’s California income tax cell-by-cell, with no extrapolation.

Pareto extrapolation (left half). A Pareto distribution with parameter \(b\) has the property that the mean of any upper tail equals \(b\) times the tail’s cutoff. Concretely, if FTB’s top bracket has \(n_t\) filers above threshold \(t\) with total AGI \(A_t\), then \[ b_t \;=\; \frac{A_t / n_t}{t}. \] Inverting the tail CDF places exactly \(N\) filers above some projected cutoff \(\hat{t}_N\): \[ \hat{t}_N \;=\; t \cdot \left(\frac{n_t}{N}\right)^{1 - 1/b_t}, \] and the mean AGI of those top \(N\) is then \(\hat{A}_N = b_t \cdot \hat{t}_N\) (so total AGI \(= N \cdot b_t \cdot \hat{t}_N\)). Multiplying by the bracket-level effective tax rate \(\tau_t = T_t / A_t\) gives an uncorrected projection of the top-\(N\) tax payment, \(\tilde{T}_N\).

Two corrections. That projection over-estimates the actual top-\(N\) tax for two reasons documented in the paper.

Overshoot correction (\(1 - \pi\), BSZ p. 47). Calibrating the same Pareto procedure on US-wide IRS data, where the true top .001% is observable, shows that extrapolating from the $10M+ bracket overshoots by roughly 10%. The replication uses a year-specific overshoot factor \(\pi_t\) stored in the \(72\) row of the workbook’s Memo 1 block.

Wealth-versus-income discount (\(\theta = 0.5\)). Top wealth-holders are not top income-earners; Balkir et al. (2025) IRS-matched evidence and the Survey of Consumer Finance both find that top wealth-holders’ incomes are about 50% of top income-earners’ incomes.

Charity/federal-rate consistency correction (\(D_{99} \approx 0.92\)). A second cross-check (Memo 2 of the workbook) confirms that top wealth-holders pay roughly 8% lower federal income tax rates than top income-earners, again because of charitable giving. This produces a scalar \(D_{99}\) recomputed each run as the ratio of Memo 2’s implied federal rate to Memo 1’s 2018-2020 average federal rate.

Headline formula. Putting it together, \[ T_N^{\text{CA-bn}} \;=\; \theta \;\cdot\; D_{99} \;\cdot\; (1 - \pi_t) \;\cdot\; \tau_t \;\cdot\; \hat{A}_N. \] In workbook terms this is row 49 = row 44 × row 46 × row 47. The 2024 and 2025 values use a different identity (no FTB bracket tabulation yet exists for those years): an average \(\pi_t\) from prior years is extrapolated and the \(T_N\) value is implied by row-50 share of FTB row-18 total CA income tax.

The Pareto projection lives in .bci_top_brackets() (rows 26-44 of the workbook); the headline pipeline (rows 49-55, including the \(\theta\), \(D_{99}\) multiplications) is the second half of compute_billionaires_ca_inctax(). Table 2 itself is rendered by build_tab2() from the year-aggregate compute_data_sec_agg(), the headline compute_billionaires_ca_inctax(), and the per-billionaire panel from extract_data_sec_top4().

Code
build_tab2(data_sec_agg_r, billionaires_ca_inctax_r, data_sec_top4)
# Read-only display. _targets.R wires this into the pipeline.
Table 2. California Income Tax Paid by California Billionaires
Year All California Billionaires Top 4 (Page, Brin, Zuckerberg, Huang) on Company Wealth
Wealth ($B) Est. CA inc. tax ($B) Tax / wealth Company wealth ($B) Est. CA inc. tax ($B) Tax / wealth
2019 821 1.99 0.242% 184 0.03 0.018%
2020 967 2.77 0.286% 247 0.06 0.022%
2021 1,212 4.50 0.371% 365 0.77 0.210%
2022 843 2.33 0.276% 188 0.27 0.142%
2023 1,155 2.45 0.212% 377 0.04 0.010%
2024 1,567 3.04 0.194% 603 0.36 0.059%
2025 2,052 4.14 0.202% 856 0.36 0.042%
2019-2025 average 1,231 3.03 0.264% 403 0.27 0.067%
Notes: All amounts in nominal $B. CA income tax for all billionaires comes from the Method I FTB-extrapolation calculation; top 4 figures come from SEC filings. Source: billionaires_ca_inctax_r + data_sec_top4.

Figure 3 depicts the estimated annual California individual income tax paid by all California billionaires and the top 4 (in nominal $ billions) each year from 2019 to 2025. (BSZ 2026, p. 7)

Methodology. A grouped bar chart of the Table 2 columns. The “all CA billionaires” bars are the Pareto-extrapolated \(T_N^{\text{CA-bn}}\) (see Table 2 methodology); the “Top 4” bars are direct SEC-derived totals \(\sum_{i \in \text{top4}} t_{i,t}\), where each \(t_{i,t}\) is the CA income tax implied by individual \(i\)’s SEC Form 4 stock sales, exercised stock-options, dividends, and executive compensation that year (full per-billionaire breakdown is in Table 3). Panel B normalizes Panel A by year-end wealth so the visual scale is comparable across years despite the rapid wealth growth.

Figure 3 is rendered by build_fig3() from compute_shortrunseries().

Code
build_fig3(shortrunseries_r)
# Read-only display. _targets.R wires this into the pipeline.

2.2 Per-billionaire detail (Top 4)

Table 3 reports statistics for each of the top 4. (BSZ 2026, p. 7)

Methodology. Per-billionaire CA income tax, computed directly from SEC Form 4 line-item data (no extrapolation). For each individual \(i\) and year \(t\), fiscal income is \[ y_{i,t} \;=\; \underbrace{\text{salary} + \text{stock-option exercises} + \text{other comp}}_{\text{ordinary income}} \;+\; \text{dividends} \;+\; \text{realized cap.\ gains}_{i,t}, \] where realized capital gains are calculated transaction-by-transaction by matching every Form 4 sale to its highest-cost-basis acquisition (the HIFO rule the paper assumes; Appendix C.3.e). The CA tax is then the top-bracket statutory rate applied to taxable income, less the standard deduction. The “Total CA income tax / wealth gain” row at the bottom divides each individual’s 2019-2025 cumulative tax by the same individual’s \(W_{i,2025} - W_{i,2019}\), where wealth is share-count × price-per-share at the relevant year-end.

Table 3 is rendered by build_tab3() from the per-billionaire panel produced upstream by SEC Form 4 ingest in extract_data_sec_top4(). The transaction- level capital-gains accumulation runs outside this R replication (it is in the paper’s separate top-4 pipeline); this report reads the cached year-billionaire-level result.

Code
build_tab3(data_sec_top4)
# Read-only display. _targets.R wires this into the pipeline.
Table 3. California Income Tax Paid by the Top 4 on Company Wealth
Larry Page (Alphabet) Sergei Brin (Alphabet) Mark Zuckerberg (Meta) Jensen Huang (Nvidia) All top 4
CA income tax 2019 0.00 0.00 26.40 6.19 32.59
CA income tax 2020 0.00 0.00 34.89 20.29 55.18
CA income tax 2021 240.55 167.59 298.58 58.88 765.60
CA income tax 2022 120.26 83.93 2.52 60.50 267.21
CA income tax 2023 0.00 0.00 11.26 27.85 39.11
CA income tax 2024 82.92 54.15 165.99 54.85 357.91
CA income tax 2025 87.07 56.25 125.98 89.39 358.69
Average CA income tax 2019-2025 75.83 51.70 95.09 45.42 268.04
Wealth at the beginning of 2019 41,576 40,085 47,768 2,857 132,286
Wealth at end of 2025 244,240 225,401 225,635 160,228 855,503
Total CA income tax / wealth gain 2019-2025 0.262% 0.195% 0.374% 0.202% 0.259%
Notes: All amounts in nominal $M unless noted. CA income tax is computed from each billionaire’s SEC Form 4 filings (target data_sec_top4). Row 11 is the lifetime effective tax rate on the 2019-2025 wealth gain.

Table 4. First, their companies pay low dividends relative to the corporate net profits they make. […] Second, they realize modest capital gains relative to the enormous wealth gains they are making. (BSZ 2026, p. 10)

Methodology. Table 4 aggregates the Table 3 individual stream into a single “Top 4” total across 2019-2025 and decomposes total taxes \(T^{\text{top4}}\) into individual-income + corporate + other components, so the reader can see why the top-4 effective tax rate ends up so low. Each line is built bottom-up from SEC + Compustat.

Fiscal individual income is the sum of stock-option exercises, dividends, and realized capital gains (per Table 3). Corporate profits are pro-rated by ownership share \(s_{i,t}\): \[ \Pi_{i,t} \;=\; s_{i,t} \cdot \Pi_{\text{firm},t}, \] with corporate taxes (federal + state + foreign) pro-rated the same way and divided by \(\Pi\) to back out the effective corporate rate. Other taxes (property + sales) are imputed per Appendix B: \[ \text{prop tax}_{i,t} \;=\; 0.01 \cdot \text{P\&PE}_{i,t}, \quad \text{sales tax}_{i,t} \;=\; 0.03 \cdot 0.5 \cdot \left(\text{AGI}_{i,t} - \text{tax}_{i,t}^{\text{CA}} - \text{tax}_{i,t}^{\text{Fed}} - 0.25 \cdot \text{AGI}_{i,t}\right). \] The two summary tax-rate rows at the bottom are \(T^{\text{top4}} / Y^{\text{top4}}\) (vs. economic income) and \(T^{\text{top4}} / (W_{2025} - W_{2019})\) (vs. wealth gains).

Table 4 is rendered by build_tab4() from extract_data_sec_top4(); the corporate-profit proration and property/sales imputation also live in the compute_top4taxes() downstream panel used by Figure 6 and 7.

Code
build_tab4(data_sec_top4)
# Read-only display. _targets.R wires this into the pipeline.
Table 4. Wealth, Income, and Taxes of the Top 4, 2019-2025
Total 2019-2025 ($B) Annual average ($B)
Wealth in 2019 (beginning of year) 132.29
Wealth in 2025 (end of year) 855.50
Gain in wealth during 2019-2025 723.22 103.32
Wealth (average over 2019-2025) 402.86
Fiscal individual income 18.44 2.63
Stock-options exercise + non-equity comp 2.32 0.33
Dividends 4.00 0.57
Realized capital gains 11.80 1.69
Memo: Appreciated stock donated to charity 17.37 2.48
Memo: Net collateral pledged 3.14 0.45
Federal individual income tax 3.62 0.52
California individual income tax 1.88 0.27
Individual taxes / individual income 29.8% 29.8%
Corporate profits 128.41 18.34
Corporate taxes (federal) 21.32 3.05
Corporate tax rate (effective) 16.6% 16.6%
Notes: All amounts in nominal $B. Source: per-billionaire SEC Form 4 filings aggregated as data_sec_top4’s “Total (excluding Ellison)” rows.

2.3 Total tax burden relative to wealth

Figure 4 depicts the estimated total annual taxes paid by all California billionaires and the composition by tax type from 2019 to 2025 relative to end of year wealth. Federal and California individual income taxes are estimated using Franchise Tax Board tabulated statistics, a simple Pareto extrapolation, and an adjustment based on Federal tax data. (BSZ 2026, p. 8)

Methodology. A stacked-area chart of total annual taxes paid by all CA billionaires, decomposed into four components, each expressed as a percent of end-of-year wealth \(W_t\):

  • \(\tau^{\text{CA-inctax}}_t / W_t\) — from the Table 2 Pareto extrapolation
  • \(\tau^{\text{Fed-inctax}}_t / W_t\) — symmetric Pareto procedure on federal-tax rates
  • \(\tau^{\text{corp}}_t / W_t\) — Compustat-derived effective corporate rate applied to pro-rated profits, scaled to all CA billionaires using the public-stock share
  • \(\tau^{\text{prop+sales}}_t / W_t\) — imputed property + sales taxes per Appendix B (formulas in Table 4 methodology)

The scaling from “top 4 / public-stock wealth” up to “all CA billionaires / all wealth” uses the BSZ Saez-Zucman national-accounts weights to split private wealth into pass-through (\(46.8 + 25\)) and private-C-corp (\(61\)) shares: \[ s^{\text{pass}} \;=\; (1 - s^{\text{pub}} - 0.11 \cdot s^{\text{pub}}) \cdot \frac{46.8 + 25}{46.8 + 25 + 61}, \quad s^{\text{priv-C}} \;=\; (1 - s^{\text{pub}} - 0.11 \cdot s^{\text{pub}}) \cdot \frac{61}{46.8 + 25 + 61}. \] Imputed corporate tax on private-C wealth is then \(\tau^{\text{corp,pub}}_t \cdot (s^{\text{priv-C}}/s^{\text{pub}})\), plus an 11% gross-up on diversified non-public holdings, plus the property-tax-to- corporate-tax ratio applied to the imputed corporate base.

Figure 4 is rendered by build_fig4() from the all_taxes block of compute_billionaires_ca_inctax(); the wealth-share decomposition is in .bci_all_taxes(), rows 110-153 of the workbook.

Code
build_fig4(billionaires_ca_inctax_r)
# Read-only display. _targets.R wires this into the pipeline.

Figure 5 illustrates the statistics from Table 4 on income and taxes of the top 4 richest Californians in the period 2019-2025 on their company wealth (which is 97% of their total wealth at the end of 2025). (BSZ 2026, p. 11)

Methodology. A two-panel visualization of Table 4. The top panel shows the bar heights are three different denominators for top-4 tax rates over 2019-2025:

  1. Fiscal individual income \(Y^{\text{fisc}} = \sum_t \sum_i y_{i,t}\)
  2. Economic income \(Y^{\text{econ}} = Y^{\text{fisc}} + \sum_t \sum_i (\Pi_{i,t} - \text{div}_{i,t})\) (fiscal income plus undistributed corporate profits, avoiding the double-counting of distributed dividends)
  3. Wealth gain \(\Delta W = W_{2025} - W_{2019}\)

Tax bars in light blue use the matching numerator: CA + Fed individual tax for the first denominator, total taxes for the latter two. The displayed percentages on the bars are exactly the effective rates \(T / Y^{\text{fisc}}\), \(T^{\text{total}} / Y^{\text{econ}}\), \(T^{\text{total}} / \Delta W\) from Table 4.

Figure 5 is rendered by build_fig5() from extract_data_sec_top4().

Code
build_fig5(data_sec_top4)
# Read-only display. _targets.R wires this into the pipeline.

Figure 6 depicts the estimated taxes paid by the top 4 each year from 2004 to 2025 relative to their wealth (top panel) and relative to their full economic income (bottom panel). (BSZ 2026, p. 13)

Methodology. A 22-year extension of the Figure 5 top panel back to 2004 (the earliest year SEC Form 4 filings are available in machine-readable format). The numerator is total taxes paid (CA + Fed + corp + prop + sales, all pro-rated to the top 4’s ownership shares); the denominators are wealth (top sub-panel) and economic income (bottom sub-panel) of the top 4. Larry Ellison is in the series through 2020 and Jensen Huang replaces him from 2021 onward — the paper justifies this swap on residence (Ellison moved to Hawaii in late 2020) and concentration (Huang’s wealth only became top-4-sized around 2021).

Figure 6 is rendered by build_fig6() from the 22-year panel produced by compute_top4taxes(); the per-year corporate-tax pro-ration and property/sales imputation use the same Appendix B formulas documented under Table 4.

Code
build_fig6(top4taxes_r)
# Read-only display. _targets.R wires this into the pipeline.

Figure 7 compares taxes paid relative to economic income for the top 4 vs. the average economy wide. […] The macroeconomic average effective tax rate for the US population is computed using the updated Distributional National Accounts series. (BSZ 2026, p. 15)

Methodology. Two-panel comparison of top-4 effective tax rates against two macro benchmarks. Panel A (total taxes / economic income) compares \(T^{\text{top4}}_t / Y^{\text{econ,top4}}_t\) (built as in Tables 4 + 6) against the US-wide Distributional National Accounts (DINA) macro average, \[ \bar{\tau}_t^{\text{US-DINA}} \;=\; \frac{T_t^{\text{US, all taxes, all govt}}} {Y_t^{\text{US, econ income}}}, \] where the DINA numerator + denominator are the totals across all tax types and levels of government from Piketty-Saez-Zucman (2018) / Saez-Zucman (2023). Panel B (CA individual income tax / economic income) compares the top-4 \(\tau^{\text{CA,top4}}_t / Y^{\text{econ,top4}}_t\) against a CA-wide average constructed as \[ \bar{\tau}_t^{\text{CA}} \;=\; \frac{T_t^{\text{CA-inctax,resid}}} {\text{AGI}_t^{\text{CA-resid}} \cdot \big(\text{econ-inc}^{\text{US-DINA}}_t / \text{AGI}^{\text{US-IRS}}_t\big)}, \] i.e. CA AGI is rescaled to economic income using the US-wide DINA-to-AGI ratio (the paper notes this assumes the CA economic-to-AGI ratio matches the US-wide ratio, BSZ p. 48).

Figure 7 is rendered by build_fig7() from compute_top4taxes() and the DINA tables loaded by extract_data_dina().

Code
build_fig7(top4taxes_r, data_dina)
# Read-only display. _targets.R wires this into the pipeline.

3. The Economics of Taxing California Billionaires’ Wealth

Section 3 scores two policy options: a one-time 5% wealth tax payable over five years, and a permanent annual wealth tax.

3.1 One-time wealth tax

Table 5 reports a column of this extra CA income tax generated assuming that wealth taxes will be paid one-third by selling assets triggering taxable realized capital gains (with a basis of 20% of selling price consistent with our earlier estimation that California billionaire wealth is 80% unrealized gains) and two-thirds with liquidities or debt. (BSZ 2026, p. 22)

Methodology. Four scoring scenarios for the proposed 5% one-time wealth tax. Each row has the same five-column structure: number of taxable billionaires, gross wealth \(W\), taxable wealth \(W^{\text{tax}}\) after avoidance, total avoidance/evasion rate \(\alpha\), wealth-tax revenue \(T^{\text{wealth}}\), extra income tax from triggered capital-gains sales \(T^{\text{extra}}\), and the annual income-tax loss \(L\) from leavers.

The benchmark formulas are \[ W^{\text{tax}} \;=\; W \cdot (1 - \alpha), \qquad T^{\text{wealth}} \;=\; 0.90 \cdot 0.05 \cdot W^{\text{tax}} \] (the \(0.90\) phase-in accounts for the linear ramp between $1B and $1.1B taxable net worth, BSZ fn. 11). Extra income tax from sales assumes billionaires fund 1/3 of the tax bill via asset sales with a 20% basis (so 80% of each sale is realized capital gains taxed at the 13.3% top CA rate): \[ T^{\text{extra}} \;=\; \tfrac{1}{3} \cdot T^{\text{wealth}} \cdot 0.80 \cdot 0.133. \] Income-tax loss from leavers assumes half of total avoidance is mobility (\(\alpha^{\text{mob}} = \alpha / 2\)) and that those leavers stop paying the average CA inctax billionaires pay: \[ L \;=\; -\alpha^{\text{mob}} \cdot \bar{T}^{\text{CA-bn}}_{2019\text{-}2025}. \] Scenario 2 adds the Pareto-extrapolated “missing small billionaires” (\(\$1B\)-\(\$4.5B\), where Forbes’ empirical \(b\) rises sharply, see Figure A4) to both the count and wealth — the implied “true” \(b = 4.48\) above \(\$4.5B\) is held constant down to \(\$1B\), yielding \(n^{\text{extra}}\) extra billionaires with \(\$615\)B extra wealth, at a higher 20% evasion rate for those less-visible cases. Scenario 3 removes specific named leavers (Page, Thiel, Hankey, Kalanick + the future-mover income-tax loss from Brin, Zuckerberg, Fang) from the benchmark; scenario 4 stacks 2 + 3.

Table 5 is rendered by build_tab5() from the scoring tibble produced by compute_tab5(), which in turn pulls Pareto summary stats from compute_pareto_missing() plus the averages rows of extract_tab2() and extract_tab3().

Code
tab5_r <- compute_tab5(pareto_missing_r, tab2, tab3)
build_tab5(tab5_r)
# Read-only display. _targets.R wires these into the pipeline.
Table 5. Scoring the One-Time 5% California Wealth Tax
# CA billionaires Wealth ($B) Taxable wealth ($B) Avoidance rate Wealth tax revenue ($B) Extra CA inctax from sales ($B) Annual CA inctax loss ($B)
1. Benchmark: Forbes estimates + 10% avoidance 249.0 2,182.0 1,963.8 10.0% 98.2 3.5 −0.2
2. Adding missing small billionaires (Pareto extrapolation) 617.1 2,797.1 2,455.9 12.2% 120.9 4.3 −0.2
3. Aggressive assumptions for pre/post-2026 leavers 249.0 2,182.0 1,678.9 23.1% 83.9 3.0 −0.5
4. Benchmark with both adding small billionaires and aggressive leavers 617.1 2,797.1 2,170.9 22.4% 108.5 3.8 −0.6
Notes: Scenarios assume a one-time 5% wealth tax. Wealth tax revenue = taxable_wealth × 5%. Extra CA income tax from sales reflects forced asset sales to pay the tax (33% realization × 80% LTCG-taxable × 13.3% CA rate). Annual CA inctax loss is the steady-state revenue forgone from billionaires leaving California. Source: tab5_r.

3.2 Permanent Billionaire Wealth Tax

Figure 8 depicts the revenue collected from billionaires from a permanent annual wealth tax as a function of the tax rate (on the x-axis) in the solid red line. This tax revenue estimate takes into account income and wealth tax revenue lost because of mobility responses calibrated using the empirical literature. (BSZ 2026, p. 25)

Methodology. A Laffer-curve sweep of permanent annual wealth-tax revenue as a function of the tax rate \(\tau\). Three curves on the same panel:

No-behavioral-response benchmark (dashed line): \[ T^{\text{naive}}(\tau) \;=\; \tau \cdot W_0, \] where \(W_0\) is the current CA billionaire wealth base.

With mobility response only (solid line). With constant semi-elasticity \(e\), the tax base shrinks geometrically: \(Z(\tau) = W_0 \cdot e^{-e \cdot (\tau + \tau_0)}\), where \(\tau_0 \approx 0.2\%\) is the existing CA-inctax-as-share-of-wealth that a mover would also avoid by leaving. Total revenue including the existing income tax is therefore \[ T^{\text{mob}}(\tau) \;=\; (\tau + \tau_0) \cdot W_0 \cdot \exp(-e \cdot (\tau + \tau_0)) \] minus the foregone-baseline \(\tau_0 \cdot W_0\). The revenue-maximizing rate is \(\tau^* = 1/e - \tau_0\) (the textbook Laffer apex shifted by the pre-existing tax); with \(e = 10\) this gives \(\tau^* = 9.8\%\).

With de-concentration too (third line). Repeated annual taxation mechanically erodes the base over time: \[ Z(\tau) \;=\; W_0 \cdot (1 - \tau)^d \cdot \exp(-e \cdot (\tau + \tau_0)), \] where \(d \approx 15.2\) is the wealth-weighted average # years current billionaires have been on the Forbes list (a proxy for “how long has the tax been compounding for the average billionaire”). The revenue-maximizing rate becomes \(\tau^* = 1/(1 + d + e) \approx 3.8\%\) (BSZ p. 25).

Figure 8 is rendered by build_fig8() from the 201-rate Laffer sweep produced by compute_fig8_laffer().

Code
build_fig8(fig8_laffer_r)
# Read-only display. _targets.R wires this into the pipeline.

Appendix

A.1 US wealth growth comparison

Table A1 provides the same statistics at the US level (instead of California). (BSZ 2026, p. 3)

Methodology. US-wide twin of Table 1. Same formulas as Table 1, applied to the full US Forbes 400 sample rather than the CA subset, with two mechanical differences: (i) Panel A’s “# billionaires” uses the Forbes real-time US-citizen count (not CA-resident), and (ii) Panel B’s “top .0002%” is dynamically resized to the larger US population (about 400 families in 2025 vs. 45 in CA). Annualized growth rates use the same geometric average \(\bar{g} = (X_{2025}/X_{1982})^{1/43} - 1\).

Table A1 is rendered by build_tab_a1() from the positional dumps extract_shortrunseries() (Panel A’s # US-citizen-billionaire count and total wealth) and extract_longrunseries() (Panel B’s 1982 vs 2025 long-run real growth).

Code
build_tab_a1(shortrunseries, longrunseries)
# Read-only display. _targets.R wires this into the pipeline.
Appendix Table A1. Wealth Growth of US Billionaires
Year # / families Wealth Growth / per family # US families (M) US GDP GDP / family ($)
A. Recent nominal wealth growth of US billionaires
2022 717 4,358
2023 746 5,247 20.4%
2024 813 6,723 28.1%
2025 938 8,189 21.8%
Growth during 3 years (2023-2025) 1
B. Long-term real wealth growth: top .0002% wealthiest US families (2025 $)
1982 221 240 108.7% 111 9,946 89,995
2025 386 6,537 1,693.1% 193 30,644 158,726
Ratio 2025 to 1982 2 27 1,558.0% 2 3 2
Annualized growth 0 0 6.6% 0 0 0
Notes: Repeats Tab 1 for US billionaires instead of CA. Panel A: nominal wealth of all US citizen billionaires (Forbes). Panel B: 1982 vs 2025 real wealth of top .0002% US families, in 2025 dollars.

A.2 Industry composition

Figure A1 depicts the industry composition of California billionaires’ wealth with a breakdown between SEC reported publicly traded stock vs. other for each industry. (BSZ 2026, p. 5)

Methodology. A horizontal stacked bar chart of CA billionaire wealth by industry as of 2026/01/01, decomposed into SEC-reported public-stock wealth vs. other. Industries are sorted by total wealth. Each bar is \[ W^{\text{industry } j} \;=\; \underbrace{W_j^{\text{pub}}}_{\text{from SEC Form 13D/G + 14A}} \;+\; \underbrace{W_j^{\text{other}}}_{\text{Forbes total} - W_j^{\text{pub}}}. \] The Forbes industry tag comes from the source spreadsheet’s rtb_2026_industry sheet — read at figure-render time rather than via the {targets} cache because the data block is small and stable.

Figure A1 is rendered by build_fig_a1(). It reads block 2 of the rtb_2026_industry sheet directly (cols A, G, H, I, rows 22-34) at build time rather than from a pre-extracted target.

Code
build_fig_a1(xlsx_path)
# Read-only display. _targets.R wires this into the pipeline.

A.3 Robustness: alternative method for CA income tax

Appendix Figure A2 displays the California income tax paid by California billionaires as a percent of California wide state individual income tax revenue. (BSZ 2026, p. 7)

Methodology. Two ratios plotted against year, both with the Pareto-extrapolated \(T_N^{\text{CA-bn}}\) in the numerator and total CA inctax in the denominator: \[ \text{headline series}_t \;=\; \frac{T_N^{\text{CA-bn}}(t)}{T_t^{\text{CA-inctax-total}}}, \] where the denominator comes from FTB statewide aggregates (workbook row 18). This is a robustness check showing the headline estimate sits in a plausible \(2\)-\(3\%\) range across years, and is consistent with the upper end of the Balkir et al. (2025) direct IRS-match estimate for the US wide top .0002%.

Figure A2 is rendered by build_fig_a2() from compute_shortrunseries() (the AA / AF columns express the headline and SEC-top-5 figures as a share of total CA income tax revenue).

Code
build_fig_a2(shortrunseries_r)
# Read-only display. _targets.R wires this into the pipeline.

A.4 Taxes on public-stock wealth

Figure A3-A produces the same statistics for wealth arising from publicly traded stock. (BSZ 2026, p. 36)

Methodology. Figure 4’s tax-burden decomposition restricted to wealth in the form of publicly traded stock — i.e. only the share that can be verified directly against SEC Form 13D/G + 14A filings rather than imputed from national-accounts weights. Public wealth in each year is \(W_t^{\text{pub}} = s^{\text{pub}}_t \cdot W_t\). Tax components are expressed as a share of \(W_t^{\text{pub}}\): \[ \text{rate}^{(k)}_t \;=\; \tau^{(k)}_t / W_t^{\text{pub}}, \quad k \in \{\text{CA inctax}, \text{Fed inctax}, \text{corp}, \text{prop+sales}\}. \] The numerators come from the same all_taxes block as Figure 4, but the private-wealth imputation rows are dropped (the public-restricted figures sit in workbook rows 117-135 versus the full decomposition in rows 144-153). The companion panel (sometimes denoted A3-B) repeats the exercise with economic income \(Y^{\text{econ,pub}}_t\) as denominator instead of wealth.

Figure A3 is rendered by build_fig_a3() from the all_taxes block of compute_billionaires_ca_inctax(), restricting to the public-asset rows (rows 117-135 of the source sheet).

Code
build_fig_a3(billionaires_ca_inctax_r)
# Read-only display. _targets.R wires this into the pipeline.

A.5 Pareto extrapolation for missing billionaires

Panel A plots the Pareto coefficient b at each wealth threshold (Forbes data + projected). Panel B compares the density of billionaires-per-bracket empirically vs. the Pareto-extrapolation that adds the small billionaires Forbes is likely missing below the $4.5B anchor.

Methodology. Panel A plots the empirical Pareto coefficient \(b\) as a function of wealth threshold, \[ b(t) \;=\; \frac{\text{mean wealth above } t}{t}, \] computed directly from the Forbes real-time list as of 4/15/2026. For thresholds above $4.5B (where Forbes’ coverage is essentially complete) \(b\) is roughly constant at about \(4.48\), the empirical regularity that a Pareto tail predicts. Below $4.5B, observed \(b\) rises sharply — diagnostic that Forbes is undercounting small billionaires.

Panel B plots two empirical CDFs of wealth: the raw Forbes count by bracket, versus a counterfactual that extends the constant-\(b\) Pareto tail from $4.5B down to $1B. Under the counterfactual, \[ n^{\text{Pareto}}(t) \;=\; n_{4.5} \cdot \left(\frac{t}{4.5}\right)^{-b/(b-1)} \quad \text{for } t < 4.5, \] implying about 368 extra “missing” billionaires (Forbes counts 249, the extrapolation says 617) with about $615B in extra wealth. These figures flow into the Table 5 scenario-2 / 4 scoring.

Figure A4 is rendered by build_fig_a4() from compute_pareto_missing().

Code
build_fig_a4(pareto_missing_r)
# Read-only display. _targets.R wires this into the pipeline.

Source code

The function bodies listed below live in R/*.R files (one canonical location per function). Each listing is pulled from disk at render time via R’s getSrcref() so the report never drifts from the actual code.

Where each kind of function lives:

  • extract_*()R/data_sheets.R
  • compute_data_sec_agg()R/compute_data_sec_agg.R
  • compute_pareto_*(), compute_fig8_laffer()R/compute_pareto.R
  • compute_tab5()R/compute_tab5.R
  • compute_billionaires_ca_inctax()R/compute_billionaires_ca_inctax.R
  • compute_shortrunseries()R/compute_shortrunseries.R
  • compute_top4taxes()R/compute_top4taxes.R
  • build_tab*_gt()R/tables.R
  • build_fig*()R/figures.R

Each entry’s header below repeats the file path, and the last line of each listing gives the precise line range pulled from getSrcref(). Listings are grouped by stage of the pipeline: extract_* ingestors that read the upstream Excel cells positionally, compute_* re-derivations that turn the extracted inputs into validated tibbles, and build_* rendererings that take those tibbles and produce the gt tables / ggplot figures embedded above.

Extract

extract_data_dina()

Code
function(path = xlsx_path_default()) {
  read_sheet("data_dina", path = path)
}

# Read-only display. Edit R/data_sheets.R:148-150.

extract_data_sec_top4()

Code
function(path = xlsx_path_default()) {
  # Data rows 5-120 (per-billionaire-year + per-year totals). A Walczak
  # comparison block sits below at rows 126-135; ignore here.
  ct <- c("text", "text", rep("numeric", 22), "text", rep("numeric", 11))
  out <- read_rectangular(
    "data_sec_top4",
    path = path,
    skip = 3,
    col_types = ct,
    key_cols = c("year", "forbes_id"),
    n_max = 116
  )
  out$year <- as.integer(out$year)
  out$end_cyear <- as.Date(out$end_cyear)
  out
}

# Read-only display. Edit R/data_sheets.R:56-71.

extract_longrunseries()

Code
function(path = xlsx_path_default()) {
  read_sheet("longrunseries", path = path)
}

# Read-only display. Edit R/data_sheets.R:140-142.

extract_shortrunseries()

Code
function(path = xlsx_path_default()) {
  read_sheet("shortrunseries", path = path)
}

# Read-only display. Edit R/data_sheets.R:144-146.

extract_tab2()

Code
function(path = xlsx_path_default()) {
  nm <- c("year", "ca_billionaires_wealth", "ca_inctax_estimated",
          "ca_inctax_per_wealth", "gap_1",
          "top4_company_wealth", "top4_ca_inctax",
          "top4_ca_inctax_per_wealth", "gap_2", "gap_3")
  out <- read_rectangular(
    "Tab2", path = path, skip = 5,
    col_types = c("text", rep("numeric", 9)),
    names = nm,
    n_max = 8  # rows 7-14: 2019..2025 + "2019-2025 average"
  )
  out
}

# Read-only display. Edit R/data_sheets.R:115-127.

extract_tab3()

Code
function(path = xlsx_path_default()) {
  nm <- c("metric", "page", "brin", "zuckerberg", "huang", "all_top4",
          "gap_1", "gap_2", "gap_3")
  read_rectangular(
    "Tab3", path = path, skip = 4,
    col_types = c("text", rep("numeric", 8)),
    names = nm,
    n_max = 11  # rows 6-16: years 2019-2025 + average + wealth begin/end + total tax/wealth
  )
}

# Read-only display. Edit R/data_sheets.R:129-138.

Compute

compute_billionaires_ca_inctax()

Code
function(data_sec_agg_r,
                                            billionaires_ca_inctax,
                                            ftb_b4a) {
  bci  <- billionaires_ca_inctax
  cell <- function(addr) xls_cell(bci, addr)
  agg  <- .bci_make_agg(data_sec_agg_r)
  ftb  <- .bci_make_ftb(ftb_b4a)

  # ---- Memo 1 + D99 correction --------------------------------------------
  m1 <- .bci_memo1(bci)
  # D99 = (Memo 2 implied fed-tax rate) / (Memo 1 fed-tax rate, 2018-2020 avg)
  B96 <- cell("B96"); B98 <- cell("B98")
  D99 <- (B98 / B96) / mean(m1$fed_tax_per_agi[1:3])

  # ---- Method I year panel (rows 6..55) -----------------------------------
  yrs <- 2018:2026
  pan_cols <- c("B","C","D","E","F","G","H","I","J")

  # Block A: CA billionaires (rows 8-10).
  n_ca_b     <- xls_cells_row(bci, pan_cols, 8)
  total_w_ca <- c(NA_real_, agg("C"), NA_real_)
  avg_w_ca   <- total_w_ca / n_ca_b

  # Block B: aggregate CA income tax stats (rows 13-21).
  stats <- .bci_aggregate_stats(bci, ftb)

  # Block C: top-bracket Pareto projection (rows 26-44).
  pct_overshoot_yr <- unname(m1$pct_overshoot[c("2018","2019","2020","2021","2022","2023")])
  brk <- .bci_top_brackets(bci, ftb, n_ca_b, pct_overshoot_yr)

  # Rows 46-47: literal correction factors.
  inc_top_w_rel <- c(rep(cell("B46"), 6), NA_real_, NA_real_, NA_real_)
  corr_passthru <- c(rep(D99,         6), NA_real_, NA_real_, NA_real_)

  # Row 49 (CA inctax paid by CA Forbes billionaires) -- MAIN OUTPUT.
  # B-G49: row44 * row46 * row47; H49, I49 = row50 * row15 (computed below).
  ca_inctax_ca_b <- rep(NA_real_, 9)
  ca_inctax_ca_b[1:6] <- brk$proj_tax_top_corr * inc_top_w_rel[1:6] * corr_passthru[1:6]
  # Row 50 = row 49 / row 18. H50 = AVG(B50:G50); I50 = E50 (2021).
  pct_ca_inctax_by_b <- numeric(9)
  pct_ca_inctax_by_b[1:6] <- ca_inctax_ca_b[1:6] / stats$ca_inctax_total_full[1:6]
  pct_ca_inctax_by_b[7] <- mean(pct_ca_inctax_by_b[1:6])
  pct_ca_inctax_by_b[8] <- pct_ca_inctax_by_b[4]
  pct_ca_inctax_by_b[9] <- NA_real_
  ca_inctax_ca_b[7] <- pct_ca_inctax_by_b[7] * stats$H15
  ca_inctax_ca_b[8] <- pct_ca_inctax_by_b[8] * stats$I15

  # Row 51 = row 49 / row 10. Row 53 from data_sec_agg!S. Row 54 = D/C share.
  # Row 55 = row 53 / row 49.
  ca_inctax_b_per_w        <- ca_inctax_ca_b / total_w_ca
  ca_inctax_public_b       <- c(NA_real_, agg("S"), NA_real_)
  public_share_b           <- c(NA_real_, agg("D") / agg("C"), NA_real_)
  ca_inctax_public_per_b_b <- ca_inctax_public_b / ca_inctax_ca_b

  method1 <- tibble::tibble(
    year                          = yrs,
    n_ca_billionaires             = n_ca_b,
    avg_wealth_ca_b               = avg_w_ca,
    total_wealth_ca_b             = total_w_ca,
    n_returns_ca                  = stats$n_returns_ca,
    ca_agi_b                      = stats$ca_agi_b,
    ca_inctax_residents_b         = stats$ca_inctax_resid_b,
    ca_inctax_passthrough_b       = stats$ca_inctax_part16,
    ca_inctax_partyear_nonres_b   = stats$ca_inctax_part17,
    ca_inctax_total_b             = stats$ca_inctax_total_full,
    ca_inctax_fy_b                = stats$ca_inctax_fy_b,
    fy_to_cy_adjustment           = stats$fy_to_cy_adj,
    n_returns_10m                 = c(brk$n_ret_10m,     NA_real_, NA_real_, NA_real_),
    ca_agi_10m_b                  = c(brk$agi_10m_b,     NA_real_, NA_real_, NA_real_),
    ca_taxable_10m_b              = c(brk$taxable_10m_b, NA_real_, NA_real_, NA_real_),
    ca_tax_10m_b                  = c(brk$tax_10m_b,     NA_real_, NA_real_, NA_real_),
    ca_tax_rate_10m               = c(brk$tax_rate_10m,  NA_real_, NA_real_, NA_real_),
    pareto_b_10m_bracket          = c(brk$pareto_b_10m,  NA_real_, NA_real_, NA_real_),
    n_returns_5m                  = c(brk$n_ret_5m,      NA_real_, NA_real_, NA_real_),
    ca_agi_5m_b                   = c(brk$agi_5m_b,      NA_real_, NA_real_, NA_real_),
    ca_taxable_5m_b               = c(brk$taxable_5m_b,  NA_real_, NA_real_, NA_real_),
    ca_tax_5m_b                   = c(brk$tax_5m_b,      NA_real_, NA_real_, NA_real_),
    ca_tax_rate_5m                = c(brk$tax_rate_5m,   NA_real_, NA_real_, NA_real_),
    pareto_b_5m_bracket           = c(brk$pareto_b_5m,   NA_real_, NA_real_, NA_real_),
    proj_cutoff_top_pre_m         = c(brk$proj_cutoff_top,    NA_real_, NA_real_, NA_real_),
    proj_agi_top_pre_b            = c(brk$proj_agi_top,       NA_real_, NA_real_, NA_real_),
    proj_tax_top_pre_b            = c(brk$proj_tax_top,       NA_real_, NA_real_, NA_real_),
    proj_cutoff_top_5m_m          = c(brk$proj_cutoff_top_5m, NA_real_, NA_real_, NA_real_),
    proj_agi_top_5m_b             = c(brk$proj_agi_top_5m,    NA_real_, NA_real_, NA_real_),
    proj_agi_top_corr_b           = c(brk$proj_agi_top_corr,  NA_real_, NA_real_, NA_real_),
    proj_tax_top_corr_b           = c(brk$proj_tax_top_corr,  NA_real_, NA_real_, NA_real_),
    income_top_wealth_relative    = inc_top_w_rel,
    correction_passthrough        = corr_passthru,
    ca_inctax_ca_billionaires_b   = ca_inctax_ca_b,
    pct_ca_inctax_by_billionaires = pct_ca_inctax_by_b,
    ca_inctax_per_wealth          = ca_inctax_b_per_w,
    ca_inctax_public_assets_b     = ca_inctax_public_b,
    public_assets_share           = public_share_b,
    ca_inctax_public_share_of_total = ca_inctax_public_per_b_b
  )

  robustness <- .bci_robustness(bci, D99, ca_inctax_ca_b,
                                 brk$tax_5m_b, brk$agi_5m_b)

  all_taxes <- .bci_all_taxes(
    yrs                = yrs,
    proj_agi_top_corr  = brk$proj_agi_top_corr,
    inc_top_w_rel      = inc_top_w_rel,
    ca_inctax_ca_b     = ca_inctax_ca_b,
    m1_fed_tax_per_agi = m1$fed_tax_per_agi,
    D99                = D99,
    agg                = agg,
    public_share_b     = public_share_b,
    total_w_ca         = total_w_ca
  )

  list(
    method1    = method1,
    memo1      = m1$memo1,
    robustness = robustness,
    all_taxes  = all_taxes
  )
}

# Read-only display. Edit R/compute_billionaires_ca_inctax.R:555-672.

compute_data_sec_agg()

Code
function(data_sec_all,
                                 exclude_ids = ELLISON_FORBES_ID,
                                 m_to_b = 1000) {
  # Columns to sum (in panel order) — the 27 metric columns shared with data_sec_all
  sum_cols <- c(
    "forbes_worth", "forbes_public_worth",
    "purchase", "sale",
    "kg", "kg_long", "kg_short",
    "option_profit", "noneq_comp", "ordinary_income",
    "kg_taxable", "dividend", "fiscal_income",
    "donation", "donation_deductible", "income_taxable",
    "ca_income_tax", "fed_ordinary_income_tax", "fed_preferential_tax",
    "fed_income_tax", "fiscal_income_tax",
    "sales_tax", "w_txt", "w_tax_ppent", "w_pi",
    "total_tax", "economic_income"
  )
  filtered <- data_sec_all[!(data_sec_all$forbes_id %in% exclude_ids), ]
  out <- filtered |>
    dplyr::group_by(year) |>
    dplyr::summarise(
      n = dplyr::n(),
      dplyr::across(
        dplyr::all_of(sum_cols),
        \(x) sum(x, na.rm = TRUE) / m_to_b
      ),
      .groups = "drop"
    )
  out$year <- as.integer(out$year)
  tibble::as_tibble(out)
}

# Read-only display. Edit R/compute_data_sec_agg.R:10-39.

compute_fig8_laffer()

Code
function(
  semi_elasticity_mobility   = 10,      # Fig8!B8 - mobility semi-elasticity e
  current_inctax_per_wealth  = 0.002,   # Fig8!C8 - current CA income tax / wealth
  current_wealth_tax_base    = 2000,    # Fig8!D8 - current wealth tax base ($B)
  deconcentration_elasticity = 15,      # Fig8!E8 - deconcentration elasticity d
  rate_step                  = 0.001,
  max_rate                   = 0.20
) {
  # Laffer curve for a permanent annual CA wealth tax under mobility +
  # deconcentration responses. Mirrors Fig8 columns A-E.
  rates  <- seq(0, max_rate, by = rate_step)
  base   <- current_wealth_tax_base *
              exp(-(rates - current_inctax_per_wealth) * semi_elasticity_mobility)
  ref_pow <- (1 - current_inctax_per_wealth)^deconcentration_elasticity
  tibble::tibble(
    tax_rate               = rates,
    mechanical_tax_revenue = rates * current_wealth_tax_base,
    wealth_tax_base        = base,
    actual_tax_revenue     = rates * base,
    long_run_tax_revenue   = rates * base * (1 - rates)^deconcentration_elasticity / ref_pow
  )
}

# Read-only display. Edit R/compute_pareto.R:142-163.

compute_pareto_missing()

Code
function(pareto_inputs,
                                   anchor_threshold = 4.5) {
  # Inputs (columns A, B, C of the Pareto-missing sheet):
  #   threshold_b           - wealth threshold ($B)
  #   n_above_threshold_emp - count of CA billionaires with wealth >= threshold
  #   wealth_above_threshold- total wealth ($B) above threshold
  # Derives columns D-L = Pareto extrapolation of billionaires Forbes misses
  # below $4.5B, assuming Pareto b averaged over anchor..tail thresholds.
  d <- pareto_inputs
  required <- c("threshold_b", "n_above_threshold_emp", "wealth_above_threshold")
  stopifnot(all(required %in% names(d)))

  A <- d$threshold_b
  B <- d$n_above_threshold_emp
  C <- d$wealth_above_threshold
  n <- length(A)

  # Empirical Pareto b at each threshold
  pareto_b_emp <- C / (B * A)

  # Bracket metrics (Excel treats the cell below the last row as 0 — replicate)
  next_C <- c(C[-1], 0)
  next_B <- c(B[-1], 0)
  next_A <- c(A[-1], NA_real_)
  wealth_in_bracket <- C - next_C
  actual_density   <- B - next_B
  avg_wealth_in_bracket_emp <- wealth_in_bracket / actual_density

  # Anchor row (threshold 4.5) and average-Pareto-b constants
  anchor_i <- which(A == anchor_threshold)
  if (length(anchor_i) != 1L) {
    stop("Anchor threshold ", anchor_threshold, " not present uniquely in pareto_inputs$threshold_b")
  }
  D23 <- mean(pareto_b_emp[anchor_i:n])   # Average Pareto b above anchor
  D24 <- D23 / (D23 - 1)                  # Corresponding Pareto a

  # Projected count: empirical at/above anchor; Pareto-extrapolated below.
  n_above_threshold_proj <- B
  below <- seq_len(anchor_i - 1L)
  n_above_threshold_proj[below] <-
    B[anchor_i] * (A[anchor_i] / A[below])^D24

  next_H <- c(n_above_threshold_proj[-1], 0)
  projected_density <- n_above_threshold_proj - next_H

  # Projected wealth in bracket: empirical bracket at/above anchor;
  # Pareto-formula below.
  projected_wealth_in_bracket <- wealth_in_bracket
  for (i in below) {
    projected_wealth_in_bracket[i] <-
      D23 * n_above_threshold_proj[i] *
        (A[i] - next_A[i] * (A[i] / next_A[i])^D24)
  }

  avg_wealth_in_bracket_proj <- projected_wealth_in_bracket / projected_density

  pareto_b_proj <- pareto_b_emp
  pareto_b_proj[below] <- D23

  tibble::tibble(
    threshold_b                 = A,
    n_above_threshold_emp       = B,
    wealth_above_threshold      = C,
    pareto_b_emp                = pareto_b_emp,
    wealth_in_bracket           = wealth_in_bracket,
    actual_density              = actual_density,
    avg_wealth_in_bracket_emp   = avg_wealth_in_bracket_emp,
    n_above_threshold_proj      = n_above_threshold_proj,
    projected_wealth_in_bracket = projected_wealth_in_bracket,
    projected_density           = projected_density,
    avg_wealth_in_bracket_proj  = avg_wealth_in_bracket_proj,
    pareto_b_proj               = pareto_b_proj
  )
}

# Read-only display. Edit R/compute_pareto.R:7-80.

compute_shortrunseries()

Code
function(data_sec_agg_r,
                                    data_sec_top4,
                                    billionaires_ca_inctax_r,
                                    shortrunseries) {
  yrs  <- 2018:2025
  m1   <- billionaires_ca_inctax_r$method1
  cols <- .srs_panel_columns(
    srs   = shortrunseries,
    agg   = data_sec_agg_r,
    top4  = data_sec_top4,
    m1    = m1,
    yrs   = yrs
  )

  list(
    panel        = .srs_assemble_panel(cols, yrs),
    summary_2025 = .srs_summary_2025(cols, data_sec_top4),
    growth       = .srs_growth(cols, yrs)
  )
}

# Read-only display. Edit R/compute_shortrunseries.R:241-260.

compute_tab5()

Code
function(pareto_missing_r, tab2, tab3,
                         baseline_n        = 249,
                         baseline_wealth   = 2182,
                         avoidance_rate    = 0.10,
                         avoidance_small   = 0.20,
                         wealth_tax_rate   = 0.05,
                         phasein_rate      = 0.025,
                         realization_share = 1/3,
                         ltcg_taxable      = 0.80,
                         ca_ltcg_rate      = 0.133) {
  # Replicates Tab5: 4 scenarios for the one-time 5% CA wealth tax.
  #   1. Benchmark (Forbes 4/15/2026 + 10% avoidance)
  #   2. Benchmark + Pareto extrapolation for missing $1-4.5B billionaires
  #   3. Benchmark + aggressive pre/post-2026 leaver assumption
  #   4. Both 2 and 3 combined
  pareto <- compute_pareto_summary(pareto_missing_r)

  # Inputs from Tab2 (2019-2025 average row, col C):
  #   tab2 row "2019-2025 average" -> ca_inctax_estimated = $3.03B/yr
  avg_row <- tab2[grepl("^2019.*average", tab2$year), ]
  ca_inctax_avg <- avg_row$ca_inctax_estimated         # Tab2!C14

  # Inputs from Tab3 (averages of CA income tax for top 4, in $M):
  avg_metric_row <- tab3[grepl("^Average", tab3$metric), ]
  # Inputs from Tab3 (Wealth at end of 2025, in $M):
  wealth_end_row <- tab3[grepl("^Wealth at end", tab3$metric), ]

  # Scenario 3 leaver block (hard-coded wealth + private-wealth values from
  # Tab5 rows 20-29). Pre-2026 leavers: Page, Thiel, Hankey, Kalanick.
  # Post-2026 leavers: Brin, Zuckerberg, Andy Fang.
  page_avg_tax_M    <- avg_metric_row$page              # Tab3!$B$13
  brin_avg_tax_M    <- avg_metric_row$brin              # Tab3!$C$13
  zuck_avg_tax_M    <- avg_metric_row$zuckerberg        # Tab3!$D$13
  top4_avg_tax_M    <- avg_metric_row$all_top4          # Tab3!F13

  # Top-4 wealth breakouts hardcoded in Tab5 row 30 (in $B):
  page_wealth_B  <- 276;   page_private_B  <- 13.4
  brin_wealth_B  <- 254.6; brin_private_B  <- 13.2
  zuck_wealth_B  <- 230.2; zuck_private_B  <- 2.5
  huang_wealth_B <- 172;   huang_private_B <- 2.84
  top4_wealth_B  <- page_wealth_B + brin_wealth_B + zuck_wealth_B + huang_wealth_B
  top4_private_B <- page_private_B + brin_private_B + zuck_private_B + huang_private_B

  # Wealth and CA income tax denominators used to apportion the leaver CA
  # income tax loss by wealth share (Tab5 rows 17-18):
  total_wealth_all_b        <- baseline_wealth
  avg_ca_inctax_all_b       <- ca_inctax_avg
  wealth_excl_top4_company  <- total_wealth_all_b - (top4_wealth_B - top4_private_B)
  ca_inctax_excl_top4       <- avg_ca_inctax_all_b - top4_avg_tax_M / 1000
  inctax_per_wealth_residual <- ca_inctax_excl_top4 / wealth_excl_top4_company

  # Pre-2026 leavers (Page, Thiel, Hankey, Kalanick) — Tab5 rows 20-23.
  # Page contributes company-tax + private-wealth share; the other three
  # contribute private-wealth share only (apportioned from residual rate).
  thiel_wealth_B    <- 28.9
  hankey_wealth_B   <- 8.15
  kalanick_wealth_B <- 3.56
  ca_inctax_loss_pre2026 <-
      (page_avg_tax_M / 1000) + page_private_B * inctax_per_wealth_residual +
      (thiel_wealth_B + hankey_wealth_B + kalanick_wealth_B) * inctax_per_wealth_residual

  # Post-2026 leavers (Brin, Zuckerberg, Andy Fang) — Tab5 rows 25-27.
  fang_wealth_B <- 1.5
  ca_inctax_loss_post2026 <-
      (brin_avg_tax_M / 1000) + brin_private_B * inctax_per_wealth_residual +
      (zuck_avg_tax_M / 1000) + zuck_private_B * inctax_per_wealth_residual +
      fang_wealth_B * inctax_per_wealth_residual

  total_leaver_inctax_loss <- ca_inctax_loss_pre2026 + ca_inctax_loss_post2026
  wealth_pre2026_leavers   <- page_wealth_B + thiel_wealth_B + hankey_wealth_B + kalanick_wealth_B

  # Scenario engine: given a scenario's wealth + taxable wealth + the
  # baseline-inctax denominator + optional adjustments, compute the 7 result
  # columns. Avoidance rate is derived as (1 - taxable/wealth).
  scenario_revenue <- function(n, wealth, taxable_wealth,
                                baseline_inctax_effective,
                                phasein_deduction = 0,
                                extra_inctax_loss = 0) {
    avoidance        <- 1 - taxable_wealth / wealth
    wealth_tax_rev   <- taxable_wealth * wealth_tax_rate - phasein_deduction
    extra_inctax     <- wealth_tax_rev * realization_share * ltcg_taxable * ca_ltcg_rate
    annual_loss      <- -wealth_tax_rate * baseline_inctax_effective - extra_inctax_loss
    c(n              = n,
      wealth         = wealth,
      taxable_wealth = taxable_wealth,
      avoidance_rate = avoidance,
      wealth_tax_rev = wealth_tax_rev,
      extra_inctax   = extra_inctax,
      annual_loss    = annual_loss)
  }

  # --- Scenario 1: Benchmark (Forbes 4/15/2026 + 10% avoidance) ---
  s1 <- scenario_revenue(
    n                          = baseline_n,
    wealth                     = baseline_wealth,
    taxable_wealth             = baseline_wealth * (1 - avoidance_rate),
    baseline_inctax_effective  = ca_inctax_avg
  )

  # --- Scenario 2: + Pareto-missing small billionaires ---
  wealth_2 <- baseline_wealth * (1 + pareto$pct_wealth_increase)
  s2 <- scenario_revenue(
    n                          = baseline_n * (1 + pareto$pct_count_increase),
    wealth                     = wealth_2,
    taxable_wealth             = baseline_wealth *
        ((1 - avoidance_rate) + (1 - avoidance_small) * pareto$pct_wealth_increase),
    baseline_inctax_effective  = ca_inctax_avg * (1 + pareto$pct_wealth_increase),
    phasein_deduction          = (wealth_2 - baseline_wealth) *
        pareto$fraction_in_phasein * phasein_rate
  )

  # --- Scenario 3: + Aggressive pre/post-2026 leavers ---
  s3 <- scenario_revenue(
    n                          = baseline_n,
    wealth                     = baseline_wealth,
    taxable_wealth             = 0.9 * (baseline_wealth - wealth_pre2026_leavers),
    baseline_inctax_effective  = ca_inctax_avg,
    extra_inctax_loss          = total_leaver_inctax_loss
  )

  # --- Scenario 4: scenarios 2 and 3 combined ---
  s4 <- scenario_revenue(
    n                          = s2[["n"]],
    wealth                     = s3[["wealth"]] + (s2[["wealth"]] - baseline_wealth),
    taxable_wealth             = s3[["taxable_wealth"]] +
                                  (s2[["taxable_wealth"]] - s1[["taxable_wealth"]]),
    baseline_inctax_effective  = ca_inctax_avg * (1 + pareto$pct_wealth_increase),
    extra_inctax_loss          = total_leaver_inctax_loss
  )

  rows <- rbind(s1, s2, s3, s4)
  tibble::tibble(
    scenario              = paste0("Scenario ", 1:4),
    n_billionaires        = unname(rows[, "n"]),
    wealth                = unname(rows[, "wealth"]),
    taxable_wealth        = unname(rows[, "taxable_wealth"]),
    avoidance_rate        = unname(rows[, "avoidance_rate"]),
    wealth_tax_revenue    = unname(rows[, "wealth_tax_rev"]),
    extra_ca_inctax_sales = unname(rows[, "extra_inctax"]),
    annual_ca_inctax_loss = unname(rows[, "annual_loss"])
  )
}

# Read-only display. Edit R/compute_tab5.R:7-148.

compute_top4taxes()

Code
function(data_sec_top4) {
  # Re-derives the 429 formula cells of top4taxes: per-year tax rates of the
  # CA top-4 billionaires, 2004..2025, plus 2004-2016 / 2017-2025 averages.
  # The "top 4" composition shifts in three phases (see comment on `agg`).

  d <- data_sec_top4
  pick <- function(id, yr, col) {
    row <- d[d$forbes_id == id & d$year == yr, ]
    if (nrow(row) == 1L) row[[col]] else NA_real_
  }
  # Aggregate the 8 metric columns for the dynamic "top 4" composition.
  metric_cols <- c("ca_income_tax", "fed_income_tax", "sales_tax",
                   "w_txt", "w_tax_ppent", "total_tax", "economic_income",
                   "public_worth_avg")
  agg <- function(yr) {
    total <- vapply(metric_cols, pick, numeric(1),
                    id = "Total (excluding Ellison)", yr = yr)
    if (yr <= 2015) {
      ell <- vapply(metric_cols, pick, numeric(1), id = "larry-ellison", yr = yr)
      total + ell
    } else if (yr <= 2020) {
      ell <- vapply(metric_cols, pick, numeric(1), id = "larry-ellison", yr = yr)
      hua <- vapply(metric_cols, pick, numeric(1), id = "jensen-huang",  yr = yr)
      total + ell - hua
    } else {
      total
    }
  }

  yrs <- 2004:2025
  mat <- vapply(yrs, agg, numeric(length(metric_cols)))
  rownames(mat) <- metric_cols
  # Columns of `mat` are years; rows are metrics.
  ca_tax  <- mat["ca_income_tax", ]
  fed_tax <- mat["fed_income_tax", ]
  sales_t <- mat["sales_tax", ]
  corp_t  <- mat["w_txt", ]
  prop_t  <- mat["w_tax_ppent", ]
  total_t <- mat["total_tax", ]
  econ_i  <- mat["economic_income", ]      # T
  wealth  <- mat["public_worth_avg", ]     # S

  # Per-income ratios (cols C..H, /T)
  C_total_per_inc  <- total_t / econ_i
  D_ca_per_inc     <- ca_tax  / econ_i
  E_fed_per_inc    <- fed_tax / econ_i
  F_sales_per_inc  <- sales_t / econ_i
  G_corp_per_inc   <- corp_t  / econ_i
  H_prop_per_inc   <- prop_t  / econ_i
  I_check_inc      <- C_total_per_inc -
                       (D_ca_per_inc + E_fed_per_inc + F_sales_per_inc +
                        G_corp_per_inc + H_prop_per_inc)

  # Per-wealth ratios (cols J..O, /S)
  J_total_per_w  <- total_t / wealth
  K_ca_per_w     <- ca_tax  / wealth
  L_fed_per_w    <- fed_tax / wealth
  M_sales_per_w  <- sales_t / wealth
  N_corp_per_w   <- corp_t  / wealth
  O_prop_per_w   <- prop_t  / wealth
  P_check_w      <- J_total_per_w -
                     (K_ca_per_w + L_fed_per_w + M_sales_per_w +
                      N_corp_per_w + O_prop_per_w)

  R_inc_per_w    <- econ_i / wealth

  panel <- tibble::tibble(
    year = yrs,
    total_tax_per_income     = C_total_per_inc,
    ca_inctax_per_income     = D_ca_per_inc,
    fed_inctax_per_income    = E_fed_per_inc,
    sales_tax_per_income     = F_sales_per_inc,
    corp_tax_per_income      = G_corp_per_inc,
    property_tax_per_income  = H_prop_per_inc,
    check_income_decomp      = I_check_inc,
    total_tax_per_wealth     = J_total_per_w,
    ca_inctax_per_wealth     = K_ca_per_w,
    fed_inctax_per_wealth    = L_fed_per_w,
    sales_tax_per_wealth     = M_sales_per_w,
    corp_tax_per_wealth      = N_corp_per_w,
    property_tax_per_wealth  = O_prop_per_w,
    check_wealth_decomp      = P_check_w,
    income_per_wealth        = R_inc_per_w,
    avg_wealth_m             = wealth,
    economic_income_m        = econ_i
  )

  # Sub-period averages (rows 27, 28 of the sheet)
  panel_cols    <- setdiff(names(panel), "year")
  avg_2004_2016 <- .t4t_period_avg(panel, panel_cols, 2004:2016)
  avg_2017_2025 <- .t4t_period_avg(panel, panel_cols, 2017:2025)

  averages <- tibble::tibble(
    period = c("2004-2016", "2017-2025"),
    !!!setNames(
      lapply(panel_cols, \(col) c(avg_2004_2016[[col]], avg_2017_2025[[col]])),
      panel_cols
    )
  )

  list(panel = panel, averages = averages)
}

# Read-only display. Edit R/compute_top4taxes.R:32-133.

Build

build_tab1()

Code
function(data_sec_agg_r, shortrunseries_r, longrunseries) {
  # Table 1: Wealth Growth of California Billionaires.
  # Panel A: 2022-2025 nominal wealth + growth + CA GDP comparison.
  # Panel B: 1982 vs 2025 long-term real-wealth growth of top .0002%.

  # ---- Panel A inputs --------------------------------------------------------
  agg <- data_sec_agg_r[data_sec_agg_r$year %in% 2022:2025, ]
  srs <- shortrunseries_r$panel
  srs_a <- srs[srs$year %in% 2022:2025, c("year", "top5_total_b")]
  # CA GDP: longrunseries col AT, rows 48..51 = years 2022..2025
  ca_gdp <- suppressWarnings(as.numeric(longrunseries$AT[48:51]))

  panel_a <- tibble::tibble(
    year                 = as.character(2022:2025),
    n_billionaires       = agg$n,
    wealth_b             = agg$forbes_worth,
    annual_growth        = c(NA_real_, agg$forbes_worth[-1] / agg$forbes_worth[-4] - 1),
    fraction_public      = agg$forbes_public_worth / agg$forbes_worth,
    top4_wealth_b        = srs_a$top5_total_b,
    ca_gdp_b             = ca_gdp,
    wealth_per_gdp       = agg$forbes_worth / ca_gdp
  )
  # Append "Growth during 2023-2025" row (matches Tab1 row 10).
  growth_row <- tibble::tibble(
    year                 = "Growth during 2023-2025",
    n_billionaires       = NA_integer_,
    wealth_b             = panel_a$wealth_b[4] / panel_a$wealth_b[1] - 1,
    annual_growth        = NA_real_,
    fraction_public      = NA_real_,
    top4_wealth_b        = panel_a$top4_wealth_b[4] / panel_a$top4_wealth_b[1] - 1,
    ca_gdp_b             = panel_a$ca_gdp_b[4] / panel_a$ca_gdp_b[1] - 1,
    wealth_per_gdp       = NA_real_
  )
  panel_a_full <- dplyr::bind_rows(panel_a, growth_row)
  panel_a_full$panel <- "A. Recent nominal wealth growth of CA billionaires"

  # ---- Panel B inputs --------------------------------------------------------
  # longrunseries rows 8 = 1982, 51 = 2025. AL = # families top .0002%,
  # AQ = top .0002% wealth in current $B, AI = # CA families,
  # AT = CA GDP, W = deflator (W51 / W8 inflates 1982 -> 2025 $).
  lrs <- longrunseries
  num <- function(col, row) suppressWarnings(as.numeric(lrs[[col]][row]))
  W8  <- num("W", 8);  W51 <- num("W", 51)
  AL_1982 <- num("AL", 8);  AL_2025 <- num("AL", 51)
  AQ_1982 <- num("AQ", 8);  AQ_2025 <- num("AQ", 51)
  AI_1982 <- num("AI", 8);  AI_2025 <- num("AI", 51)
  AT_1982 <- num("AT", 8);  AT_2025 <- num("AT", 51)
  # Deflation: 1982 $ -> 2025 $ via W ratio; 2025 row deflator-ratios to 1.
  defl_1982 <- W8  / W51
  defl_2025 <- W51 / W51    # = 1

  make_year_row <- function(label, AL, AQ, AI, AT, defl) {
    wealth_b <- AQ * defl
    n_fam_m  <- AI / 1000
    gdp_b    <- AT * defl
    tibble::tibble(
      year                 = label,
      families_top0002_k   = AL,
      wealth_top0002_b     = wealth_b,
      wealth_per_family_b  = wealth_b / AL,
      n_ca_families_m      = n_fam_m,
      ca_gdp_2025dollars_b = gdp_b,
      gdp_per_family_k     = 1000 * gdp_b / n_fam_m
    )
  }
  row_1982 <- make_year_row("1982", AL_1982, AQ_1982, AI_1982, AT_1982, defl_1982)
  row_2025 <- make_year_row("2025", AL_2025, AQ_2025, AI_2025, AT_2025, defl_2025)
  lr <- .long_run_compare(row_1982, row_2025, n_years = 43)
  panel_b <- dplyr::bind_rows(row_1982, row_2025, lr$ratio, lr$annualized)

  # ---- Render with gt -------------------------------------------------------
  # Two stacked panels rendered as one gt; row groups give the panel headers.
  panel_a_render <- tibble::tibble(
    section = "A. Recent nominal wealth growth of CA billionaires",
    year    = panel_a_full$year,
    col1    = panel_a_full$n_billionaires,
    col2    = panel_a_full$wealth_b,
    col3    = panel_a_full$annual_growth,
    col4    = panel_a_full$fraction_public,
    col5    = panel_a_full$top4_wealth_b,
    col6    = panel_a_full$ca_gdp_b,
    col7    = panel_a_full$wealth_per_gdp
  )
  panel_b_render <- tibble::tibble(
    section = "B. Long-term real wealth growth: top .0002% wealthiest CA families (2025 $)",
    year    = panel_b$year,
    col1    = panel_b$families_top0002_k,
    col2    = panel_b$wealth_top0002_b,
    col3    = panel_b$wealth_per_family_b,
    col4    = panel_b$n_ca_families_m,
    col5    = panel_b$ca_gdp_2025dollars_b,
    col6    = panel_b$gdp_per_family_k,
    col7    = NA_real_
  )
  combined <- dplyr::bind_rows(panel_a_render, panel_b_render)

  tab <- gt::gt(combined, groupname_col = "section") |>
    gt::tab_header(title = "Table 1. Wealth Growth of California Billionaires") |>
    gt::cols_label(
      year = "Year",
      col1 = "#",
      col2 = "Wealth",
      col3 = "Growth",
      col4 = "% public",
      col5 = "Top 4 wealth",
      col6 = "CA GDP",
      col7 = "Wealth / GDP"
    ) |>
    gt::fmt_number(columns = c(col1, col2, col5, col6),
                   decimals = 0, use_seps = TRUE) |>
    gt::fmt_percent(columns = c(col3, col4, col7), decimals = 1) |>
    gt::sub_missing(missing_text = "—") |>
    gt::tab_source_note(source_note = gt::md(paste(
      "**Notes:** Panel A illustrates the wealth growth of CA billionaires in 2022-2025.",
      "Panel B compares the top .0002% wealthiest CA families' wealth in 1982 vs 2025,",
      "all in 2025 dollars. Source: Forbes RTB snapshots + SEC EDGAR."
    )))

  # Attach a tidy version for downstream use / tests.
  attr(tab, "panel_a") <- panel_a_full
  attr(tab, "panel_b") <- panel_b
  tab
}

# Read-only display. Edit R/tables.R:416-538.

build_tab2()

Code
function(data_sec_agg_r, billionaires_ca_inctax_r, data_sec_top4) {
  # Table 2: California Income Tax Paid by California Billionaires.
  # Two side-by-side sub-panels: all CA billionaires (cols B-D) and the top 4
  # on company wealth (cols F-H), 2019-2025 + a 2019-2025 average row.

  yrs <- 2019:2025
  m1 <- billionaires_ca_inctax_r$method1
  m1_idx <- match(yrs, m1$year)

  agg_y <- data_sec_agg_r[match(yrs, data_sec_agg_r$year), ]
  total_excl <- data_sec_top4[data_sec_top4$forbes_id == "Total (excluding Ellison)" &
                                data_sec_top4$year %in% yrs, ]
  total_excl <- total_excl[order(total_excl$year), ]

  panel <- tibble::tibble(
    year                       = as.character(yrs),
    wealth_b                   = agg_y$forbes_worth,
    ca_inctax_b                = m1$ca_inctax_ca_billionaires_b[m1_idx],
    ca_inctax_per_wealth       = NA_real_,
    top4_company_wealth_b      = total_excl$public_worth   / 1000,
    top4_ca_inctax_b           = total_excl$ca_income_tax  / 1000,
    top4_ca_inctax_per_wealth  = NA_real_
  )
  panel$ca_inctax_per_wealth      <- panel$ca_inctax_b      / panel$wealth_b
  panel$top4_ca_inctax_per_wealth <- panel$top4_ca_inctax_b / panel$top4_company_wealth_b

  # Average row — Excel quirk: D14 = AVERAGE(D7:D12) (6 yrs, 2019-2024 only)
  # while B/C/F/G14 = AVERAGE(*7:*13) (7 yrs). Reproduce as-is for fidelity.
  avg_row <- tibble::tibble(
    year                      = "2019-2025 average",
    wealth_b                  = mean(panel$wealth_b),
    ca_inctax_b               = mean(panel$ca_inctax_b),
    ca_inctax_per_wealth      = mean(panel$ca_inctax_per_wealth[1:6]),  # 2019-2024
    top4_company_wealth_b     = mean(panel$top4_company_wealth_b),
    top4_ca_inctax_b          = mean(panel$top4_ca_inctax_b),
    top4_ca_inctax_per_wealth = mean(panel$top4_ca_inctax_b) /
                                  mean(panel$top4_company_wealth_b)
  )
  full <- dplyr::bind_rows(panel, avg_row)

  tab <- gt::gt(full) |>
    gt::tab_header(title = "Table 2. California Income Tax Paid by California Billionaires") |>
    gt::tab_spanner(label = "All California Billionaires",
                    columns = c(wealth_b, ca_inctax_b, ca_inctax_per_wealth)) |>
    gt::tab_spanner(label = "Top 4 (Page, Brin, Zuckerberg, Huang) on Company Wealth",
                    columns = c(top4_company_wealth_b, top4_ca_inctax_b,
                                top4_ca_inctax_per_wealth)) |>
    gt::cols_label(
      year                      = "Year",
      wealth_b                  = "Wealth ($B)",
      ca_inctax_b               = "Est. CA inc. tax ($B)",
      ca_inctax_per_wealth      = "Tax / wealth",
      top4_company_wealth_b     = "Company wealth ($B)",
      top4_ca_inctax_b          = "Est. CA inc. tax ($B)",
      top4_ca_inctax_per_wealth = "Tax / wealth"
    ) |>
    gt::fmt_number(columns = c(wealth_b, top4_company_wealth_b),
                   decimals = 0, use_seps = TRUE) |>
    gt::fmt_number(columns = c(ca_inctax_b, top4_ca_inctax_b),
                   decimals = 2) |>
    gt::fmt_percent(columns = c(ca_inctax_per_wealth, top4_ca_inctax_per_wealth),
                    decimals = 3) |>
    gt::tab_source_note(source_note = gt::md(paste(
      "**Notes:** All amounts in nominal $B. CA income tax for all billionaires comes",
      "from the Method I FTB-extrapolation calculation; top 4 figures come from SEC",
      "filings. Source: `billionaires_ca_inctax_r` + `data_sec_top4`."
    )))

  attr(tab, "panel") <- full
  tab
}

# Read-only display. Edit R/tables.R:21-91.

build_tab3()

Code
function(data_sec_top4) {
  # Table 3: CA Income Tax Paid by the Top 4 on Company Wealth.
  # Per-billionaire panel (Page, Brin, Zuckerberg, Huang) + all-top-4 sum.
  # 7 yearly CA income tax rows (2019-2025) + 1 average row +
  # begin-of-2019 wealth + end-of-2025 wealth + tax / wealth-gain.

  ids <- c("larry-page", "sergey-brin", "mark-zuckerberg", "jensen-huang")
  ids_lbl <- c("page", "brin", "zuckerberg", "huang")

  d <- data_sec_top4
  pick <- function(id, yr, col) {
    row <- d[d$forbes_id == id & d$year == yr, ]
    if (nrow(row) == 1L) row[[col]] else NA_real_
  }
  ca_tax_yearly <- function(yr) {
    setNames(vapply(ids, pick, numeric(1), yr = yr, col = "ca_income_tax"), ids_lbl)
  }

  tax_yrs <- lapply(2019:2025, ca_tax_yearly)
  # Per-row: per-billionaire + all_top4 sum
  yearly_rows <- lapply(seq_along(tax_yrs), function(i) {
    row <- as.list(tax_yrs[[i]])
    row$all_top4 <- sum(unlist(row))
    row$metric   <- paste("CA income tax", 2018 + i)
    tibble::as_tibble(row)
  })
  panel <- dplyr::bind_rows(yearly_rows)
  panel <- panel[, c("metric", "page", "brin", "zuckerberg", "huang", "all_top4")]

  # Average row (mean of the 7 yearly rows, per column)
  num_cols <- setdiff(names(panel), "metric")
  avg_values <- setNames(lapply(num_cols, function(c) mean(panel[[c]])), num_cols)
  avg_row <- tibble::as_tibble(c(
    list(metric = "Average CA income tax 2019-2025"),
    avg_values
  ))

  # Wealth rows: begin = end-of-2018 public_worth; end = end-of-2025 public_worth.
  wealth_begin <- setNames(vapply(ids, pick, numeric(1),
                                   yr = 2018, col = "public_worth"), ids_lbl)
  wealth_end   <- setNames(vapply(ids, pick, numeric(1),
                                   yr = 2025, col = "public_worth"), ids_lbl)
  wealth_begin_row <- tibble::as_tibble(c(
    list(metric = "Wealth at the beginning of 2019"),
    as.list(wealth_begin),
    list(all_top4 = sum(wealth_begin))
  ))
  wealth_end_row <- tibble::as_tibble(c(
    list(metric = "Wealth at end of 2025"),
    as.list(wealth_end),
    list(all_top4 = sum(wealth_end))
  ))
  # Total CA income tax / (end_wealth - begin_wealth), per column.
  total_taxes <- vapply(num_cols, function(c) sum(panel[[c]]), numeric(1))
  end_minus_begin <- c(wealth_end, all_top4 = sum(wealth_end)) -
                      c(wealth_begin, all_top4 = sum(wealth_begin))
  ratio_row <- tibble::as_tibble(c(
    list(metric = "Total CA income tax / wealth gain 2019-2025"),
    as.list(total_taxes / end_minus_begin)
  ))

  full <- dplyr::bind_rows(panel, avg_row, wealth_begin_row, wealth_end_row, ratio_row)

  tab <- gt::gt(full) |>
    gt::tab_header(title = "Table 3. California Income Tax Paid by the Top 4 on Company Wealth") |>
    gt::cols_label(
      metric     = "",
      page       = "Larry Page (Alphabet)",
      brin       = "Sergei Brin (Alphabet)",
      zuckerberg = "Mark Zuckerberg (Meta)",
      huang      = "Jensen Huang (Nvidia)",
      all_top4   = "All top 4"
    ) |>
    gt::fmt_number(rows = 1:8, decimals = 2, use_seps = TRUE) |>
    gt::fmt_number(rows = 9:10, decimals = 0, use_seps = TRUE) |>
    gt::fmt_percent(rows = 11, decimals = 3) |>
    gt::tab_source_note(source_note = gt::md(paste(
      "**Notes:** All amounts in nominal $M unless noted. CA income tax is",
      "computed from each billionaire's SEC Form 4 filings (target",
      "`data_sec_top4`). Row 11 is the lifetime effective tax rate on the",
      "2019-2025 wealth gain."
    )))

  attr(tab, "panel") <- full
  tab
}

# Read-only display. Edit R/tables.R:329-414.

build_tab4()

Code
function(data_sec_top4) {
  # Table 4: Wealth, Income, and Taxes of the Top 4, 2019-2025.
  # Two columns: "Total 2019-2025" and "Annual average" (= total / 7).

  d <- data_sec_top4
  total_excl <- d[d$forbes_id == "Total (excluding Ellison)" &
                    d$year %in% 2019:2025, ]
  total_excl <- total_excl[order(total_excl$year), ]
  begin <- d[d$forbes_id == "Total (excluding Ellison)" & d$year == 2018, ]
  end   <- d[d$forbes_id == "Total (excluding Ellison)" & d$year == 2025, ]
  # All values converted from $M to $B by dividing by 1000.
  wealth_begin <- begin$public_worth / 1000
  wealth_end   <- end$public_worth   / 1000
  wealth_gain  <- wealth_end - wealth_begin
  wealth_avg   <- mean(total_excl$public_worth) / 1000

  sum_col <- function(col) sum(total_excl[[col]]) / 1000

  fiscal_income      <- sum_col("fiscal_income")
  stock_options      <- sum_col("option_profit") + sum_col("noneq_comp")
  dividends          <- sum_col("dividend")
  realized_gains     <- sum_col("kg_taxable")
  appreciated_stock  <- sum_col("donation")
  net_collateral     <- sum_col("value_borrowed")
  fed_inctax         <- sum_col("fed_income_tax")
  ca_inctax          <- sum_col("ca_income_tax")
  corp_profits       <- sum_col("w_pi")
  corp_taxes         <- sum_col("w_txt")

  rows <- tibble::tibble(
    metric = c(
      "Wealth in 2019 (beginning of year)",
      "Wealth in 2025 (end of year)",
      "Gain in wealth during 2019-2025",
      "Wealth (average over 2019-2025)",
      "Fiscal individual income",
      "        Stock-options exercise + non-equity comp",
      "        Dividends",
      "        Realized capital gains",
      "Memo: Appreciated stock donated to charity",
      "Memo: Net collateral pledged",
      "Federal individual income tax",
      "California individual income tax",
      "Individual taxes / individual income",
      "Corporate profits",
      "Corporate taxes (federal)",
      "Corporate tax rate (effective)"
    ),
    total = c(
      wealth_begin, wealth_end, wealth_gain, NA_real_,
      fiscal_income, stock_options, dividends, realized_gains,
      appreciated_stock, net_collateral,
      fed_inctax, ca_inctax,
      (fed_inctax + ca_inctax) / fiscal_income,
      corp_profits, corp_taxes, corp_taxes / corp_profits
    ),
    annual_avg = c(
      NA_real_, NA_real_, wealth_gain / 7, wealth_avg,
      fiscal_income / 7, stock_options / 7, dividends / 7, realized_gains / 7,
      appreciated_stock / 7, net_collateral / 7,
      fed_inctax / 7, ca_inctax / 7,
      (fed_inctax + ca_inctax) / fiscal_income,
      corp_profits / 7, corp_taxes / 7, corp_taxes / corp_profits
    )
  )

  ratio_rows <- c(13, 16)        # tax-rate rows
  fmt_rows   <- setdiff(seq_len(nrow(rows)), ratio_rows)

  tab <- gt::gt(rows) |>
    gt::tab_header(title = "Table 4. Wealth, Income, and Taxes of the Top 4, 2019-2025") |>
    gt::cols_label(
      metric     = "",
      total      = "Total 2019-2025 ($B)",
      annual_avg = "Annual average ($B)"
    ) |>
    gt::fmt_number(rows = fmt_rows, decimals = 2) |>
    gt::fmt_percent(rows = ratio_rows, decimals = 1) |>
    gt::sub_missing(missing_text = "—") |>
    gt::tab_source_note(source_note = gt::md(paste(
      "**Notes:** All amounts in nominal $B. Source: per-billionaire SEC Form 4",
      "filings aggregated as `data_sec_top4`'s \"Total (excluding Ellison)\" rows."
    )))

  attr(tab, "panel") <- rows
  tab
}

# Read-only display. Edit R/tables.R:241-327.

build_tab5()

Code
function(tab5_r) {
  # Table 5: Scoring the One-Time 5% CA Wealth Tax.
  # 4 scenarios × 7 metric columns. tab5_r already verified element-wise
  # against the Excel sheet by the upstream R re-derivation.

  long_labels <- c(
    "1. Benchmark: Forbes estimates + 10% avoidance",
    "2. Adding missing small billionaires (Pareto extrapolation)",
    "3. Aggressive assumptions for pre/post-2026 leavers",
    "4. Benchmark with both adding small billionaires and aggressive leavers"
  )
  panel <- tab5_r
  panel$scenario <- long_labels

  tab <- gt::gt(panel) |>
    gt::tab_header(title = "Table 5. Scoring the One-Time 5% California Wealth Tax") |>
    gt::cols_label(
      scenario              = "",
      n_billionaires        = "# CA billionaires",
      wealth                = "Wealth ($B)",
      taxable_wealth        = "Taxable wealth ($B)",
      avoidance_rate        = "Avoidance rate",
      wealth_tax_revenue    = "Wealth tax revenue ($B)",
      extra_ca_inctax_sales = "Extra CA inctax from sales ($B)",
      annual_ca_inctax_loss = "Annual CA inctax loss ($B)"
    ) |>
    gt::fmt_number(columns = c(n_billionaires, wealth, taxable_wealth,
                                wealth_tax_revenue, extra_ca_inctax_sales,
                                annual_ca_inctax_loss),
                   decimals = 1, use_seps = TRUE) |>
    gt::fmt_percent(columns = avoidance_rate, decimals = 1) |>
    gt::tab_source_note(source_note = gt::md(paste(
      "**Notes:** Scenarios assume a one-time 5% wealth tax. Wealth tax revenue =",
      "taxable_wealth × 5%. Extra CA income tax from sales reflects forced",
      "asset sales to pay the tax (33% realization × 80% LTCG-taxable × 13.3%",
      "CA rate). Annual CA inctax loss is the steady-state revenue forgone from",
      "billionaires leaving California. Source: `tab5_r`."
    )))

  attr(tab, "panel") <- panel
  tab
}

# Read-only display. Edit R/tables.R:198-239.

build_tab_a1()

Code
function(shortrunseries, longrunseries) {
  # Appendix Table A1: Wealth Growth of US Billionaires (mirrors Tab1 but
  # for the US). Panel A: recent nominal growth 2022-2025. Panel B: 1982 vs
  # 2025 long-term real growth of the top .0002% US families.
  srs <- shortrunseries
  lrs <- longrunseries
  num <- function(df, col, rows) {
    unname(vapply(rows, function(r) suppressWarnings(as.numeric(df[[col]][r])),
                  numeric(1)))
  }

  # ---- Panel A inputs (shortrunseries rows 10..13 = years 2022..2025) ------
  # Col I = # US citizen billionaires (literal); Col Q = US billionaire wealth.
  n_us_b <- num(srs, "I", 10:13)
  w_us_b <- num(srs, "Q", 10:13)

  panel_a <- tibble::tibble(
    year                = as.character(2022:2025),
    n_us_billionaires   = n_us_b,
    wealth_b            = w_us_b,
    annual_growth       = c(NA_real_, w_us_b[-1] / w_us_b[-4] - 1)
  )
  growth_row <- tibble::tibble(
    year                = "Growth during 3 years (2023-2025)",
    n_us_billionaires   = NA_integer_,
    wealth_b            = panel_a$wealth_b[4] / panel_a$wealth_b[1] - 1,
    annual_growth       = NA_real_
  )
  panel_a_full <- dplyr::bind_rows(panel_a, growth_row)

  # ---- Panel B inputs (longrunseries rows 8 = 1982, 51 = 2025) -------------
  num_lr <- function(col, row) suppressWarnings(as.numeric(lrs[[col]][row]))
  W8  <- num_lr("W", 8);  W51 <- num_lr("W", 51)
  defl_1982 <- W8  / W51
  AM_1982 <- num_lr("AM", 8);  AM_2025 <- num_lr("AM", 51)
  AP_1982 <- num_lr("AP", 8);  AP_2025 <- num_lr("AP", 51)
  X_1982  <- num_lr("X",  8);  X_2025  <- num_lr("X",  51)
  AS_1982 <- num_lr("AS", 8);  AS_2025 <- num_lr("AS", 51)

  make_year_row <- function(label, AM, AP, X, AS, defl) {
    wealth_b <- AP * defl
    n_fam_m  <- X / 1000
    gdp_b    <- AS * defl
    tibble::tibble(
      year                  = label,
      families_top0002_k    = AM,
      wealth_top0002_b      = wealth_b,
      wealth_per_family_b   = wealth_b / AM,
      n_us_families_m       = n_fam_m,
      us_gdp_2025dollars_b  = gdp_b,
      gdp_per_family_k      = 1000 * gdp_b / n_fam_m
    )
  }
  # 2025 columns are already in 2025 $; pass defl = 1 to skip the inflate.
  row_1982 <- make_year_row("1982", AM_1982, AP_1982, X_1982, AS_1982, defl_1982)
  row_2025 <- make_year_row("2025", AM_2025, AP_2025, X_2025, AS_2025, 1)
  lr <- .long_run_compare(row_1982, row_2025, n_years = 43)
  panel_b <- dplyr::bind_rows(row_1982, row_2025, lr$ratio, lr$annualized)

  panel_a_render <- tibble::tibble(
    section = "A. Recent nominal wealth growth of US billionaires",
    year    = panel_a_full$year,
    col1    = panel_a_full$n_us_billionaires,
    col2    = panel_a_full$wealth_b,
    col3    = panel_a_full$annual_growth,
    col4    = NA_real_, col5 = NA_real_, col6 = NA_real_
  )
  panel_b_render <- tibble::tibble(
    section = "B. Long-term real wealth growth: top .0002% wealthiest US families (2025 $)",
    year    = panel_b$year,
    col1    = panel_b$families_top0002_k,
    col2    = panel_b$wealth_top0002_b,
    col3    = panel_b$wealth_per_family_b,
    col4    = panel_b$n_us_families_m,
    col5    = panel_b$us_gdp_2025dollars_b,
    col6    = panel_b$gdp_per_family_k
  )
  combined <- dplyr::bind_rows(panel_a_render, panel_b_render)

  tab <- gt::gt(combined, groupname_col = "section") |>
    gt::tab_header(title = "Appendix Table A1. Wealth Growth of US Billionaires") |>
    gt::cols_label(
      year = "Year",
      col1 = "# / families",
      col2 = "Wealth",
      col3 = "Growth / per family",
      col4 = "# US families (M)",
      col5 = "US GDP",
      col6 = "GDP / family ($)"
    ) |>
    gt::fmt_number(columns = c(col1, col2, col4, col5, col6),
                   decimals = 0, use_seps = TRUE) |>
    gt::fmt_percent(columns = col3, decimals = 1) |>
    gt::sub_missing(missing_text = "—") |>
    gt::tab_source_note(source_note = gt::md(paste(
      "**Notes:** Repeats Tab 1 for US billionaires instead of CA. Panel A:",
      "nominal wealth of all US citizen billionaires (Forbes). Panel B: 1982",
      "vs 2025 real wealth of top .0002% US families, in 2025 dollars."
    )))

  attr(tab, "panel_a") <- panel_a_full
  attr(tab, "panel_b") <- panel_b
  tab
}

# Read-only display. Edit R/tables.R:93-196.

build_fig1()

Code
function(shortrunseries_r) {
  # Figure 1: The Rise of California Billionaires Wealth in Recent Years.
  # Two solid lines (2019-2025): total CA billionaire wealth (excl. Ellison) +
  # top-4 wealth. Two dashed segments 2024->2025 showing the 5% wealth-tax
  # counterfactual (95% of 2025 actual).

  panel <- shortrunseries_r$panel
  yrs <- 2019:2025
  d <- panel[panel$year %in% yrs, ]

  series <- tibble::tibble(
    year   = rep(d$year, 2),
    wealth = c(d$wealth_us_citizens_b, d$top5_total_b),
    series = factor(rep(c("All CA billionaires", "Top 4 (Page, Brin, Zuck, Huang)"),
                          each = nrow(d)),
                     levels = c("All CA billionaires",
                                "Top 4 (Page, Brin, Zuck, Huang)"))
  )

  # Dashed-line segments: from 2024 actual to 2025-after-5%-tax.
  d24 <- d[d$year == 2024, ]
  d25 <- d[d$year == 2025, ]
  dashed <- tibble::tibble(
    x      = c(2024, 2024),
    xend   = c(2025, 2025),
    y      = c(d24$wealth_us_citizens_b, d24$top5_total_b),
    yend   = c(d25$wealth_w_avoid_b,     d25$top5_w_avoid_b),
    series = factor(c("All CA billionaires",
                       "Top 4 (Page, Brin, Zuck, Huang)"),
                     levels = levels(series$series))
  )

  ggplot2::ggplot(series, ggplot2::aes(x = year, y = wealth,
                                        color = series, shape = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 2.5) +
    ggplot2::geom_segment(
      data = dashed,
      ggplot2::aes(x = x, xend = xend, y = y, yend = yend, color = series),
      linetype = "dashed", inherit.aes = FALSE
    ) +
    ggplot2::scale_x_continuous(breaks = yrs) +
    ggplot2::scale_y_continuous(
      labels = function(v) format(v, big.mark = ",", scientific = FALSE),
      limits = c(0, NA)
    ) +
    ggplot2::scale_color_manual(values = c("All CA billionaires" = "#d62728",
                                            "Top 4 (Page, Brin, Zuck, Huang)" = "#1f77b4")) +
    ggplot2::labs(
      title    = "Figure 1: The Rise of California Billionaires' Wealth",
      subtitle = "Wealth of CA billionaires ($B, end of year). Dashed: 5% wealth-tax counterfactual.",
      x        = NULL,
      y        = "Wealth ($B)",
      color    = NULL,
      shape    = NULL
    ) +
    ggplot2::theme_minimal(base_size = 12) +
    ggplot2::theme(
      legend.position    = "bottom",
      plot.title         = ggplot2::element_text(face = "bold"),
      panel.grid.minor.x = ggplot2::element_blank()
    )
}

# Read-only display. Edit R/figures.R:647-709.

build_fig2()

Code
function(longrunseries) {
  # Figure 2: Billionaire Class Wealth Grows Much Faster than the Economy.
  # Two panels stacked horizontally via patchwork.
  #   A: CA top .0002% wealth (AZ, $B real) vs CA GDP per family (BE, $100K real)
  #      — dual y-axis line chart.
  #   B: Top .0002% wealth as % of GDP for US (AV) and CA (AW) — two lines.
  # Year range 1982-2025 = longrunseries rows 8..51.
  lrs <- longrunseries
  rows <- 8:51
  num <- function(col) suppressWarnings(as.numeric(lrs[[col]][rows]))
  year <- num("A")
  wealth_b   <- num("AZ")   # CA top .0002% real wealth, $B
  gdp_per_fam <- num("BE")   # CA GDP per family, $100Ks
  us_share   <- num("AV")   # US top 400 wealth / GDP
  ca_share   <- num("AW")   # CA top 45  wealth / GDP

  # Panel A: single y-axis — both series naturally fit in 0..30 (wealth in $B,
  # GDP per family in $100Ks). Wealth is the headline (red, with points);
  # GDP per family is the baseline (black, also with points).
  n_a <- length(year)
  panel_a_data <- tibble::tibble(
    year   = rep(year, 2),
    value  = c(wealth_b, gdp_per_fam),
    series = factor(rep(c("CA top .0002% wealth ($B, 2025 $)",
                            "CA GDP per family ($100K, 2025 $)"),
                          each = n_a),
                     levels = c("CA top .0002% wealth ($B, 2025 $)",
                                "CA GDP per family ($100K, 2025 $)"))
  )
  panel_a <- ggplot2::ggplot(panel_a_data,
                              ggplot2::aes(x = year, y = value, color = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 1.5) +
    ggplot2::scale_color_manual(values = c(
      "CA top .0002% wealth ($B, 2025 $)" = "#a91d22",
      "CA GDP per family ($100K, 2025 $)" = "#222222"
    )) +
    ggplot2::scale_y_continuous(limits = c(0, NA),
                                 breaks = seq(0, 30, by = 5)) +
    ggplot2::labs(
      title = "A. CA Top .0002% wealth vs. GDP per family",
      x = NULL, y = NULL, color = NULL
    ) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(
      legend.position = "bottom",
      plot.title      = ggplot2::element_text(face = "bold"),
      panel.grid.minor = ggplot2::element_blank()
    )

  n_yrs <- length(year)
  panel_b_data <- tibble::tibble(
    year   = rep(year, 2),
    share  = c(us_share, ca_share),
    region = factor(rep(c("US (top 400)", "California (top 45)"), each = n_yrs),
                     levels = c("US (top 400)", "California (top 45)"))
  )
  panel_b <- ggplot2::ggplot(panel_b_data,
                              ggplot2::aes(x = year, y = share, color = region)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
    ggplot2::scale_color_manual(values = c("US (top 400)"       = "#d62728",
                                            "California (top 45)" = "#1f77b4")) +
    ggplot2::labs(
      title = "B. Top .0002% wealth (% of annual GDP)",
      x = NULL, y = NULL, color = NULL
    ) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(
      legend.position = "bottom",
      plot.title      = ggplot2::element_text(face = "bold"),
      panel.grid.minor = ggplot2::element_blank()
    )

  patchwork::wrap_plots(panel_a, panel_b, ncol = 2) +
    patchwork::plot_annotation(
      title = "Figure 2: Billionaire Wealth Grows Much Faster than the Economy"
    )
}

# Read-only display. Edit R/figures.R:567-645.

build_fig3()

Code
function(shortrunseries_r) {
  # Figure 3: The California Income Tax Paid by Billionaires.
  # Panel A: $B income tax for all CA billionaires + top 5 (from SEC),
  # 2019-2025. Panel B: same series as % of wealth.
  panel <- shortrunseries_r$panel
  d <- panel[panel$year %in% 2019:2025, ]
  n_yrs <- nrow(d)

  panel_a_data <- tibble::tibble(
    year   = rep(d$year, 2),
    value  = c(d$ca_inctax_billionaires_b, d$top5_sec_ca_inctax_b),
    series = factor(rep(c("All CA billionaires", "Top 5 (SEC filings)"),
                          each = n_yrs),
                     levels = c("All CA billionaires", "Top 5 (SEC filings)"))
  )
  panel_b_data <- tibble::tibble(
    year   = rep(d$year, 2),
    value  = c(d$ca_inctax_per_wealth, d$top5_sec_tax_rate),
    series = factor(rep(c("All CA billionaires", "Top 5 (SEC filings)"),
                          each = n_yrs),
                     levels = c("All CA billionaires", "Top 5 (SEC filings)"))
  )

  colors <- c("All CA billionaires" = "#d62728", "Top 5 (SEC filings)" = "#1f77b4")

  panel_a <- ggplot2::ggplot(panel_a_data,
                              ggplot2::aes(x = year, y = value, color = series,
                                            shape = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 2.2) +
    ggplot2::scale_x_continuous(breaks = 2019:2025) +
    ggplot2::scale_y_continuous(limits = c(0, NA)) +
    ggplot2::scale_color_manual(values = colors) +
    ggplot2::labs(
      title = "A. CA income tax paid by billionaires ($B)",
      x = NULL, y = NULL, color = NULL, shape = NULL
    ) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor.x = ggplot2::element_blank())

  panel_b <- ggplot2::ggplot(panel_b_data,
                              ggplot2::aes(x = year, y = value, color = series,
                                            shape = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 2.2) +
    ggplot2::scale_x_continuous(breaks = 2019:2025) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 0.1),
                                 limits = c(0, NA)) +
    ggplot2::scale_color_manual(values = colors) +
    ggplot2::labs(
      title = "B. CA income tax / wealth",
      x = NULL, y = NULL, color = NULL, shape = NULL
    ) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor.x = ggplot2::element_blank())

  patchwork::wrap_plots(panel_a, panel_b, ncol = 2) +
    patchwork::plot_annotation(
      title = "Figure 3: California Income Tax Paid by Billionaires"
    )
}

# Read-only display. Edit R/figures.R:501-565.

build_fig4()

Code
function(billionaires_ca_inctax_r) {
  # Figure 4: Total Taxes Paid by California Billionaires relative to Wealth.
  # Stacked area chart of 4 tax components (CA inctax, fed inctax, corporate,
  # property+sales) as % of total wealth, 2019-2025.
  at <- billionaires_ca_inctax_r$all_taxes
  yrs <- 2019:2025
  d <- at[at$year %in% yrs, ]

  # geom_area's default stacking plots the FIRST factor level on TOP. To
  # match the BSZ figure (CA income tax at the bottom, property+sales as the
  # thin top sliver), list factor levels in TOP-TO-BOTTOM order.
  stack_levels <- c("Property + sales taxes", "Corporate taxes",
                     "Federal income tax", "CA income tax")
  long <- tibble::tibble(
    year  = rep(d$year, 4),
    share = c(d$ca_inctax_per_total_wealth,
              d$fed_inctax_per_total_wealth,
              d$corp_per_total_wealth,
              d$prop_sales_per_total_wealth),
    tax   = factor(rep(c("CA income tax", "Federal income tax",
                           "Corporate taxes", "Property + sales taxes"),
                        each = nrow(d)),
                    levels = stack_levels)
  )

  ggplot2::ggplot(long, ggplot2::aes(x = year, y = share, fill = tax)) +
    ggplot2::geom_area(alpha = 0.95, color = "white", linewidth = 0.3) +
    ggplot2::scale_x_continuous(breaks = yrs) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
    ggplot2::scale_fill_manual(values = c(
      "CA income tax"          = "#4a1414",   # darkest, anchor band
      "Federal income tax"     = "#8a2c2c",   # burgundy
      "Corporate taxes"        = "#c47878",   # rose
      "Property + sales taxes" = "#f0d4d4"    # pale pink, top sliver
    )) +
    ggplot2::labs(
      title    = "Figure 4: Total Taxes Paid by California Billionaires (% of wealth)",
      subtitle = "CA income tax + Federal income tax + Corporate + Property/sales, 2019-2025",
      x = NULL, y = NULL, fill = NULL
    ) +
    ggplot2::theme_minimal(base_size = 12) +
    ggplot2::theme(
      legend.position    = "bottom",
      plot.title         = ggplot2::element_text(face = "bold"),
      panel.grid.minor   = ggplot2::element_blank()
    )
}

# Read-only display. Edit R/figures.R:453-499.

build_fig5()

Code
function(data_sec_top4) {
  # Figure 5: Fiscal Income, Economic Income, and Wealth Gains of the Top 4,
  # 2019-2025. Three stacked bars showing the size of each income/wealth
  # concept and the share absorbed by taxes (CA, federal, corporate, and a
  # hypothetical 5% wealth tax).
  total_excl <- data_sec_top4[data_sec_top4$forbes_id == "Total (excluding Ellison)" &
                                data_sec_top4$year %in% 2019:2025, ]
  d <- total_excl[order(total_excl$year), ]
  begin <- data_sec_top4[data_sec_top4$forbes_id == "Total (excluding Ellison)" &
                           data_sec_top4$year == 2018, ]
  end   <- data_sec_top4[data_sec_top4$forbes_id == "Total (excluding Ellison)" &
                           data_sec_top4$year == 2025, ]

  fiscal_income  <- sum(d$fiscal_income) / 1000
  econ_income    <- sum(d$economic_income) / 1000
  wealth_gain    <- (end$public_worth - begin$public_worth) / 1000
  ca_inctax      <- sum(d$ca_income_tax) / 1000
  fed_inctax     <- sum(d$fed_income_tax) / 1000
  corp_tax       <- sum(d$total_tax) / 1000 - ca_inctax - fed_inctax
  wealth_tax_5p  <- 0.05 * end$public_worth / 1000

  # Per-bar breakdown: net income + CA + Fed + (corp only for Econ/Wealth) +
  # (5% wealth tax only for Wealth Gain).
  net_fi <- fiscal_income - ca_inctax - fed_inctax
  net_ei <- econ_income   - ca_inctax - fed_inctax - corp_tax
  net_wg <- wealth_gain   - wealth_tax_5p           # corporate taxes absorbed within net

  bars <- tibble::tibble(
    bar = factor(rep(c("Fiscal Income", "Economic Income", "Wealth Gain"), each = 5),
                  levels = c("Fiscal Income", "Economic Income", "Wealth Gain")),
    component = factor(rep(c("Net of taxes", "CA income tax", "Federal income tax",
                              "Corporate taxes", "5% wealth tax"), times = 3),
                        levels = c("Net of taxes", "CA income tax",
                                   "Federal income tax", "Corporate taxes",
                                   "5% wealth tax")),
    value = c(
      net_fi, ca_inctax, fed_inctax, 0, 0,
      net_ei, ca_inctax, fed_inctax, corp_tax, 0,
      net_wg, ca_inctax, fed_inctax, corp_tax, wealth_tax_5p
    )
  )

  ggplot2::ggplot(bars, ggplot2::aes(x = bar, y = value, fill = component)) +
    ggplot2::geom_col(width = 0.65) +
    ggplot2::scale_y_continuous(
      labels = function(v) format(v, big.mark = ",", scientific = FALSE)
    ) +
    ggplot2::scale_fill_manual(values = c(
      "Net of taxes"        = "#a6cee3",
      "CA income tax"       = "#1f77b4",
      "Federal income tax"  = "#2ca02c",
      "Corporate taxes"     = "#d62728",
      "5% wealth tax"       = "#ff7f0e"
    )) +
    ggplot2::labs(
      title = "Figure 5: Fiscal Income, Economic Income, and Wealth Gains of the Top 4",
      subtitle = "2019-2025 cumulative, $B; bars stack net + tax components",
      x = NULL, y = "$B", fill = NULL
    ) +
    ggplot2::theme_minimal(base_size = 12) +
    ggplot2::theme(
      legend.position = "bottom",
      plot.title      = ggplot2::element_text(face = "bold"),
      panel.grid.major.x = ggplot2::element_blank()
    )
}

# Read-only display. Edit R/figures.R:386-451.

build_fig6()

Code
function(top4taxes_r) {
  # Figure 6: Taxes paid by Top 4 relative to wealth and economic income.
  # 2004-2025 line chart, two panels: A) tax/wealth, B) tax/economic income.
  d <- top4taxes_r$panel
  yrs <- d$year
  n  <- length(yrs)

  panel_a <- tibble::tibble(
    year   = rep(yrs, 2),
    value  = c(d$total_tax_per_wealth, d$ca_inctax_per_wealth),
    series = factor(rep(c("Total taxes / wealth", "CA income tax / wealth"),
                          each = n),
                     levels = c("Total taxes / wealth", "CA income tax / wealth"))
  )
  panel_b <- tibble::tibble(
    year   = rep(yrs, 2),
    value  = c(d$total_tax_per_income, d$ca_inctax_per_income),
    series = factor(rep(c("Total taxes / income", "CA income tax / income"),
                          each = n),
                     levels = c("Total taxes / income", "CA income tax / income"))
  )

  colors_a <- c("Total taxes / wealth" = "#d62728", "CA income tax / wealth" = "#1f77b4")
  colors_b <- c("Total taxes / income" = "#d62728", "CA income tax / income" = "#1f77b4")

  pa <- ggplot2::ggplot(panel_a, ggplot2::aes(x = year, y = value, color = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 1.5) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 0.1),
                                 limits = c(0, NA)) +
    ggplot2::scale_color_manual(values = colors_a) +
    ggplot2::labs(title = "A. Top 4 Total Tax and CA income tax (% of wealth)",
                   x = NULL, y = NULL, color = NULL) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor = ggplot2::element_blank())

  pb <- ggplot2::ggplot(panel_b, ggplot2::aes(x = year, y = value, color = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 1.5) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 1),
                                 limits = c(0, NA)) +
    ggplot2::scale_color_manual(values = colors_b) +
    ggplot2::labs(title = "B. Top 4 Total Tax and CA income tax (% of economic income)",
                   x = NULL, y = NULL, color = NULL) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor = ggplot2::element_blank())

  patchwork::wrap_plots(pa, pb, ncol = 2) +
    patchwork::plot_annotation(
      title = "Figure 6: Taxes Paid by the Top 4 Relative to Wealth and Income"
    )
}

# Read-only display. Edit R/figures.R:329-384.

build_fig7()

Code
function(top4taxes_r, data_dina) {
  # Figure 7: Total Taxes / Economic Income, Top 4 vs US-wide average; and
  # CA income tax / Economic Income, Top 4 vs CA average. 2004-2025.
  d <- top4taxes_r$panel
  yrs <- d$year   # 2004:2025
  # data_dina K = US avg total tax/economic income, S = CA avg ca_inctax/economic
  # income. Rows 6..27 = years 2004..2025.
  dina_K <- suppressWarnings(as.numeric(data_dina$K[6:27]))
  dina_S <- suppressWarnings(as.numeric(data_dina$S[6:27]))

  n <- length(yrs)
  panel_a <- tibble::tibble(
    year   = rep(yrs, 2),
    value  = c(d$total_tax_per_income, dina_K),
    series = factor(rep(c("Top 4 (CA billionaires)", "US average"), each = n),
                     levels = c("Top 4 (CA billionaires)", "US average"))
  )
  panel_b <- tibble::tibble(
    year   = rep(yrs, 2),
    value  = c(d$ca_inctax_per_income, dina_S),
    series = factor(rep(c("Top 4 (CA billionaires)", "CA average"), each = n),
                     levels = c("Top 4 (CA billionaires)", "CA average"))
  )

  colors <- c("Top 4 (CA billionaires)" = "#1f77b4",
              "US average"              = "#d62728",
              "CA average"              = "#d62728")

  pa <- ggplot2::ggplot(panel_a,
                         ggplot2::aes(x = year, y = value, color = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 1.5) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 1),
                                 limits = c(0, NA)) +
    ggplot2::scale_color_manual(values = colors) +
    ggplot2::labs(title = "A. Top 4 vs. US average: Total Taxes / Economic Income",
                   x = NULL, y = NULL, color = NULL) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor = ggplot2::element_blank())

  pb <- ggplot2::ggplot(panel_b,
                         ggplot2::aes(x = year, y = value, color = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 1.5) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 0.1),
                                 limits = c(0, NA)) +
    ggplot2::scale_color_manual(values = colors) +
    ggplot2::labs(title = "B. Top 4 vs. CA average: CA Income Tax / Economic Income",
                   x = NULL, y = NULL, color = NULL) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor = ggplot2::element_blank())

  patchwork::wrap_plots(pa, pb, ncol = 2) +
    patchwork::plot_annotation(
      title = "Figure 7: Top 4 Effective Tax Rates vs. US/CA Averages"
    )
}

# Read-only display. Edit R/figures.R:267-327.

build_fig8()

Code
function(fig8_laffer_r) {
  # Figure 8: Laffer Curve for a permanent annual CA wealth tax.
  # Three lines: mechanical revenue (rate * base), actual revenue (with
  # mobility response), long-run revenue (mobility + deconcentration).
  d <- fig8_laffer_r
  long <- tibble::tibble(
    rate   = rep(d$tax_rate, 3),
    revenue = c(d$mechanical_tax_revenue, d$actual_tax_revenue, d$long_run_tax_revenue),
    series  = factor(rep(c("Mechanical (no behavior)",
                            "Short-run (with mobility)",
                            "Long-run (mobility + deconcentration)"),
                          each = nrow(d)),
                      levels = c("Mechanical (no behavior)",
                                 "Short-run (with mobility)",
                                 "Long-run (mobility + deconcentration)"))
  )

  ggplot2::ggplot(long, ggplot2::aes(x = rate, y = revenue,
                                       color = series, linetype = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::scale_x_continuous(labels = scales::percent_format(accuracy = 1),
                                 breaks = seq(0, 0.2, by = 0.05)) +
    ggplot2::scale_y_continuous(
      labels = function(v) format(v, big.mark = ",", scientific = FALSE),
      limits = c(0, NA)
    ) +
    ggplot2::scale_color_manual(values = c(
      "Mechanical (no behavior)"            = "#999999",
      "Short-run (with mobility)"           = "#1f77b4",
      "Long-run (mobility + deconcentration)" = "#d62728"
    )) +
    ggplot2::scale_linetype_manual(values = c(
      "Mechanical (no behavior)"            = "dashed",
      "Short-run (with mobility)"           = "solid",
      "Long-run (mobility + deconcentration)" = "solid"
    )) +
    ggplot2::labs(
      title    = "Figure 8: Laffer Curve for a Permanent California Wealth Tax",
      subtitle = "Annual revenue ($B) vs. tax rate, under mobility (e=10) + deconcentration (d=15) responses",
      x = "Annual wealth-tax rate",
      y = "Annual revenue ($B)",
      color    = NULL, linetype = NULL
    ) +
    ggplot2::theme_minimal(base_size = 12) +
    ggplot2::theme(
      legend.position = "bottom",
      plot.title      = ggplot2::element_text(face = "bold"),
      panel.grid.minor = ggplot2::element_blank()
    )
}

# Read-only display. Edit R/figures.R:216-265.

build_fig_a1()

Code
function(xlsx_path) {
  # Appendix Figure A1: California Billionaires Wealth by Industry.
  # Stacked horizontal bar, 13 industries × 3 components: Public stock held
  # by Top 4 (col I), other public stock (col G), private stock (col H).
  # Block 2 of rtb_2026_industry: rows 22-34, cols A and G-I.
  raw <- read_sheet("rtb_2026_industry", path = xlsx_path,
                    range = cellranger::cell_limits(ul = c(22, 1), lr = c(34, 9)))
  names(raw) <- c("industry", paste0("c", 2:ncol(raw)))
  d <- raw
  d$top4_public    <- suppressWarnings(as.numeric(d$c9))  # col I
  d$other_public   <- suppressWarnings(as.numeric(d$c7))  # col G
  d$private        <- suppressWarnings(as.numeric(d$c8))  # col H
  d$total          <- d$top4_public + d$other_public + d$private
  d <- d[!is.na(d$total) & d$total > 0, ]
  d <- d[order(d$total, decreasing = TRUE), ]
  d$industry <- factor(d$industry, levels = rev(d$industry))

  long <- tibble::tibble(
    industry  = rep(d$industry, 3),
    share     = c(d$top4_public, d$other_public, d$private),
    component = factor(rep(c("Public stock (Top 4)",
                              "Public stock (other)",
                              "Private stock"), each = nrow(d)),
                        levels = c("Public stock (Top 4)",
                                   "Public stock (other)",
                                   "Private stock"))
  )

  ggplot2::ggplot(long, ggplot2::aes(y = industry, x = share, fill = component)) +
    ggplot2::geom_col() +
    ggplot2::scale_x_continuous(labels = scales::percent_format(accuracy = 1)) +
    ggplot2::scale_fill_manual(values = c(
      "Public stock (Top 4)" = "#d62728",
      "Public stock (other)" = "#1f77b4",
      "Private stock"        = "#999999"
    )) +
    ggplot2::labs(
      title    = "Appendix Figure A1: CA Billionaire Wealth by Industry",
      subtitle = "% of total CA billionaire wealth, end of 2025 (Forbes 2026/01/01)",
      x = NULL, y = NULL, fill = NULL
    ) +
    ggplot2::theme_minimal(base_size = 12) +
    ggplot2::theme(
      legend.position    = "bottom",
      plot.title         = ggplot2::element_text(face = "bold"),
      panel.grid.major.y = ggplot2::element_blank()
    )
}

# Read-only display. Edit R/figures.R:167-214.

build_fig_a2()

Code
function(shortrunseries_r) {
  # Appendix Figure A2: CA Income Tax paid by Billionaires as % of total CA
  # income tax revenue. Two-series line, 2019-2025: all CA billionaires + top 5.
  panel <- shortrunseries_r$panel
  d <- panel[panel$year %in% 2019:2025, ]
  n <- nrow(d)

  long <- tibble::tibble(
    year   = rep(d$year, 2),
    share  = c(d$ca_inctax_share_total, d$top5_sec_share_of_total),
    series = factor(rep(c("All CA billionaires", "Top 5 (SEC filings)"), each = n),
                     levels = c("All CA billionaires", "Top 5 (SEC filings)"))
  )

  ggplot2::ggplot(long, ggplot2::aes(x = year, y = share, color = series,
                                       shape = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 2.5) +
    ggplot2::scale_x_continuous(breaks = 2019:2025) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 0.1),
                                 limits = c(0, NA)) +
    ggplot2::scale_color_manual(values = c(
      "All CA billionaires" = "#d62728", "Top 5 (SEC filings)" = "#1f77b4"
    )) +
    ggplot2::labs(
      title    = "Appendix Figure A2: CA Income Tax Paid by Billionaires",
      subtitle = "As % of total CA personal income tax revenue, 2019-2025",
      x = NULL, y = NULL, color = NULL, shape = NULL
    ) +
    ggplot2::theme_minimal(base_size = 12) +
    ggplot2::theme(
      legend.position    = "bottom",
      plot.title         = ggplot2::element_text(face = "bold"),
      panel.grid.minor.x = ggplot2::element_blank()
    )
}

# Read-only display. Edit R/figures.R:130-165.

build_fig_a3()

Code
function(billionaires_ca_inctax_r) {
  # Appendix Figure A3: Total Taxes Paid by CA Billionaires on their Public
  # Stock Wealth. Two area panels (% of public-asset wealth; % of public-asset
  # economic income), each stacking CA inctax + Fed inctax + Corp + Property/sales.
  at <- billionaires_ca_inctax_r$all_taxes
  d <- at[at$year %in% 2019:2025, ]
  n <- nrow(d)

  # Tax labels in the natural reading order; factor levels listed in
  # TOP-TO-BOTTOM stack order so geom_area places CA inctax at the bottom
  # (matches the original BSZ chart).
  labels      <- c("CA income tax", "Federal income tax",
                    "Corporate taxes", "Property + sales taxes")
  stack_levels <- rev(labels)
  panel_w <- tibble::tibble(
    year  = rep(d$year, 4),
    share = c(d$ca_inctax_per_public_wealth,
              d$fed_inctax_per_public_wealth,
              d$corp_per_public_wealth,
              d$prop_sales_per_public_wealth),
    tax   = factor(rep(labels, each = n), levels = stack_levels)
  )
  panel_ei <- tibble::tibble(
    year  = rep(d$year, 4),
    share = c(d$ca_inctax_per_econ_income,
              d$fed_inctax_per_econ_income,
              d$corp_per_econ_income,
              d$prop_sales_per_econ_income),
    tax   = factor(rep(labels, each = n), levels = stack_levels)
  )
  fills <- c("CA income tax"          = "#4a1414",
             "Federal income tax"     = "#8a2c2c",
             "Corporate taxes"        = "#c47878",
             "Property + sales taxes" = "#f0d4d4")

  pw <- ggplot2::ggplot(panel_w, ggplot2::aes(x = year, y = share, fill = tax)) +
    ggplot2::geom_area(alpha = 0.85, color = "white", linewidth = 0.3) +
    ggplot2::scale_x_continuous(breaks = 2019:2025) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 0.1)) +
    ggplot2::scale_fill_manual(values = fills) +
    ggplot2::labs(title = "A. Total taxes as % of public-asset wealth",
                   x = NULL, y = NULL, fill = NULL) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor = ggplot2::element_blank())

  pei <- ggplot2::ggplot(panel_ei, ggplot2::aes(x = year, y = share, fill = tax)) +
    ggplot2::geom_area(alpha = 0.85, color = "white", linewidth = 0.3) +
    ggplot2::scale_x_continuous(breaks = 2019:2025) +
    ggplot2::scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
    ggplot2::scale_fill_manual(values = fills) +
    ggplot2::labs(title = "B. Total taxes as % of economic income on public assets",
                   x = NULL, y = NULL, fill = NULL) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor = ggplot2::element_blank())

  patchwork::wrap_plots(pw, pei, ncol = 2) +
    patchwork::plot_annotation(
      title = "Appendix Figure A3: Total Taxes on CA Billionaire Public Stock Wealth"
    )
}

# Read-only display. Edit R/figures.R:65-128.

build_fig_a4()

Code
function(pareto_missing_r) {
  # Appendix Figure A4: Pareto extrapolation for missing CA billionaires.
  # Panel A: Pareto b (empirical vs projected) at each wealth threshold.
  # Panel B: density (empirical vs projected) at each wealth threshold.
  d <- pareto_missing_r
  n <- nrow(d)

  panel_a <- tibble::tibble(
    threshold = rep(d$threshold_b, 2),
    pareto_b  = c(d$pareto_b_emp, d$pareto_b_proj),
    series    = factor(rep(c("Pareto b (Forbes data)", "Pareto b (projected)"),
                             each = n),
                        levels = c("Pareto b (Forbes data)", "Pareto b (projected)"))
  )
  panel_b <- tibble::tibble(
    threshold = rep(d$threshold_b, 2),
    density   = c(d$actual_density, d$projected_density),
    series    = factor(rep(c("Density (Forbes)", "Density (Pareto-projected)"),
                             each = n),
                        levels = c("Density (Forbes)", "Density (Pareto-projected)"))
  )
  colors_a <- c("Pareto b (Forbes data)" = "#d62728",
                "Pareto b (projected)"  = "#1f77b4")
  colors_b <- c("Density (Forbes)"            = "#d62728",
                "Density (Pareto-projected)" = "#1f77b4")

  pa <- ggplot2::ggplot(panel_a,
                         ggplot2::aes(x = threshold, y = pareto_b, color = series,
                                        shape = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 2) +
    ggplot2::labs(title = "A. Pareto coefficient b (actual and projected)",
                   x = "Wealth threshold ($B)", y = "Pareto b", color = NULL, shape = NULL) +
    ggplot2::scale_color_manual(values = colors_a) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor = ggplot2::element_blank())

  pb <- ggplot2::ggplot(panel_b,
                         ggplot2::aes(x = threshold, y = density, color = series,
                                        shape = series)) +
    ggplot2::geom_line(linewidth = 0.9) +
    ggplot2::geom_point(size = 2) +
    ggplot2::labs(title = "B. Density (Forbes vs. adding missing)",
                   x = "Wealth threshold ($B)", y = "# billionaires in bracket",
                   color = NULL, shape = NULL) +
    ggplot2::scale_color_manual(values = colors_b) +
    ggplot2::theme_minimal(base_size = 11) +
    ggplot2::theme(legend.position = "bottom",
                    plot.title = ggplot2::element_text(face = "bold"),
                    panel.grid.minor = ggplot2::element_blank())

  patchwork::wrap_plots(pa, pb, ncol = 2) +
    patchwork::plot_annotation(
      title = "Appendix Figure A4: Pareto Extrapolation for Missing CA Billionaires"
    )
}

# Read-only display. Edit R/figures.R:6-63.

Reproducibility

This report is built from the {targets} pipeline in the project root. All upstream inputs come from original-materials/BSZ_MainTablesFigures.xlsx. To re-render:

targets::tar_make()
quarto::quarto_render("report.qmd")

The 506-expectation testthat suite (Rscript tests/testthat.R) verifies every R re-derivation against the Excel cached values within documented tolerances.

To-do: Raw data sources that are NOT yet independently pulled

The Excel workbook combines several public and proprietary inputs. Independent re-pulls from each primary source — to cross-check the workbook’s cached values — are an ongoing strand of this project. Status:

Raw source Workbook sheet it feeds Status
SEC EDGAR Form 4 (insider transactions) data_sec_top4, data_sec_all, data_sec_agg POC done — Huang 2025 only. sale matches exactly; donation $ requires per-day stock close prices (deferred). Other top-4 billionaires and earlier years not yet pulled.
BEA SAGDP / SQGDP macro series (CA + US GDP, deflators) longrunseries cols AS, AT, W Not pulled
California FTB Personal Income Tax Statistics (B4A bracket detail) ftb_b4a Not pulled (workbook ships its own copy; data.ca.gov may have newer release)
Saez-Zucman DINA tables (US-wide + CA-wide effective tax rates) data_dina (cols K, S) Not pulled
IRS SOI Top .001% income statistics billionairesCAinctax rows 59-72 Not pulled (literal pass-through from authors’ compilation)
Forbes Real-Time Billionaires snapshots shortrunseries cols B, K, Q Not pulled (no public historical archive)
ProPublica IRS leak data_sec_propublica Cannot be re-pulled — restricted-access data
Compustat (corporate financials feeding SEC top-4 columns) parts of data_sec_top4 Cannot be re-pulled here — paywalled

Interpretation. The R pipeline verifies that R reproduces the Excel cells. The cross-validation work (in progress) verifies that the Excel cells in turn reproduce the public raw data. Until that second layer is complete, the replication is “faithful to the authors’ workbook” but not yet “independently sourced from underlying public data” for most series.