Overview
This series of posts is intended to get the reader up speed on how to import, format, and use the economic data of Thomas Piketty, Gabriel Zucman, and Emmanuel Saez. Piketty is most known in the US for his seminal 2014 work Capital in the Twenty-First Century, and Saez and Zucman recently released The Triumph of Injustice: How the Rich Dodge Taxes and How to Make Them Pay.
Summary
If there is one thing I learned working with a Big4 accounting firm it was that if the numbers don’t reconcile then any insights are worthless.
This part is code-heavy and if you’ve been following along I would recommend going directly to the bottom and copying & pasting in the full recon script into R and then just follow along as you see how each metric is calculated. I added the TOC to this chapter since I image that it will serve as a reference for field calculations as the reader gets deeper into the data.
Profs’ Piketty, Zucman, and Saez provide several metrics that are aggregates of other variables. In the previous section I have given them the prefix ttl_.
These are the values that we will reconcile with our own calculations based on the documentation.
library(tidyverse)
dina_renamed <- readRDS("temp/dina_renamed.RDS")
sort(names(select(dina_renamed, starts_with("ttl_"))))## [1] "ttl_contributions_social_insurance"
## [2] "ttl_contributions_social_insurance_govt"
## [3] "ttl_income_disposable_extended"
## [4] "ttl_income_factor"
## [5] "ttl_income_factor_capital"
## [6] "ttl_income_factor_labor"
## [7] "ttl_income_fiscal_excl_capgains"
## [8] "ttl_income_fiscal_incl_capgains"
## [9] "ttl_income_national_factor"
## [10] "ttl_income_national_posttax"
## [11] "ttl_income_national_posttax_2"
## [12] "ttl_income_national_pretax"
## [13] "ttl_income_pretax"
## [14] "ttl_income_pretax_capital"
## [15] "ttl_income_pretax_labor"
## [16] "ttl_income_pretax_pension"
## [17] "ttl_income_social_benefits"
## [18] "ttl_income_social_collective"
## [19] "ttl_income_social_collective_2"
## [20] "ttl_income_social_inkind"
## [21] "ttl_income_social_othercash"
## [22] "ttl_income_social_share"
## [23] "ttl_taxes_payments_contributions"
## [24] "ttl_wealth_net"
## [25] "ttl_wealth_personal_net"
Reconcile Core Wealth & Income Series
In this part I manually calculate core totals based on the formulas in the data appendix. My intent was to drill down to a granular level of data whenever possible though there were some aggregations that I couldn’t reconcile as you’ll see below.
Why are they ordered this way?
The reconciliations have to be built in a particular order because some ttl_ fields need the value from other ttl_ fields.
Naming convention
Two new prefixes are introduced in this section:
1. The summ_ prefix refers to a summary statistic that we calculated. In most cases it’s a sum() function but sometimes it’s a mean().
2. The recon_ prefix refers to the result of comparing the ttl_ to the summ_. If all the recons come up zero then we have success!!
Total Personal Factor Labor Income
Each recon is being broken out individually for legibility. I realize this is a nightmare to copy, paste & edit so I’ve added the addendum at the bottom with the full code in one chunk.
df01 <- dina_renamed %>%
mutate(
summ_income_factor_labor =
income_factor_labor_wages +
income_factor_labor_mixed +
income_factor_labor_sales_taxes, #flemp + flmil + flprl
recon_income_factor_labor =
ttl_income_factor_labor - summ_income_factor_labor)
summary(df01$recon_income_factor_labor)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Total Personal Factor Capital Income
df02 <- df01 %>%
mutate(
summ_income_factor_capital =
income_factor_capital_housing +
income_factor_capital_equity +
income_factor_capital_interest +
income_factor_capital_business +
income_factor_capital_pension_benefits +
payments_interest, #fkhou + fkequ + fkfix + fkbus + fkpen + fkdeb
recon_income_factor_capital =
ttl_income_factor_capital - summ_income_factor_capital
)
summary(df02$recon_income_factor_capital)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Total Personal Factor Income
df03 <- df02 %>%
mutate(
summ_income_factor =
summ_income_factor_labor +
summ_income_factor_capital, #flinc + fkinc
recon_income_factor =
ttl_income_factor - summ_income_factor
)
summary(df03$recon_income_factor)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Total Personal Pretax Labor Income
df04 <- df03a %>%
mutate(
summ_income_pretax_labor =
summ_income_factor_labor +
summ_contributions_social_insurance +
income_social_share_labor, #flinc + plcon + plbel
recon_income_pretax_labor =
ttl_income_pretax_labor - summ_income_pretax_labor
)
summary(df04$recon_income_pretax_labor)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Total Personal Pretax Capital Income
df05 <- df04 %>%
mutate(
summ_income_pretax_capital =
summ_income_factor_capital +
income_investment_payable_pensions +
income_social_share_capital, #fkinc + pkpen + pkbek
recon_income_pretax_capital =
ttl_income_pretax_capital - summ_income_pretax_capital
)
summary(df05$recon_income_pretax_capital)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Total Personal Pretax Income
df06 <- df05 %>%
mutate(
summ_ttl_income_pretax =
summ_income_pretax_labor +
summ_income_pretax_capital, #plinc + pkinc
recon_ttl_income_pretax =
ttl_income_pretax - summ_ttl_income_pretax
)
summary(df06$recon_ttl_income_pretax)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.384e-07 0.000e+00 0.000e+00 -5.000e-13 0.000e+00 2.980e-08
Close enough.
Total Fiscal Income Including Capital Gains
df07 <- df06 %>%
mutate(
summ_income_fiscal_incl_capgains =
income_fiscal_wages_pensions +
income_fiscal_business +
income_fiscal_rents +
income_fiscal_interest +
income_fiscal_dividends, #fiwag + fibus + firen + fiint + fidiv
recon_income_fiscal_incl_capgains =
ttl_income_fiscal_incl_capgains - summ_income_fiscal_incl_capgains
)
summary(df07$recon_income_fiscal_incl_capgains)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Total Fiscal Income Excluding Capital Gains
df07 <- df06 %>%
mutate(
summ_income_fiscal_excl_capgains =
income_fiscal_wages_pensions +
income_fiscal_business +
income_fiscal_rents +
income_fiscal_interest +
income_fiscal_dividends +
income_fiscal_capital_gains, #fiwag + fibus + firen + fiint + fidiv + fikgi
recon_income_fiscal_excl_capgains =
ttl_income_fiscal_excl_capgains - summ_income_fiscal_excl_capgains
)
summary(df07$recon_income_fiscal_excl_capgains)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Total Personal Disposable Income
df08 <- df07 %>%
mutate(
summ_income_disposable_extended =
income_cash_disposable +
income_social_inkind_transfers +
ttl_income_social_collective, #dicsh + inkindinc + colexp
recon_income_disposable_extended =
ttl_income_disposable_extended - summ_income_disposable_extended
)
summary(df08$recon_income_disposable_extended)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Total National Factor Income
df09 <- df08 %>%
mutate(
summ_income_national_factor =
summ_income_factor +
income_social_collective_property_paid_by_govt +
income_social_collective_non_profit, #fainc + govin + npinc
recon_income_national_factor =
ttl_income_national_factor - summ_income_national_factor
)
summary(df09$recon_income_national_factor)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.980e-08 0.000e+00 0.000e+00 2.000e-12 0.000e+00 4.768e-07
Total National Pre-tax Income
df10 <- df09 %>%
mutate(
summ_income_national_pretax =
summ_ttl_income_pretax +
income_social_collective_property_paid_by_govt +
income_social_collective_non_profit +
surplus_primary_public_pension_system +
income_investment_pensions_payable, #ptinc + govin + npinc + prisupen + invpen
recon_income_national_pretax =
ttl_income_national_pretax - summ_income_national_pretax
)
summary(df10$recon_income_national_pretax)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -2.384e-07 0.000e+00 0.000e+00 -4.500e-13 0.000e+00 2.980e-08
Total National Post-tax Income
df11 <- df10 %>%
mutate(
summ_income_national_posttax =
summ_income_disposable_extended +
income_social_collective_property_paid_by_govt +
income_social_collective_non_profit +
surplus_primary_private_pension_system +
income_investment_pensions_payable +
surplus_primary_government, # diinc + govin + npinc + prisupenprivate + invpen + prisupgov
recon_income_national_posttax =
ttl_income_national_posttax - summ_income_national_posttax
)
summary(df11$recon_income_national_posttax)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Total Net Personal Wealth
df12 <- df11 %>%
mutate(
summ_wealth_net =
assets_equity +
assets_currency +
assets_housing +
assets_business +
assets_pension_lifeins +
liabilities_household, #hwequ + hwfix + hwhou + hwbus + hwpen + hwdeb
recon_wealth_net =
ttl_wealth_net - summ_wealth_net
)
summary(df12$recon_wealth_net)## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 0 0 0 0 0
Detailed reconciliations
Taxes
df18 <- df17 %>%
mutate(
summ_taxes_paid =
taxes_capital_sales_excise +
taxes_property_housing +
taxes_property_business +
taxes_income_wealth_current +
taxes_federal_income +
taxes_state_income +
taxes_sales_excise +
taxes_corporate +
taxes_estate,
summ_taxes_effective_tax_rate = summ_taxes_paid / ttl_income_national_pretax,
summ_taxes_cohort_income_prepost_tax_rate = if_else(
summ_income_national_pretax != 0,
1 - (summ_income_national_posttax / summ_income_national_pretax),
0)
)
summary(df18$summ_taxes_effective_tax_rate)## Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
## -2049.594 0.144 0.317 0.357 0.519 722.578 6200
Calculate total cohort income
df19 <- df18 %>%
mutate(
summ_cohort_national_income_pretax = ttl_income_national_pretax * population_weight,
summ_cohort_national_income_posttax = ttl_income_national_posttax * population_weight,
summ_cohort_wealth_net = ttl_wealth_net * population_weight)
df19 %>%
group_by(year) %>%
summarize(summ_national_income_pretax = sum(summ_cohort_national_income_pretax),
summ_national_income_posttax = sum(summ_cohort_national_income_posttax),
summ_wealth_net = sum(summ_cohort_wealth_net))## # A tibble: 6 x 4
## year summ_national_income_pretax summ_national_income_posttax summ_wealth_net
## <fct> <dbl> <dbl> <dbl>
## 1 1968 832027000000. 832074959120. 2595593111106.
## 2 1978 2029741000000. 2029771071890 5618501388471.
## 3 1988 4492753000000. 4492765000000 15170572206378.
## 4 1998 7856187000000. 7856195323930. 30978943791586.
## 5 2008 12426422999998. 12426475661370. 50280852524672.
## 6 2018 17638286460000 17638376930000 88662475850000.
Get a full summary of reconciliations
df19 %>%
select(starts_with("recon_")) %>%
summary()## recon_income_factor_labor recon_income_factor_capital recon_income_factor
## Min. :0 Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0 1st Qu.:0
## Median :0 Median :0 Median :0
## Mean :0 Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0 Max. :0
## recon_contributions_social_insurance recon_income_pretax_labor
## Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0
## Median :0 Median :0
## Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0
## recon_income_pretax_capital recon_ttl_income_pretax
## Min. :0 Min. :-2.384e-07
## 1st Qu.:0 1st Qu.: 0.000e+00
## Median :0 Median : 0.000e+00
## Mean :0 Mean :-5.000e-13
## 3rd Qu.:0 3rd Qu.: 0.000e+00
## Max. :0 Max. : 2.980e-08
## recon_income_fiscal_excl_capgains recon_income_disposable_extended
## Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0
## Median :0 Median :0
## Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0
## recon_income_national_factor recon_income_national_pretax
## Min. :-2.980e-08 Min. :-2.384e-07
## 1st Qu.: 0.000e+00 1st Qu.: 0.000e+00
## Median : 0.000e+00 Median : 0.000e+00
## Mean : 2.000e-12 Mean :-4.500e-13
## 3rd Qu.: 0.000e+00 3rd Qu.: 0.000e+00
## Max. : 4.768e-07 Max. : 2.980e-08
## recon_income_national_posttax recon_wealth_net recon_income_social_inkind
## Min. :0 Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0 1st Qu.:0
## Median :0 Median :0 Median :0
## Mean :0 Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0 Max. :0
## recon_income_social_othercash recon_contributions_social_insurance_govt
## Min. :0 Min. :0
## 1st Qu.:0 1st Qu.:0
## Median :0 Median :0
## Mean :0 Mean :0
## 3rd Qu.:0 3rd Qu.:0
## Max. :0 Max. :0
## recon_income_social_collective
## Min. : -812723
## 1st Qu.: 1493
## Median : 5714
## Mean : 68513
## 3rd Qu.: 29597
## Max. :529727066
End notes
If all of your recons are zero or near zero then you should be confident that the totals in the source are accurate.
There is an issue with the social_collective category. I’ll see if I can get some clarification from the economists. We have the total to work with and don’t need the component parts for anything now so feel free to remove that section if you wish.
Next up: Creating Income & Wealth Proportions
Addendum: Full recon code
dina_reconciled <- dina_renamed %>%
mutate(
# Core wealth & income aggregates
summ_income_factor_labor =
income_factor_labor_wages +
income_factor_labor_mixed +
income_factor_labor_sales_taxes, #flemp + flmil + flprl
recon_income_factor_labor =
ttl_income_factor_labor - summ_income_factor_labor,
summ_income_factor_capital =
income_factor_capital_housing +
income_factor_capital_equity +
income_factor_capital_interest +
income_factor_capital_business +
income_factor_capital_pension_benefits +
payments_interest, #fkhou + fkequ + fkfix + fkbus + fkpen + fkdeb
recon_income_factor_capital =
ttl_income_factor_capital - summ_income_factor_capital,
summ_income_factor =
summ_income_factor_labor +
summ_income_factor_capital, #flinc + fkinc
recon_income_factor =
ttl_income_factor - summ_income_factor,
summ_contributions_social_insurance =
contributions_social_insurance_pensions +
contributions_social_insurance_di_ui,
recon_contributions_social_insurance =
ttl_contributions_social_insurance - summ_contributions_social_insurance,
summ_income_pretax_labor =
summ_income_factor_labor +
summ_contributions_social_insurance +
income_social_share_labor, #flinc + plcon + plbel
recon_income_pretax_labor =
ttl_income_pretax_labor - summ_income_pretax_labor,
summ_income_pretax_capital =
summ_income_factor_capital +
income_investment_payable_pensions +
income_social_share_capital, #fkinc + pkpen + pkbek
recon_income_pretax_capital =
ttl_income_pretax_capital - summ_income_pretax_capital,
summ_ttl_income_pretax =
summ_income_pretax_labor +
summ_income_pretax_capital, #plinc + pkinc
recon_ttl_income_pretax =
ttl_income_pretax - summ_ttl_income_pretax,
summ_income_fiscal_incl_capgains =
income_fiscal_wages_pensions +
income_fiscal_business +
income_fiscal_rents +
income_fiscal_interest +
income_fiscal_dividends, #fiwag + fibus + firen + fiint + fidiv
recon_income_fiscal_incl_capgains =
ttl_income_fiscal_incl_capgains - summ_income_fiscal_incl_capgains,
summ_income_fiscal_excl_capgains =
income_fiscal_wages_pensions +
income_fiscal_business +
income_fiscal_rents +
income_fiscal_interest +
income_fiscal_dividends +
income_fiscal_capital_gains, #fiwag + fibus + firen + fiint + fidiv + fikgi
recon_income_fiscal_excl_capgains =
ttl_income_fiscal_excl_capgains - summ_income_fiscal_excl_capgains,
summ_income_disposable_extended =
income_cash_disposable +
income_social_inkind_transfers +
ttl_income_social_collective, #dicsh + inkindinc + colexp
recon_income_disposable_extended =
ttl_income_disposable_extended - summ_income_disposable_extended,
summ_income_national_factor =
summ_income_factor +
income_social_collective_property_paid_by_govt +
income_social_collective_non_profit, #fainc + govin + npinc
recon_income_national_factor =
ttl_income_national_factor - summ_income_national_factor,
summ_income_national_pretax =
summ_ttl_income_pretax +
income_social_collective_property_paid_by_govt +
income_social_collective_non_profit +
surplus_primary_public_pension_system +
income_investment_pensions_payable, #ptinc + govin + npinc + prisupen + invpen
recon_income_national_pretax =
ttl_income_national_pretax - summ_income_national_pretax,
summ_income_national_posttax =
summ_income_disposable_extended +
income_social_collective_property_paid_by_govt +
income_social_collective_non_profit +
surplus_primary_private_pension_system +
income_investment_pensions_payable +
surplus_primary_government, # diinc + govin + npinc + prisupenprivate + invpen + prisupgov
recon_income_national_posttax =
ttl_income_national_posttax - summ_income_national_posttax,
summ_wealth_net =
assets_equity +
assets_currency +
assets_housing +
assets_business +
assets_pension_lifeins +
liabilities_household, #hwequ + hwfix + hwhou + hwbus + hwpen + hwdeb
recon_wealth_net =
ttl_wealth_net - summ_wealth_net,
# Detailed wealth & income aggregates
summ_income_social_inkind =
income_social_inkind_pell +
income_social_inkind_vethealth,
recon_income_social_inkind =
ttl_income_social_inkind - summ_income_social_inkind,
summ_income_social_othercash =
income_social_othercash_tanf +
income_social_othercash_cashlocalstate,
recon_income_social_othercash =
ttl_income_social_othercash - summ_income_social_othercash,
summ_income_social_share =
income_social_share_labor +
income_social_share_capital,
recon_income_social_share =
ttl_income_social_share - summ_income_social_share,
summ_income_social_collective =
income_social_collective_property_paid_by_govt +
income_social_collective_non_profit +
income_social_collective_education,
recon_income_social_collective =
ttl_income_social_collective - summ_income_social_collective,
recon_income_social_collective_2 =
ttl_income_social_collective_2 - summ_income_social_collective,
adj_income_social_collective =
ttl_income_social_collective -
summ_income_social_collective,
recon_adj_income_social_collective =
ttl_income_social_collective -
summ_income_social_collective +
adj_income_social_collective,
summ_contributions_social_insurance_govt =
contributions_social_insurance_govt_pensions_ui_di +
contributions_social_insurance_govt_other,
recon_contributions_social_insurance_govt =
ttl_contributions_social_insurance_govt - summ_contributions_social_insurance_govt,
summ_taxes_paid =
taxes_capital_sales_excise +
taxes_property_housing +
taxes_property_business +
taxes_income_wealth_current +
taxes_federal_income +
taxes_state_income +
taxes_sales_excise +
taxes_corporate +
taxes_estate,
summ_taxes_effective_tax_rate = summ_taxes_paid / ttl_income_national_pretax,
summ_taxes_cohort_income_prepost_tax_rate = if_else(
summ_income_national_pretax != 0,
1 - (summ_income_national_posttax / summ_income_national_pretax),
0),
summ_cohort_national_income_pretax = ttl_income_national_pretax * population_weight,
summ_cohort_national_income_posttax = ttl_income_national_posttax * population_weight,
summ_cohort_wealth_net = ttl_wealth_net * population_weight
)
saveRDS(dina_reconciled, "temp/dina_reconciled.RDS")
Social Insurance Contributions