1 Abstract

This paper investigates the relationship between health expenditure and life expectancy at birth across 37 OECD countries over the period 2000–2022. Rather than estimating public and private expenditure as separate regressors, the empirical specification decomposes health expenditure into total expenditure per capita and the public share of total expenditure. Using a two-way fixed effects panel data model, we find that total health expenditure, GDP per capita, and urbanisation are positively associated with life expectancy. Contrary to our main hypothesis, the public expenditure share carries a negative coefficient within OECD countries. This result is consistent with possible reverse causality or time-varying confounding, since public expenditure may increase during periods of deteriorating population health. A pre-COVID robustness check confirms that the negative association remains present when the sample is restricted to 2000–2019. The findings contribute to the debate on health-system financing while highlighting the limits of causal interpretation in a static panel model.

Keywords: life expectancy, health expenditure, public finance, panel data, fixed effects, OECD


2 Introduction

Health expenditure is an important determinant of population health, but the level of expenditure alone does not fully capture the policy choices faced by governments. The composition of health financing may also matter. In particular, a greater public role in health expenditure may improve access to essential services, reduce financial barriers to treatment, and support more equitable health outcomes. At the same time, a larger public share may reflect country-specific institutional arrangements, fiscal constraints, or responses to health crises. Understanding this relationship is therefore relevant for both economic policy and health-system design.

The topic is especially important for OECD countries. Many of these countries face ageing populations, rising healthcare costs, and increasing pressure on public budgets. The COVID-19 pandemic also highlighted the importance of resilient health systems and the potential role of public financing during periods of exceptional stress. Examining whether changes in the public-private financing composition are associated with changes in life expectancy may provide useful evidence for policymakers considering how healthcare resources should be allocated.

This paper addresses two research questions:

  1. Do changes in the public-private composition of health expenditure have a significant association with life expectancy in OECD countries after controlling for total expenditure?
  2. Does the association between the public expenditure share and life expectancy depend on a country’s level of GDP per capita?

Main hypothesis (H1): A higher share of public health expenditure in total health spending is positively associated with life expectancy at birth in OECD countries, ceteris paribus.

Secondary hypothesis (H2): The association between the public health expenditure share and life expectancy differs across countries depending on their level of GDP per capita.

The empirical analysis uses a fixed-effects panel model with year controls, allowing the estimates to be identified from within-country changes over time while controlling for time-invariant country characteristics and common shocks. The results should be interpreted as conditional associations rather than definitive causal effects, since reverse causality and time-varying omitted factors may still be present.

The remainder of the paper is structured as follows. The next section reviews the relevant literature. The subsequent sections describe the data, explain the econometric methodology, and present the empirical results and diagnostic tests. The final section summarises the main findings, discusses policy implications and limitations, and identifies possible directions for further research.


3 Literature Review

3.1 Health Expenditure and Life Expectancy: Aggregate Evidence

The relationship between health expenditure and population health outcomes has been extensively studied using panel data methods. The general consensus in the literature is that higher health spending is associated with better health outcomes, although the magnitude and direction of the estimates vary across country groups, time periods, and model specifications.

Aytemiz et al. (2024) examine 32 OECD countries over 2005–2021 and find that health expenditure and GDP per capita positively affect life expectancy at birth in OECD member states. Their analysis provides recent evidence that health spending remains a significant determinant of longevity in high-income countries, the context directly relevant to the present study. Nixon and Ulmann (2006) review the relationship between healthcare expenditure and health outcomes and emphasise that observed associations must be interpreted with caution because a causal link is difficult to establish.

3.2 Public vs Private Health Expenditure

The distinction between public and private health expenditure has received comparatively limited empirical attention, yet it is central to health-policy debates about the role of the state in healthcare provision. Musgrove (1996) discusses the complementary public and private roles in health financing and highlights that financing structures must be evaluated in relation to access, equity, and institutional context.

Linden and Ray (2017) analyse 34 OECD countries over 1970–2012 using a panel time-series approach and find a positive relationship between both public and private health expenditure and life expectancy. Ray and Linden (2020) extend the analysis to 195 countries over 1995–2014 using dynamic panel models. They report that public health expenditure is generally more health-promoting than private expenditure, while primary education effects can be larger than health-expenditure effects. Their study also acknowledges that the dynamic estimators do not provide fully robust answers across all specifications.

Evidence outside the OECD context also illustrates the importance of financing composition. Novignon et al. (2012) examine sub-Saharan African countries using panel data and report positive associations between both public and private healthcare expenditure and health status. Earlier OECD-related evidence remains mixed. Cremieux et al. (2005) study Canadian provinces over 1975–1998 and find that private drug spending has a somewhat larger effect on life expectancy than public expenditure. Lichtenberg (2000), using United States time-series data, finds statistically significant public expenditure effects while private effects are less precise in some specifications. Or (2000) analyses mortality across OECD countries and finds that healthcare provision and financing matter for health outcomes. Together, these studies indicate that the relationship is context-dependent.

3.3 Interpretation of the Public Expenditure Share

The public expenditure share should be interpreted as an indicator of the composition of health financing rather than as a direct measure of healthcare quality, efficiency, or accessibility. A higher public share may reflect broader government involvement in financing healthcare services, but it does not necessarily imply that additional resources are allocated efficiently or that they immediately translate into improved health outcomes. The relationship between financing composition and life expectancy therefore depends on institutional arrangements, the timing of expenditure, and the specific purposes for which public funds are used.

This distinction is particularly important in the OECD context. Countries with similar levels of total health expenditure may differ substantially in the balance between public and private financing, the organisation of healthcare delivery, and the extent of insurance coverage. Changes in the public expenditure share may also occur in response to adverse events, including economic downturns, demographic pressures, or health emergencies. Consequently, the coefficient on the public expenditure share should not be interpreted as a simple ranking of public versus private healthcare systems.

The specification used in this paper separates the level effect of total health expenditure from the composition effect of the public expenditure share. Holding total expenditure constant, the coefficient on the public share captures whether changes in the financing structure are associated with changes in life expectancy within countries over time. This interpretation is narrower than a causal claim, but it provides a useful empirical perspective on how financing structures evolve alongside population health outcomes.

3.4 Reverse Causality and Endogeneity

A recurring methodological concern in this literature is the bidirectional relationship between health expenditure and health outcomes. Ray and Linden (2020) note that while public and private health expenditure can affect life expectancy, the reverse direction is also plausible: deteriorating health conditions may lead to higher expenditure. In OECD countries, this concern is especially relevant because the public share of health spending may rise during health emergencies and periods of fiscal intervention. Static panel models, including the fixed-effects estimator employed in the present study, cannot fully resolve this endogeneity. The resulting coefficients should therefore be interpreted as conditional associations.

3.5 Urbanisation, GDP, and Other Determinants

Beyond health expenditure, several macroeconomic and demographic variables have been identified as important determinants of life expectancy. GDP per capita captures the overall standard of living and access to resources. Urbanisation can be associated with better access to healthcare services, although its net effect may vary across country groups. Ray and Linden (2020) also find that primary education effects on life expectancy can be larger than those of health expenditure, motivating the inclusion of education variables in the general model of the present study.

3.6 Research Gap and Contribution

The existing literature provides valuable evidence on the relationship between health expenditure and population health, but the differential role of public and private financing remains difficult to identify. Directly including public and private expenditure as separate regressors can create multicollinearity because both tend to rise with income and healthcare-system development. The present paper addresses this issue by decomposing expenditure into total health expenditure and the public share of total expenditure. This approach distinguishes the level effect of spending from the composition effect of financing.

The paper contributes an updated analysis for 37 OECD countries over 2000–2022, including the COVID-19 period. It also reports a pre-COVID robustness check to assess whether the negative public-share coefficient is driven primarily by the exceptional mortality shock observed during 2020–2022.


4 Data

4.1 Sources and Variable Definitions

All data were obtained from the World Development Indicators database (World Bank, 2024) using the WDI package in R. The sample covers 37 OECD countries over 2000–2022.

Table 1. Variable definitions and expected signs

Variable Definition WDI code Transformation Expected sign
Life expectancy Life expectancy at birth SP.DYN.LE00.IN Natural logarithm Dependent variable
Total health expenditure Public plus private health expenditure per capita Constructed Natural logarithm Positive
Public expenditure share Public expenditure divided by total expenditure Constructed Natural logarithm Positive under H1
GDP per capita GDP per capita in current USD NY.GDP.PCAP.CD Natural logarithm Positive
Urban population Urban population as a percentage of total population SP.URB.TOTL.IN.ZS Natural logarithm Ambiguous
Education expenditure Education expenditure as a percentage of GDP SE.XPD.TOTL.GD.ZS Natural logarithm Positive
Primary completion Primary school completion rate SE.PRM.CMPT.ZS Natural logarithm Positive
# Define WDI indicator codes
indicators <- c(
  life_exp     = "SP.DYN.LE00.IN",    # Life expectancy at birth (years)
  health_pub   = "SH.XPD.GHED.PC.CD", # Public HE per capita (current USD)
  health_priv  = "SH.XPD.PVTD.PC.CD", # Private HE per capita (current USD)
  gdp_pc       = "NY.GDP.PCAP.CD",    # GDP per capita (current USD)
  urban        = "SP.URB.TOTL.IN.ZS", # Urban population (% of total)
  educ         = "SE.XPD.TOTL.GD.ZS", # Education expenditure (% of GDP)
  primary_comp = "SE.PRM.CMPT.ZS"     # Primary school completion rate (%)
)

# 37 OECD country ISO2 codes
oecd_countries <- c("AU","AT","BE","CA","CL","CO","CZ","DK","EE","FI",
                    "FR","DE","GR","HU","IS","IE","IL","IT","JP","KR",
                    "LV","LT","LU","MX","NL","NZ","NO","PL","PT","SK",
                    "SI","ES","SE","CH","TR","GB","US")

# Download data
raw_data <- WDI(
  country   = oecd_countries,
  indicator = indicators,
  start     = 2000,
  end       = 2022,
  extra     = FALSE
)

4.2 Data Cleaning and Transformation

# Remove rows with missing values in key variables
data_clean <- raw_data %>%
  filter(!is.na(life_exp) & !is.na(health_pub) & !is.na(health_priv) &
           !is.na(gdp_pc) & !is.na(urban))

# Log-transform all variables (following Ray & Linden, 2020)
# In a log-log model, coefficients are interpreted as elasticities
data_clean <- data_clean %>%
  mutate(
    ln_life_exp    = log(life_exp),
    ln_health_pub  = log(health_pub),
    ln_health_priv = log(health_priv),
    ln_gdp_pc      = log(gdp_pc),
    ln_urban       = log(urban),
    ln_educ        = log(educ + 0.01),
    ln_primary     = log(primary_comp + 0.01),
    # Decomposition of HE to address multicollinearity (see Section 4.3)
    health_total       = health_pub + health_priv,
    ln_health_tot      = log(health_total),
    share_pub          = health_pub / health_total,
    ln_share_pub       = log(share_pub),
    interact_share_gdp = ln_share_pub * ln_gdp_pc
  )

# Convert to panel data frame
panel_data <- pdata.frame(data_clean, index = c("iso2c", "year"))

cat("Countries:", length(unique(data_clean$iso2c)), "\n")
## Countries: 37
cat("Years: 2000–2022 (T =", length(unique(data_clean$year)), ")\n")
## Years: 2000–2022 (T = 23 )
cat("Total observations (key vars):", nrow(data_clean), "\n")
## Total observations (key vars): 851

4.3 Measurement Considerations

The expenditure and GDP variables are measured in current USD. This choice is transparent and consistent across the baseline specification, but it can combine real changes in resources with inflation and exchange-rate movements. The results should therefore be interpreted as associations based on the observed WDI series rather than as estimates constructed from constant-price or purchasing-power-parity measures. A future extension could assess sensitivity to alternative price-adjusted indicators where comparable coverage is available for the full OECD panel.

4.4 Summary Statistics

summary_table <- data_clean %>%
  summarise(
    `Life expectancy (years)` = list(c(
      Min = min(life_exp, na.rm = TRUE),
      Q1 = quantile(life_exp, 0.25, na.rm = TRUE),
      Median = median(life_exp, na.rm = TRUE),
      Mean = mean(life_exp, na.rm = TRUE),
      Max = max(life_exp, na.rm = TRUE)
    )),
    `Public HE per capita (USD)` = list(c(
      Min = min(health_pub, na.rm = TRUE),
      Q1 = quantile(health_pub, 0.25, na.rm = TRUE),
      Median = median(health_pub, na.rm = TRUE),
      Mean = mean(health_pub, na.rm = TRUE),
      Max = max(health_pub, na.rm = TRUE)
    )),
    `Private HE per capita (USD)` = list(c(
      Min = min(health_priv, na.rm = TRUE),
      Q1 = quantile(health_priv, 0.25, na.rm = TRUE),
      Median = median(health_priv, na.rm = TRUE),
      Mean = mean(health_priv, na.rm = TRUE),
      Max = max(health_priv, na.rm = TRUE)
    )),
    `GDP per capita (USD)` = list(c(
      Min = min(gdp_pc, na.rm = TRUE),
      Q1 = quantile(gdp_pc, 0.25, na.rm = TRUE),
      Median = median(gdp_pc, na.rm = TRUE),
      Mean = mean(gdp_pc, na.rm = TRUE),
      Max = max(gdp_pc, na.rm = TRUE)
    )),
    `Urban population (%)` = list(c(
      Min = min(urban, na.rm = TRUE),
      Q1 = quantile(urban, 0.25, na.rm = TRUE),
      Median = median(urban, na.rm = TRUE),
      Mean = mean(urban, na.rm = TRUE),
      Max = max(urban, na.rm = TRUE)
    ))
  ) %>%
  tidyr::pivot_longer(
    cols = everything(),
    names_to = "Variable",
    values_to = "Statistics"
  ) %>%
  tidyr::unnest_wider(Statistics)

knitr::kable(
  summary_table,
  digits = 2,
  caption = "Table 2. Summary statistics for the main variables"
)
Table 2. Summary statistics for the main variables
Variable Min Q1.25% Median Mean Max
Life expectancy (years) 69.75 77.05 79.87 79.06 84.56
Public HE per capita (USD) 92.15 766.85 1926.19 2274.96 7871.24
Private HE per capita (USD) 25.18 335.29 718.28 914.75 7203.52
GDP per capita (USD) 2311.94 16620.57 31695.83 35067.01 134965.82
Urban population (%) 49.76 68.00 78.31 76.40 94.99

4.5 Multicollinearity Check

# Ray & Linden (2020) note strong correlation between pub and priv HE
# in high-income countries (r = 0.707). We verify this.
cor_pub_priv <- cor(data_clean$ln_health_pub, data_clean$ln_health_priv,
                    use = "complete.obs")
cat("Correlation between ln_health_pub and ln_health_priv:", round(cor_pub_priv, 3), "\n")
## Correlation between ln_health_pub and ln_health_priv: 0.823
# r = 0.823 > 0.8 → multicollinearity problem
# Solution: decompose into total HE + share of public HE
cor_new <- cor(data_clean$ln_health_tot, data_clean$ln_share_pub,
               use = "complete.obs")
cat("Correlation between ln_health_tot and ln_share_pub:", round(cor_new, 3), "\n")
## Correlation between ln_health_tot and ln_share_pub: 0.145
# Initial public-private HE collinearity substantially reduced; residual collinearity with GDP remains.

4.6 Data Visualisation

ggplot(data_clean, aes(x = year, y = life_exp, group = country)) +
  geom_line(alpha = 0.4, color = "steelblue") +
  stat_summary(aes(group = 1), fun = mean, geom = "line",
               color = "darkred", linewidth = 1.2) +
  theme_minimal() +
  labs(title = "Life Expectancy at Birth in OECD Countries (2000–2022)",
       subtitle = "Individual countries (blue) and OECD mean (red)",
       x = "Year", y = "Life expectancy (years)")
Figure 1. Life expectancy trends in OECD countries, 2000–2022.

Figure 1. Life expectancy trends in OECD countries, 2000–2022.

Figure 1 plots life expectancy trends for individual OECD countries alongside the unweighted OECD mean over the period 2000–2022. The general upward trend is interrupted by a sharp decline in 2020–2021, reflecting the mortality impact of the COVID-19 pandemic. The pace of recovery varies across countries, with some returning to or exceeding pre-pandemic levels by 2022 while others continue to lag behind. This pattern motivates the inclusion of year fixed effects to account for common time shocks.

ggplot(data_clean, aes(x = health_pub + health_priv, y = life_exp)) +
  geom_point(alpha = 0.3, color = "steelblue") +
  geom_smooth(method = "lm", color = "darkred", se = TRUE) +
  theme_minimal() +
  labs(title = "Total Health Expenditure vs Life Expectancy (OECD, 2000–2022)",
       x = "Total health expenditure per capita (USD)",
       y = "Life expectancy at birth (years)")
Figure 2. Total health expenditure versus life expectancy in OECD countries, 2000–2022.

Figure 2. Total health expenditure versus life expectancy in OECD countries, 2000–2022.

Figure 2 shows a positive cross-sectional relationship between total health expenditure per capita and life expectancy. The relationship appears non-linear, with diminishing returns at higher expenditure levels. This pattern is consistent with the log-log specification adopted in the empirical analysis.

ggplot(data_clean, aes(x = health_pub, y = health_priv)) +
  geom_point(alpha = 0.3, color = "steelblue") +
  geom_smooth(method = "lm", color = "darkred") +
  theme_minimal() +
  labs(title = "Public vs Private Health Expenditure per capita (OECD, 2000–2022)",
       subtitle = paste0("Pearson r = ", round(cor_pub_priv, 3)),
       x = "Public HE per capita (USD)", y = "Private HE per capita (USD)")
Figure 3. Public versus private health expenditure per capita in OECD countries, 2000–2022.

Figure 3. Public versus private health expenditure per capita in OECD countries, 2000–2022.

Figure 3 illustrates the strong positive correlation between public and private health expenditure per capita. The Pearson correlation coefficient is 0.823. This plot directly motivates the decomposition strategy described in Section 4.3. Including public and private expenditure as separate regressors would create a substantial multicollinearity concern. The final specification therefore separates the overall expenditure level from the public financing share.


5 Method / Model

5.1 Model Specification

We estimate a two-way fixed effects panel data model:

\[\ln LE_{it} = \alpha_i + \lambda_t + \beta_1 \ln HE^{tot}_{it} + \beta_2 \ln s^{pub}_{it} + \beta_3 \ln GDP_{it} + \beta_4 \ln Urban_{it} + \varepsilon_{it}\]

where \(\alpha_i\) are country fixed effects, \(\lambda_t\) are year fixed effects, and all variables are expressed in natural logarithms. The coefficients can therefore be interpreted as elasticities. In contrast to Ray and Linden (2020), who employ dynamic panel estimators, this paper uses a static fixed-effects specification consistent with the scope of the study.

5.2 Rationale for the Fixed-Effects Specification

Country fixed effects absorb time-invariant characteristics that may influence both health expenditure and life expectancy, including persistent institutional differences, health-system structures, and geographic conditions. Year effects absorb common shocks and broad trends affecting all countries, including medical innovation, macroeconomic conditions, and the COVID-19 period. The coefficients are identified from changes within countries over time rather than from permanent differences between countries.

This design improves comparability relative to pooled OLS, but it does not resolve all sources of endogeneity. Time-varying omitted variables and reverse causality may remain. Accordingly, the estimates are interpreted as conditional within-country associations rather than causal effects.

5.3 Interpretation of the Financing-Composition Decomposition

The decomposition into ln_health_tot and ln_share_pub separates two distinct questions. The coefficient on ln_health_tot captures the association between life expectancy and the overall level of health expenditure per capita, holding the financing composition constant. The coefficient on ln_share_pub captures the association between life expectancy and a change in the public financing share, holding total expenditure constant.

This distinction is central to the interpretation of the paper. A negative coefficient on ln_share_pub does not imply that public healthcare is intrinsically inferior to private healthcare. It indicates that, within the observed OECD sample and conditional on the controls, increases in the public share are associated with lower life expectancy. Such a pattern can be consistent with reverse causality or crisis-related expenditure responses.

5.4 Hausman Test and Individual Effects

# General models for testing
fixed_general <- plm(ln_life_exp ~ ln_health_tot + ln_share_pub +
                       ln_gdp_pc + ln_urban + ln_educ + ln_primary +
                       interact_share_gdp,
                     data  = panel_data,
                     index = c("iso2c", "year"),
                     model = "within")

random_general <- plm(ln_life_exp ~ ln_health_tot + ln_share_pub +
                        ln_gdp_pc + ln_urban + ln_educ + ln_primary +
                        interact_share_gdp,   
                      data  = panel_data,
                      index = c("iso2c", "year"),
                      model = "random")

pols_general <- plm(ln_life_exp ~ ln_health_tot + ln_share_pub +
                      ln_gdp_pc + ln_urban + ln_educ + ln_primary +
                      interact_share_gdp,     
                    data  = panel_data,
                    index = c("iso2c", "year"),
                    model = "pooling")

# Hausman test: H0 = RE consistent, H1 = FE preferred
phtest(fixed_general, random_general)
## 
##  Hausman Test
## 
## data:  ln_life_exp ~ ln_health_tot + ln_share_pub + ln_gdp_pc + ln_urban +  ...
## chisq = 69.011, df = 7, p-value = 0.000000000002341
## alternative hypothesis: one model is inconsistent
# Individual effects
pFtest(fixed_general, pols_general)
## 
##  F test for individual effects
## 
## data:  ln_life_exp ~ ln_health_tot + ln_share_pub + ln_gdp_pc + ln_urban +  ...
## F = 68.57, df1 = 30, df2 = 468, p-value < 0.00000000000000022
## alternative hypothesis: significant effects
plmtest(pols_general, type = c("bp"))
## 
##  Lagrange Multiplier Test - (Breusch-Pagan)
## 
## data:  ln_life_exp ~ ln_health_tot + ln_share_pub + ln_gdp_pc + ln_urban +  ...
## chisq = 1714.8, df = 1, p-value < 0.00000000000000022
## alternative hypothesis: significant effects

The Hausman test (\(\chi^2 = 69.011\), df = 7, p < 0.001) rejects the null hypothesis of random-effects consistency, indicating that the fixed-effects estimator is preferred. Both the F-test for individual effects and the Breusch-Pagan Lagrange Multiplier test reject pooled OLS (p < 0.001).

5.5 Time Effects Test

# Test whether year dummies are jointly significant
fixed_time <- plm(ln_life_exp ~ ln_health_tot + ln_share_pub +
                    ln_gdp_pc + ln_urban + ln_educ + ln_primary +
                    interact_share_gdp + factor(year),
                  data  = panel_data,
                  index = c("iso2c", "year"),
                  model = "within")

pFtest(fixed_time, fixed_general)
## 
##  F test for individual effects
## 
## data:  ln_life_exp ~ ln_health_tot + ln_share_pub + ln_gdp_pc + ln_urban +  ...
## F = 19.866, df1 = 22, df2 = 446, p-value < 0.00000000000000022
## alternative hypothesis: significant effects
plmtest(fixed_general, c("time"), type = ("bp"))
## 
##  Lagrange Multiplier Test - time effects (Breusch-Pagan)
## 
## data:  ln_life_exp ~ ln_health_tot + ln_share_pub + ln_gdp_pc + ln_urban +  ...
## chisq = 39.605, df = 1, p-value = 0.0000000003109
## alternative hypothesis: significant effects
# Time effects significant → re-estimate fixed_general WITH factor(year)
# From this point all models include year dummies
fixed_general <- plm(ln_life_exp ~ ln_health_tot + ln_share_pub +
                       ln_gdp_pc + ln_urban + ln_educ + ln_primary +
                       interact_share_gdp + factor(year),
                     data  = panel_data,
                     index = c("iso2c", "year"),
                     model = "within")

Both tests confirm that time effects are significant (p<0.001), justifying the inclusion of year dummies in all models.

5.6 General-to-Specific Variable Selection

# Step 1: Test ln_educ (least significant in general model, p=0.632)
h <- rbind(c(0, 0, 0, 0, 1, 0, 0, rep(0, 22)))
wald.test(b = coef(fixed_general), Sigma = vcov(fixed_general), L = h)
## Wald test:
## ----------
## 
## Chi-squared test:
## X2 = 0.23, df = 1, P(> X2) = 0.63
# p = 0.63 → remove
# Step 2: Test ln_educ + interact_share_gdp jointly
h <- rbind(c(0, 0, 0, 0, 1, 0, 0, rep(0, 22)),
           c(0, 0, 0, 0, 0, 0, 1, rep(0, 22)))
wald.test(b = coef(fixed_general), Sigma = vcov(fixed_general), L = h)
## Wald test:
## ----------
## 
## Chi-squared test:
## X2 = 1.2, df = 2, P(> X2) = 0.56
# p = 0.56 → remove both
# Intermediate model: remove ln_educ and interact_share_gdp
fixed_inter <- plm(ln_life_exp ~ ln_health_tot + ln_share_pub +
                     ln_gdp_pc + ln_urban + ln_primary + factor(year),
                   data  = panel_data,
                   index = c("iso2c", "year"),
                   model = "within")
summary(fixed_inter)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = ln_life_exp ~ ln_health_tot + ln_share_pub + ln_gdp_pc + 
##     ln_urban + ln_primary + factor(year), data = panel_data, 
##     model = "within", index = c("iso2c", "year"))
## 
## Unbalanced Panel: n = 31, T = 1-23, N = 553
## 
## Residuals:
##       Min.    1st Qu.     Median    3rd Qu.       Max. 
## -0.0539694 -0.0031535  0.0003055  0.0041928  0.0291080 
## 
## Coefficients:
##                    Estimate Std. Error t-value          Pr(>|t|)    
## ln_health_tot     0.0172445  0.0041810  4.1245 0.000043574642261 ***
## ln_share_pub     -0.0371279  0.0085203 -4.3576 0.000015998331587 ***
## ln_gdp_pc         0.0089082  0.0050242  1.7731         0.0768341 .  
## ln_urban          0.1067302  0.0230797  4.6244 0.000004800664634 ***
## ln_primary       -0.0182512  0.0125783 -1.4510         0.1474121    
## factor(year)2001  0.0098642  0.0040079  2.4612         0.0141883 *  
## factor(year)2002  0.0092417  0.0039754  2.3247         0.0204912 *  
## factor(year)2003  0.0073260  0.0040619  1.8036         0.0719067 .  
## factor(year)2004  0.0080057  0.0040428  1.9802         0.0482298 *  
## factor(year)2005  0.0068456  0.0041197  1.6617         0.0972080 .  
## factor(year)2006  0.0074795  0.0041900  1.7851         0.0748604 .  
## factor(year)2007  0.0044716  0.0043428  1.0296         0.3036775    
## factor(year)2008  0.0070331  0.0044530  1.5794         0.1148872    
## factor(year)2009  0.0133304  0.0043848  3.0401         0.0024901 ** 
## factor(year)2010  0.0160660  0.0044174  3.6370         0.0003048 ***
## factor(year)2011  0.0193103  0.0044672  4.3226 0.000018647159135 ***
## factor(year)2012  0.0218625  0.0044296  4.9356 0.000001093923692 ***
## factor(year)2013  0.0233758  0.0044646  5.2358 0.000000243262742 ***
## factor(year)2014  0.0266553  0.0045139  5.9052 0.000000006550707 ***
## factor(year)2015  0.0285565  0.0044249  6.4537 0.000000000260570 ***
## factor(year)2016  0.0312939  0.0044665  7.0064 0.000000000008028 ***
## factor(year)2017  0.0307107  0.0045388  6.7663 0.000000000037424 ***
## factor(year)2018  0.0301046  0.0046281  6.5048 0.000000000190646 ***
## factor(year)2019  0.0335430  0.0046491  7.2149 0.000000000002038 ***
## factor(year)2020  0.0240097  0.0047824  5.0204 0.000000720552916 ***
## factor(year)2021  0.0143887  0.0049715  2.8942         0.0039684 ** 
## factor(year)2022  0.0238261  0.0049007  4.8617 0.000001565081553 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    0.18626
## Residual Sum of Squares: 0.029917
## R-Squared:      0.83939
## Adj. R-Squared: 0.82089
## F-statistic: 95.8122 on 27 and 495 DF, p-value: < 0.000000000000000222
# Step 3: Test ln_primary (p=0.147 in intermediate model)
h <- rbind(c(0, 0, 0, 0, 1, rep(0, 22)))
wald.test(b = coef(fixed_inter), Sigma = vcov(fixed_inter), L = h)
## Wald test:
## ----------
## 
## Chi-squared test:
## X2 = 2.1, df = 1, P(> X2) = 0.15
# p = 0.15 → remove
# Joint Wald test: all removed variables vs fixed_general
h <- rbind(c(0, 0, 0, 0, 1, 0, 0, rep(0, 22)),  # ln_educ
           c(0, 0, 0, 0, 0, 0, 1, rep(0, 22)),   # interact_share_gdp
           c(0, 0, 0, 0, 0, 1, 0, rep(0, 22)))   # ln_primary
wald.test(b = coef(fixed_general), Sigma = vcov(fixed_general), L = h)
## Wald test:
## ----------
## 
## Chi-squared test:
## X2 = 4.6, df = 3, P(> X2) = 0.21
# p = 0.21 → all three can be omitted jointly ✓

6 Results

6.1 Final Model

# Final model: all remaining variables significant at 5%
fixed_final <- plm(ln_life_exp ~ ln_health_tot + ln_share_pub +
                     ln_gdp_pc + ln_urban + factor(year),
                   data  = panel_data,
                   index = c("iso2c", "year"),
                   model = "within")
summary(fixed_final)
## Oneway (individual) effect Within Model
## 
## Call:
## plm(formula = ln_life_exp ~ ln_health_tot + ln_share_pub + ln_gdp_pc + 
##     ln_urban + factor(year), data = panel_data, model = "within", 
##     index = c("iso2c", "year"))
## 
## Balanced Panel: n = 37, T = 23, N = 851
## 
## Residuals:
##        Min.     1st Qu.      Median     3rd Qu.        Max. 
## -0.05601144 -0.00333204  0.00050824  0.00409284  0.03117265 
## 
## Coefficients:
##                    Estimate Std. Error t-value              Pr(>|t|)    
## ln_health_tot     0.0107785  0.0030745  3.5058             0.0004810 ***
## ln_share_pub     -0.0469729  0.0059519 -7.8921  0.000000000000009937 ***
## ln_gdp_pc         0.0087448  0.0033931  2.5772             0.0101414 *  
## ln_urban          0.0713760  0.0095355  7.4853  0.000000000000191214 ***
## factor(year)2001  0.0041107  0.0017355  2.3686             0.0180949 *  
## factor(year)2002  0.0046280  0.0017546  2.6377             0.0085123 ** 
## factor(year)2003  0.0031656  0.0018313  1.7287             0.0842620 .  
## factor(year)2004  0.0053277  0.0019300  2.7604             0.0059073 ** 
## factor(year)2005  0.0052865  0.0020084  2.6322             0.0086487 ** 
## factor(year)2006  0.0069679  0.0020797  3.3504             0.0008452 ***
## factor(year)2007  0.0060640  0.0022373  2.7104             0.0068663 ** 
## factor(year)2008  0.0086226  0.0023571  3.6581             0.0002710 ***
## factor(year)2009  0.0140652  0.0023089  6.0918  0.000000001744419377 ***
## factor(year)2010  0.0164942  0.0023436  7.0380  0.000000000004244033 ***
## factor(year)2011  0.0195555  0.0024395  8.0163  0.000000000000003926 ***
## factor(year)2012  0.0214193  0.0024133  8.8755 < 0.00000000000000022 ***
## factor(year)2013  0.0233278  0.0024683  9.4510 < 0.00000000000000022 ***
## factor(year)2014  0.0268635  0.0025005 10.7431 < 0.00000000000000022 ***
## factor(year)2015  0.0280709  0.0023833 11.7781 < 0.00000000000000022 ***
## factor(year)2016  0.0304133  0.0024187 12.5741 < 0.00000000000000022 ***
## factor(year)2017  0.0303470  0.0024853 12.2107 < 0.00000000000000022 ***
## factor(year)2018  0.0304854  0.0025651 11.8845 < 0.00000000000000022 ***
## factor(year)2019  0.0339061  0.0025744 13.1707 < 0.00000000000000022 ***
## factor(year)2020  0.0257021  0.0026535  9.6860 < 0.00000000000000022 ***
## factor(year)2021  0.0184280  0.0028229  6.5281  0.000000000119211071 ***
## factor(year)2022  0.0259716  0.0027561  9.4232 < 0.00000000000000022 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Total Sum of Squares:    0.30029
## Residual Sum of Squares: 0.043711
## R-Squared:      0.85444
## Adj. R-Squared: 0.84299
## F-statistic: 177.908 on 26 and 788 DF, p-value: < 0.000000000000000222

6.2 Publication-Quality Table

# White1 HC0 heteroskedasticity-robust standard errors clustered by country
robust_se <- list(
  sqrt(diag(vcovHC(fixed_general, method = "white1", type = "HC0", cluster = "group"))),
  sqrt(diag(vcovHC(fixed_inter,   method = "white1", type = "HC0", cluster = "group"))),
  sqrt(diag(vcovHC(fixed_final,   method = "white1", type = "HC0", cluster = "group")))
)

stargazer(fixed_general, fixed_inter, fixed_final,
          title  = "Table 3. Panel Data Models: Health Expenditure and Life Expectancy in OECD Countries (2000–2022)",
          dep.var.labels  = "ln(Life Expectancy at Birth)",
          covariate.labels = c("ln(Total HE per capita)",
                               "ln(Share of public HE)",
                               "ln(GDP per capita)",
                               "ln(Urban population %)",
                               "ln(Education exp. % GDP)",
                               "ln(Primary completion rate)",
                               "ln(Share pub) x ln(GDP pc)"),
          column.labels   = c("General", "Intermediate", "Final"),
          omit            = "year",
          add.lines       = list(c("Year fixed effects",    "Yes", "Yes", "Yes"),
                                 c("Country fixed effects", "Yes", "Yes", "Yes")),
          star.cutoffs    = c(0.10, 0.05, 0.01),
          align           = TRUE,
          se              = robust_se,
          type            = "html",
          notes           = NULL,
          notes.append    = FALSE)
Table 3. Panel Data Models: Health Expenditure and Life Expectancy in OECD Countries (2000–2022)
Dependent variable:
ln(Life Expectancy at Birth)
General Intermediate Final
(1) (2) (3)
ln(Total HE per capita) 0.007 0.017*** 0.011***
(0.005) (0.005) (0.004)
ln(Share of public HE) -0.108** -0.037*** -0.047***
(0.047) (0.010) (0.007)
ln(GDP per capita) 0.019*** 0.009 0.009**
(0.007) (0.006) (0.004)
ln(Urban population %) 0.101*** 0.107*** 0.071***
(0.021) (0.021) (0.008)
ln(Education exp. % GDP) -0.002
(0.005)
ln(Primary completion rate) -0.024** -0.018
(0.011) (0.011)
ln(Share pub) x ln(GDP pc) 0.007
(0.005)
Year fixed effects Yes Yes Yes
Country fixed effects Yes Yes Yes
Observations 506 553 851
R2 0.840 0.839 0.854
Adjusted R2 0.818 0.821 0.843
F Statistic 80.490*** (df = 29; 446) 95.812*** (df = 27; 495) 177.908*** (df = 26; 788)
Note: p<0.1; p<0.05; p<0.01
cat("<p><em>Note:</em> White1 HC0 heteroskedasticity-robust standard errors clustered by country are reported in parentheses. * p &lt; 0.10; ** p &lt; 0.05; *** p &lt; 0.01.</p>")

Note: White1 HC0 heteroskedasticity-robust standard errors clustered by country are reported in parentheses. * p < 0.10; ** p < 0.05; *** p < 0.01.

6.3 Diagnostic Tests

# Breusch-Godfrey: serial autocorrelation
# H0: no serial correlation
pbgtest(fixed_final)
## 
##  Breusch-Godfrey/Wooldridge test for serial correlation in panel models
## 
## data:  ln_life_exp ~ ln_health_tot + ln_share_pub + ln_gdp_pc + ln_urban +  ...
## chisq = 408.43, df = 23, p-value < 0.00000000000000022
## alternative hypothesis: serial correlation in idiosyncratic errors
# Breusch-Pagan: heteroskedasticity
# H0: homoskedasticity
bptest(ln_life_exp ~ ln_health_tot + ln_share_pub +
         ln_gdp_pc + ln_urban + factor(year),
       data = data_clean, studentize = TRUE)
## 
##  studentized Breusch-Pagan test
## 
## data:  ln_life_exp ~ ln_health_tot + ln_share_pub + ln_gdp_pc + ln_urban +     factor(year)
## BP = 124.58, df = 26, p-value = 0.000000000000007797
# White1 HC0 heteroskedasticity-robust standard errors clustered by country
coeftest(fixed_final,
         vcov. = vcovHC(fixed_final, method = "white1",
                        type = "HC0", cluster = "group"))
## 
## t test of coefficients:
## 
##                    Estimate Std. Error t value              Pr(>|t|)    
## ln_health_tot     0.0107785  0.0039381  2.7370             0.0063398 ** 
## ln_share_pub     -0.0469729  0.0069225 -6.7856 0.0000000000227259515 ***
## ln_gdp_pc         0.0087448  0.0040515  2.1584             0.0311964 *  
## ln_urban          0.0713760  0.0078972  9.0381 < 0.00000000000000022 ***
## factor(year)2001  0.0041107  0.0019830  2.0729             0.0385030 *  
## factor(year)2002  0.0046280  0.0019008  2.4347             0.0151231 *  
## factor(year)2003  0.0031656  0.0020027  1.5807             0.1143424    
## factor(year)2004  0.0053277  0.0019772  2.6946             0.0071978 ** 
## factor(year)2005  0.0052865  0.0020190  2.6183             0.0090057 ** 
## factor(year)2006  0.0069679  0.0022175  3.1422             0.0017395 ** 
## factor(year)2007  0.0060640  0.0025145  2.4116             0.0161089 *  
## factor(year)2008  0.0086226  0.0024855  3.4692             0.0005503 ***
## factor(year)2009  0.0140652  0.0023607  5.9581 0.0000000038452070293 ***
## factor(year)2010  0.0164942  0.0024069  6.8529 0.0000000000146011103 ***
## factor(year)2011  0.0195555  0.0025726  7.6014 0.0000000000000833403 ***
## factor(year)2012  0.0214193  0.0025369  8.4433 < 0.00000000000000022 ***
## factor(year)2013  0.0233278  0.0026347  8.8542 < 0.00000000000000022 ***
## factor(year)2014  0.0268635  0.0026539 10.1224 < 0.00000000000000022 ***
## factor(year)2015  0.0280709  0.0024741 11.3459 < 0.00000000000000022 ***
## factor(year)2016  0.0304133  0.0025718 11.8258 < 0.00000000000000022 ***
## factor(year)2017  0.0303470  0.0026264 11.5548 < 0.00000000000000022 ***
## factor(year)2018  0.0304854  0.0027229 11.1960 < 0.00000000000000022 ***
## factor(year)2019  0.0339061  0.0027620 12.2760 < 0.00000000000000022 ***
## factor(year)2020  0.0257021  0.0034633  7.4212 0.0000000000003011198 ***
## factor(year)2021  0.0184280  0.0042479  4.3382 0.0000162323000138657 ***
## factor(year)2022  0.0259716  0.0031372  8.2785 0.0000000000000005321 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# VIF — multicollinearity check on OLS equivalent
ols_final <- lm(ln_life_exp ~ ln_health_tot + ln_share_pub +
                  ln_gdp_pc + ln_urban + factor(year),
                data = data_clean)
vif(ols_final)
##                    GVIF Df GVIF^(1/(2*Df))
## ln_health_tot 21.958909  1        4.686033
## ln_share_pub   1.081380  1        1.039894
## ln_gdp_pc     22.682276  1        4.762591
## ln_urban       1.202760  1        1.096704
## factor(year)   1.225924 22        1.004640

The Breusch-Godfrey/Wooldridge test rejects the null of no serial correlation, and the studentised Breusch-Pagan test rejects homoskedasticity. The reported White1 HC0 standard errors are clustered by country and address heteroskedasticity. Serial correlation remains a limitation of the baseline inference strategy and should be considered when interpreting statistical significance.

The VIF diagnostics also indicate residual collinearity between total health expenditure and GDP per capita. The decomposition substantially reduces the initial correlation between public and private expenditure, but it does not eliminate all collinearity among the retained regressors.

6.4 Robustness Check: Excluding the COVID-19 Period

To assess whether the negative coefficient on the public expenditure share is driven primarily by the COVID-19 period, the final fixed-effects specification is re-estimated using the pre-COVID sample covering 2000–2019.

# Keep pre-COVID observations only: 2000–2019
# year is stored as a factor in panel_data, so convert it safely before filtering
panel_precovid <- subset(
  panel_data,
  as.numeric(as.character(year)) <= 2019
)

# Re-estimate the final specification on the pre-COVID sample
fixed_precovid <- plm(
  ln_life_exp ~ ln_health_tot + ln_share_pub +
    ln_gdp_pc + ln_urban + factor(year),
  data  = panel_precovid,
  index = c("iso2c", "year"),
  model = "within"
)

# Robust standard errors for the baseline and pre-COVID models
baseline_robust_se <- sqrt(
  diag(
    vcovHC(
      fixed_final,
      method  = "white1",
      type    = "HC0",
      cluster = "group"
    )
  )
)

precovid_robust_se <- sqrt(
  diag(
    vcovHC(
      fixed_precovid,
      method  = "white1",
      type    = "HC0",
      cluster = "group"
    )
  )
)

# Table 4
stargazer(
  fixed_final,
  fixed_precovid,
  type = "html",
  se = list(
    baseline_robust_se,
    precovid_robust_se
  ),
  title = "Table 4. Robustness Check Excluding the COVID-19 Period",
  dep.var.labels = "ln(Life Expectancy at Birth)",
  column.labels = c(
    "Baseline: 2000-2022",
    "Pre-COVID: 2000-2019"
  ),
  covariate.labels = c(
    "ln(Total HE per capita)",
    "ln(Share of public HE)",
    "ln(GDP per capita)",
    "ln(Urban population %)"
  ),
  omit = "factor\\(year\\)",
  add.lines = list(
    c("Country fixed effects", "Yes", "Yes"),
    c("Year fixed effects", "Yes", "Yes")
  ),
  star.cutoffs = c(0.10, 0.05, 0.01),
  notes = NULL,
  notes.append = FALSE,
  align = TRUE
)
Table 4. Robustness Check Excluding the COVID-19 Period
Dependent variable:
ln(Life Expectancy at Birth)
Baseline: 2000-2022 Pre-COVID: 2000-2019
(1) (2)
ln(Total HE per capita) 0.011*** 0.010***
(0.004) (0.004)
ln(Share of public HE) -0.047*** -0.036***
(0.007) (0.006)
ln(GDP per capita) 0.009** 0.007*
(0.004) (0.004)
ln(Urban population %) 0.071*** 0.059***
(0.008) (0.007)
Country fixed effects Yes Yes
Year fixed effects Yes Yes
Observations 851 740
R2 0.854 0.901
Adjusted R2 0.843 0.892
F Statistic 177.908*** (df = 26; 788) 269.103*** (df = 23; 680)
Note: p<0.1; p<0.05; p<0.01
cat("<p><em>Note:</em> White1 HC0 heteroskedasticity-robust standard errors clustered by country are reported in parentheses. * p &lt; 0.10; ** p &lt; 0.05; *** p &lt; 0.01.</p>")

Note: White1 HC0 heteroskedasticity-robust standard errors clustered by country are reported in parentheses. * p < 0.10; ** p < 0.05; *** p < 0.01.

The coefficient on the public expenditure share remains negative and statistically significant at the 1% level. Its absolute magnitude decreases from −0.047 in the full sample to −0.036 in the pre-COVID sample. This indicates that the pandemic period may have amplified the negative association, but it does not fully explain the result. The negative relationship is already present before 2020.

The coefficients on total health expenditure and urbanisation also remain positive and statistically significant. GDP per capita remains positively associated with life expectancy, although its statistical significance is weaker in the pre-COVID specification. As in the baseline model, the estimates should be interpreted as conditional associations rather than causal effects.

6.5 Hypothesis Verification

H1 — Main hypothesis: Rejected. The share of public health expenditure has a statistically significant but negative effect on life expectancy (β = −0.047, p<0.01). The direction is contrary to the hypothesis. See Section 7 for discussion.

H2 — Secondary hypothesis: Rejected. The interaction term ln_share_pub × ln_gdp_pc was not statistically significant (p=0.298) and was removed during the general-to-specific procedure.


7 Findings and Policy Implications

The results confirm that total health expenditure, GDP per capita, and urbanisation are positively associated with life expectancy in OECD countries, consistent with the broader literature (Ray and Linden, 2020; Aytemiz et al., 2024). Urbanisation emerges as the strongest predictor in the final model (\(\beta = 0.071\)), while total health expenditure and GDP per capita have smaller but statistically significant elasticities.

The negative coefficient on the public expenditure share is unexpected but interpretable. Within OECD countries, increases in the public share over time can occur during periods of economic stress or health emergencies, when population health outcomes may also deteriorate. This pattern is consistent with possible reverse causality or time-varying confounding, although a static fixed-effects model cannot establish the mechanism definitively.

The pre-COVID robustness check strengthens this interpretation. After excluding 2020–2022, the public-share coefficient remains negative and statistically significant, while its absolute magnitude falls from −0.047 to −0.036. The pandemic period may therefore have amplified the association, but it does not fully account for the result.

From a policy perspective, the findings do not support a simple conclusion that public financing is less effective than private financing. The estimated coefficient refers to changes in the public share while holding total expenditure constant. Decisions about health financing should therefore consider not only the public-private composition of spending, but also expenditure levels, institutional design, the allocation of resources, and the timing of fiscal responses. A higher public share may be particularly important during crises even when the contemporaneous association with life expectancy is negative.

Limitations:

  • Endogeneity of health expenditure variables: a static fixed-effects model cannot fully resolve reverse causality.
  • Residual multicollinearity between total health expenditure and GDP per capita (GVIF approximately 22), which can inflate standard errors.
  • White1 HC0 standard errors address heteroskedasticity but do not fully resolve the serial correlation detected in the diagnostics.
  • The static model does not capture dynamic adjustment paths.
  • Models including education variables use reduced estimation samples because of missing observations (506 versus 851 observations).

Directions for future research:

  • Apply instrumental-variable or dynamic-panel methods to address endogeneity.
  • Examine price-adjusted or purchasing-power-parity expenditure measures where adequate coverage is available.
  • Disaggregate public expenditure by type, such as preventive and curative care.
  • Extend the analysis to non-OECD countries with an inference strategy appropriate for greater heterogeneity and cross-sectional dependence.

8 Bibliography

Aytemiz, L., Sart, G., Bayar, Y., Danilina, M., & Sezgin, F. H. (2024). The long-term effect of social, educational, and health expenditures on indicators of life expectancy: An empirical analysis for the OECD countries. Frontiers in Public Health, 12, 1497794.

Cremieux, P.-Y., Meilleur, M.-C., Ouellette, P., Petit, P., Zelder, M., & Potvin, K. (2005). Public and private pharmaceutical spending as determinants of health outcomes in Canada. Health Economics, 14, 107–116.

Lichtenberg, F. R. (2000). Sources of U.S. longevity increase, 1960–1997. CESifo Working Paper Series No. 405.

Linden, M., & Ray, D. (2017). Life expectancy effects of public and private health expenditures in OECD countries 1970–2012: Panel time series approach. Economic Analysis and Policy, 56, 101–113.

Musgrove, P. (1996). Public and private roles in health: Theory and financing patterns. World Bank Discussion Paper No. 339. Washington, DC: World Bank.

Nixon, J., & Ulmann, P. (2006). The relationship between health care expenditure and health outcomes: Evidence and caveats for a causal link. European Journal of Health Economics, 7(1), 7–18.

Novignon, J., Olakojo, S. A., & Nonvignon, J. (2012). The effects of public and private health care expenditure on health status in sub-Saharan Africa: New evidence from panel data analysis. Health Economics Review, 2(1), 1–8.

Or, Z. (2000). Exploring the effects of health care on mortality across OECD countries. Labour Market and Social Policy Occasional Papers No. 46. Paris: OECD.

Ray, D., & Linden, M. (2020). Health expenditure, longevity, and child mortality: Dynamic panel data approach with global data. International Journal of Health Economics and Management, 20, 99–119. https://doi.org/10.1007/s10754-019-09272-z

World Bank. (2024). World Development Indicators. https://databank.worldbank.org


9 Appendix

The complete annotated R script is submitted separately as AE_Project_HealthExp_FINAL.R.