Identifying heterogeneous impacts of agri-environmental schemes using generalized random forests


Christian Stetter

Agricultural Economics and Policy Group, ETH Zurich

2023-03-28

Today’s talk

Link to the paper

Farming and the environment in the European Union

  • Like other regions in the world, farming in the EU is faced with several environmental challenges:

    - Biodiversity

    - Land degradation

    - Water pollution

    - GHG emissions


  • There has been lively debate on how the Common Agricultural Policy of the EU can tackle these challenges.


  • Consensus about key objectives of the CAP in terms of the environment:

    - Environmental care

    - Climate change action

    - Preservation of landscapes

    - Biodiversity

What are agri-environmental schemes?

  • As a means to reduce the pressure of agriculture on the environment, the EU introduced agri-environmental schemes

  • Not part of direct payments system

  • Voluntary

  • How they work (action-based):

    • Farmers commit themselves to adopt environmentally-friendly farming techniques that go beyond legal obligations.

    • In return, they receive payments that are supposed to provide compensation for additional costs and income foregone (resulting from applying those environmentally friendly farming practices)

  • Examples:
    • Renunciation of pesticides
    • Five-part crop rotation





…but are they actually effective?

What does the literature say?



  • Average treatment effects of AES on, e.g.:

    • Fertilizer and pesticide use [0/-] [1][3]
    • Crop diversity [0/+] [4]
    • Grassland (share/diversity) [+] [1], [3], [5]
    • Biodiversity conservation area [+] [6]
  • Usual estimation approaches:

    • Matching

    • Difference in Difference (DID)

    • DID Matching


but:

  • AES might have different effects on different farms
  • These methods are not built for assessing heterogeneous effects

AES might have different effects on different farms

Stylized example: assessing heterogeneous effects

Imagine the following situation:

The government has set up a program to minimize greenhouse gas emissions from farming.It now wants to evaluate the success of this program and see if the money spent is worth it.

Research Objectives



  1. Find a suitable approach to estimate heterogeneous effects of AES participation and demonstrate its usefulness.



  1. Assess the impact heterogeneity of AES within the climate, biodiversity, soil and water health domains.



  1. Use contextual knowledge to improve targeting for an increased effectiveness of AES.

Theoretical Framework: Production possibilities

Theoretical Framework:
Conditional average treatment effect

Farm A

Farm B

Farm C

  • Production possibilities depend on farm-specific contextual variables \(X\)

  • Effect of interest: \[\tau = E[Q_1 - Q_0| X=x ]\]

\(X\) domains:

  • Resource bundle and input intensities
  • Output bundle
  • Farm and farmer characteristics
  • Biophysical environment
  • Institutional and market environment

Potential outcomes model


Suppose a set of i.i.d. farm households \(i = 1, \dots, n\), for which we observe \(\left( X_i, Q_i, D_i \right)\), where \(X_i = x \in \mathbb{R}^p\) is a vector of p features describing the individual farming context and containing all determinants of \(Q^0\) and \(Q^1\) as well as the determinants of the participation decision.


Given the potential outcomes \(Q_i^0\) and \(Q_i^1\), for each farm \(i\) that is (uniquely) characterized by its feature vector \(x\), we wish to estimate the conditional average treatment effect (CATE):

\[\tau(x) = \mathbb{E}\big[ Q_i^1 - Q_i^0 \; \vert \; X_i = x \big]\]


➡️ It is impossible to observe the effect for more than one treatment on a subject.

➡️ Hence, we can only observe realization \(Q_i = Q_i(D_i)\).

➡️ Without further assumptions, it is impossible to identify the CATE \(\tau(x)\).

Identifying assumptions

  1. Conditional independence assumption: \(D_i\) is independent of unobservable features conditional on \(X_i\):

    \[{Q_i^1, Q_i^0} \perp \!\!\! \perp D_i \mid X_i\]


  1. Common support (overlap): No perfect predictability of program participation, i.e. individuals with the same X have a positive probability of being both participants and non-participants:

\[0 < P(D_i = w \mid X) < 1\]


  1. Stable unit Treatment Values Assumption (SUTVA):

         - No interference between units.

         - Homogeneous treatment variable.

Estimation strategy:
Generalized random forest [7]

\[\hat{\tau}(x) = \min_{\tau} \sum_{i=1}^{n} \color{red}{\alpha_i(x)} \times (Y_i - (const.+\tau W_i ))^2\]

Generalized random forest

  • Uses the basic ideas of regression trees and random forests:

    • Recursive partitioning, subsampling and random split selection of input variables
    • Subsample is recursively divided into binary nodes such that the prediction error is minimized
  • Adjusted splitting rule:

    • Instead of minimizing the prediction error
    • Maximize homogeneity within each subgroup and maximize treatment effect across subgroups
  • Allows to assess statistical uncertainty

  • Allows for (higher-order) interactions and non-linearities

  • De facto no limit on the number of predictors

GRF: Finding causal weights

\[\hat{\tau}(x) = \min_{\tau} \sum_{i=1}^{n} \boxed{ \color{red}{\alpha_i(x)}} \times \color{navy}{(Y_i - (const.+\tau W_i ))^2}\]

Case study description: Bavaria

Data


  • Farm bookkeeping data (FADN)

  • Sample of 2758 farms for 2014

Environmental indicators (Y)

Domain Indicator Teated Un-treated Entire sample
Soil/water
Fertilizer intensity
(Euro/ha)
187 205 194
Soil/water
Pesticide intensity
(Euro/ha)
120 121 120
Biodiversity
Gini-Simpson index
(0-100)
67 64 66
Climate
GHG emissions
(t CO2eq )
469 411 445

Treatment (W)

  • Binary participation indicator (Treated: 1677 Untreated: 1081)

Contextual variables (X, ca. 130)

Domain Description
Resource bundle and input intensities
Land use, labor, materials & capital, cultivation plan, livestock count and composition
Output bundle
Crop and livestock outputs
Farm and farmer characteristics
Farm type, whole farm value added, farmer's age, yields etc.
Biophysical environment
Soil quality, altitude etc.
Institutional and market environment
Administrative district, local unemployment rate, local land rental price etc

Results: Heterogeneous impact predictions


Results: Heterogeneous impacts and statistical significance

GHG emissions Fertilizer intensity Pesticide intensity Land use diversity
Share of (in-)significant AES participation effects
(evaluated at the 95% significance level):
increase, decrease, insignificant
Distributions of significant treatment effects
(evaluated at the 95% significance level):
Mean significant effects
(evaluated at the 95% significance level)
Mean(+): 12t
Mean(-): -10.8t
Mean(+): /
Mean(-): -14.3EUR/ha
Mean(+): 6.6EUR/ha
Mean(-): -10.3EUR/ha
Mean(+): 1.6
Mean(-): -0.9

Results: Additionality, adversity and windfall effects




  • We found additionality in approx. 70% of the observations

  • We found adverse effects in approx. 6% of the observations

  • We found windfall effects in approx. 30% of the observations

How can we make use of the heterogeneity? Making sense of the black box



GRF in practice

Interpreting ML models

  • Trade-off between complexity and interpretability

  • We need a mechanism to trace back where the heterogeneity comes from

  • Explainable machine learning

  • Model-agnostic methods

Shapley values


  • Concept stems from cooperative game theory

  • Shapley values explain the contributions of different variables to an individual prediction.

  • The sum of Shapley values yields the difference of actual and average prediction.

  • It is NOT the difference in prediction when we would remove the feature from the model.



  • A simplistic low-dimensional regression analogy in our context: \[GHG = \alpha + \beta_{1} Treatment + \color{red}{\beta_{12}}(Treatment \times \color{red}{\text{farm size)}} + ... + error\]

Results: Shapley values

Examplary contextual feature: location

  • How do contextual variables drive AES impact?

  • E.g, location drives AES impact

  • Targeting regions with high Shapley values could improve effectiveness

Endogeneity concerns


Endogeneity concerncs were addressed in multiple ways:

  • Through estimation method (selection-on-observables)

    • Propensity score adjustment
    • Doubly robust estimator
    • Many control variables
  • Through contextual variable adjustment (bad controls problem)

    • Cross-sectional analysis
    • 2007-2013 average values for contextual variables: pre-treatment
  • Through algorithmic complexity (unobserved confounding)

    • Unobserved heterogeneity usually not contained in observables
    • Complex interactions and nonlinear combination of observables might contain latent variables

We ran multiple sensitivity analyses, robustness checks, simulations of potential unobserved confounders

Major limitations of the study


  • Indicator choice


  • Effect heterogeneity vs. treatment heterogeneity


  • Cross-sectional nature of the study & the possibility of unobserved confounding

Conclusions


  • Generalized random forest appears to be a good addition to the agricultural subsidies impact assessment toolbox


  • We find heterogeneous impacts of AES participation


  • AES are not very effective in terms of the chosen environmental indicators


  • Ignoring impact heterogeneity yields an incomplete picture of the AES effectiveness


Our analysis should be seen as a first step in the direction of a more complete assessment of agri-environmental schemes.










THANK YOU








cstetter@ethz.ch | Twitter |Google Scholar

References

[1]
R. Uehleke, M. Petrick, and S. Hüttel, Evaluations of agri-environmental schemes based on observational farm data: The importance of covariate selection,” Land Use Policy, vol. 114, p. 105950, 2022, doi: 10.1016/j.landusepol.2021.105950.
[2]
A. Pufahl and C. R. Weiss, Evaluating the effects of farm programmes: Results from propensity score matching,” European Review of Agricultural Economics, vol. 36, no. 1, pp. 79–101, 2009, doi: 10.1093/erae/jbp001.
[3]
L. Arata and P. Sckokai, The impact of agri-environmental schemes on farm performance in five E.U. member States: A DID-matching approach,” Land Economics, vol. 92, no. 1, pp. 167–186, 2016, doi: 10.3368/le.92.1.167.
[4]
D. Bertoni, D. Curzi, G. Aletti, and A. Olper, Estimating the effects of agri-environmental measures using difference-in-difference coarsened exact matching,” Food Policy, vol. 90, no. November, p. 101790, Jan. 2020, doi: 10.1016/j.foodpol.2019.101790.
[5]
S. Chabé-Ferret and J. Subervie, How much green for the buck? Estimating additional and windfall effects of French agro-environmental schemes by DID-matching,” Journal of Environmental Economics and Management, vol. 65, no. 1, pp. 12–27, Jan. 2013, doi: 10.1016/j.jeem.2012.09.003.
[6]
D. Wuepper and R. Huber, Comparing effectiveness and return on investment of action‐ and results‐based agri‐environmental payments in Switzerland,” American Journal of Agricultural Economics, vol. 104, no. 5, pp. 1585–1604, Oct. 2022, doi: 10.1111/ajae.12284.
[7]
S. Athey, J. Tibshirani, and S. Wager, Generalized random forests,” Annals of Statistics, vol. 47, no. 2, pp. 1179–1203, 2019, doi: 10.1214/18-AOS1709.

Appendix

DAG: Selection-on-observables

Directed acyclic graph (DAG) without unobserved confounders, i.e. the unconfoundedness assumption is fulfilled. The effect of the treatment variable D (i.e. participation in AES) on an outcome Y (i.e. environmental indicator) is identified if our model controls for all observed confounders X 1 through X K (i.e. contextual variables) and hence all backdoor paths are closed. Connections among confounders are not shown for brevity.

DAG: Latent confounders

Directed acyclic graph (DAG) in an exemplary situation where two unobserved confounders (U 1 , U 2 ) are present. The effect of the treatment variable D (i.e. participation in AES) on an outcome Y (i.e. environmental indicator) is not correctly identified if our model does not control for all observed confounders X 1 through X K and unobserved confounders (U 1 , U 2 ). Since U 1 and U 2 are not observable, there is no way to directly control for these confounders. Yet, under the assumption that observed and unobserved confounders are associated (arrows from U to X) and the unobserved confounders are reflected in the complex, nonlinear, and high-dimensional combination of the large number of observed confounders (latent confounder space), it might be possible to capture (most of) the variation coming from the unobserved confounder space, if the causal forest maps that latent confounder space accurately. If this is the case, the backdoor path D ← U 1,2 → Y can be closed and the treatment effect from AES participation on environmental performance is identified. Connections among confounders are not shown for brevity.

Robustness: Random placebo treatment

The original treatment variable (D) is replaced by a random placebo treatment variable (\(D_{random}\)). If the model is correctly specified, we expect no effect of the treatment on the environmental outcome.

Robustness: Random outcome

The original outcome variable (Y) is replaced by a random placebo outcome variable (\(Y_{random}\)). If the model is correctly specified, we expect no effect of the treatment on the environmental outcome.

Robustness: Extra random covariate

We add an extra random variable (\(U_{random}\)) as potential confounder to our model. As this confounder is random, we expect no effect of the treatment on the environmental outcome.

Robustness: Leave out most important confounder(s)

We leave out the (three) most important confounder(s) and re-estimate the model. If the (nonlinear) correlation structure of the other observed confounders properly reflect the left-out variable(s), we expect no change of the model against our baseline model, which includes all observed confounders. This could point towards the stability of our model against unobserved confounders that are not included in the model if they are associated with the set of observed confounders.

Robustness: Leave out confounder(s) based on PCA

We leave out various observed confounders more systematically. We use principal component analysis and the resulting loadings to detect systematic groups of confounders and leave these out, such that the combination structure of the other observed confounders are less capable to compensate for the exclusion of these variables. The retrieved model behavior could indicate how strongly it reacts to potentially completely left out confounders that are not buffered away by the combination of observed covariates.

Robustness: Results (i)

Robustness: Results (ii)

Sensitivity: Simulate unobserved confounder (i)

In this robustness check, we simulate a completely left out unobserved covariate (\(U_{target}\)) simulating various diffrent correlation structures between Utarget, D and Y. This gives indication as to how strong the omitted variable bias in our model could be given dfferent correlation structures between Utarget, D and Y.

Sensitivity: Simulate unobserved confounder (ii)

Greenhouse gas emissions

Fertilizer intesnity

Sensitivity: Simulate unobserved confounder (iii)

Pesticide intensity

Land use diversity