Instrument-free methods
Department of Biostatistics, University of Michigan
5/24/23
Challenge: draw valid causal inferences from these data is the presence of endogenous regressors that are correlated with the structural error in the population regression model representing the causal relationship of interest.
Instrumental variables (IV): classical method to deal with endogeneity.
Park and Gupta (2012) present an instrument-free method using a Gaussian copula approach.
We consider the following linear structural regression model: \[Y_i = \mu + \alpha P_i + \beta^T W_i + \epsilon_i\] where \(P_i\) is a endogeneous regressor variable, \(W_i\) is an exogenous regressor vector and \(\epsilon_i\) is the structural error term.
Key idea: use a copula to jointly model the correlation between \(P_i\) and \(\epsilon_i\). Marginals are not restricted by the joint distribution. Using information contained in the observed data, marginals of the endogenous regressor and the error term are first obtained respectively.
The structural error can be expressed as:
\[\epsilon_i = \sigma_\epsilon \epsilon_i^\star = \sigma_\epsilon (\rho P_i^\star + \sqrt{1-\rho^2}w_i)\] and the structural regression model can be re-written as
\[\begin{equation} \begin{aligned} Y_i &= \mu + \alpha P_i + \beta^T W_i + \sigma_\epsilon (\rho P_i^\star + \sqrt{1-\rho^2}w_i)\\ Y_i &= \mu + \alpha P_i + \sigma_\epsilon\rho P_i^\star + \beta^T W_i + \sigma_\epsilon \sqrt{1 - \rho^2} w_i. \end{aligned} \end{equation}\]
Jointly model endogenous regressor \(P_i\), the correlated exogenous variable, \(W_i\), and the structural error term, \(\epsilon_i\), using the Gaussian copula model: \[\begin{equation} \left(\begin{array}{c} P_i^* \\ W_i^* \\ \epsilon_i^* \end{array}\right) = \left(\begin{array}{c} \Phi^{-1}\{F_P(P_i) \} \\ \Phi^{-1}\{F_W(W_i) \} \\ \Phi^{-1}\{F_\epsilon(\epsilon_i) \} \end{array}\right) \sim N\left(\left[\begin{array}{l} 0 \\ 0 \\ 0 \end{array}\right],\left[\begin{array}{ccc} 1 & \rho_{p w} & \rho_{p \epsilon} \\ \rho_{p w} & 1 & 0 \\ \rho_{p \epsilon} & 0 & 1 \end{array}\right]\right) \end{equation}\]
The model above can be re-written as
\[\begin{equation} \left(\begin{array}{c} P_i^* \\ W_i^* \\ \epsilon_i^* \end{array}\right)=\left(\begin{array}{ccc} 1 & 0 & 0 \\ \rho_{p w} & \sqrt{1-\rho_{p w}^2} & 0 \\ \rho_{p \epsilon} & \frac{-\rho_{p w} \rho_{p \epsilon} }{\sqrt{1-\rho_{p w}^2}} & \sqrt{1-\rho_{p \epsilon}^2-\frac{\rho_{p w}^2 \rho_{p \epsilon}^2}{1-\rho_{p w}^2}} \end{array}\right) \cdot\left(\begin{array}{c} w_{1, i} \\ w_{2, i} \\ w_{3, i} \end{array}\right) \end{equation}\] where \((w_1, w_2, w_3) \overset{i.i.d}{\sim} N(0, 1)\)
\[\begin{equation} \begin{aligned} P_i^\star &= \rho_{pw}W_i^\star + \sqrt{1 - \rho_{pw}^2} w_{2, i} \\ &= \rho_{pw}W_i^\star + \epsilon_i \quad(1) \\ Y_i &= \mu+ \alpha P_i + \beta W_i + \frac{\sigma_{\epsilon} \rho_{p \epsilon}}{1-\rho_{p w}^2} \epsilon_i+\sigma_{\epsilon} \sqrt{1-\rho_{p \epsilon}^2-\frac{\rho_{p w}^2 \rho_{p \epsilon}^2}{1-\rho_{p w}^2}} \cdot w_{3, i} \quad(2) \end{aligned} \end{equation}\]