This analysis examines the effectiveness of a job training program on participants’ earnings, utilizing Gamma regression and interpreting the results using the Clarify package for simulation-based inference. The analysis follows best practices outlined in course materials and research articles.
The dataset used for this analysis is the lalonde
dataset from the MatchIt
package in R. This dataset is
commonly used in econometric studies to evaluate the impact of job
training programs.
Variable | Description |
---|---|
treat |
Binary (1 = received training, 0 = did not) |
age |
Age of participant |
educ |
Years of education |
race |
Categorical (Black, Hispanic, White) |
married |
Binary (1 = married, 0 = unmarried) |
nodegree |
Binary (1 = no high school diploma, 0 = has diploma) |
re74 , re75 |
Real earnings in 1974 and 1975 (pre-treatment) |
re78 |
Dependent variable - Real earnings in 1978 (post-treatment) |
Does participation in the National Supported Work (NSW) program significantly impact participants’ earnings in 1978, after accounting for demographic and pre-treatment income variables?
This research question is important for both theoretical and policy-driven reasons: 1. Economic Theory: Understanding labor market interventions helps in designing effective employment policies. 2. Policy Relevance: Governments and NGOs invest heavily in job training programs. Evaluating their impact ensures efficient resource allocation.
Given that re78
(earnings in 1978) is continuous
and strictly positive, Gamma regression is an
appropriate model choice. Gamma regression is useful for modeling
right-skewed distributions, common in income data.
We estimate the following model:
\[ E(\text{re78} | X) = \beta_0 + \beta_1 \text{treat} + \beta_2 \text{age} + \beta_3 \text{educ} + \beta_4 \text{race} + \beta_5 \text{married} + \beta_6 \text{nodegree} + \beta_7 \text{re74} + \beta_8 \text{re75} \]
where: - \(\beta_1\) measures the impact of job training on earnings.
# Load required libraries
library(MatchIt)
library(clarify)
# Load the dataset
data("lalonde")
# Fit a Gamma regression model
fit <- glm(re78 ~ treat + age + educ + race + married + nodegree + re74 + re75,
data = lalonde, family = Gamma(link = "log"))
# Simulate coefficient distributions
set.seed(123)
sim_fit <- sim(fit, n = 1000)
# Compute Average Marginal Effects
sim_ame_results <- sim_ame(sim_fit, var = "treat")
# Display results
summary(sim_ame_results)
clarify
The clarify
package improves model interpretation by
simulating coefficient distributions, which helps in addressing
uncertainty and variability in the estimates.
treat
variable has a positive coefficient, suggesting that
program participants had higher earnings in 1978.clarify
generates
distributions of estimates, providing better
uncertainty quantification.This analysis demonstrates the positive impact of the NSW job training program on earnings. By employing Gamma regression and leveraging Clarify’s simulation-based inference, we improve the accuracy and transparency of our findings.