Part 1: Paper using randomized data: Impact of Class Size on Learning

Krueger (1999) Experimental Estimates of Education Production Functions QJE 114 (2): 497-532

1.1 Briefly answer these questions:

a. What is the causal link the paper is trying to reveal? The purpose of the paper is to uncover the relationship between the factors that go into producing educational outcomes (e.g. teacher quality and student effort) and the results themselves (e.g. test scores). To do this, Krueger employs data from a randomized study to calculate the education production function, which in turn enables him to establish a cause-and-effect connection between the inputs and outputs.

b. What would be the ideal experiment to test this causal link? The best type of experiment to determine the connection between the elements that contribute to educational outcomes (inputs) and the results themselves (outputs) would be a randomized controlled trial (RCT). In an RCT, participants are randomly divided into two groups: one group receiving the intervention being evaluated (treatment group), and the other group not receiving it (control group). This random assignment helps to account for any factors that could impact the outcome that are not being measured, thereby providing a trustworthy estimate of the effect of the intervention.

c. What is the identification strategy? The primary identification strategy of this paper is a Randomize Control Trial (RCT) technique. However, in some models, the author incorporated a Two Stage Least Square approach together with RCT.

d. What are the assumptions / threats to this identification strategy? The identification method employed in Alan Krueger’s study “Experimental Estimates of Education Production Functions” is predicated on the idea that teachers and students are assigned to classrooms at random. This design establishes a credible causal link between teacher quality and student outcomes.

However, there are several threats to this identification strategy that can limit the validity of the results: - Non-compliance -Treatment Spillover or spillover effects

Part 2: Paper using Twins for Identification: Economic Returns to Schooling

2.1.Briefly answer these questions:

a. What is the causal link the paper is trying to reveal? The effect of schooling on earnings is the causal relationship that Orley Ashenfelter and Alan Krueger aim to identify in their study, “Estimates of the Economic Return to Schooling from a New Sample of Twins.” The authors use data from a group of twins to determine the causal relationship between schooling and incomes.

b. What would be the ideal experiment to test this causal link? The ideal experiment to test the causal link between education and earnings would be a randomized controlled trial (RCT). However, it is impractical to randomly assign schooling in real-world circumstances, so the authors estimated the causal relationship between education and incomes using a sample of twins. They could account for unmeasured abilities and familial traits that influence both educational attainment and earnings by comparing the earnings of twins with various levels of education, and they could estimate the causal effect of education on earnings while controlling for these factors.

c. What is the identification strategy? The identification strategy is based on the hypothesis that twins are comparable in terms of their genes and early family background, allowing the authors to account for unmeasured talents and familial variables that might affect both educational achievement and incomes. The authors can assess the causal influence of education on earnings while controlling for other factors that can affect earnings by comparing the incomes of twin pairs with various levels of schooling.

d. What are the assumptions / threats to this identification strategy? The Identification Strategy is based on several assumptions, notably: - Twin Similarity: The first presumption is that twins have comparable DNA and a common ancestry. This supposition is crucial because it enables the authors to account for unmeasured skills and familial traits that could influence educational attainment and income.

-The second supposition is that twins grow up in surroundings that are similar. This supposition is crucial because it aids in adjusting for environmental influences that could have an impact on both incomes and educational performance.

-The third presumption is that the sample of twin pairs contains no selection bias. This indicates that there is no systematic difference between the twin pairs included in the sample and those who are not, and that the sample of twin pairs is representative of the population of twin pairs.

The risk of measurement error in the earnings and education data, the possibility of unmeasured confounding factors affecting both educational attainment and incomes, and the possibility of omitted variable bias are all threats to the identification technique.

2.2.Replication Analysis:

a. Load Ashenfelter and Krueger AER 1994 data.

# Load the foreign package
library(foreign)

# Load Stata data into R
My_Data <- read.dta("/Users/godwinnutsugah/Dropbox/AAEE-UGA/AAEC 8610/HW/HW04/AshenfelterKrueger1994_twins.dta")

b. Reproduce the result from table 3 column 5.

# Load the required library
library(dplyr)

# Create a new variable in the data frame
My_Data <- My_Data %>%
  mutate(df_Lwage = lwage1 - lwage2,
         df_own_educ= educ1 -educ2)

# Run a linear regression model with the first difference variables
model <- lm(df_Lwage ~ df_own_educ, data = My_Data)

# load the stargazer package
library(stargazer)

stargazer(model, type="text",
          no.space=TRUE, keep.stat = c("n","rsq"),
          title = "First Difference Estimate of Log Wage Equation for Twins",
          covariate.labels = "Own education", dep.var.labels = "Log Wage")

## 
## First Difference Estimate of Log Wage Equation for Twins
## =========================================
##                   Dependent variable:    
##               ---------------------------
##                        Log Wage          
## -----------------------------------------
## Own education          0.092***          
##                         (0.024)          
## Constant                -0.079*          
##                         (0.045)          
## -----------------------------------------
## Observations              149            
## R2                       0.092           
## =========================================
## Note:         *p<0.1; **p<0.05; ***p<0.01

c. Explain how this coefficient should be interpreted. All else equal, an additional year of own’s education would (on average) lead to a 9.2% increase in wage rates.

e. Explain how the coefficient on education should be interpreted. All else equal, an additional year of own’s education would (on average) lead to a 8.4% increase in wage rates.

f. Explain how the coefficient on education should be interpreted. The estimate on Age and Age squared suggest that the partial or marginal effect of age on expected wages increases at a diminishing rate. This also indicates that there exist a non-linear effect partial of age on expected, all else equal.

On average, male twins receive approximately 20% more wages than female counterparts. Also whites receive 40% less wages than non-whites, all else equal.

HW04

Godwin Nutsugah

2023-02-07

Part 1: Paper using randomized data: Impact of Class Size on Learning

1.1 Briefly answer these questions:

Part 2: Paper using Twins for Identification: Economic Returns to Schooling

2.1.Briefly answer these questions:

2.2.Replication Analysis: