“POLI 30 D” “Axel Chavez”
india<-read.csv(‘https://raw.githubusercontent.com/umbertomig/POLI30Dpublic/main/data/india.csv’)
We will estimate the average causal effect of having a female politician on two policy outcomes. For this purpose, we will analyze data from an experiment conducted in India, where villages were randomly assigned to have a female council head. The dataset we will use is in a file called “india.csv”. The Table below shows the names and descriptions of the variables in this dataset, where the unit of observation is villages.
| Variable | Description |
|---|---|
| village | village identifier (“Gram Panchayat number _ village number”) |
| female | whether the village was assigned a female politician: 1=yes, 0=no |
| water | number of new (or repaired) drinking water facilities in the village
since random assignment |
| irrigation | number of new (or repaired) irrigation facilities in the
village since random assignment |
Considering that the data set we are analyzing comes from a randomized experiment, what we can compute to estimate the average casual effect of having a female politician on the number of new(or repaired) drinking water facilities will be contrast between both treatment and control group. Utilizing the difference in means estimator we can determine the gap between the mean number of the new(or repaired) drinking water facilities in the treatment group and in the control group. Control groups are those without a female politician and treatment groups are those with a female politician.
mean(india\(water[india\)female == 1])
mean(india\(water[india\)female == 0])
mean(india\(water[india\)female == 1])- mean(india\(water[india\)female == 0])
ggplot(india, aes(x=water))+geom_histogram()
## Coding answers here
Answers: Answers here.
(Hint: a scatter plot is the graphical representation of the relationship between two variables. The function in R to create a scatter plot with the fitting line is:
ggplot(data = dataset, aes(x = var_x, y = var_y)) +
geom_point() +
geom_smooth(formula = 'y ~ x', method = 'lm', se = F)
It requires three arguments: (1) the name you saved the dataset; (2) the code identifying the variable to be plotted along the x-axis, and (3) the code identifying the variable to be plotted along the y-axis.)
## Coding answers here
Answers: Answers here.
(Hint: the function in R to compute a correlation coefficient is . It requires two arguments (separated by a comma) and in no particular order: the code identifying each of the two variables.)
Answer: Answers here.
Answer: Answers here.