Problem Set 01 - Do Women Promote Different Policies than Men?

(Based on DSS Materials and on Chattopadhyay and Esther Duflo. 2004. ``Women as Policy Makers: Evidence from a Randomized Policy Experiment in India.” Econometrica, 72 (5): 1409–43.)

We will estimate the average causal effect of having a female politician on two policy outcomes. For this purpose, we will analyze data from an experiment conducted in India, where villages were randomly assigned to have a female council head. The dataset we will use is in a file called “india.csv”. The Table below shows the names and descriptions of the variables in this dataset, where the unit of observation is villages.

Variable Description
village village identifier (“Gram Panchayat number _ village number”)
female whether village was assigned a female politician: 1=yes, 0=no
water number of new (or repaired) drinking water facilities in the village
since random assignment
irrigation number of new (or repaired) irrigation facilities in the village
since random assignment

In this problem set, we will practice loading, making sense of data, and understanding the basics of causal inference. We will also learn how to use R Markdown.


  1. Use the function read.csv() to read the CSV file “india.csv”. You can find it on the GitHub page: https://raw.githubusercontent.com/umbertomig/POLI30Dpublic/main/data/india.csv. Use the assignment operator <- to store the data in an object called india. Provide the R code you used. (1 point)
# Your coding answers here.
india<-read.csv("https://raw.githubusercontent.com/umbertomig/POLI30Dpublic/main/data/india.csv")

  1. Use the function head() to view the first few observations of the dataset. Provide the R code you used. (1 point)
# Your coding answers here.
head(india)
##        village female water irrigation
## 1 GP1_village2      1    10          0
## 2 GP1_village1      1     0          5
## 3 GP2_village2      1     2          2
## 4 GP2_village1      1    31          4
## 5 GP3_village2      0     0          0
## 6 GP3_village1      0     0          0

  1. What does each observation in this dataset represent? Please substantively interpret the first observation in the dataset. (1 point)

Answer: Each observation in this dataset relates the relationship between whether or not a village has a female politician, the number of new or repaired water drinking facilities since the beginning of the experiment, and the number of new or repaired irrigation facilities since the beginning of the experiment. The first observation shows that when the politician in a village was a woman, 10 new or repaired drinking water facilities were created and 0 new or repaired irrigation facilities were created.


  1. For each variable in the dataset, please identify the type of variable (character vs. numeric binary vs. numeric non-binary) (1 point)

Answer: The ‘female’ variable is numeric binary, the ‘water’ variable is numeric non-binary, and the ‘irrigation’ variable is numeric non-binary.


  1. How many observations are in the dataset? In other words, how many villages were part of this experiment? (Hint: the function dim() might be helpful here.) (1 point)
# Your coding answers here.
dim(india)
## [1] 322   4

322 Villages

  1. Use the function mean() to calculate the average of the variable female. Please provide a full substantive interpretation of what this average means. (1 point)
# Your coding answers here.
mean(india$female)
## [1] 0.3354037

Answer: This average shows that 33.5% of villages that are involved in the experiment have a woman council head


  1. Use the function mean() to calculate the average of the variable water. Please provide a full substantive interpretation of what this average means. Make sure to provide the unit of measurement. (1 point)
# Your coding answers here.
mean(india$water)
## [1] 17.84161

Answer: This average means that the average village that was examined has built or repaired an average of 17.8 water drinking villages.


  1. If we wanted to estimate the average causal effect of having a female politician on the number of new (and repaired) drinking water facilities: (1 point)
    1. What would be the treatment variable? Please provide the name of the variable
    2. What would be the outcome variable? Please provide the name of the variable

Answer: The treatment variable is whether or not a woman politician heads a village. It’s variable ‘female’. The outcome variables are the number of drinking water facilities that are built or repaired. The variable is ‘water’.


  1. If we wanted to estimate the average causal effect of having a female politician on the number of new (and repaired) irrigation facilities: (1 point)
    1. What would be the treatment variable? Please provide the name of the variable
    2. What would be the outcome variable? Please provide the name of the variable

Answer: The treatment variable is whether or not a woman politician heads a village. It’s variable ‘female’. The outcome variables are the number of irrigation facilities that are built or repaired. The variable is ‘irrigation’.


  1. In both analyses above: (1 point)
    1. What would be the treatment group?
    2. What would be the control group?

Answer: The treatment group is a village with a woman head council member. The control group is a village with a male council member.