Recall “Causal effect of intergroup contact on exclusionary attitudes” by Ryan Enos. PNAS March 11, 2014 111 (10) 3699-3704. (pdf)). We will explore this data, using the techniques from chapters 19 and 20.

Scene 1

Prompt: Instead of focusing in the change in attitude, which is what Enos does, let’s start by looking at the effect of treatment on att_end, the persons attitude toward immigration in the final survey, after the experiment is complete. Use stan_glm() to estimate and interpret a model, called model_1, in which att_end is the dependent variable and treatment is the explanatory variable. Provide some intuition about:

model1 <- stan_glm(att_end ~ treatment, data = train, refresh = 0)
model1

## stan_glm
##  family:       gaussian [identity]
##  formula:      att_end ~ treatment
##  observations: 115
##  predictors:   2
## ------
##                  Median MAD_SD
## (Intercept)      8.5    0.4   
## treatmentTreated 1.5    0.5   
## 
## Auxiliary parameter(s):
##       Median MAD_SD
## sigma 2.8    0.2   
## 
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg

Why is intercept 8.4? Average end of attitude without treatment is 8.4.
Why is treatment effect 1.6? Treated pepople ended up more exclusive. The people in the treated group has a more exclusionary ending attitude.
Why is sigma 2.8? variability of prediction Lots of uncertainty on treatment effect.

Also, provide a sentence about the 90% confidence interval for the treatment effect with a Bayesian interpreptation. Given our model data, 90% of the simulations show the population mean within this interval.

Scene 2

Prompt: Create a new model, model_2, which is just like model_1 but which includes att_start as an additional regressor. Interpret the associated coefficients.

model2 <- stan_glm(att_end ~ treatment + att_start, data = train, refresh = 0)
model2

## stan_glm
##  family:       gaussian [identity]
##  formula:      att_end ~ treatment + att_start
##  observations: 115
##  predictors:   3
## ------
##                  Median MAD_SD
## (Intercept)      1.4    0.4   
## treatmentTreated 0.9    0.2   
## att_start        0.8    0.0   
## 
## Auxiliary parameter(s):
##       Median MAD_SD
## sigma 1.3    0.1   
## 
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg

The intercept is now 1.4. Provide an interpretation. Intercept shows untreated end_att of someone with no bias (start_att = 0, very inclusionary)
sigma is now 1.3. Why? What does that mean? This means it’s a better prediction signma is what you can’t explain
How do the inferences you would draw from model_1 differ from those you would draw from model_2? whether you’re treated or not treated, your starting attitude matters. This potentially means that the treatment did not create attitude change on its own - people’s starting factors into end_att as well
Which model is the truth? model2: real world biases matter smaller sigma

Scene 3

Prompt: Let’s consider interactions. Create a new model, model_3, which is just like model_1 but which includes att_start, male, treatment and the interaction between male and treatment as regressors. Interpret the associated coefficients. Is the treatment effect different for men?

model3 <- stan_glm(att_end ~ att_start + male + treatment + treatment*male, data = train, refresh = 0)
train

## # A tibble: 115 x 9
##     male liberal republican   age income treatment att_start att_end att_chg
##    <int>   <int>      <int> <dbl>  <dbl> <chr>         <dbl>   <dbl>   <dbl>
##  1     0       0          0    31 135000 Treated          11      11       0
##  2     0       0          1    34 105000 Treated           9      10       1
##  3     1       1          0    63 135000 Treated           3       5       2
##  4     1       0          0    45 300000 Treated          11      11       0
##  5     1       1          0    55 135000 Control           8       5      -3
##  6     0       0          0    37  87500 Treated          13      13       0
##  7     0       0          1    53  87500 Control          13      13       0
##  8     1       0          0    36 135000 Treated          10      11       1
##  9     0       0          0    54 105000 Control          12      12       0
## 10     1       0          1    42 135000 Treated           9      10       1
## # … with 105 more rows

x <- data.frame(att_start = c(9, 9),
                male = c(1, 0), 
                treatment = c("Treated", "Treated"))

res_mean <- posterior_linpred(model3, newdata = x, transform = TRUE)

#uncertainty: 
quantile(res_mean[,1], probs = c(0.025, 0.5, 0.975))

##     2.5%      50%    97.5% 
## 8.692380 9.169312 9.663389

quantile(res_mean[,2], probs = c(0.025, 0.5, 0.975))

##      2.5%       50%     97.5% 
##  9.366377  9.886288 10.401815

# posterior linpred assumes thaat error is zero, take your model, plug in coefficient values based on parameters you've given
# see uncertainty with confidence interval BUT doesn't factor uncertainty with each coefficient 
# posterior predict adds in ALL uncertainty

treatment for women is 1.3 and 0.6 for men (1.3 - 0.7)

Imagine we one man and one women, both with att_start = 9. We are interested in two things.

First, what is the unobservable predictor for the true att_end for each person if given treatment. Hint: posterior_linpred(). What is the 95% confidence interval?

Second, if we give expose them to the treatment, what will their att_end be? Hint: posterior_predict(). What is a 95% confidence interval for this forecast?

Scene 4

Prompt: Enos does not estimate this model. Instead, he uses a model with att_chg as the outcome variable. Use stan_glm() to estimate and interpret a model, called model_4, in which att_chg is the dependent variable and treatment is the explanatory variable.

How does the estimated treatment effect differ between model_1 and model_4? What causes that difference? Which one is correct?

Scene 5

Prompt: Create a tibble, called scene_5, which creates the same model as in Scene 1, but for four sub-groups separately: combinations of male/female and Republican/Non-Republican. Before running the regression, what do you predict you will find? Will the treatment effect vary across these groupings? Why?

Hints: You want to create a new variable which defines your four blocks, then nest with that variable. The Primer provides some useful examples.

After running the analysis, interpret the intercept and coefficient estimates across the models. Do they match your predictions? Is there evidence of varying treatment effects?

It seems that the treatment only has an effect on two of the four sub-groups. Tell me a story about why that might be the case.

Challenge Problems

Make a cool animation with the train data, using this package. Start with someone’s starting attitude, then they either get treatment or control, and then they end up with their ending attitude. Animate the people as dots, moving (on a train!?) from where they start to where they finish.

remotes::install_github("daranzolin/d3rain")

## Skipping install of 'd3rain' from a github remote, the SHA1 (e0b06577) has not changed since last install.
##   Use `force = TRUE` to force installation

library(dplyr)
library(d3rain)

treat <- c("Treated", "Control")
train

## # A tibble: 115 x 9
##     male liberal republican   age income treatment att_start att_end att_chg
##    <int>   <int>      <int> <dbl>  <dbl> <chr>         <dbl>   <dbl>   <dbl>
##  1     0       0          0    31 135000 Treated          11      11       0
##  2     0       0          1    34 105000 Treated           9      10       1
##  3     1       1          0    63 135000 Treated           3       5       2
##  4     1       0          0    45 300000 Treated          11      11       0
##  5     1       1          0    55 135000 Control           8       5      -3
##  6     0       0          0    37  87500 Treated          13      13       0
##  7     0       0          1    53  87500 Control          13      13       0
##  8     1       0          0    36 135000 Treated          10      11       1
##  9     0       0          0    54 105000 Control          12      12       0
## 10     1       0          1    42 135000 Treated           9      10       1
## # … with 105 more rows

att_levels <- c('Control', 'Treatment')

train%>%
  mutate(treat = as.factor(treatment))%>%
  arrange(att_chg, att_start)%>%
  d3rain(att_chg, treat, toolTip = treat, title = "Change in Attitude") %>% 
  drip_settings(dripSequence = 'iterate',
                ease = 'bounce',
                jitterWidth = 20,
                dripSpeed = 1000,
                dripFill = c('blue'))%>% 
  chart_settings(fontFamily = 'times',
                 yAxisTickLocation = 'left')

Final Projects

Prompt: Go to the joint repo for final projects: https://github.com/GOV-1006-Spring-2020/papers. We will spend 20 minutes on this. Each person gets 20/N minutes. Allow everyone to read your abstract. Each person must then make a comment or suggestion on the abstract. Exact word choice matters. Refer to our guidance. (Version 2 distributed at the start of class.) Then, open your PDF. Give a brief tour. Talk about your extension. Get some feedback.