Today’s Data

Load the data set boston.csv

boston <- read.csv("boston.csv")

1. R Cheat sheet

R Cheat sheet

Set the working directory

Operators

Functions - Explore the data

Modify your data

Functions - descriptives and visualizations

Functions - Regression Analysis

-lm() fits a linear model. It requires a formula of the type: Y~X, where Y identifies the outcome variable and X identifies the X variable. lm(data$y_var~data$x_var) or lm(y_var~x_var, data=data) - summary(lm()) shows a summary of the linear model -abline() adds a straight line to a graph. To add the fitted line, we specify as the main argument the object that contains the output of the lm() function. fit<-lm(Y~X);abline(fit)

2. Case- Causal effect

The names and descriptions of variables in the data set

Name Description
age Age of individual at time of experiment
male Sex of individual, male (1) or female (0)
income Income group in dollars (not exact income)
white Indicator variable for whether individual identifies as white (1) or not (0)
college Indicator variable for whether individual attended college (1) or not (0)
usborn Indicator variable for whether individual is born in the US (1) or not (0)
treatment Indicator variable for whether an individual was treated (1) or not (0)
ideology Self-placement on ideology spectrum from Very Liberal (1) through Moderate (3) to Very Conservative (5)
numberim.pre Policy opinion on question about increasing the number of immigrants allowed in the country from Increased (1) to Decreased (5)
numberim.post Same question as above, asked later
remain.pre Policy opinion on question about allowing the children of undocumented immigrants to remain in the country from Allow (1) to Not Allow (5)
remain.post Same question as above, asked later
english.pre Policy opinion on question about passing a law establishing English as the official language from Not Favor (1) to Favor (5)
english.post Same question as above, asked later

Always start by exploring the data

First, let’s a get sense for this data. Use head() to take a quick look at it. What are its dimensions? Calculate the mean of a variable.

head(boston)
##   age male income white college usborn treatment ideology numberim.pre
## 1  31    0 135000     1       1      1         1        3            5
## 2  34    0 105000     1       1      0         1        4            1
## 3  63    1 135000     1       1      1         1        2            1
## 4  45    1 300000     1       1      1         1        4            3
## 5  55    1 135000     1       1      1         0        2            3
## 6  37    0  87500     1       1      1         1        5            3
##   numberim.post remain.pre remain.post english.pre english.post
## 1             4          2           3           4            4
## 2             2          5           5           3            3
## 3             3          1           1           1            1
## 4             3          4           4           4            4
## 5             2          1           1           4            2
## 6             3          5           5           5            5
dim(boston)
## [1] 115  14

Average Treatment Effect

Our goal is to calculate the average treatment effect on the change in attitudes about immigration (number)

  1. Change post-pre

  2. Average change of the treatment group

  3. Average change of the control group

  4. Difference of Means treatment’s change - control’s change

  5. Calculate the change in attitudes (post-pre)

boston$change <- 
  boston$numberim.post - boston$numberim.pre
  1. Compute change in attitude for the treatment grouptreatment == 1
treat.change <- 
  mean(boston$change[boston$treatment == 1])
treat.change
## [1] 0.1176471

More exclusionary with the treatment

  1. Compute change in attitude for the control group [same but not treatment == 0]
ctrl.change <- 
  mean(boston$change[boston$treatment == 0])
ctrl.change
## [1] -0.1875
  1. Finally, compute the difference of the mean change between treatment group and control group
treat.change - ctrl.change
## [1] 0.3051471

The changes in the treatment group were more exclusionary than the control group by 0.31 points \(\rightarrow\) Exposure to simulated demographic changes caused this increase in exclusionary attitudes.

3. Let’s Practice

Calculate the ATE of the experiment but instead of using as the dependent variable the number of immigrants use the policy opinion on question about allowing the children of undocumented immigrants to remain in the country from Allow (1) to Not Allow (5)

Follow the steps:

# 1. Calculate the change in attitudes (post-pre)
boston$change <- 
  boston$remain.post - boston$remain.pre
# 2. Compute change in attitude for the treatment group`treatment == 1` 

treat.change <- 
  mean(boston$change[boston$treatment == 1])
treat.change
## [1] 0.2156863
# 3. Compute change in attitude for the control group [same but not `treatment == 0`]
ctrl.change <- 
  mean(boston$change[boston$treatment == 0])
ctrl.change
## [1] -0.109375
# 4. Finally, compute the difference of the mean change between treatment group and control group
treat.change - ctrl.change
## [1] 0.3250613