Friday May 3, 2013: Stats 155

Causation and Experiment

Last session we looked at hypothetical causal networks. There were pretty simple rules for including or excluding a covariate in order to open or block a pathway. But:

There's no guarantee that our hypothesis corresponds to reality.

We need a reliable and authoritative way to establish the existence and strength of a causal link, that works regardless of the correspondence between our hypothesis and reality.

Consider the mediator network:

fetchData("simulate.r")

## [1] TRUE

cnet.mediator

## Causal Network with  3  vars:  B, C, A 
## ===============================================
## B is exogenous
## C <== B 
## A <== C

We can see what the structure of the network is, but suppose we want to discover the causal structure without being able to see the network. Here's how:

Pick two variables you're interested in seeing the connection between, say A and B.
Create an intervention:

myintervention = rnorm(100)

Apply the intervention to one of the variables:

toA = run.sim(cnet.mediator, 100, A = myintervention)

Look for a correlation between your intervention and the target variable

summary(lm(B ~ A, data = toA))

## 
## Call:
## lm(formula = B ~ A, data = toA)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.2225 -0.5493  0.0122  0.5152  2.1900 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  0.00357    0.08725    0.04     0.97
## A           -0.10015    0.09188   -1.09     0.28
## 
## Residual standard error: 0.872 on 98 degrees of freedom
## Multiple R-squared: 0.012,   Adjusted R-squared: 0.0019 
## F-statistic: 1.19 on 1 and 98 DF,  p-value: 0.278

Now go the other way:

toB = run.sim(cnet.mediator, 100, B = myintervention)
summary(lm(A ~ B, data = toB))

## 
## Call:
## lm(formula = A ~ B, data = toB)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.037 -0.908  0.155  1.027  3.695 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.0369     0.1532    0.24     0.81    
## B            -1.0478     0.1613   -6.50  3.4e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Residual standard error: 1.53 on 98 degrees of freedom
## Multiple R-squared: 0.301,   Adjusted R-squared: 0.294 
## F-statistic: 42.2 on 1 and 98 DF,  p-value: 3.44e-09

Setting a categorical intervention

A couple of ways:

myvals = resample(c("N", "Y"), 100)
myvals2 = rep(c("N", "Y"), 50)

Activity

Use this technique to map out the causal connections on these networks campaign.spending, cnet.common.cause, cnet.witness, cnet.chain, aspirin, jock,heights

How to set the intervention?

We want to maximize the variance but not go above a certain limit (as in the dose of aspirin). An efficient way to do this is to make an intervention that's of the form 0, 1, 0, 1

An experiment with heart rate

We're going to do an experiment to see if there is a relationship between posture (standing or sitting) and heart rate.

Ask every second student to stand up.
Put your finger against your carotid artery and press gently until you feel your pulse.
At the signal, start counting heart beats. Count until the next signal (after 15 seconds).
Your heart rate in beats per second is 4 times the count

We've just done an experiment, a procedure in which we impose certain conditions and measure the result. In this case, the condition we imposed was sitting versus standing.

We do experiments in order to measure with some precision the relationships involved. Let's tally up the results of this experiment:

Enter the data into this spreadsheet. There is one column for each of the three meeting times for the course. Note that the posture data has been pre-entered, so enter the sitting people first and the standing people after.

Now we'll analyze the data:

s = fetchGoogle("https://docs.google.com/spreadsheet/pub?key=0Am13enSalO74dE1COTRrYWVHN3dNR09TZHNLTkJnbHc&single=true&gid=0&output=csv")

## Loading required package: RCurl

## Warning: package 'RCurl' was built under R version 2.15.2

## Loading required package: bitops

Plot out the density broken down by posture and a bwplot.

Now, the model:

coef(summary(lm(HR1050 ~ Posture, data = s)))

##                 Estimate Std. Error t value  Pr(>|t|)
## (Intercept)       66.333      4.089  16.222 3.452e-12
## Posturestanding    8.212      5.514   1.489 1.537e-01

Improving the Experiment

Could we improve the experiment. Several ways:

Measure HR more precisely.
Specify posture more precisely.
Collect more data from more people.

But our focus here is on how to improve the experiment by a better design.

To explore this, here is a program that will generate simulated HR data:

fetchData("hr.R")

## Retrieving from http://www.mosaic-web.org/go/datasets/hr.R

## [1] TRUE

fetchData("simulate.r")

## Retrieving from http://www.mosaic-web.org/go/datasets/simulate.r

## [1] TRUE

To run it, you specify the subject ID numbers and the posture of each patient, which must be one of “sitting”, “standing”, “lying”. Here's an example of running it on 10 people, half of whom are standing and half sitting. NOTE the expression 1:10 is creating the numbers 1, 2, 3, …, 10. That is, 1:10 makes 10 different people for the experiment. The levels of the posture variable are recycled.

s1 = heart.rate(who = 1:10, posture = c("sitting", "standing"))
s1

##    who  posture hr sex weight
## 1   S1  sitting 52   F  75.27
## 2   S2 standing 54   M  79.02
## 3   S3  sitting 53   M  63.18
## 4   S4 standing 62   M  52.30
## 5   S5  sitting 51   F  66.33
## 6   S6 standing 66   M  84.19
## 7   S7  sitting 70   F  69.76
## 8   S8 standing 61   F  66.96
## 9   S9  sitting 50   F  46.71
## 10 S10 standing 64   F  50.95

Can we see the difference between the two different postures?

coef(summary(lm(hr ~ posture, data = s1)))

##                 Estimate Std. Error t value  Pr(>|t|)
## (Intercept)         55.2      3.008  18.349 8.009e-08
## posturestanding      6.2      4.254   1.457 1.831e-01

No significant difference.

ACTIVITY:

Run the simulation several times. What's the power for a significance level of 0.05? For a significance level of 0.01?
Assign different groups of students to run experiments of size $ n=20 $, $ n=30 $, $ n=50 $, and so on. What sample size will give a power of 80% at a significance level of 0.01?

This may seem odd at first, but let's add who as an explanatory variable:

summary(lm(hr ~ posture + who, data = s1))

## 
## Call:
## lm(formula = hr ~ posture + who, data = s1)
## 
## Residuals:
## ALL 10 residuals are 0: no residual degrees of freedom!
## 
## Coefficients: (1 not defined because of singularities)
##                 Estimate Std. Error t value Pr(>|t|)
## (Intercept)           52         NA      NA       NA
## posturestanding        9         NA      NA       NA
## whoS10                 3         NA      NA       NA
## whoS2                 -7         NA      NA       NA
## whoS3                  1         NA      NA       NA
## whoS4                  1         NA      NA       NA
## whoS5                 -1         NA      NA       NA
## whoS6                  5         NA      NA       NA
## whoS7                 18         NA      NA       NA
## whoS8                 NA         NA      NA       NA
## whoS9                 -2         NA      NA       NA
## 
## Residual standard error: NaN on 0 degrees of freedom
## Multiple R-squared:    1,    Adjusted R-squared:  NaN 
## F-statistic:  NaN on 9 and 0 DF,  p-value: NA

Why did the p-values disappear?

Now a slight variation on the experiment:

s2 = heart.rate(who = rep(1:5, each = 2), posture = c("sitting", "standing"))
s2

##    who  posture hr sex weight
## 1   S1  sitting 51   F  75.27
## 2   S1 standing 55   F  75.27
## 3   S2  sitting 52   M  79.02
## 4   S2 standing 54   M  79.02
## 5   S3  sitting 53   M  63.18
## 6   S3 standing 55   M  63.18
## 7   S4  sitting 60   M  52.30
## 8   S4 standing 62   M  52.30
## 9   S5  sitting 52   F  66.33
## 10  S5 standing 54   F  66.33

Can we see the difference between the two different postures in this new experiment?

coef(summary(lm(hr ~ posture, data = s2)))

##                 Estimate Std. Error t value  Pr(>|t|)
## (Intercept)         53.6      1.575  34.036 6.067e-10
## posturestanding      2.4      2.227   1.078 3.126e-01

coef(summary(lm(hr ~ posture + who, data = s2)))

##                   Estimate Std. Error    t value  Pr(>|t|)
## (Intercept)      5.180e+01     0.4899  1.057e+02 4.797e-08
## posturestanding  2.400e+00     0.4000  6.000e+00 3.883e-03
## whoS2           -8.119e-15     0.6325 -1.284e-14 1.000e+00
## whoS3            1.000e+00     0.6325  1.581e+00 1.890e-01
## whoS4            8.000e+00     0.6325  1.265e+01 2.249e-04
## whoS5           -9.936e-15     0.6325 -1.571e-14 1.000e+00

Why is the estimate from the second model much better than from the first?

We've introduced a new explanatory variable, who that accounts for a lot of the case-by-case variability.
We couldn't do this in the original experiment, because who would have provided a perfect fit.

Principles of Experimental Design

Replicate! You need to have many subjects. Large $ n $ gives higher precision. Remember, the standard error is proportional to $ 1/\sqrt{n-m} $, where $ m $ is the number of model terms. You need to have some residual in order to know what your error is.
Impose controlled variation! That's why we had half the people stand up and half sit down. To create variation.
Minimize uncontrolled variation. Make your experimental conditions as similar as possible (outside of the controlled variation).
Make your controlled variation perpendicular to your uncontrolled variation.
When you have multiple variables that you impose, make them mutually orthogonal. Indeed, if you expect to be fitting a model with interaction terms, etc., arrange your intervention so that all the terms are mutually orthogonal. That will give you better estimation properties.

Why the Second Experiment was Better

A major source of uncontrolled variation in the heart rate is the person-to-person variability. There's nothing we can do about that directly. However, we can try to arrange the experimental treatment to be orthogonal to the uncontrolled variation.

You can measure the angle between the imposed treatment and the uncontrolled variation with R^2.

r.squared(lm(as.numeric(posture) ~ who, data = s1))

## [1] 1

r.squared(lm(as.numeric(posture) ~ who, data = s2))

## [1] 3.747e-31

In the second experiment, we had each person participate twice. Intuitively, you can think of this as each person serving as his or her own control. Mathematically, it's a matter of using the who variable to eat up variance, and ensuring that there is little or no overlap between what's explained by the who variable and what's explained by the posture variable.

This gives us a much better experiment: smaller $ n $ for a given power, better precision in the measurement. This is called a cross-over experiment.

Review orthogonality in terms of dot product, R²

When you can't do a cross-over

There are many settings when you only have one opportunity for each experimental unit. For example, suppose instead of heart rate we were studying survival. How should we arrange posture?

Now it will be impossible to impose a posture that's orthogonal to who. But we can come close. Assign the posture randomly. posture = resample(c("sitting","standing"), 1000)

Show that this is approximately orthogonal to both sex and weight.

Blocking

We don't mean blocking as in football or as in blocking a path in a hypothetical causal network. The “blocking” here refers to arranging your experimental units in groups. Within each group, the units are similar to one another.

Blocking according to sex. Break into two groups, one for males and one for females. Assign the treatment randomly within each group.
Blocking according to weight. Break into groups of, say, two. The people with the closest weights. Then randomly assign one person to “sitting” and one person to “standing” within each group.
Blocking according to both sex and weight: Divide into males and females. Then subdivide each of those groups into small groups according to weight. Randomly assign “sitting” and “standing” within each group.

Randomization is a Procedure

You should be able to document your randomization procedure. It's not really enough to say, “We did it at random.”

Experiment for Causation

Often, especially in the social sciences, experiments are done to probe causal paths.

The basic idea is that by imposing a treatment, you alter the network. If you assign posture, then all the other factors that might have been connected causally to posture are disconnected — only the downstream elements remain.

Observation versus experiment with the campaign spending simulation

When the study is done observationally, the results depend on what covariates you include in the model. So you are looking at the consequences of your assumptions, rather than purely at the system under study.

But when you impose a treatment, you have reconfigured the system in a known way. All of the backdoor links that involve an incoming paths to the treatment have been blocked.

Let's do an experiment where every other candidate gets $10M to spend, while the others get nothing:

s = run.sim(campaign.spending, 5000, spending = c(0, 10))
coef(summary(lm(vote ~ spending, data = s)))

##             Estimate Std. Error t value  Pr(>|t|)
## (Intercept)   37.414    0.31824 117.564 0.000e+00
## spending       0.315    0.04501   6.999 2.924e-12

coef(summary(lm(vote ~ spending + polls, data = s)))

##             Estimate Std. Error t value  Pr(>|t|)
## (Intercept)   1.0291   0.216459   4.754 2.046e-06
## spending      0.2240   0.015388  14.554 4.973e-47
## polls         0.7323   0.003767 194.413 0.000e+00

The results don't depend on the covariates; within the margin of error, they are the same for the two models.

Notice that I used 5000 units. One reason for this is that I didn't make the variation in spending very large. Here's a bigger variation with only 1/10th as many experimental units:

s = run.sim(campaign.spending, 500, spending = c(0, 100))
coef(summary(lm(vote ~ spending, data = s)))

##             Estimate Std. Error t value   Pr(>|t|)
## (Intercept)  36.0453    0.99446   36.25 9.203e-142
## spending      0.2495    0.01406   17.74  6.307e-55

coef(summary(lm(vote ~ spending + polls, data = s)))

##             Estimate Std. Error t value   Pr(>|t|)
## (Intercept)   1.2178   0.687078   1.772  7.693e-02
## spending      0.2484   0.004975  49.932 8.418e-196
## polls         0.7301   0.012373  59.011 1.185e-226

In-Class Activity

An experiment with aspirin.

For reference: An observational study

obs = run.sim(aspirin, 500)

Construct a logistic equation model of stroke:

mod = lm(stroke == "Y" ~ mgPerDay, data = obs)
coef(summary(mod))

##               Estimate Std. Error t value  Pr(>|t|)
## (Intercept)  0.3060503  0.0353370  8.6609 6.554e-17
## mgPerDay    -0.0002174  0.0006233 -0.3488 7.274e-01

mod2 = lm(stroke == "Y" ~ mgPerDay + sick, data = obs)
coef(summary(mod2))

##              Estimate Std. Error t value  Pr(>|t|)
## (Intercept)  0.196663  0.0245781   8.002 8.728e-15
## mgPerDay    -0.003238  0.0004443  -7.287 1.252e-12
## sick         0.098743  0.0041377  23.864 1.925e-84

The Experimental Study

(Have the students do this.)

Create an experimental intervention:

exper = run.sim(aspirin, 500, mgPerDay = c(0, 100))

Now the analysis:

emod = lm(stroke == "Y" ~ mgPerDay, data = exper)
coef(summary(emod))

##             Estimate Std. Error t value  Pr(>|t|)
## (Intercept)  0.40400  0.0276437  14.615 1.653e-40
## mgPerDay    -0.00236  0.0003909  -6.037 3.076e-09

emod2 = lm(stroke == "Y" ~ mgPerDay + sick, data = exper)
coef(summary(emod2))

##              Estimate Std. Error t value  Pr(>|t|)
## (Intercept)  0.229622  0.0220955  10.392 5.040e-23
## mgPerDay    -0.002835  0.0002892  -9.803 7.384e-21
## sick         0.085480  0.0041774  20.462 5.890e-68

The two models give the same result. The second model, depending on measuring and using the covariate sick, doesn't change the result, but it does eat up deviance (the logistic equivalent of variance)

Creating Orthogonality (Still in Draft)

[Need to write some blocking software that can create an intervention that's orthogonal to a set of covariates]

It would be better to make spending orthogonal to polls. There's no good way to do that in this simulation, except by setting the popularity and polls from one simulation when creating the next simulation.

# Get the poll data.  This involve
prelim = run.sim(campaign.spending, 100)