We want to ask questions about cause and effect.
- Does drug XYZ reduce patients' risk of cancer?
- Did the Fed's choice of interest rate grow the economy?
Wednesday, September 30, 2015
We want to ask questions about cause and effect.
But there's a problem: any phenomenon has many causes.
To answer these questions we need to:
You've already had a taste of the problems economists face in collecting evidence:
Now we need to consider interpreting data.
Has Obamacare been a success?
This seemingly simple question is actually very difficult.
To make our task simpler, we need to start with simpler questions. For example:
Does having health insurance lead to better health outcomes?
Does having health insurance lead to better health outcomes?
This makes our job simpler, but it doesn't make our job simple. It's tempting to try to answer this question by comparing the average health of insured and uninsured people.
But is that enough to answer our question?
This raises an important question: Are insured people healthier because of the insurance, or because of some other factor?
What we're really concerned with isn't whether wealthier people are also healthier.
We're interested in whether health insurance actually helps people be healthier (and if so, how much healthier?).
To answer this question, we need to consider the counterfactual. What would the outcome have been?
With perfect knowledge we would know what happens and what happens in the parallel universe where poor people are insured (and the other parallel universe where rich people are uninsured).
To make our lives easier we're going to make up some symbols to describe the situation.
This lets us determine the treatment effect (\(Y_{1i} - Y_{0i}\))
At the end of the day, we want to know if insurance will lead to better outcomes than an alternative we never actually see.
We don't want to compare my health with yours, we want to compare my health with a version of myself that doesn't have insurance.
In the real world we can only know \(Y_{0i}\) for uninsured people and \(Y_{1i}\) for insured people.
But we want to ask if giving person \(i\) insurance lead to:
| Table 1.2 | Khuzdar Khalat | Maria Moreño |
|---|---|---|
| Potential outcome without insurance: \(Y_{0i}\) | 3 | 5 |
| Potential outcome with insurance: \(Y_{1i}\) | 4 | 5 |
| Treatment (insurance status chosen): \(D_i\) | 1 | 0 |
| Actual health outcome: \(Y_i\) | 4 | 5 |
| Treatment effect: \(Y_{1i} - Y_{0i}\) | 1 | 0 |
Table 1.2 (if it reflects the truth) tells us that:
How do we make an apples to apples comparison? How do we hold all else equal?
Ideally, we would set up a randomized trial.
If we randomly select a large groups of people, those large groups should, on average, have similar characteristics.
Randomized trials, when done correctly, allow us to make inferences about the relationships between variables. Comparisons are valid when we're effectively holding all else constant.
But doing so perfectly is impossible. Our groups' average features (e.g. initial health, income, wealth, knowledge, etc.) will always be slightly different.
We can check our work by comparing the characteristics of our groups. Is the average age of one group different than the other? If our sample isn't already balanced we can rebalance it by ignoring some of the data.
If our treatment group (those we're providing insurance for) is older than the control group, we might randomly throw out some of the observations of older patients from the treatment group.
However, we don't want to throw out too much data because the purpose of randomized trials is to take advantage of the Law of Large Numbers. If we throw out too much data we won't have "large numbers" of observations to deal with.
Consider a simple trial:
Even if our groups look similar on average, there are unobserved (and unobservable) differences between everyone.
It's possible that participants in the treatment group are more likely to have some particular gene that affects the effectiveness of the treatment.
But if our groups are large and randomly selected, it's unlikely that this will happen.
So how similar is similar enough? Does it matter if the treatment group's average income is $200 greater than the control group? This raises the issue of significance.
There are two parts of this issue:
Differences in what we observe may occur by random chance.
Flip a coin 10 times and get 6 heads. Flip another coin 10 times and get 4 heads. Is either coin unfair? Probably not. Flip both coins 1,000,000 times and you should get much closer to 50% heads.
Some test subjects will get insurance and also cancer. Some subjects will get no insurance but also get healthier. These unpredictable outcomes mean groups' average outcomes are variable.
A test of statistical significance starts by recognizing that the treatment might have no effect, and the average outcomes might be different just because of random chance. It then asks how likely it is we would observe the difference we actually saw.
Maybe there is a statistically significant effect, but does it actually matter? If insurance raises health outcomes by 0.01%, it would be a statistically significant effect that isn't very important in practical terms.