Wednesday, September 30, 2015

Introduction

We want to ask questions about cause and effect.

  • Does drug XYZ reduce patients' risk of cancer?
  • Did the Fed's choice of interest rate grow the economy?

Introduction

But there's a problem: any phenomenon has many causes.

  • Are patients who take XYZ are also healthier to begin with?
  • Do int'l capital markets undo Fed policy?

Introduction

To answer these questions we need to:

  • Collect appropriate evidence.
  • Deal with that evidence appropriately.

Introduction

You've already had a taste of the problems economists face in collecting evidence:

  • There is a lot of data out there related to many possible questions.
  • Some of this data is easier to use than others.

Now we need to consider interpreting data.

Example

Has Obamacare been a success?

This seemingly simple question is actually very difficult.

  • There are different aspects to that legislation, and
  • the definition of success isn't always obvious.

To make our task simpler, we need to start with simpler questions. For example:

Does having health insurance lead to better health outcomes?

Example

Does having health insurance lead to better health outcomes?

This makes our job simpler, but it doesn't make our job simple. It's tempting to try to answer this question by comparing the average health of insured and uninsured people.

But is that enough to answer our question?

Ceteris Paribus

  • Are poor people are less healthy on average than rich people?
  • Are poor people are less likely to have health insurance?

This raises an important question: Are insured people healthier because of the insurance, or because of some other factor?

What we're after

What we're really concerned with isn't whether wealthier people are also healthier.

We're interested in whether health insurance actually helps people be healthier (and if so, how much healthier?).

To answer this question, we need to consider the counterfactual. What would the outcome have been?

Some theory…

With perfect knowledge we would know what happens and what happens in the parallel universe where poor people are insured (and the other parallel universe where rich people are uninsured).

Some theory…

To make our lives easier we're going to make up some symbols to describe the situation.

  • \(Y\) represents an outcome (in this case a self-reported health score from 1-5)
  • \(Y_i\) represents the outcome of person \(i\)
  • \(Y_{0i}\) represents person i's outcome if she bought insurance
  • \(Y_{1i}\) represents person i's outcome if she didn't buy insurance.
  • \(D_i\) represents the treatment condition of person i. (e.g. \(D_i\) = 0 means "person \(i\) doesn't have insurance")

What we want to know

  • What someone's outcome would be under other circumstances (e.g. \(Y_{0i}\) for someone with insurance)

This lets us determine the treatment effect (\(Y_{1i} - Y_{0i}\))

What we can see

  • Whether someone is insured (\(D_i\))
  • What someone's actual health outcome is (either \(Y_{0i}\) or \(Y_{1i}\))

What we want to know, and what we can see in real life:

At the end of the day, we want to know if insurance will lead to better outcomes than an alternative we never actually see.

We don't want to compare my health with yours, we want to compare my health with a version of myself that doesn't have insurance.

What we want to know, and what we can see in real life:

In the real world we can only know \(Y_{0i}\) for uninsured people and \(Y_{1i}\) for insured people.

But we want to ask if giving person \(i\) insurance lead to:

  • a better outcome (\(Y_{1i} > Y_{0i}\)),
  • worse outcome (\(Y_{1i} < Y_{0i}\)), or
  • no real difference (\(Y_{1i} = Y_{0i}\)).

If we knew everything:

Table 1.2 Khuzdar Khalat Maria Moreño
Potential outcome without insurance: \(Y_{0i}\) 3 5
Potential outcome with insurance: \(Y_{1i}\) 4 5
Treatment (insurance status chosen): \(D_i\) 1 0
Actual health outcome: \(Y_i\) 4 5
Treatment effect: \(Y_{1i} - Y_{0i}\) 1 0

If we knew everything:

Table 1.2 (if it reflects the truth) tells us that:

  • in a world where Khuzdar is insured, he would be healthier than the Khuzdar who didn't get insurance.
  • For Maria, getting insurance wouldn't affect her outcome.
    • Maybe she sees the doctor more often but eats less healthy food to save enough money to pay her premiums.
    • Maybe she sees the doctor more often but gets an infection at the hospital.
    • Maybe she is healthier, but this 1-5 scale doesn't allow us to see that she's gone from "very healthy" to "even healthier than that".

So how do we get to there from here?

How do we make an apples to apples comparison? How do we hold all else equal?

Ideally, we would set up a randomized trial.

If we randomly select a large groups of people, those large groups should, on average, have similar characteristics.

Randomized Trials

  • Apply a "treatment" to one group (e.g. giving half of them some new drug)
  • Treat a seperate group exactly (or at least roughly) the same but not give them the treatment (e.g. give them a placebo)
  • Compare the average outcomes

Careful…

Randomized trials, when done correctly, allow us to make inferences about the relationships between variables. Comparisons are valid when we're effectively holding all else constant.

But doing so perfectly is impossible. Our groups' average features (e.g. initial health, income, wealth, knowledge, etc.) will always be slightly different.

It's a Balancing Act

We can check our work by comparing the characteristics of our groups. Is the average age of one group different than the other? If our sample isn't already balanced we can rebalance it by ignoring some of the data.

If our treatment group (those we're providing insurance for) is older than the control group, we might randomly throw out some of the observations of older patients from the treatment group.

It's a Balancing Act

However, we don't want to throw out too much data because the purpose of randomized trials is to take advantage of the Law of Large Numbers. If we throw out too much data we won't have "large numbers" of observations to deal with.

Law of Large Numbers

Consider a simple trial:

  • Flip a (fair) coin. There's a 50% chance you'll flip heads.
  • Now flip that coin twice. There's a 25% chance you'll flip heads both times.
  • Now flip the coin 100 times. It's possible, but highly unlikely that you'll flip heads every time.

Law of Large Numbers

Even if our groups look similar on average, there are unobserved (and unobservable) differences between everyone.

It's possible that participants in the treatment group are more likely to have some particular gene that affects the effectiveness of the treatment.

But if our groups are large and randomly selected, it's unlikely that this will happen.

How careful do we need to be?

So how similar is similar enough? Does it matter if the treatment group's average income is $200 greater than the control group? This raises the issue of significance.

There are two parts of this issue:

  • statistical significance and
  • practical significance (in this class we'll usually refer to the latter as "economic significance").

Statistical significance

Differences in what we observe may occur by random chance.

Flip a coin 10 times and get 6 heads. Flip another coin 10 times and get 4 heads. Is either coin unfair? Probably not. Flip both coins 1,000,000 times and you should get much closer to 50% heads.

Statistical significance

Some test subjects will get insurance and also cancer. Some subjects will get no insurance but also get healthier. These unpredictable outcomes mean groups' average outcomes are variable.

A test of statistical significance starts by recognizing that the treatment might have no effect, and the average outcomes might be different just because of random chance. It then asks how likely it is we would observe the difference we actually saw.

Practical significance

Maybe there is a statistically significant effect, but does it actually matter? If insurance raises health outcomes by 0.01%, it would be a statistically significant effect that isn't very important in practical terms.