Causal Inference, Hypothesis Testing, Z-scores
POLS 3316: Statistics for Political Scientists

Tom Hanna

2023-10-28

Where we were

Standard Errors - distance between sample and population data
Z-scores - probability that sample represents the true population data
```
          + Z- Score tables
```

Today

Look at Z-tables and formula
Talk about statistics and causal inference

Statistics: Cause and effect

What is a cause?
Causes are complicated!
The Fundamental Problem
Hypothesis Testing
So what does a hypothesis test tell us?
Bayes Rule Again

What is a cause?

the thing that Y would not happen without

What is a cause?

the thing that Y would not happen without
If X does not exist, Y will not happen

What is a cause?

the thing that Y would not happen without
If X does not exist, Y will not happen

Except…

Causes are complicated!

Multiple causes

Causes are complicated!

Multiple causes
Causes may be necessary but not sufficient (conditional)

Causes are complicated!

Multiple causes
Causes may be necessary but not sufficient (conditional)
Causes may be sufficient but not necessary (more than one possible sufficient cause)

Causes are complicated!

Multiple causes
Causes may be necessary but not sufficient (conditional)
Causes may be sufficient but not necessary (more than one possible cause)
Some causes may be neither

Causes are complicated!

Multiple causes
Causes may be necessary but not sufficient (conditional)
Causes may be sufficient but not necessary (more than one possible cause)
Some causes may be neither
The randomness factor (stochastic factor)

Causes are complicated!

Multiple causes
Causes may be necessary but not sufficient (conditional)
Causes may be sufficient but not necessary (more than one possible cause)
Some causes may be neither
The randomness factor (stochastic factor) - - Intermediate steps (mediating or moderating variables)

Causes are complicated!

Multiple causes
Causes may be necessary but not sufficient (conditional)
Causes may be sufficient but not necessary (more than one possible cause)
Some causes may be neither
The randomness factor (stochastic factor) - - Intermediate steps (mediating or moderating variables)
Reverse causation (endogeneity)

These are easy compared to the big issue…

The Fundamental Problem of Causal Inferance

The Fundamental Problem

We can’t observe “what if”?

The Fundamental Problem

We can’t observe the “what if”?
Technical term: counterfactual

The Fundamental Problem

We can’t observe the “what if”?
Technical term: counterfactual
We don’t know if Y would have happened without X because X did happen

The Fundamental Problem

We can’t observe the “what if”?
Technical term: counterfactual
We don’t know if Y would have happened without X because X happened
Huge problem for observational studies

The Fundamental Problem

We can’t observe the “what if”?
Technical term: counterfactual
We don’t know if Y would have happened without X because X happened
Huge problem for observational studies
Experimental design manipulates the data generation: partial solution

The Fundamental Problem

We can’t observe the “what if”?
Technical term: counterfactual
We don’t know if Y would have happened without X because X happened
Huge problem for observational studies
Experimental design manipulates the data generation: partial solution
Observational studies rely on our treatment of the data: partial solution

Approach hypothesis testing

Our approach to hypothesis testing is part of the solution to the fundamental problem
Our interpretation of hypothesis testing is driven by the fundamental problem

Hypothesis Testing

Null hypothesis: effect on Y is due to random chance

Hypothesis Testing

Null hypothesis: effect on Y is only due to random chance
Null hypothesis: as if X didn’t exist

Hypothesis Testing

Null hypothesis: effect on Y is only due to random chance
Null hypothesis: as if X didn’t exist
design the model so that: null hypothesis ~ the counterfactual

Hypothesis Testing

Null hypothesis: effect on Y is only due to random chance
Null hypothesis: as if X didn’t exist
design the model so that: null hypothesis ~ the counterfactual

That’s aspirational

Standard Errors, Z-Scores, and Z-Tables

Standard error: Standard deviation of the sampling distribution of the mean

SE = \(\frac{\sigma}{\sqrt{n}}\)

Z-score: number of standard errors from the mean

\(Z = \frac{\bar{x}-\mu}{SE}\) or Z = \(\frac{\bar{x} - \my}{\sigma_{\bar{x}}}\)

Z-Tables

Z-table

Example:

Mean height of UH students is 5’10”
Standard deviation of height is 3”
Sample of 100 students
Mean height of sample is 5’9”

Is the sample mean height shorter than the population mean height?

Is this a one-tailed or two-tailed test?

A one-tailed test is directional
For example, is the sample mean greater than the population mean?
A two-tailed test is non-directional
For example, is the sample mean different from the population mean?

What probability are we looking for?

Our required confidence level is 95%
Our required significance level is 5%
We are looking for a probabiliyt of 5% or less
Also called a p-value of 0.05 or less
Also called an \(\alpha\) (alpha) of 0.05 or less

p < .05

What is our Standard Error?

SE = \(\frac{\sigma}{\sqrt{n}}\)

What is our Standard Error?

SE = \(\frac{\sigma}{\sqrt{n}}\)
\(\frac{3}{\sqrt{100}} = 0.3\)

What is our Z-score?

\(Z = \frac{\bar{x}-\mu}{SE}\) or Z = \(\frac{\bar{x} - \my}{\sigma_{\bar{x}}}\)

What is our Z-score?

\(Z = \frac{\bar{x}-\mu}{SE}\) or Z = \(\frac{\bar{x} - \my}{\sigma_{\bar{x}}}\)
\(Z = \frac{5'9"-5'10"}{0.3} = -3.33\)

Z-Table

So what does a hypothesis test tell us?

Critical z-Values for a 95% confidence interval:

Z < 1.96 (or Z > -1.96) for a two-tailed
Z < 1.65 (or Z > 1.65) for a one-tailed test

So what does a hypothesis test tell us?

Z < 1.96: “the null hypothesis is retained”

So what does a hypothesis test tell us?

Z < 1.96: “the null hypothesis is retained”
```
  - The Theory is Wrong
```

So what does a hypothesis test tell us?

Z < 1.96: “the null hypothesis is retained”
```
  - The Theory is Wrong
  - As written
```

So what does a hypothesis test tell us?

Z < 1.96: “the null hypothesis is retained”

  - The Theory is Wrong
  - As written
  - In some way

So what does a hypothesis test tell us?

Possible: “the null hypothesis is retained”

  - The Theory is Wrong
  - As written
  - In some way

Z > 1.96:

So what does a hypothesis test tell us?

Possible: “the null hypothesis is retained”

  - The Theory is Wrong
  - As written
  - In some way

Z > 1.96: “the null hypothesis is rejected”

So what does a hypothesis test tell us?

Possible: “the null hypothesis is retained”

  - The Theory is Wrong
  - As written
  - In some way

Z > 1.96: “the null hypothesis is rejected”
```
  - The Theory is Right??
```

So what does a hypothesis test tell us?

Possible: “the null hypothesis is retained”

  - The Theory is Wrong
  - As written
  - In some way

Z > 1.96: “the null hypothesis is rejected”
```
  - The Theory is Right??
```

NO!!!!!!

So what does a hypothesis test tell us?

Z > 1.96: “the null hypothesis is rejected”

The evidence supports the hypothesis.

So what does a hypothesis test tell us?

Z > 1.96: “the null hypothesis is rejected”

The evidence supports the hypothesis.

The evidence is consistent with the theory.

So what does a hypothesis test tell us?

Z > 1.96: “the null hypothesis is rejected”

The evidence supports the hypothesis.

The evidence is consistent with the theory.

The null hypothesis is rejected and the evidence is consistent with the hypothesized effect.

So what does a hypothesis test tell us?

Z > 1.96: “the null hypothesis is rejected”

The evidence supports the hypothesis.

The evidence is consistent with the theory.

The null hypothesis is rejected and the evidence is consistent with the hypothesized effect.

What about certainty and proof?

Back to Bayes Rule

Bayes Rule

What does this tell us?

We need to be precise about what we mean by a cause

We need to understand what statistics can tell us about causation and what it can’t

  - Correlation does not *prove* causation 
  - but correlation can *help establish* causation

  - We need to understand the limits of data and statistics
  - We also need to understand the capabilities of data and statistics

Everything after here is draft notes for your reading. Beware of typos, etc.

Some of the things in these notes are from courses I took, some are from assorted books, some are from these two sources which are at least somewhat readable and free:

https://egap.org/resource/10-things-to-know-about-hypothesis-testing/

https://egap.org/resource/10-things-to-know-about-causal-inference/

1 - Correlation \(\notequal\) causation.

Correlation does imply a relationship
Relationship may involve some cause and effect somewhere
The relationship could go either direction
The relationship could involve other variables
Lack of correlation doesn’t necessarily mean anything - correlation is linear and causal effects are not always linear

2 - A cause is a claim about something that did not happen

If we say X caused Y, we mean: If X did not happen, Y would not happen, everything else being held the same.

If we say X caused Y, we mean: If X didn’t happen, Y would not happen, everything else being held the same.

The Fundamental Problem of Causal Inference

Our proposed cause, which did happen, is the factual
The thing that didn’t happen is called the counterfactual
We can’t actually observe the thing that didn’t happen
The inability to observe the counterfactual is the fundamental problem of causal inference
Experiments are a potential way around this

4. Causes have to involve a possible manipulation of circumstances so that the counterfactual occurred

5. Statistics looks for average causal effects

Statistics are about average causal effects, not single data points or individual effects. The average effects may conflict with anecdotal evidence. This is partially because…

6. There can be multiple causes.

The technical phrase here is: Causes are non-rival.

7. Causes can be…

necessary
sufficient
neither -or both

and still be causes

8. Measuring effects is easy

It’s a lot easier to measure effects than to find causes.

What can statistics do for us?

The Null Hypothesis and counterfactuals

          + We can measure the probability an effect is due to random chance (the null hypothesis)
          + Formal hypothesis tests give us this value, the *p-value*
          + Theory provides an *alternative hypothesis* which we believe to be true based on the theory
          + Well designed hypotheses can help with the unobserved counterfactual
          + When we reject the null, we can determine that "the evidence is consistent with the alternative hypothesis" and the theory

Authorship, License, Credits

Author: Tom Hanna
Website: tomhanna.me
License: This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.</>

Z-Table image from: https://byjus.com/maths/z-score-table/

Full Z-Table from unknown course I took sometime in the last 8 years

Other images referenced in previous lectures

Causal Inference, Hypothesis Testing, Z-scores POLS 3316: Statistics for Political Scientists

Where we were

Today

Statistics: Cause and effect

What is a cause?

What is a cause?

What is a cause?

What is a cause?

Causes are complicated!

Causes are complicated!

Causes are complicated!

Causes are complicated!

Causes are complicated!

Causes are complicated!

Causes are complicated!

Causes are complicated!

The Fundamental Problem of Causal Inferance

The Fundamental Problem

The Fundamental Problem

The Fundamental Problem

The Fundamental Problem

The Fundamental Problem

The Fundamental Problem

Approach hypothesis testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Standard Errors, Z-Scores, and Z-Tables

Z-Tables

Example:

Is this a one-tailed or two-tailed test?

What probability are we looking for?

What is our Standard Error?

What is our Standard Error?

What is our Z-score?

What is our Z-score?

Z-Table

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

So what does a hypothesis test tell us?

Back to Bayes Rule

What does this tell us?

Everything after here is draft notes for your reading. Beware of typos, etc.

1 - Correlation \(\notequal\) causation.

2 - A cause is a claim about something that did not happen

4. Causes have to involve a possible manipulation of circumstances so that the counterfactual occurred

5. Statistics looks for average causal effects

6. There can be multiple causes.

7. Causes can be…

8. Measuring effects is easy

What can statistics do for us?

Authorship, License, Credits

Causal Inference, Hypothesis Testing, Z-scores
POLS 3316: Statistics for Political Scientists