August 31, 2016

What is Econometrics?

  • Use of statistical methods to analyze economic data
  • Dealing with nonexperimental data
  • Trying to make inferences from incomplete data
    • i.e. trying to estimate population-wide relationships based on sample data

Uses?

  • Estimate relationships between variables
  • Test theories and simple hypotheses
  • Forecast variables ("crystal balls")
  • Evaluating gov't policy

Steps

  1. Economic model (this can get complicated for difficult questions)
  2. Econometric model

Examples: estimating demand curve, evaluating effects of minimum wage increase

A Quick Note on Forecasting:

Example: Economics of Crime

  • Gary Becker, 1968
  • Assume criminals are utility maximizers
  • We could model criminal behavior as a function of the opportunity costs of crime and its benefits.

Example: Economics of Crime

\[y = f(w_1,w_2,p_1,p_2,s,a)\]

Symbol Variable
\(y\) criminess
\(w_1,w_2\) wages from crime (1) and legal wages (2)
\(p_1,p_2\) probabilities of getting caught (1) and convicted (2)
\(s\) likely sentence
\(a\) age

Example: Economics of Crime

  • What functional form do we use?
  • It depends…
    • Simple first step: throw together a bunch of variables and estimate a linear model.
    • Consider possible interactions (e.g. does schooling affect men differently than women?)
    • Some variables might need to be modified (e.g. \(p_1 \times p_2 \times s = E(s)\))

Example: Model of Job Training and Worker Productivity

  • Does extra training increase productivity?
  • What factors are important to hold constant?
    • Education
    • Work Experience
    • Training
    • Others?

Example: Model of Job Training and Worker Productivity

  • How do we measure these?
    • Years of schooling
      • Does quality matter?
      • Subject?
      • Earnestness?
    • Years since leaving school
    • Weeks of training
  • The data we have is always imperfect
    • Schooling \(\neq\) Education

Example: Economics of Crime

  • Econometric model:

\[ crime = \beta_0 + \beta_1 wage + ... + u\]

  • This is a linear model

Linear models

  • We're looking at how the \(y\) variable changes as the \(x\) variables change

\[ y = \beta_0 + \beta_1 X + ... + u\]

  • With multiple x variables, we're seeing how \(x_i\) changes, holding \(x_j\) constant.
  • A lot of interesting things are happening in the \(u\) term
    • Factors that aren't included in the model affect this "error term"
    • We hope that the effects of these other factors balance out
    • We also hope their effects aren't too big

The error term (made up data)

When the effects of all the things not in our model are small, it's relatively easy to see the relationship between our variables.

The error term (made up data)

When the data is noisier (i.e. there's a lot of variation in \(y\) that isn't associated with \(x\)) the confidence interval (dark grey) expands and we're less certain that the relationship holds up.

Real example: stopping distance as a function of speed

Types of data

  • Cross-sectional
  • Time series
  • Panel/Longitudinal
    • Pooled cross-sections
  • Different types of data call for different approaches

Cross-sectional data

  • Looking at different individuals at one point in time
    • e.g. countries, regions, people, firms, etc.
  • Assumes observations are essentially independent

Cross-sectional data

obs city crime rate poverty rate
1 03441 0.25 0.23
2 03442 0.15 0.03
3 03456 0.23 0.14
4 03458 0.35 0.12
5 03460 0.34 0.27

Time Series

  • Looking at one or more variables over time
    • e.g. GDP, stock prices, CPI, homicide rates
  • Usually "serially correlated"
    • \(GDP_{2016} = GDP_{2015} + u\)
    • This will raise problems with statistical inference if we don't do something.

Time Series

obs year crime rate poverty rate
1 1999 0.25 0.23
2 2000 0.25 0.23
3 2001 0.23 0.24
4 2002 0.25 0.22
5 2003 0.24 0.27

Panel/Pooled Cross-section

  • Looking at cross-sectional variation and variation over time
    • Pooled: different (independent) cross-sections
      • e.g. random people from different states at different time periods
    • Panel (longitudinal): following the same people/firms/countries/etc. over different time periods
      • can account for unobservable differences ("fixed effects")

Panel Data

obs year city crime rate poverty rate
1 1999 Baltimore 0.25 0.23
2 2000 Baltimore 0.25 0.23
3 2001 Baltimore 0.23 0.24
4 1999 Dist Col. 0.15 0.22
5 2000 Dist Col. 0.14 0.27

Ceteris Paribus means…

Ceteris Paribus

  • What is the effect of \(x_1\) on \(y\)?
    • what if \(x_2\) changes at the same time as \(x_1\)?
      • We might face the problem of colinearity.
    • to the extent we can, we want to isolate the effect of interest.
  • We need to be careful to include the right variables to avoid "model specification error"

Ceteris Paribus

What about when we think \(x_1\) and \(x_2\) work together to affect \(y\)? We can use interaction terms to understand the effect of both of them together.

Causality

https://www.xkcd.com/552/

Correlation doesn't imply causation, but it does waggle its eyebrows suggestively and gesture furtively while mouthing "look over there".

Causality

  • The best way to determine causality is with an experiment.

  • Most economists aren't able to run experiments, so figuring out causality is much harder.

  • Good advice: imagine an experiment that would help you figure out causality.

Examples: Crop Yield

  • Effect of fertilizer on crop yield
    • how many extra pounds of soybeans can I get from an extra pound of fertilizer?
    • implicit assumption: other factors (e.g. weather, parasites, etc.) are held fixed

Examples: Crop Yield

  • Experiment
    • sort fields into grid (e.g. squared meters)
    • randomly put different amounts of fertilizer on different squares (why randomly?)
    • carefully keep track of everything (make sure computers/GPS are working well)
    • compare results, taking account of other variables
      • e.g. when comparing to last harvest, take account of differences in weather

Examples: Returns to Schooling

  • Measuring returns to schooling
    • what would happen to someone's wages if they'd spent an extra year in school?
    • implicit assumption: other factors (e.g. motivation, IQ, etc.) are held fixed

Examples: Returns to Schooling

  • Experiment
    • pry babies from their mother's arms
    • randomly assign them to different amounts of schooling at different schools
    • compare wage outcomes

Examples: Returns to Schooling

  • Econometrics
    • collect information on people, including demographics, wages, years of schooling, etc.
    • make and justify assumptions about what variables are included
    • look for natural experiments when possible

Testing hypotheses

  • Theories might be:
    • Causal: "good institutions cause economic growth"
    • Correlational: "short term and long term bonds should be roughly equivalent"