Due Date: March 31, 2025 (by
noon)
Submission Format: PDF report + GRETL file
(.gretl)
This assignment applies multiple regression techniques to analyze how distance to the nearest college affects years of completed education, while controlling for student characteristics, family background, and local labor market conditions.
You will:
✔ Compare simple vs. multiple regression results
✔ Test for omitted variable bias
✔ Conduct t-tests and F-tests for statistical
significance
✔ Make predictions using regression results
All exercises must be performed in GRETL.
The College Distance dataset comes from the High School and Beyond Survey (1980) and includes variables on educational attainment, demographics, family background, and local economic conditions.
Variable | Description |
---|---|
ed | Years of education completed (dependent variable) |
dist | Distance to nearest 4-year college (in 10s of miles) |
bytest | Base year composite test score (standardized) |
female | 1 = Female, 0 = Male |
black | 1 = Black, 0 = Not Black |
hispanic | 1 = Hispanic, 0 = Not Hispanic |
incomchi | 1 = Family income > 25k, 0 = ≤ 25k |
ownhome | 1 = Family owns home, 0 = Does not own home |
dadooll | 1 = Father is college graduate, 0 = Not college graduate |
cue80 | County unemployment rate (1980) |
stwmfg80 | State hourly manufacturing wage (1980) |
ed
on
dist
:
dist
(state H₀, t-statistic, p-value, and
conclusion at α=0.05).Run a multiple regression of ed
on:
dist
, bytest
, female
,
black
, hispanic
, incomchi
,
ownhome
, dadooll
, cue80
,
stwmfg80
.
Compare the coefficient on dist
from the multiple regression to the simple regression:
For each coefficient in your multiple regression (except the intercept):
dadooll
(Father is a college graduate)
ed = β₀ + β₁dist + β₂bytest + u
dadooll
and
momcoll
):
dadooll
and
momcoll
.bytest=58
,
incomchi=1
, ownhome=1
, dadooll=0
,
cue80=7.5
, stwmfg80=9.75
, dist=2
(20 miles).dist=4
(40
miles).✔ GRETL file (.gretl) containing all regression
models.
✔ PDF report with:
- Regression tables
- Hypothesis test results (t-tests & F-tests)
- Interpretations of coefficients and predictions
✔ Due: March 31, 2025 (noon).
Late submissions will not be accepted.
Good luck!