Homework Assignment 2: Applied Regression
Analysis
Objective: Perform regression analysis on two
datasets (Income-Consumption and Olympic 100m) to calculate key
statistics, test hypotheses, and compare models. Submit one
Excel file with two sheets and a PDF report
with interpretations.
Part 1: Income-Consumption Dataset (Manual
Calculations)
Dataset: Income vs. Consumption.
Tasks (Excel Sheet 1)
Calculate R-squared (R²)
- Compute \(\bar{y}\) (mean
consumption).
- For each observation:
- \((e_i)^2 = (y_i -
\hat{y}_i)^2\)
- \((y_i - \bar{y})^2\)
- Sum to get \(SSR = \sum (e_i)^2\)
and \(SST = \sum (y_i -
\bar{y})^2\).
- Calculate:
\[
R^2 = 1 - \frac{SSR}{SST}
\]
Estimate Standard Error (s)
\[
s = \sqrt{\frac{SSR}{n-2}}
\]
Standard Error of Slope (\(s_{\hat{\beta}_2}\))
\[
s_{\hat{\beta}_2} = \sqrt{\frac{s^2}{\sum (x_i - \bar{x})^2}}
\]
Hypothesis Test for Slope
- \(t =
\frac{\hat{\beta}_2}{s_{\hat{\beta}_2}}\)
- Reject \(H_0\) if \(|t| > 2\).
95% CI for Slope
\[
\hat{\beta}_2 \pm 2 \cdot s_{\hat{\beta}_2}
\]
Prediction Interval
For \(x_0 = \$10,000\):
\[
\hat{y}_0 \pm 2 \cdot s
\]
Interpretation (PDF Report)
Part 2: Olympic 100m Dataset (Linear vs. Exponential
Models)
Dataset: Winning times (s) for men/women
(1948–2004).
Tasks (Excel Sheet 2)
- Linear Model
- Fit: \(\text{Time} = \beta_1 + \beta_2
\cdot \text{Year}\).
- Predict equal-performance year by solving:
\[
\beta_1^{men} + \beta_2^{men} \cdot \text{Year} = \beta_1^{women} +
\beta_2^{women} \cdot \text{Year}
\]
- Exponential Model
- Transform: \(\ln(\text{Time}) =
\ln(\beta_1) + \beta_2 \cdot \text{Year}\).
- Use
LINEST()
to estimate \(\ln(\beta_1)\) and \(\beta_2\).
- Convert back: \(\text{Time} =
e^{\ln(\beta_1) + \beta_2 \cdot \text{Year}}\).
- Predict equal-performance year.
Interpretation (PDF Report)
- Linear vs. Exponential**:
- Limitations:
Submission Guidelines
- Excel File:
- Sheet 1: Income-Consumption manual
calculations.
- Sheet 2: Olympic 100m
LINEST()
outputs
and predictions.
- PDF Report:
- 1–2 pages.
- Interpret R², slope test, CI, and model
comparisons.
- Due: The midnight on March 27th.
- Grading:
- Excel (50%): Correct calculations and
organization.
- PDF (50%): Clear interpretations and
reasoning.
Key Notes for Students
- Use absolute cell references (e.g.,
$A$2
) in Excel.
- Label all columns/rows clearly.
- For the Olympic data, highlight how exponential decay reflects
physiological limits.