Teaching Notes for Applied Econometrics and Economic Modelling Lab Session

Duration: 2 sessions of 45 minutes each
Tools: Google Sheets (for manual calculations) and LINEST function for verification.
Objective: Students will calculate confidence intervals for the slope using the Advertising-Sales dataset, and then analyze the Olympic 100m dataset using linear and non-linear models. They will also learn how to log-linearize non-linear models.
Focus: Hands-on calculations, intuitive explanations, and forecasting using both linear and non-linear models.

$\text{Sales_i} = \beta_1 + \beta_2 \times \text{Advertising_i} + \epsilon_i$

$\beta_1 + \beta_2 \times \text{Advertising_i}$ is the population regression line.
Create a table showing the values of $\hat{\beta}_2$ , $s_{\hat{\beta}_2}$ , and $t-stat$ .

Objective: Recap the concepts of confidence intervals.
Key Points:
- Confidence Interval for the Slope:
  - What is it?: A range of values within which the true slope coefficient is expected to lie with a certain level of confidence (e.g., 95%).
  - Why is it important?: It provides a measure of the uncertainty around the slope estimate.

Objective: Calculate the confidence interval for the slope manually using Google Sheets.
Exercise: Use the Advertising-Sales dataset.
- Step 1: Compute the standard error of the slope coefficient ( $s_{\hat{\beta}_2}$ ).
  - Instruction: Use the formula: $s_{\hat{\beta}_2} = \sqrt{\frac{s^2}{\sum (x_i - \bar{x})^2}}$
  - Why?: This measures the variability of the slope estimate.
- Step 2: Calculate the 95% confidence interval for the slope.
  - Instruction: Use the formula: $\text{CI} = \hat{\beta}_2 \pm 2 \times s_{\hat{\beta}_2}$
  - Why?: This gives a range within which the true slope is likely to lie.
- Discussion: What does the confidence interval tell us about the slope? How does it help in interpreting the regression results?

Objective: Fit a linear trend model to the Olympic 100m dataset for men and women.
Exercise: Use the Olympic 100m dataset.
- Step 1: Fit a linear trend model to the data.
  - Instruction: Use the formula: $W_i = \beta_1 + \beta_2 G_i + \epsilon_i$ where $W_i$ is the winning time (seconds) and $G_i$ is the year for i = 1,…,15 (from 1=1948 to 15=2004).
    - Find $\hat{\beta}_2 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2}$ and $\hat{\beta}_1 = \bar{y} - \hat{\beta}_2 \bar{x}$ .
  - Why?: This provides a baseline forecast based on a linear trend.
- Step 2: Calculate the outcomes of the linear model.
  - Instruction: Compute $s^2 = \frac{1}{n-2} \sum_{i=1}^{n} e_i^2$ and $\sigma_{\hat{\beta}_2}^2 = \frac{s^2}{\sum (x_i - \bar{x})^2}$
  - Create a table summarizing the results.
  - Construct a 95% CI for the slope coefficients.
  - A 95% two-sided confidence interval for the slope coefficient is an interval that contains the true value of the slope coefficient with a 95% probability; that is, it contains the true value of the slope coefficient in 95% of all possible randomly drawn samples.
  - Women seem to have made most progress.
  - Model assumes fixed gain (in seconds per game).
  - Negative slope means winning times decrease over time (improvement).

Objective: Fit a non-linear (exponential) trend model to the data and transform it into a log-linear form.
Exercise: Use the Olympic 100m dataset.
- Step 1: Fit a non-linear (exponential) trend model to the data.
  - Instruction: Use the formula: $W_i = e^{\beta_1 + \beta_2 G_i + \epsilon_i}$
  - Why?: This model can capture exponential trends in the data and provide a more accurate forecast.
- Step 2: Transform the non-linear model into a log-linear form.
  - Instruction: Take the natural logarithm of both sides of the non-linear model: $\ln(W_i) = \beta_1 + \beta_2 G_i + \epsilon_i$
  - Why?: This allows us to use linear regression techniques on the transformed model.
- Step 3: Calculate the outcomes of the non-linear model.
  - Instruction: Compute the predicted winning times for men and women using the non-linear model.
  - Why?: This gives the forecasted winning times based on the non-linear trend.

Objective: Use the fitted models to forecast winning times for men and women in the Olympic games of 2008 and 2012.
Exercise: Use the Olympic 100m dataset.
- Step 1: Forecast winning times for 2008 and 2012 using the linear model.
  - Instruction: Plug in the years 2008 and 2012 into the linear equation: $W_i = \beta_1 + \beta_2 G_i$
  - Why?: This provides a forecast based on the linear trend.
- Step 2: Forecast winning times for 2008 and 2012 using the non-linear model.
  - Instruction: Plug in the years 2008 and 2012 into the log-linear equation and then exponentiate the result: $W_i = e^{\beta_1 + \beta_2 G_i}$
  - Why?: This provides a forecast based on the non-linear trend.
- Discussion: How do the forecasts differ between the linear and non-linear models? Which model is more realistic?

Objective: Introduce GRETL and verify the results of the linear and non-linear models.
Exercise: Use the Olympic 100m dataset.
- Step 1: Load the dataset into GRETL.
  - Instruction: Use the GRETL interface to load the dataset.
  - Why?: This allows us to perform regression analysis using GRETL.
- Step 2: Perform linear regression in GRETL.
  - Instruction: Use GRETL to run the linear regression and compare the results with the manual calculations.
  - Why?: This verifies the accuracy of the manual calculations.
- Step 3: Perform non-linear regression in GRETL.
  - Instruction: Use GRETL to run the non-linear regression and compare the results with the manual calculations.
  - Why?: This verifies the accuracy of the manual calculations.
- Discussion: How do the GRETL results compare with the manual calculations? What are the benefits of using GRETL for regression analysis?

Assignment: Students should use GRETL to perform regression analysis on the Olympic 100m dataset and submit their results, including confidence intervals, optimal sales level, and forecasts using both linear and non-linear models.

Teaching Notes for Applied Econometrics and Economic Modelling Lab Session - Day 3