Processing math: 100%

Session Overview


Session 1: Hypothesis testing and confidence Intervals (45 minutes),

**1.1 Recap on the results of exercises.

  • Linear regression Model(population):

Sales_i=β1+β2×Advertising_i+ϵi

  • β1+β2×Advertising_i is the population regression line.

  • Create a table showing the values of ˆβ2, sˆβ2, and tstat.

ˆβ2 0.017
sˆβ2 0.00136
t-stat 12.4
  • R2=0.75

  • s (the standard error of the regression)= 8.74

  • Recap the concepts of hypothesis testing.

1.2 Confidence Intervals

  • Objective: Recap the concepts of confidence intervals.
  • Key Points:
    • Confidence Interval for the Slope:
      • What is it?: A range of values within which the true slope coefficient is expected to lie with a certain level of confidence (e.g., 95%).
      • Why is it important?: It provides a measure of the uncertainty around the slope estimate.

1.3 Hands-On Calculation of Confidence Interval for the Slope

  • Objective: Calculate the confidence interval for the slope manually using Google Sheets.
  • Exercise: Use the Advertising-Sales dataset.
    • Step 1: Compute the standard error of the slope coefficient (sˆβ2).
      • Instruction: Use the formula: sˆβ2=s2(xiˉx)2
      • Why?: This measures the variability of the slope estimate.
    • Step 2: Calculate the 95% confidence interval for the slope.
      • Instruction: Use the formula: CI=ˆβ2±2×sˆβ2
      • Why?: This gives a range within which the true slope is likely to lie.
    • Discussion: What does the confidence interval tell us about the slope? How does it help in interpreting the regression results?

Session 2: Forecasting with Linear and Non-Linear Models (45 minutes)

2.1 Linear Model for Men and Women

  • Objective: Fit a linear trend model to the Olympic 100m dataset for men and women.
  • Exercise: Use the Olympic 100m dataset.
    • Step 1: Fit a linear trend model to the data.
      • Instruction: Use the formula: Wi=β1+β2Gi+ϵi where Wi is the winning time (seconds) and Gi is the year for i = 1,…,15 (from 1=1948 to 15=2004).
        • Find ˆβ2=(xiˉx)(yiˉy)(xiˉx)2 and ˆβ1=ˉyˆβ2ˉx.
      • Why?: This provides a baseline forecast based on a linear trend.
    • Step 2: Calculate the outcomes of the linear model.
      • Instruction: Compute s2=1n2ni=1e2i and σ2ˆβ2=s2(xiˉx)2

      • Create a table summarizing the results.

      • Construct a 95% CI for the slope coefficients.

      • A 95% two-sided confidence interval for the slope coefficient is an interval that contains the true value of the slope coefficient with a 95% probability; that is, it contains the true value of the slope coefficient in 95% of all possible randomly drawn samples.

      • Women seem to have made most progress.

      • Model assumes fixed gain (in seconds per game).

      • Negative slope means winning times decrease over time (improvement).

2.2 Non-Linear (Exponential) Model for Men and Women

  • Objective: Fit a non-linear (exponential) trend model to the data and transform it into a log-linear form.
  • Exercise: Use the Olympic 100m dataset.
    • Step 1: Fit a non-linear (exponential) trend model to the data.
      • Instruction: Use the formula: Wi=eβ1+β2Gi+ϵi
      • Why?: This model can capture exponential trends in the data and provide a more accurate forecast.
    • Step 2: Transform the non-linear model into a log-linear form.
      • Instruction: Take the natural logarithm of both sides of the non-linear model: ln(Wi)=β1+β2Gi+ϵi
      • Why?: This allows us to use linear regression techniques on the transformed model.
    • Step 3: Calculate the outcomes of the non-linear model.
      • Instruction: Compute the predicted winning times for men and women using the non-linear model.
      • Why?: This gives the forecasted winning times based on the non-linear trend.

2.3 Forecasting Winning Times for 2008 and 2012

  • Objective: Use the fitted models to forecast winning times for men and women in the Olympic games of 2008 and 2012.
  • Exercise: Use the Olympic 100m dataset.
    • Step 1: Forecast winning times for 2008 and 2012 using the linear model.
      • Instruction: Plug in the years 2008 and 2012 into the linear equation: Wi=β1+β2Gi
      • Why?: This provides a forecast based on the linear trend.
    • Step 2: Forecast winning times for 2008 and 2012 using the non-linear model.
      • Instruction: Plug in the years 2008 and 2012 into the log-linear equation and then exponentiate the result: Wi=eβ1+β2Gi
      • Why?: This provides a forecast based on the non-linear trend.
    • Discussion: How do the forecasts differ between the linear and non-linear models? Which model is more realistic?

2.4 Introduction to GRETL and Verification of Results

  • Objective: Introduce GRETL and verify the results of the linear and non-linear models.
  • Exercise: Use the Olympic 100m dataset.
    • Step 1: Load the dataset into GRETL.
      • Instruction: Use the GRETL interface to load the dataset.
      • Why?: This allows us to perform regression analysis using GRETL.
    • Step 2: Perform linear regression in GRETL.
      • Instruction: Use GRETL to run the linear regression and compare the results with the manual calculations.
      • Why?: This verifies the accuracy of the manual calculations.
    • Step 3: Perform non-linear regression in GRETL.
      • Instruction: Use GRETL to run the non-linear regression and compare the results with the manual calculations.
      • Why?: This verifies the accuracy of the manual calculations.
    • Discussion: How do the GRETL results compare with the manual calculations? What are the benefits of using GRETL for regression analysis?

Homework/Follow-Up