Processing math: 100%

Session Overview


Session 1: Data Entry and Visualization (45 minutes)

1.1 Introduction to the Lab Session (5 minutes)

  • Objective: Set the stage for the lab session and explain the importance of manual calculations.
  • Key Points:
    • Today’s focus is on manual calculations using Google Sheets to understand the mechanics of regression analysis.
    • In future sessions, we will transition to using GRETL for more advanced analyses.
    • Task 1: Sign in to your Google account and go to Google Drive.
    • Task 2: Create a new folder in Google Drive named Applied Econometrics.
      • Why?: This folder will store all the files and datasets used in these lab sessions for easy access and organization.
    • Task 3: Save all files and datasets studied in these lab sessions in the Applied Econometrics folder.

1.2 Loading the Advertising-Sales Dataset (5 minutes)

  • Objective: Load the dataset into Google Sheets and explain its structure.
  • Exercise: Use the Advertising-Sales dataset.
    • Step 1: Download the Advertising-Sales dataset (provided below).
    • Step 2: Import the dataset into Google Sheets.
      • Instruction: Go to Google Sheets, click on File > Import, and upload the dataset.
      • Why?: Importing the dataset allows us to organize and analyze the data in a structured format.
    • Step 3: Explain the dataset.
      • Dataset Description:
        • The dataset contains 52 weekly observations of Advertising Expenditure (in thousands of dollars) and Weekly Sales (in thousands of units) for a retail store.
        • Variables:
          • Week: The week number (1 to 52).
          • Advertising Expenditure (x): The amount spent on advertising each week (in $1,000s).
          • Weekly Sales (y): The number of units sold each week (in 1,000s).
      • Discussion: What does this dataset represent? How can we use it to analyze the relationship between advertising spending and sales?

1.3 Data Exploration with Weekly Dataset (25 minutes)

  • Objective: Familiarize students with the dataset and perform basic data exploration.
  • Exercise: Use the weekly advertising and sales dataset.
    • Step 1: Load the dataset into Google Sheets.
      • Why?: To visualize and analyze the data, we need to organize it in a structured format.
    • Step 2: Create a line chart to visualize advertising spending over the 52 weeks.
      • Instruction: Plot Week on the horizontal axis and Advertising Expenditure on the vertical axis.
      • Why?: A line chart helps us observe trends in advertising spending over time.
      • Discussion: What trends do you observe? Is advertising spending increasing, decreasing, or stable over time?
    • Step 3: Create a line chart to visualize weekly sales over the 52 weeks.
      • Instruction: Plot Week on the horizontal axis and Weekly Sales on the vertical axis.
      • Why?: A line chart helps us observe trends in sales over time.
      • Discussion: What trends do you observe in sales? Are sales increasing, decreasing, or stable over time? Do sales trends align with advertising spending trends?
  • Step 4: Create a scatter plot to visualize weekly sales against advertising expenditures.
    • Instruction: Plot Weekly advertising expenditures on the horizontal axis and Weekly Sales on the vertical axis.
    • Why?: A scatter plot helps us observe the relationship between numerical variables.
    • Discussion: What kind of relationship do you observe between advertising expenditures and sales?

Session 2: Simple Linear Regression – Intuition and Model Interpretation (45 minutes)

2.1 Refresher on Simple Linear Regression (10 minutes)

  • Objective: Recap the intuition behind simple linear regression and its components.
  • Key Points:
    • Regression Equation: yi=ˆβ1+ˆβ2xi+ei
      • yi: Dependent variable (e.g., Weekly Sales)
      • xi: Independent variable (e.g., Advertising Expenditure)
      • ˆβ1: Intercept (value of y when x=0)
      • ˆβ2: Slope (change in y for a unit change in x)
      • ei: Residual (error term, the difference between actual and predicted values)
    • Intuition: Regression helps us understand how changes in advertising spending affect sales. ˆβ2 tells us the direction and strength of this relationship.

2.2 Hands-On Calculation of ˆβ1 and ˆβ2 (35 minutes)

  • Objective: Calculate the slope (ˆβ2) and intercept (ˆβ1) manually using Google Sheets.
  • Exercise: Use the weekly advertising and sales dataset.
    • Step 1: Compute (xiˉx) and (yiˉy) for each observation.
      • Instruction: Create new columns in Google Sheets for these calculations.
      • Why?: These deviations from the mean help us measure how far each observation is from the average, which is essential for calculating ˆβ2.
    • Step 2: Compute (xiˉx)(yiˉy) and (xiˉx)2.
      • Instruction: Multiply and square the deviations in new columns.
      • Why?: These calculations are part of the formula for ˆβ2. The numerator captures the covariance between x and y, while the denominator captures the variance in x.
    • Step 3: Sum the columns to get (xiˉx)(yiˉy) and (xiˉx)2.
      • Instruction: Use the SUM function in Google Sheets.
      • Why?: These sums are used to calculate ˆβ2.
    • Step 4: Calculate ˆβ2 using the formula. ˆβ2=(xiˉx)(yiˉy)(xiˉx)2
      • Instruction: Divide the two sums to get ˆβ2.
      • Why?: ˆβ2 tells us how much y changes for a unit change in x. For example, if ˆβ2=2.5, it means that for every $1,000 increase in advertising spending, sales increase by 2,500 units.
    • Step 5: Calculate ˆβ1 using the formula ˆβ1=ˉyˆβ2ˉx.
      • Instruction: Use the previously computed means and ˆβ2 to find ˆβ1.
      • Why?: The intercept (ˆβ1) tells us the value of y when x=0. It helps us anchor the regression line.
    • Step 6: Write the regression equation. ˆy=ˆβ1+ˆβ2x
      • Instruction: Substitute the values of ˆβ1 and ˆβ2 into the equation.
      • Why?: This equation allows us to predict y for any given x.
    • Step 7: Predict sales for a week with $8,000 spent on advertising.
      • Instruction: Use the regression equation to make predictions.
      • Why?: Predictions help us understand how the model can be applied in real-world scenarios. For example, a store manager can use this model to predict sales for different levels of advertising spending.
      • Discussion: What do ˆβ1 and ˆβ2 tell us about the relationship between advertising spending and sales? How accurate are the predictions?
    • Step 8: Objective: Demonstrate and explain the properties of residuals in the context of the least squares regression method. Verification of residual sums and averages.
      • Instruction: Compute the sum of residuals (ei) and the average of residuals (ˉe). Also, verify that the average of the observed dependent variable (ˉy) equals the average of the predicted values (ˉˆy).
      • Calculation:
        • Calculate ei using the formula: ei=(yiˆyi)
        • Compute ˉe using the formula: ˉe=1nei
        • Verify ˉy=ˉˆy by comparing ˉy and 1nˆyi
      • Why?:
        • ei=0: This result emerges because the least squares method minimizes the sum of the squared residuals, naturally leading to the residuals summing to zero. This property ensures that the regression line correctly balances the overestimations and underestimations of the dependent variable, providing an unbiased estimate.
        • ˉe=0: The average of the residuals being zero further confirms that the model does not systematically overpredict or underpredict the dependent variable, which aligns with the unbiased nature of the least squares estimators.
        • ˉy=ˉˆy: This equality holds because the least squares regression line passes through the centroid of the data points (ˉx,ˉy), ensuring that the average of the observed values equals the average of the predicted values. This is a crucial validation of the model’s accuracy in capturing the central tendency of the dependent variable.

Homework/Follow-Up