Processing math: 100%
Session Overview
- Duration: 2 sessions of 45 minutes each
- Tools: Google Sheets (for initial manual
calculations and visualization), with an introduction to GRETL planned
for future sessions
- Objective: Guide students through the process of
manually calculating key regression estimators such as the slope (ˆβ2), intercept (ˆβ1), and residuals using
Google Sheets. This hands-on approach is designed to solidify their
understanding of the underlying principles of regression analysis,
enabling them to interpret the results effectively and apply these
concepts to real-world scenarios.
- Focus: The sessions emphasize practical engagement
with the weekly advertising expenditures and sales
dataset, including data entry, visualization of trends through
line and scatter plots, and manual calculation of regression parameters.
The aim is to ensure that students grasp both the computational and
interpretative aspects of econometric modeling.
Session 1: Data Entry and Visualization (45
minutes)
1.1 Introduction to the Lab Session (5
minutes)
- Objective: Set the stage for the lab session and
explain the importance of manual calculations.
- Key Points:
- Today’s focus is on manual calculations using
Google Sheets to understand the mechanics of regression analysis.
- In future sessions, we will transition to using
GRETL for more advanced analyses.
- Task 1: Sign in to your Google account and go to
Google Drive.
- Task 2: Create a new folder in Google Drive named
Applied Econometrics.
- Why?: This folder will store all the files and
datasets used in these lab sessions for easy access and
organization.
- Task 3: Save all files and datasets studied in
these lab sessions in the Applied Econometrics
folder.
1.2 Loading the Advertising-Sales Dataset (5
minutes)
- Objective: Load the dataset into Google Sheets and
explain its structure.
- Exercise: Use the Advertising-Sales
dataset.
- Step 1: Download the Advertising-Sales
dataset (provided below).
- Step 2: Import the dataset into Google Sheets.
- Instruction: Go to Google Sheets, click on
File > Import, and upload the dataset.
- Why?: Importing the dataset allows us to organize
and analyze the data in a structured format.
- Step 3: Explain the dataset.
- Dataset Description:
- The dataset contains 52 weekly observations of
Advertising Expenditure (in thousands of dollars) and
Weekly Sales (in thousands of units) for a retail
store.
- Variables:
- Week: The week number (1 to 52).
- Advertising Expenditure (x): The amount spent on
advertising each week (in $1,000s).
- Weekly Sales (y): The number of units sold each
week (in 1,000s).
- Discussion: What does this dataset represent? How
can we use it to analyze the relationship between advertising spending
and sales?
1.3 Data Exploration with Weekly Dataset (25
minutes)
- Objective: Familiarize students with the dataset
and perform basic data exploration.
- Exercise: Use the weekly
advertising and sales dataset.
- Step 1: Load the dataset into Google Sheets.
- Why?: To visualize and analyze the data, we need to
organize it in a structured format.
- Step 2: Create a line chart to
visualize advertising spending over the 52 weeks.
- Instruction: Plot Week on the horizontal axis and
Advertising Expenditure on the vertical axis.
- Why?: A line chart helps us observe trends in
advertising spending over time.
- Discussion: What trends do you observe? Is
advertising spending increasing, decreasing, or stable over time?
- Step 3: Create a line chart to
visualize weekly sales over the 52 weeks.
- Instruction: Plot Week on the horizontal axis and
Weekly Sales on the vertical axis.
- Why?: A line chart helps us observe trends in sales
over time.
- Discussion: What trends do you observe in sales?
Are sales increasing, decreasing, or stable over time? Do sales trends
align with advertising spending trends?
- Step 4: Create a scatter plot to
visualize weekly sales against advertising
expenditures.
- Instruction: Plot Weekly advertising
expenditures on the horizontal axis and Weekly
Sales on the vertical axis.
- Why?: A scatter plot helps us observe the
relationship between numerical variables.
- Discussion: What kind of relationship do you
observe between advertising expenditures and sales?
Session 2: Simple Linear Regression – Intuition and Model
Interpretation (45 minutes)
2.1 Refresher on Simple Linear Regression (10
minutes)
- Objective: Recap the intuition behind simple linear
regression and its components.
- Key Points:
- Regression Equation: yi=ˆβ1+ˆβ2xi+ei
- yi: Dependent variable (e.g.,
Weekly Sales)
- xi: Independent variable
(e.g., Advertising Expenditure)
- ˆβ1: Intercept (value
of y when x=0)
- ˆβ2: Slope (change in
y for a unit change in x)
- ei: Residual (error term, the
difference between actual and predicted values)
- Intuition: Regression helps us understand how
changes in advertising spending affect sales. ˆβ2 tells us the direction and
strength of this relationship.
2.2 Hands-On Calculation of ˆβ1 and ˆβ2 (35 minutes)
- Objective: Calculate the slope (ˆβ2) and intercept (ˆβ1) manually using Google
Sheets.
- Exercise: Use the weekly advertising and
sales dataset.
- Step 1: Compute (xi−ˉx) and (yi−ˉy) for each observation.
- Instruction: Create new columns in Google Sheets
for these calculations.
- Why?: These deviations from the mean help us
measure how far each observation is from the average, which is essential
for calculating ˆβ2.
- Step 2: Compute (xi−ˉx)(yi−ˉy) and (xi−ˉx)2.
- Instruction: Multiply and square the deviations in
new columns.
- Why?: These calculations are part of the formula
for ˆβ2. The numerator
captures the covariance between x
and y, while the denominator
captures the variance in x.
- Step 3: Sum the columns to get ∑(xi−ˉx)(yi−ˉy) and
∑(xi−ˉx)2.
- Instruction: Use the
SUM
function in
Google Sheets.
- Why?: These sums are used to calculate ˆβ2.
- Step 4: Calculate ˆβ2 using the formula. ˆβ2=∑(xi−ˉx)(yi−ˉy)∑(xi−ˉx)2
- Instruction: Divide the two sums to get ˆβ2.
- Why?: ˆβ2 tells us how much y changes for a unit change in x. For example, if ˆβ2=2.5, it means that for
every $1,000 increase in advertising spending, sales increase by 2,500
units.
- Step 5: Calculate ˆβ1 using the formula ˆβ1=ˉy−ˆβ2ˉx.
- Instruction: Use the previously computed means and
ˆβ2 to find ˆβ1.
- Why?: The intercept (ˆβ1) tells us the value of
y when x=0. It helps us anchor the regression
line.
- Step 6: Write the regression equation. ˆy=ˆβ1+ˆβ2x
- Instruction: Substitute the values of ˆβ1 and ˆβ2 into the equation.
- Why?: This equation allows us to predict y for any given x.
- Step 7: Predict sales for a week with
$8,000 spent on advertising.
- Instruction: Use the regression equation to make
predictions.
- Why?: Predictions help us understand how the model
can be applied in real-world scenarios. For example, a store manager can
use this model to predict sales for different levels of advertising
spending.
- Discussion: What do ˆβ1 and ˆβ2 tell us about the
relationship between advertising spending and sales? How accurate are
the predictions?
- Step 8: Objective: Demonstrate and
explain the properties of residuals in the context of the least squares
regression method. Verification of residual sums and averages.
- Instruction: Compute the sum of residuals (∑ei) and the average of residuals
(ˉe). Also, verify that the
average of the observed dependent variable (ˉy) equals the average of the
predicted values (ˉˆy).
- Calculation:
- Calculate ∑ei using the
formula: ∑ei=∑(yi−ˆyi)
- Compute ˉe using the
formula: ˉe=1n∑ei
- Verify ˉy=ˉˆy
by comparing ˉy and 1n∑ˆyi
- Why?:
- ∑ei=0:
This result emerges because the least squares method minimizes the sum
of the squared residuals, naturally leading to the residuals summing to
zero. This property ensures that the regression line correctly balances
the overestimations and underestimations of the dependent variable,
providing an unbiased estimate.
- ˉe=0:
The average of the residuals being zero further confirms that the model
does not systematically overpredict or underpredict the dependent
variable, which aligns with the unbiased nature of the least squares
estimators.
- ˉy=ˉˆy: This equality holds because the least
squares regression line passes through the centroid of the data points
(ˉx,ˉy), ensuring that
the average of the observed values equals the average of the predicted
values. This is a crucial validation of the model’s accuracy in
capturing the central tendency of the dependent variable.
Homework/Follow-Up
- Assignment: Students are required to replicate the
steps performed on the provided dataset for another set of data on
“Consumption and Income,” accessible here.
Submit all calculations in Excel format and interpretations of the
discussion questions in PDF format. The submission deadline is midnight
on March 23rd.
- Reading: Review the lecture slides on least squares
assumptions and properties for the next session.