The goal of the least squares method is to select the parameters \(\hat{\beta}_1\) and \(\hat{\beta}_2\) such that the sum of squared residuals (SSR) is minimized. The residuals (\(e_i\)) are defined as:
\[ e_i = y_i - \hat{y}_i = y_i - (\hat{\beta}_1 + \hat{\beta}_2 x_i) \]
The sum of squared residuals (SSR) is:
\[ SSR = \sum_{i=1}^n e_i^2 = \sum_{i=1}^n \left( y_i - (\hat{\beta}_1 + \hat{\beta}_2 x_i) \right)^2 \]
To minimize \(SSR\), we take the partial derivatives of \(SSR\) with respect to \(\hat{\beta}_1\) and \(\hat{\beta}_2\), set them to zero, and solve for \(\hat{\beta}_1\) and \(\hat{\beta}_2\). This gives us the normal equations:
\[ \frac{\partial SSR}{\partial \hat{\beta}_1} = -2 \sum_{i=1}^n (y_i - \hat{\beta}_1 - \hat{\beta}_2 x_i) = 0 \] \[ \frac{\partial SSR}{\partial \hat{\beta}_2} = -2 \sum_{i=1}^n x_i (y_i - \hat{\beta}_1 - \hat{\beta}_2 x_i) = 0 \]
Solving these equations yields the least squares estimators:
\[ \hat{\beta}_1 = \bar{y} - \hat{\beta}_2 \bar{x} \] \[ \hat{\beta}_2 = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2} \]
Here: - \(\bar{y}\) is the mean of \(y\), - \(\bar{x}\) is the mean of \(x\), - \(\hat{\beta}_1\) is the intercept, - \(\hat{\beta}_2\) is the slope.
Total variation is the sum of squared deviations of \(y_i\) from the mean \(\bar{y}\):
\[ \text{Total Variation} = \sum_{i=1}^n (y_i - \bar{y})^2 \]
Unexplained variation is the sum of squared residuals (\(e_i\)):
\[ \text{Unexplained Variation} = \sum_{i=1}^n e_i^2 = \sum_{i=1}^n \left( y_i - \hat{y}_i \right)^2 \]
SUM function in
Google Sheets.SUM function in
Google Sheets.Objective: Test whether advertising spending has a
statistically significant effect on sales.
Rationale: Even if \(\hat{\beta}_2 = 3\) (each $1,000 on
advertising increases sales by $3,000), this could be due to random
sampling error. Hypothesis testing tells us if the true population slope
\(\beta_2\) is different from zero.
Step 0: State the Null and Alternative Hypotheses
Null hypothesis (\(H_0\)): \(\beta_2 = 0\)
Meaning: There is no linear relationship
between advertising and sales. Any observed slope is purely by
chance.
Alternative hypothesis (\(H_1\)): \(\beta_2 \neq 0\) (two‑tailed test)
Meaning: There is a linear relationship
(positive or negative).
Why two‑tailed? We don’t assume direction – advertising could
theoretically hurt sales (negative slope) or help (positive
slope).
Real‑world interpretation:
- \(H_0\): “Our advertising campaign has no impact on sales.”
- \(H_1\): “Advertising does affect sales.”
From the variance formula:
\[ \sigma_{\hat{\beta}_2}^2 = \frac{s^2}{\sum (x_i - \bar{x})^2} \] \[ s_{\hat{\beta}_2} = \sqrt{\sigma_{\hat{\beta}_2}^2} \]
\(s_{\hat{\beta}_2}\) measures
the typical sampling variability of \(\hat{\beta}_2\).
A smaller \(s_{\hat{\beta}_2}\) means the estimate is more precise.
Step 2: Calculate the t‑Statistic
\[ t_{\hat{\beta}_2} = \frac{\hat{\beta}_2 - 0}{s_{\hat{\beta}_2}} = \frac{\hat{\beta}_2}{s_{\hat{\beta}_2}} \]
This tells us how many standard errors \(\hat{\beta}_2\) is away from the null
hypothesis value (zero).
Large absolute t‑statistic → evidence against \(H_0\).
Step 3: Compare to Critical Value or Use Rule‑of‑Thumb
Degrees of freedom (df) = \(n - 2\) (we estimated two parameters: intercept and slope).
\[ \hat{\beta}_2 \pm t_{\alpha/2, df} \cdot s_{\hat{\beta}_2} \]
Given: - \(\hat{\beta}_2 = 3.0\) - \(s_{\hat{\beta}_2} = 0.5\) - \(df = 23\)
t‑statistic:
\[
t = \frac{3.0}{0.5} = 6.0
\]
Critical value (from t‑table, \(\alpha=0.05\), two‑tailed, df=23): about 2.069.
Since \(6.0 > 2.069\), we reject \(H_0\).
Conclusion: There is strong statistical evidence that advertising spending affects sales (the true slope is not zero).
95% Confidence Interval:
\[
3.0 \pm 2.069 \times 0.5 = 3.0 \pm 1.0345 \quad \Rightarrow \quad
(1.9655,\; 4.0345)
\]
We are 95% confident that each additional $1,000 on advertising
increases sales by between $1,965 and $4,035.
What if the t‑statistic were 1.5 with \(n=25\)?
→ Fail to reject \(H_0\); we cannot
conclude advertising matters (the observed slope might be due to
chance).
Why use a two‑tailed test?
→ Because we don’t know whether advertising could backfire (negative
slope). A one‑tailed test would only look for a positive effect, which
is less conservative.
How does the sample size affect the test?
Practical significance vs. statistical significance
| Step | Action | Formula / Tool |
|---|---|---|
| 1 | State \(H_0\) and \(H_1\) | \(H_0: \beta_2 = 0\); \(H_1: \beta_2 \neq 0\) |
| 2 | Choose significance level (\(\alpha\)) | Usually 0.05 (95% confidence) |
| 3 | Compute \(s_{\hat{\beta}_2}\) | \(s_{\hat{\beta}_2} = \sqrt{ s^2 / \sum (x_i - \bar{x})^2 }\) |
| 4 | Calculate t‑statistic | \(t = \hat{\beta}_2 / s_{\hat{\beta}_2}\) |
| 5 | Find critical value \(t_{\alpha/2, n-2}\) | t‑table or rule‑of‑thumb (\(\approx 2\)) |
| 6 | Compare | Reject \(H_0\) if \(|t| > t_{\text{crit}}\) |
| 7 | (Optional) Build confidence interval | \(\hat{\beta}_2 \pm t_{\text{crit}} \cdot s_{\hat{\beta}_2}\) |
| 8 | Interpret in context | “Advertising has a significant positive effect on sales.” |
Project Reminder: For your dataset, apply these same
hypothesis testing steps.
- Write the null and alternative hypotheses in words and symbols.
- Compute the t‑statistic and state whether you reject \(H_0\) at \(\alpha
= 0.05\).
- Provide a business interpretation.