The goal of the least squares method is to select the parameters ˆβ1 and ˆβ2 such that the sum of squared residuals (SSR) is minimized. The residuals (ei) are defined as:
ei=yi−ˆyi=yi−(ˆβ1+ˆβ2xi)
The sum of squared residuals (SSR) is:
SSR=n∑i=1e2i=n∑i=1(yi−(ˆβ1+ˆβ2xi))2
To minimize SSR, we take the partial derivatives of SSR with respect to ˆβ1 and ˆβ2, set them to zero, and solve for ˆβ1 and ˆβ2. This gives us the normal equations:
∂SSR∂ˆβ1=−2n∑i=1(yi−ˆβ1−ˆβ2xi)=0 ∂SSR∂ˆβ2=−2n∑i=1xi(yi−ˆβ1−ˆβ2xi)=0
Solving these equations yields the least squares estimators:
ˆβ1=ˉy−ˆβ2ˉx ˆβ2=∑(xi−ˉx)(yi−ˉy)∑(xi−ˉx)2
Here: - ˉy is the mean of y, - ˉx is the mean of x, - ˆβ1 is the intercept, - ˆβ2 is the slope.
Total variation is the sum of squared deviations of yi from the mean ˉy:
Total Variation=n∑i=1(yi−ˉy)2
Unexplained variation is the sum of squared residuals (ei):
Unexplained Variation=n∑i=1e2i=n∑i=1(yi−ˆyi)2
SUM
function in
Google Sheets.SUM
function in
Google Sheets.Instruction: Use a t-distribution table to find the critical value for a 95% confidence level with n−2 degrees of freedom.
Why?: If the t-statistic exceeds the critical value, we reject the null hypothesis (H0:β=0).
Rule-of-thumb for large n : reject H0 if tˆβ2<−2 or tˆβ2>2.