To better understand the model: \(Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i,\) \(i=1,\cdots,n\) we will now proceed with an example of some data.
Below are some given data for height and weight
height = c(180.7, 171.9, 173.4, 188, 178.5, 185, 186.5)
weight = c(80.4, 73.4, 63.4, 88.5, 81.4, 74.6, 87.8)
You should be familiar with the basic equation for a line \(y=mx+b\), however you are likely unfamiliar with the \(\varepsilon\) term.
In Simple Linear Regression (SLR), there are 3 important assumptions we will cover now:
\(\varepsilon_1, \varepsilon_2,\cdots, \varepsilon_n\) are independent.
\(E[\varepsilon_i]=0\)
\(Var(\varepsilon_i)=\sigma^2\) (an unknown variance)
In other words, the error terms each follow a Normal distribution with mean 0 and variance \(\sigma^2\). Therefore, they are described as i.i.d. (independent and identically distributed) \(\varepsilon_i \stackrel{i.i.d.}{\sim} N(0, \sigma^2)\). Sometimes we only require the first two assumptions.
Given the height and weights above, we are also given this equation:
\(weight = -67 + 0.8 \times height + \varepsilon\)
If \(E(\varepsilon_i)=0\), then it also follows that \(E(Y_i) = \beta_0 + \beta_1 X_i\) (they are all constants, and the expected value of a constant is itself)
We can then solve for the \(\varepsilon\) and \(E(Y_i)\) with the following equations:
\(\varepsilon = weight + 67 - 0.8 \times height\) and,
\(E(Y_i) = -67 + 0.8 \times height\)
slr = function(weight, height) {
error <- c() # Initialize vectors
expected_value <- c()
for(i in 1:length(weight)) {
error_i <- weight[i] + 67 - 0.8*(height[i])
error <- c(error, error_i) # Solve for epsilon
expected_value_i <- 67 - 0.8*(height[i])
expected_value <- c(expected_value, # Solve for E(Y)
expected_value_i)
}
return(list(error, expected_value))
}
Below are the error terms, \(\varepsilon_i,\) for \(i=1,\cdots,7\)
slr(weight, height)[1]
## [[1]]
## [1] 2.84 2.88 -8.32 5.10 5.60 -6.40 5.60
Below are the expected values of \(Y_i\), \(E(Y_i),\) for \(i=1,\cdots,7\)
slr(weight, height)[2]
## [[1]]
## [1] -77.56 -70.52 -71.72 -83.40 -75.80 -81.00 -82.20
Below is a plot of the data, along with the equation of the line which was given:
plot(height, weight, main = 'Weight (kg) vs. Height (cm) (with regression line)', xlab = 'Height (cm)', ylab = 'Weight (kg)')
abline(-67, 0.8)
The \(\varepsilon_i 's\) can then be understood as the difference between the height per subject, or \(X_i\), and the line \(weight = -67 + 0.8 \times height\).
We are going to quickly go over the assumptions again. However this time they will be stated slightly differently.