To better understand the model: \(Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i,\) \(i=1,\cdots,n\) we will now proceed with an example of some data.

Below are some given data for height and weight

height = c(180.7, 171.9, 173.4, 188, 178.5, 185, 186.5)
weight = c(80.4, 73.4, 63.4, 88.5, 81.4, 74.6, 87.8)

You should be familiar with the basic equation for a line \(y=mx+b\), however you are likely unfamiliar with the \(\varepsilon\) term.

In Simple Linear Regression (SLR), there are 3 important assumptions we will cover now:

  1. \(\varepsilon_1, \varepsilon_2,\cdots, \varepsilon_n\) are independent.

  2. \(E[\varepsilon_i]=0\)

  3. \(Var(\varepsilon_i)=\sigma^2\) (an unknown variance)

In other words, the error terms each follow a Normal distribution with mean 0 and variance \(\sigma^2\). Therefore, they are described as i.i.d. (independent and identically distributed) \(\varepsilon_i \stackrel{i.i.d.}{\sim} N(0, \sigma^2)\). Sometimes we only require the first two assumptions.


Given the height and weights above, we are also given this equation:

\(weight = -67 + 0.8 \times height + \varepsilon\)


If \(E(\varepsilon_i)=0\), then it also follows that \(E(Y_i) = \beta_0 + \beta_1 X_i\) (they are all constants, and the expected value of a constant is itself)

We can then solve for the \(\varepsilon\) and \(E(Y_i)\) with the following equations:

\(\varepsilon = weight + 67 - 0.8 \times height\) and,

\(E(Y_i) = -67 + 0.8 \times height\)

slr = function(weight, height) {
  error <- c() # Initialize vectors
  expected_value <- c()
  
  for(i in 1:length(weight)) {
    error_i <- weight[i] + 67 - 0.8*(height[i])
    error <- c(error, error_i) # Solve for epsilon
    
    expected_value_i <- 67 - 0.8*(height[i])
    expected_value <- c(expected_value, # Solve for E(Y)
                        expected_value_i)
  }
  return(list(error, expected_value))
}

Below are the error terms, \(\varepsilon_i,\) for \(i=1,\cdots,7\)

slr(weight, height)[1]
## [[1]]
## [1]  2.84  2.88 -8.32  5.10  5.60 -6.40  5.60

Below are the expected values of \(Y_i\), \(E(Y_i),\) for \(i=1,\cdots,7\)

slr(weight, height)[2]
## [[1]]
## [1] -77.56 -70.52 -71.72 -83.40 -75.80 -81.00 -82.20

Below is a plot of the data, along with the equation of the line which was given:

plot(height, weight, main = 'Weight (kg) vs. Height (cm) (with regression line)', xlab = 'Height (cm)', ylab = 'Weight (kg)')
abline(-67, 0.8)

The \(\varepsilon_i 's\) can then be understood as the difference between the height per subject, or \(X_i\), and the line \(weight = -67 + 0.8 \times height\).


We are going to quickly go over the assumptions again. However this time they will be stated slightly differently.

  1. Linearity \(E(Y_i)=\beta_0+\beta_1X_i\) The observations are independent.
  2. Equal variance \(Var(\varepsilon_i)=\sigma^2\) this is important for the least squares formulation.
  3. Normality \(\varepsilon_i \sim N(0, \sigma^2)\) This follows since \(E(\varepsilon_i)=0\) and \(\sigma^2\) on all \(\varepsilon_i\)’s.