2025-04-13

How Are Regression Coefficients Computed?

The data is put in matrix form: \[ \tiny\begin{bmatrix} y_{1}\\ y_{2}\\ y_{3} \end{bmatrix}_{Y}=\begin{bmatrix} 1 & x_{1} \\ 1 & x_{2} \\ 1 & x_{3} \end{bmatrix}_{X}\begin{bmatrix} \beta_{0} \\ \beta_{1} \end{bmatrix}_{\beta}+\begin{bmatrix} e_{1} \\ e_{2} \\ e_{3} \end{bmatrix}_{e} \] Where \(\beta\) contains the coefficients and \(e\) contains the error between actual points and the model. The coefficients that minimize \(e\) can be found by: \[ \small\beta=(X^{T}X)^{-1}X^{T}Y \] The equation for the fit is given by: \[ \small y=\beta_{0}+\beta_{1}x \]

Generating Sample Data

x = 1:1000
y = 117 + (3.4 * x) + rnorm(1000,mean=0,sd=250)
data = data.frame(x,y)

Timing lm() Function

#Initialize timing vector
times = rep(0,1000)

#Time execution 1000 times and take mean (in seconds)
for (i in 1:length(times)) {
  startTime = Sys.time()
  
  model = lm(y ~ x, data = data)
  
  endTime = Sys.time()
  totalTime = round(endTime - startTime, 3)
  times[i] = totalTime
  
  i = i + 1
}
mean(times)
## [1] 0.000154

Residuals

Creating Manual Linear Regression

linRegression = function(x,y) {
  #create matrices
  Y = matrix(y, ncol = 1)
  X = cbind(rep(1, length(x)), x)
  
  #calculate coefficients
  B = solve(t(X) %*% X) %*% t(X) %*% Y
  
  return(B)
}

Timing linRegression() Function

#Initialize timing vector
times = rep(0,1000)

#Time execution 1000 times and take mean (in seconds)
for (i in 1:length(times)) {
  startTime = Sys.time()
  
  newModel = linRegression(x,y)
  
  endTime = Sys.time()
  totalTime = round(endTime - startTime, 3)
  times[i] = totalTime
  
  i = i + 1
}
mean(times)
## [1] 1e-05

How Are Coefficients Computed in Higher Dimensions?

We can simply add columns to \(X\) for new independent variables and replace \(Y\) with our dependent variable. For a function \(z=f(x,y)\) our matrices become: \[ \tiny\begin{bmatrix} z_{1}\\ z_{2}\\ z_{3} \end{bmatrix}_{Z}=\begin{bmatrix} 1 & x_{1} & y_{1}\\ 1 & x_{2} & y_{2}\\ 1 & x_{3} & y_{3} \end{bmatrix}_{X}\begin{bmatrix} \beta_{0} \\ \beta_{1} \\ \beta_{2} \end{bmatrix}_{\beta}+\begin{bmatrix} e_{1} \\ e_{2} \\ e_{3} \end{bmatrix}_{e} \] With \(\beta\) found by: \[ \small\beta=(X^{T}X)^{-1}X^{T}Z \] And fit equation of form: \[ \small z=\beta_{0}+\beta_{1}x+\beta_{2}y \]

Creating 3D Manual Linear Regression

linRegression3D = function(x,y,z) {
  #create matrices
  Z = matrix(z, ncol = 1)
  X = cbind(rep(1, length(x)), x, y)
  
  #calculate coefficients
  B = solve(t(X) %*% X) %*% t(X) %*% Z
  
  return(B)
}

Generating Sample Data

x = sample(c(1:1000), size = 1000)
y = sample(c(1:1000), size = 1000)
z = 138 + 5.6 * x - 2.4 * y + rnorm(1000,mean=0,sd=250)

Timing linRegression3D() Function

#Initialize timing vector
times = rep(0,1000)

#Time execution 1000 times and take mean (in seconds)
for (i in 1:length(times)) {
  startTime = Sys.time()
  
  newModel3D = linRegression3D(x,y,z)
  
  endTime = Sys.time()
  totalTime = round(endTime - startTime, 5)
  times[i] = totalTime
  
  i = i + 1
}
mean(times)
## [1] 7.699e-05