- An approach for predicting a response with only one feature.
- We assume that two variables (dependent and independent) are LINEARLY related
2024-10-18
\(h(x_i) = \beta_0 + \beta_1(x_i)\)
We want to find \(\beta\) values that fit this data
In Least Squares, we form a matrix from the data
In other words, our goal is to calculate \(\beta = (X^TX)^{-1}X^Ty\)
\(X\) is the design matrix (with a column of 1s for the intercept and a column for the independent variable).
\(y\) is the vector of observed values (dependent variable).
\(\beta\) is the vector of coefficients (intercept and slope).
# Step 1: Construct the matrix X (with a column of 1s for the intercept) X <- cbind(1, x) # (1st column: intercept, 2nd column: x values) # Step 2: Compute X^T * X XtX <- t(X) %*% X # Step 3: Compute X^T * y Xty <- t(X) %*% y # Step 4: Compute (X^T * X)^(-1) XtX_inv <- solve(XtX) # Step 5: Compute the coefficients beta = (X^T * X)^(-1) * X^T * y beta <- XtX_inv %*% Xty
## Coefficients (Intercept and Slope):
## [,1] ## 0.8487934 ## x 1.7401009
# calculates MSE calculate_mse <- function(actual, predicted) { mean((actual - predicted)^2) } # calculates MAE calculate_mae <- function(actual, predicted) { mean(abs(actual - predicted)) } mse <- calculate_mse(df$y, df$predicted) mae <- calculate_mae(df$y, df$predicted)
## [1] "Mean Squared Error (MSE): 4.2103"
## [1] "Mean Absolute Error (MAE): 1.7082"