1 Introduction

Multiple linear regression is one of the statistical inference methods used to model the linear relationship between one response (dependent) variable and two or more predictor (independent) variables. This method is widely used in various fields, including economics, to explain how an economic variable is influenced by several factors simultaneously, as well as to perform prediction.

In general, the multiple linear regression model with \(p\) predictor variables is written as:

\[Y_i = \beta_0 + \beta_1 X_{1i} + \beta_2 X_{2i} + \beta_3 X_{3i} + \varepsilon_i, \quad i = 1, 2, \dots, n\]

where:

\(Y_i\) : the value of the response variable for observation \(i\)
\(X_{1i}, X_{2i}, X_{3i}\) : the values of the predictor variables for observation \(i\)
\(\beta_0, \beta_1, \beta_2, \beta_3\) : regression parameters to be estimated
\(\varepsilon_i\) : random error, assumed \(\varepsilon_i \sim N(0, \sigma^2)\) and mutually independent

In matrix notation, the model above can be written as:

\[\mathbf{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\varepsilon}\]

where \(\mathbf{Y}\) has dimension \(n \times 1\), \(\mathbf{X}\) has dimension \(n \times (p+1)\) (the design matrix containing a column of 1’s for the intercept and \(p\) predictor variables), \(\boldsymbol{\beta}\) has dimension \((p+1) \times 1\), and \(\boldsymbol{\varepsilon}\) has dimension \(n \times 1\).

In this analysis, the parameter \(\boldsymbol{\beta}\) will be estimated manually using basic matrix operations in R (without using lm() or additional packages such as car, lmtest, or MASS), using two approaches: Ordinary Least Squares (OLS) and Maximum Likelihood Estimation (MLE). All functions used are base R functions such as solve(), t(), %*%, optim(), and functions from the built-in stats package.

2 Theoretical Background

2.1 Classical Linear Regression Assumptions

The classical linear regression model requires several assumptions for the resulting estimators to be BLUE (Best Linear Unbiased Estimator):

The relationship between \(Y\) and \(X\) is linear in the parameters.
The error has zero mean, \(E(\varepsilon_i) = 0\).
The error is homoscedastic, \(\text{Var}(\varepsilon_i) = \sigma^2\) (constant).
The errors are uncorrelated (no autocorrelation), \(\text{Cov}(\varepsilon_i, \varepsilon_j) = 0\) for \(i \neq j\).
The error is normally distributed, \(\varepsilon_i \sim N(0, \sigma^2)\) (required for inference/hypothesis testing).
There is no perfect multicollinearity among the predictor variables.

2.2 Ordinary Least Squares (OLS) Approach

The principle of OLS is to minimize the sum of squared errors (SSE):

\[SSE = \sum_{i=1}^{n} \varepsilon_i^2 = \boldsymbol{\varepsilon}^T \boldsymbol{\varepsilon} = (\mathbf{Y} - \mathbf{X}\boldsymbol{\beta})^T(\mathbf{Y} - \mathbf{X}\boldsymbol{\beta})\]

By differentiating \(SSE\) with respect to \(\boldsymbol{\beta}\) and setting it equal to zero, we obtain the normal equations:

\[\mathbf{X}^T\mathbf{X}\boldsymbol{\beta} = \mathbf{X}^T\mathbf{Y}\]

so that the OLS estimator is:

\[\hat{\boldsymbol{\beta}}_{OLS} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{Y}\]

The estimated error variance:

\[\hat{\sigma}^2_{OLS} = \frac{SSE}{n-p-1} = \frac{(\mathbf{Y}-\mathbf{X}\hat{\boldsymbol{\beta}})^T(\mathbf{Y}-\mathbf{X}\hat{\boldsymbol{\beta}})}{n-p-1}\]

with the variance-covariance matrix of the coefficient estimates:

\[\widehat{\text{Var}}(\hat{\boldsymbol{\beta}}) = \hat{\sigma}^2 (\mathbf{X}^T\mathbf{X})^{-1}\]

2.3 Maximum Likelihood Estimation (MLE) Approach

The MLE approach assumes the error is normally distributed, \(\boldsymbol{\varepsilon} \sim N(0, \sigma^2 \mathbf{I})\), so that \(\mathbf{Y} \sim N(\mathbf{X}\boldsymbol{\beta}, \sigma^2\mathbf{I})\). The likelihood function is:

\[L(\boldsymbol{\beta}, \sigma^2 \mid \mathbf{Y}) = \prod_{i=1}^{n} \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left(-\frac{(Y_i - \mathbf{x}_i^T\boldsymbol{\beta})^2}{2\sigma^2}\right)\]

Its log-likelihood function is:

\[\ell(\boldsymbol{\beta}, \sigma^2) = -\frac{n}{2}\ln(2\pi) - \frac{n}{2}\ln(\sigma^2) - \frac{1}{2\sigma^2}(\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})^T(\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})\]

Maximizing \(\ell\) with respect to \(\boldsymbol{\beta}\) is equivalent to minimizing \((\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})^T(\mathbf{Y}-\mathbf{X}\boldsymbol{\beta})\), so that:

\[\hat{\boldsymbol{\beta}}_{MLE} = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{Y} = \hat{\boldsymbol{\beta}}_{OLS}\]

However, the MLE estimator of the error variance differs from OLS, since it is derived from \(\partial \ell / \partial \sigma^2 = 0\):

\[\hat{\sigma}^2_{MLE} = \frac{SSE}{n} \neq \hat{\sigma}^2_{OLS} = \frac{SSE}{n-p-1}\]

The estimator \(\hat{\sigma}^2_{MLE}\) is biased (it underestimates the true variance) because it does not account for the degrees of freedom lost due to estimating \(\hat{\boldsymbol{\beta}}\), whereas \(\hat{\sigma}^2_{OLS}\) is unbiased. In the practical section below, \(\hat{\boldsymbol{\beta}}_{MLE}\) will be obtained in two ways: (1) analytically through the closed-form matrix formula above, and (2) numerically by optimizing the log-likelihood function using optim(), as proof that the two are consistent.

3 Data Simulation and Description

The data used is simulated data set in a household economics context, consisting of one response variable and three predictor variables:

Variable	Description	Unit
\(Y\)	Monthly Household Consumption Expenditure	Million IDR
\(X_1\)	Monthly Household Income	Million IDR
\(X_2\)	Consumer Lending Interest Rate	Percent (%)
\(X_3\)	Inflation Rate (Consumer Price Index, YoY)	Percent (%)

Based on economic theory, income (\(X_1\)) is expected to have a positive effect on consumption, while the interest rate (\(X_2\)) and inflation (\(X_3\)) are expected to have a negative effect on real household consumption.

set.seed(2024)
n <- 100

# Predictor variables
X1 <- round(rnorm(n, mean = 8, sd = 2), 2)      # Income (million IDR), mean 8 million
X2 <- round(runif(n, min = 4, max = 12), 2)     # Lending interest rate (%), 4%-12%
X3 <- round(runif(n, min = 1, max = 8), 2)      # Inflation rate (%), 1%-8%

# Population (true) parameters set by the researcher
beta0_true <- 1.5
beta1_true <- 0.65    # income up by 1 million -> consumption up by 0.65 million
beta2_true <- -0.18   # interest rate up by 1% -> consumption down by 0.18 million
beta3_true <- -0.12   # inflation up by 1% -> consumption down by 0.12 million

# Random error, normally distributed
epsilon <- rnorm(n, mean = 0, sd = 0.6)

# Response variable
Y <- beta0_true + beta1_true * X1 + beta2_true * X2 + beta3_true * X3 + epsilon
Y <- round(Y, 2)

data <- data.frame(Consumption = Y,
                             Income = X1,
                             InterestRate = X2,
                             Inflation = X3)

head(data, 10)

##    Consumption Income InterestRate Inflation
## 1         5.61   9.96        11.89      1.89
## 2         6.18   8.94         4.24      5.42
## 3         3.88   7.78         9.36      4.38
## 4         4.94   7.57         4.96      5.92
## 5         5.92  10.32        10.81      7.94
## 6         6.80  10.58         4.30      1.64
## 7         4.56   9.07         8.40      7.25
## 8         4.69   7.75         7.25      5.21
## 9         4.46   5.55         4.82      2.55
## 10        2.75   5.76        10.70      6.86

summary(data)

##   Consumption        Income        InterestRate      Inflation    
##  Min.   :0.410   Min.   : 1.450   Min.   : 4.030   Min.   :1.010  
##  1st Qu.:3.732   1st Qu.: 6.560   1st Qu.: 5.492   1st Qu.:3.393  
##  Median :4.655   Median : 7.965   Median : 7.705   Median :5.260  
##  Mean   :4.632   Mean   : 7.830   Mean   : 7.857   Mean   :4.928  
##  3rd Qu.:5.612   3rd Qu.: 9.348   3rd Qu.:10.193   3rd Qu.:6.763  
##  Max.   :8.070   Max.   :11.950   Max.   :11.900   Max.   :7.950

sapply(data, sd)

##  Consumption       Income InterestRate    Inflation 
##     1.479771     2.044882     2.475760     2.051868

4 Multicollinearity Test

Multicollinearity occurs when predictor variables are highly correlated with each other, which can make coefficient estimates unstable. One commonly used indicator is the Variance Inflation Factor (VIF):

\[VIF_j = \frac{1}{1 - R_j^2}\]

where \(R_j^2\) is the coefficient of determination from regressing \(X_j\) on all the other predictors. As a general rule of thumb, \(VIF_j > 10\) (or in some references \(> 5\)) indicates serious multicollinearity.

cor(data[, c("Income", "InterestRate", "Inflation")])

##                   Income InterestRate  Inflation
## Income        1.00000000  -0.07310745 0.07372015
## InterestRate -0.07310745   1.00000000 0.04284133
## Inflation     0.07372015   0.04284133 1.00000000

# Manual VIF function, no package required
calculate_VIF <- function(X) {
  p <- ncol(X)
  vif_values <- numeric(p)
  names(vif_values) <- colnames(X)
  
  for (j in 1:p) {
    y_aux <- X[, j]
    X_aux <- cbind(1, X[, -j])              # other predictors + intercept
    beta_aux <- solve(t(X_aux) %*% X_aux) %*% t(X_aux) %*% y_aux
    y_fit <- X_aux %*% beta_aux
    SSR <- sum((y_fit - mean(y_aux))^2)
    SST <- sum((y_aux - mean(y_aux))^2)
    R2_aux <- SSR / SST
    vif_values[j] <- 1 / (1 - R2_aux)
  }
  return(vif_values)
}

predictor_matrix <- as.matrix(data[, c("Income", "InterestRate", "Inflation")])
VIF_result <- calculate_VIF(predictor_matrix)
print(round(VIF_result, 3))

##       Income InterestRate    Inflation 
##        1.011        1.008        1.008

Interpretation: If all VIF values are below 10 (ideally close to 1), it can be concluded that there is no indication of serious multicollinearity among the predictor variables, so the regression coefficient estimates can be considered stable.

5 Linearity Test

The linearity test is performed to ensure that the relationship between each predictor variable and the response variable is linear, which is a key requirement for using a linear regression model. The check is done visually through scatter plots equipped with a simple regression line obtained manually.

# Manual simple regression function for the trend line in the scatter plots
simple_regression <- function(x, y) {
  b1 <- sum((x - mean(x)) * (y - mean(y))) / sum((x - mean(x))^2)
  b0 <- mean(y) - b1 * mean(x)
  return(c(intercept = b0, slope = b1))
}

par(mfrow = c(1, 3))

for (var in c("Income", "InterestRate", "Inflation")) {
  x <- data[[var]]
  y <- data$Consumption
  coef <- simple_regression(x, y)
  r <- cor(x, y)
  
  plot(x, y, pch = 19, col = "steelblue4",
       xlab = var, ylab = "Consumption",
       main = paste0(var, " vs Consumption\n(r = ", round(r, 3), ")"))
  abline(a = coef["intercept"], b = coef["slope"], col = "firebrick", lwd = 2)
}

par(mfrow = c(1, 1))

Interpretation: A pattern of points in each scatter plot that follows a straight line (either with a positive or negative slope) indicates that the assumption of a linear relationship between the predictor variable and the response variable is satisfied.

6 Parameter Estimation

6.1 Preparing the Design Matrix

Y_vec <- matrix(data$Consumption, ncol = 1)
X_mat <- cbind(1, as.matrix(data[, c("Income", "InterestRate", "Inflation")]))
colnames(X_mat) <- c("Intercept", "Income", "InterestRate", "Inflation")

n_obs <- nrow(X_mat)
p_par <- ncol(X_mat) - 1   # number of predictors (excluding intercept)

6.2 Estimation via OLS Approach

# OLS beta estimator: (X'X)^-1 X'Y
XtX <- t(X_mat) %*% X_mat
XtY <- t(X_mat) %*% Y_vec
beta_OLS <- solve(XtX) %*% XtY

# Fitted values and residuals
Y_hat <- X_mat %*% beta_OLS
e_OLS <- Y_vec - Y_hat

# Unbiased estimate of error variance (OLS)
SSE <- sum(e_OLS^2)
sigma2_OLS <- SSE / (n_obs - p_par - 1)

# Variance-covariance matrix of beta and standard errors
varcov_beta <- as.numeric(sigma2_OLS) * solve(XtX)
se_beta <- sqrt(diag(varcov_beta))

# t-statistic and p-value for each coefficient
t_stat <- beta_OLS / se_beta
p_value <- 2 * pt(abs(t_stat), df = n_obs - p_par - 1, lower.tail = FALSE)

table_OLS <- data.frame(
  Coefficient = round(beta_OLS, 5),
  SE = round(se_beta, 5),
  t_value = round(t_stat, 4),
  p_value = round(p_value, 5)
)
print(table_OLS)

##              Coefficient      SE t_value p_value
## Intercept        1.54192 0.30394  5.0731  0.0000
## Income           0.62602 0.02664 23.4979  0.0000
## InterestRate    -0.17235 0.02197 -7.8467  0.0000
## Inflation       -0.09288 0.02650 -3.5044  0.0007

SST <- sum((Y_vec - mean(Y_vec))^2)
SSR <- SST - SSE
R2 <- SSR / SST
R2_adj <- 1 - (1 - R2) * (n_obs - 1) / (n_obs - p_par - 1)

cat("SSE               :", round(SSE, 4), "\n")

## SSE               : 27.8895

cat("R-squared          :", round(R2, 5), "\n")

## R-squared          : 0.87135

cat("Adjusted R-squared :", round(R2_adj, 5), "\n")

## Adjusted R-squared : 0.86733

cat("Sigma^2 (OLS)      :", round(sigma2_OLS, 5), "\n")

## Sigma^2 (OLS)      : 0.29052

6.3 Estimation via MLE Approach

6.3.1 (a) Analytical Solution

According to the theoretical derivation, \(\hat{\boldsymbol{\beta}}_{MLE}\) is identical to \(\hat{\boldsymbol{\beta}}_{OLS}\), while \(\hat{\sigma}^2_{MLE} = SSE/n\).

beta_MLE_analytic <- beta_OLS                  # algebraically identical
sigma2_MLE_analytic <- SSE / n_obs              # divisor is n, not n-p-1

cat("Sigma^2 (MLE, analytic):", round(sigma2_MLE_analytic, 5), "\n")

## Sigma^2 (MLE, analytic): 0.27889

6.3.2 (b) Numerical Solution (Log-Likelihood Optimization)

As verification, direct optimization of the negative log-likelihood function is performed using optim() (a base R function), without using the closed-form OLS formula.

neg_log_likelihood <- function(par, X, Y) {
  k <- ncol(X)
  beta <- par[1:k]
  sigma2 <- par[k + 1]
  
  if (sigma2 <= 0) return(1e10)   # constraint to keep variance positive
  
  n <- length(Y)
  resid <- Y - X %*% beta
  ll <- -(n/2) * log(2 * pi) - (n/2) * log(sigma2) - (1 / (2 * sigma2)) * sum(resid^2)
  return(-ll)   # multiplied by -1 because optim() performs minimization
}

# Starting values
start_par <- c(rep(0, ncol(X_mat)), var(Y_vec))

optim_result <- optim(par = start_par,
                       fn = neg_log_likelihood,
                       X = X_mat, Y = Y_vec,
                       method = "BFGS",
                       hessian = TRUE)

beta_MLE_numeric <- optim_result$par[1:ncol(X_mat)]
sigma2_MLE_numeric <- optim_result$par[ncol(X_mat) + 1]

names(beta_MLE_numeric) <- colnames(X_mat)
cat("Optim convergence (0 = successful):", optim_result$convergence, "\n\n")

## Optim convergence (0 = successful): 0

cat("Beta coefficients (MLE - numeric):\n")

## Beta coefficients (MLE - numeric):

print(round(beta_MLE_numeric, 5))

##    Intercept       Income InterestRate    Inflation 
##      1.54192      0.62602     -0.17235     -0.09288

cat("\nSigma^2 (MLE - numeric):", round(sigma2_MLE_numeric, 5), "\n")

## 
## Sigma^2 (MLE - numeric): 0.2789

6.3.3 Comparison Table: OLS vs MLE

comparison_table <- data.frame(
  Parameter = colnames(X_mat),
  OLS = round(as.numeric(beta_OLS), 5),
  MLE_Analytic = round(as.numeric(beta_MLE_analytic), 5),
  MLE_Numeric = round(as.numeric(beta_MLE_numeric), 5)
)
print(comparison_table)

##      Parameter      OLS MLE_Analytic MLE_Numeric
## 1    Intercept  1.54192      1.54192     1.54192
## 2       Income  0.62602      0.62602     0.62602
## 3 InterestRate -0.17235     -0.17235    -0.17235
## 4    Inflation -0.09288     -0.09288    -0.09288

cat("\nComparison of error variance estimates:\n")

## 
## Comparison of error variance estimates:

cat("Sigma^2 OLS (divisor n-p-1):", round(sigma2_OLS, 5), "\n")

## Sigma^2 OLS (divisor n-p-1): 0.29052

cat("Sigma^2 MLE (divisor n)    :", round(sigma2_MLE_analytic, 5), "\n")

## Sigma^2 MLE (divisor n)    : 0.27889

Interpretation: All three approaches produce (nearly) identical estimates of the regression coefficients \(\boldsymbol{\beta}\), which empirically confirms that the OLS and MLE estimators for the regression coefficients are equivalent. The only difference occurs in the estimated error variance \(\hat\sigma^2\), where MLE produces a slightly smaller (biased) value compared to OLS, due to the difference in the degrees-of-freedom divisor.

7 Residual (Error) Assumption Tests

The residuals used in all of the following assumption tests are the OLS model residuals (\(\hat{e} = \mathbf{Y} - \mathbf{X}\hat{\boldsymbol{\beta}}_{OLS}\)).

residuals_vec <- as.numeric(e_OLS)
fitted_values <- as.numeric(Y_hat)

7.1 Normality of Residuals Test

The normality assumption of the residuals is required for valid inference (t-test, F-test) on the regression coefficients. The check is conducted through:

Visual exploration: histogram and QQ-plot of the residuals.
Jarque-Bera test (manual), based on the skewness (\(S\)) and kurtosis (\(K\)) coefficients of the residuals:

\[JB = \frac{n}{6}\left(S^2 + \frac{(K-3)^2}{4}\right) \sim \chi^2_{(2)}\]

Shapiro-Wilk test using the shapiro.test() function from the (built-in) stats package, as a comparison.

par(mfrow = c(1, 2))
hist(residuals_vec, breaks = 12, col = "lightblue", border = "white",
     main = "Histogram of Residuals", xlab = "Residual", freq = FALSE)
curve(dnorm(x, mean(residuals_vec), sd(residuals_vec)), add = TRUE, col = "firebrick", lwd = 2)

qqnorm(residuals_vec, main = "QQ-Plot of Residuals", pch = 19, col = "steelblue4")
qqline(residuals_vec, col = "firebrick", lwd = 2)

par(mfrow = c(1, 1))

jarque_bera_test <- function(e) {
  n <- length(e)
  m2 <- mean((e - mean(e))^2)
  m3 <- mean((e - mean(e))^3)
  m4 <- mean((e - mean(e))^4)
  
  S <- m3 / (m2^(3/2))     # skewness
  K <- m4 / (m2^2)         # kurtosis
  
  JB <- (n / 6) * (S^2 + ((K - 3)^2) / 4)
  p_val <- pchisq(JB, df = 2, lower.tail = FALSE)
  
  return(list(skewness = S, kurtosis = K, JB_stat = JB, p_value = p_val))
}

JB_result <- jarque_bera_test(residuals_vec)
cat("Skewness     :", round(JB_result$skewness, 4), "\n")

## Skewness     : -0.1951

cat("Kurtosis     :", round(JB_result$kurtosis, 4), "\n")

## Kurtosis     : 3.7901

cat("JB statistic :", round(JB_result$JB_stat, 4), "\n")

## JB statistic : 3.2357

cat("p-value      :", round(JB_result$p_value, 4), "\n")

## p-value      : 0.1983

shapiro.test(residuals_vec)

## 
##  Shapiro-Wilk normality test
## 
## data:  residuals_vec
## W = 0.98921, p-value = 0.6011

Hypotheses: \(H_0\): the errors are normally distributed vs \(H_1\): the errors are not normally distributed. If p-value > \(\alpha\) (commonly 0.05), \(H_0\) fails to be rejected, so the normality assumption of the residuals is satisfied.

7.2 Homoscedasticity Test

The homoscedasticity assumption requires the error variance to be constant across all values of the predictors. Violation of this assumption (heteroscedasticity) causes the OLS estimator to remain unbiased but no longer efficient.

plot(fitted_values, residuals_vec, pch = 19, col = "steelblue4",
     xlab = "Fitted Value", ylab = "Residual",
     main = "Residuals vs Fitted Values")
abline(h = 0, col = "firebrick", lwd = 2, lty = 2)

Breusch-Pagan Test (manual). This is performed by regressing the squared residuals on all predictor variables, then computing the statistic:

\[LM = n \times R^2_{aux} \sim \chi^2_{(p)}\]

where \(R^2_{aux}\) is the coefficient of determination from the auxiliary regression of \(\hat{e}^2\) on \(\mathbf{X}\).

breusch_pagan_test <- function(X, e) {
  n <- nrow(X)
  e2 <- e^2
  
  beta_aux <- solve(t(X) %*% X) %*% t(X) %*% e2
  e2_hat <- X %*% beta_aux
  
  SSR_aux <- sum((e2_hat - mean(e2))^2)
  SST_aux <- sum((e2 - mean(e2))^2)
  R2_aux <- SSR_aux / SST_aux
  
  LM <- n * R2_aux
  df_bp <- ncol(X) - 1   # number of predictors (excluding intercept)
  p_val <- pchisq(LM, df = df_bp, lower.tail = FALSE)
  
  return(list(LM_stat = LM, df = df_bp, p_value = p_val))
}

BP_result <- breusch_pagan_test(X_mat, residuals_vec)
cat("LM statistic (Breusch-Pagan):", round(BP_result$LM_stat, 4), "\n")

## LM statistic (Breusch-Pagan): 5.3909

cat("Degrees of freedom          :", BP_result$df, "\n")

## Degrees of freedom          : 3

cat("p-value                     :", round(BP_result$p_value, 4), "\n")

## p-value                     : 0.1453

Hypotheses: \(H_0\): the error variance is homogeneous (homoscedastic) vs \(H_1\): the error variance is not homogeneous (heteroscedastic). If p-value > \(\alpha\), \(H_0\) fails to be rejected, so the homoscedasticity assumption is satisfied.

7.3 Autocorrelation Test

The no-autocorrelation assumption is especially important for data with an inherent ordering structure (e.g., time series data). A commonly used test is the Durbin-Watson test:

\[DW = \frac{\sum_{i=2}^{n}(\hat{e}_i - \hat{e}_{i-1})^2}{\sum_{i=1}^{n}\hat{e}_i^2}\]

The \(DW\) statistic ranges between 0 and 4, with a value close to 2 indicating no autocorrelation; a value close to 0 indicating positive autocorrelation; and a value close to 4 indicating negative autocorrelation.

durbin_watson_test <- function(e) {
  n <- length(e)
  diff_e <- diff(e)                     # e_t - e_(t-1)
  DW <- sum(diff_e^2) / sum(e^2)
  return(DW)
}

DW_stat <- durbin_watson_test(residuals_vec)
cat("Durbin-Watson statistic:", round(DW_stat, 4), "\n")

## Durbin-Watson statistic: 1.9662

Interpretation: A \(DW\) value close to 2 (generally in the range of 1.5 - 2.5) indicates no meaningful autocorrelation in the residuals, so the assumption of error independence is satisfied. More precise critical bounds \(d_L\) and \(d_U\) can be referenced from the Durbin-Watson table according to \(n\) and the number of predictors.

8 Final Model Summary

summary_table <- data.frame(
  Parameter = colnames(X_mat),
  Coefficient_OLS = round(as.numeric(beta_OLS), 4),
  SE = round(se_beta, 4),
  t_value = round(as.numeric(t_stat), 4),
  p_value = round(as.numeric(p_value), 4)
)
print(summary_table)

##                 Parameter Coefficient_OLS     SE t_value p_value
## Intercept       Intercept          1.5419 0.3039  5.0731  0.0000
## Income             Income          0.6260 0.0266 23.4979  0.0000
## InterestRate InterestRate         -0.1724 0.0220 -7.8467  0.0000
## Inflation       Inflation         -0.0929 0.0265 -3.5044  0.0007

cat("\nR-squared              :", round(R2, 4))

## 
## R-squared              : 0.8713

cat("\nAdjusted R-squared     :", round(R2_adj, 4))

## 
## Adjusted R-squared     : 0.8673

cat("\nSigma^2 (OLS)          :", round(sigma2_OLS, 4))

## 
## Sigma^2 (OLS)          : 0.2905

cat("\nDurbin-Watson statistic:", round(DW_stat, 4))

## 
## Durbin-Watson statistic: 1.9662

The resulting multiple linear regression model is:

\[\hat{Y} = \hat\beta_0 + \hat\beta_1 \, X_1 + \hat\beta_2 \, X_2 + \hat\beta_3 \, X_3\]

where \(\hat\beta_0, \hat\beta_1, \hat\beta_2, \hat\beta_3\) are as listed in the summary table above (Income, Interest Rate, and Inflation as predictors of Household Consumption).

9 Conclusion

A multiple linear regression model was successfully built using simulated data on household consumption expenditure influenced by income, lending interest rate, and inflation.
Parameter estimation using the OLS and MLE approaches (both analytically and numerically through log-likelihood optimization) produced identical coefficient estimates \(\boldsymbol{\beta}\), consistent with the theory that the two are equivalent in a linear regression model with the normal error assumption. The only difference is found in the estimated error variance \(\hat\sigma^2\).
The multicollinearity test (VIF) shows no high correlation among predictors, so the model is relatively stable.
The linearity test shows a fairly clear linear relationship between each predictor and the response variable.
The residual assumption tests (normality, homoscedasticity, and no-autocorrelation) are generally satisfied in this simulated data, so the resulting OLS model is BLUE (Best Linear Unbiased Estimator).

10 References

Draper, N. R., & Smith, H. (1998). Applied Regression Analysis (3rd ed.). John Wiley & Sons.
Gujarati, D. N., & Porter, D. C. (2009). Basic Econometrics (5th ed.). McGraw-Hill.
Montgomery, D. C., Peck, E. A., & Vining, G. G. (2012). Introduction to Linear Regression Analysis (5th ed.). John Wiley & Sons.

Multiple Linear Regression Analysis: OLS and MLE Approaches (Manual Implementation)

Muhammad Aji Alrasyid