Types of regressions

Method Process Manual_Calculation Assumptions Advantages Disadvantages Parameter_Testing
OLS Minimizes the sum of squared residuals to estimate parameters.
  1. Specify the regression equation Y = Xβ + ε.
  2. Compute β = (X’X)^(-1)X’Y.
    • Multiply X’X, invert it, and multiply by X’Y.
  3. Solve for β.
Linearity, no multicollinearity, homoscedasticity, uncorrelated errors, normally distributed errors (for inference). Simple, easy to interpret; BLUE under Gauss-Markov assumptions. Sensitive to outliers and multicollinearity; inefficient under heteroscedasticity or autocorrelation. Use regress. Coefficients interpreted as unit change effects. Test with t-tests and assess significance with p-values.
GLS Adjusts for heteroscedasticity/autocorrelation by transforming the data or using weights.
  1. Estimate the error variance structure (Ω).
  • Transform the model using Ω^(-1/2):
    • Y* = Ω^(-1/2)Y, X* = Ω^(-1/2)X.
  • Apply OLS to the transformed model.
  • Correctly specified structure of heteroscedasticity or autocorrelation. Efficient under heteroscedasticity or autocorrelation. Requires knowledge/estimation of error structure; bias if misspecified. Use prais or xtgls. Test similar to OLS but interpret results within the transformed context.
    2SLS Stage 1: Regress endogenous variables on instruments. Stage 2: Use fitted values in the main regression.
    1. Stage 1: Regress endogenous variables (Z) on instruments (W): Z = Wπ + ν.
  • Obtain predicted values (Z-hat).
  • Replace Z in the main equation: Y = Z-hatβ + ε.
  • Apply OLS.
  • Instruments must be relevant (correlated with endogenous variables) and exogenous (uncorrelated with errors). Corrects endogeneity bias; consistent under valid instruments. Sensitive to weak instruments; relies on instrument validity. Use ivregress 2sls. Test instrument relevance with first-stage F-statistics and use Hansen J-test for overidentification.
    GMM Minimizes weighted sum of squared moment conditions derived from data and model.
    1. Define moment conditions E[g(Z, θ)] = 0.
  • Choose weighting matrix W.
  • Minimize J(θ) = g(Z, θ)’Wg(Z, θ).
  • Solve for θ (parameters).
  • Valid moment conditions; correct specification of weighting matrix. Flexible under heteroscedasticity or autocorrelation; handles overidentified models. Computationally intensive; sensitive to weight matrix choice. Use gmm or xtabond2. Test overidentifying restrictions with Hansen J-test. Interpret coefficients based on moment conditions.
    GME Maximizes entropy subject to constraints (data and prior information).
    1. Define entropy H = -∑p_i ln(p_i).
  • Specify constraints (e.g., Xβ = Y).
  • Maximize H subject to constraints.
  • Solve for β and p_i (entropy weights).
  • Requires carefully chosen constraints; small or ill-posed data. Handles multicollinearity and small samples; incorporates prior information. Relatively uncommon; computationally demanding; interpretation depends on entropy weights. Limited support in Stata. Often requires external packages or manual programming. Coefficients depend on entropy constraints and weights.
    MLE Maximizes the likelihood function of the data given the model.
    1. Define the likelihood L(θ) = Πf(y_i | X_i, θ).
  • Take log: ln(L(θ)).
  • Differentiate ln(L(θ)) w.r.t. θ, set to zero.
  • Solve for θ (MLE estimates).
  • Correct specification of likelihood function; errors are i.i.d. Asymptotically efficient and consistent; flexible for non-linear models. Sensitive to misspecified likelihood; computationally intensive. Use ml, logit, probit. Interpret likelihood values and use LR tests for model comparison. Interpret coefficients based on likelihood estimation.

    Common Problems in Regression Analysis

    Problem Meaning Consequences Solution
    Endogeneity Occurs when an explanatory variable is correlated with the error term, often due to reverse causality, omitted variables, or measurement error. Biased and inconsistent coefficient estimates; incorrect inference and policy recommendations. Use instrumental variables (IV) or two-stage least squares (2SLS); include omitted variables; improve data quality.
    Multicollinearity Occurs when two or more independent variables are highly correlated, making it hard to estimate their individual effects. Inflated standard errors, leading to low statistical significance and difficulty in determining the effect of each variable. Center variables, drop one variable, or use ridge regression or principal component analysis (PCA).
    Omitted Variable Bias Happens when a relevant variable is excluded from the model, causing biased and inconsistent estimates. Biased coefficient estimates; results cannot reliably reflect the true relationship between variables. Include the omitted variable if data is available; use proxy variables; apply sensitivity analysis.
    Heteroskedasticity Occurs when the variance of the error term is not constant across observations. Inefficient estimates, invalid hypothesis tests, and incorrect standard errors. Use robust standard errors (e.g., White’s robust estimator) or generalized least squares (GLS).
    Autocorrelation Happens when error terms are correlated across observations, often in time-series data. Biased standard errors, leading to invalid hypothesis tests and inefficient estimates. Use Newey-West standard errors; model the autocorrelation structure (e.g., ARMA or Prais-Winsten regression).
    Measurement Error Occurs when the observed variables contain measurement errors, leading to biased and inconsistent parameter estimates. Bias and inconsistency in parameter estimates; loss of reliability in results. Use methods like instrumental variables (IV) to address measurement error; improve data collection methods.
    Non-Linearity Occurs when the relationship between the dependent and independent variables is not linear, violating the linearity assumption. Incorrect model specification leads to biased estimates and poor predictive accuracy. Apply non-linear models such as polynomial regression, log-transformation, or generalized additive models (GAM).

    Testing the presence of regression problems

    Problem Test Intuition_Process Stata_Command Statistic_and_Interpretation
    Endogeneity Durbin-Wu-Hausman Test Compares the consistency of OLS and IV estimates. If IV estimates differ significantly from OLS, endogeneity is likely present. ivregress with Hausman test: hausman The test returns a chi-square statistic: - Null: OLS is consistent. - Rejecting the null suggests endogeneity. Look at p-values for significance.
    Multicollinearity Variance Inflation Factor (VIF) Checks if independent variables are highly correlated. A high VIF indicates multicollinearity. estat vif after regression VIF > 10 indicates high multicollinearity. Analyze the VIF values for each independent variable.
    Omitted Variable Bias No direct test, but look for model misfit and theoretical relevance. Omitted variable bias cannot be directly tested but can be suspected when model fit is poor, residuals are large, or theoretical relationships are overlooked. No specific command; examine model fit and theoretical relevance. No direct statistic; look for patterns in residual plots, model misfit, or theoretical gaps.
    Heteroskedasticity Breusch-Pagan Test, White Test Detects non-constant variance in the residuals. Breusch-Pagan tests variance as a function of independent variables; White’s test checks for heteroskedasticity without specifying a form. estat hettest for Breusch-Pagan; estat imtest, white for White test Breusch-Pagan: High chi-square values suggest heteroskedasticity. White: Similar chi-square interpretation, robust to forms of heteroskedasticity.
    Autocorrelation Durbin-Watson Test, Breusch-Godfrey LM Test Tests whether error terms are serially correlated. Durbin-Watson focuses on adjacent residuals; Breusch-Godfrey handles higher-order autocorrelation. estat dwatson; estat bgodfrey Durbin-Watson statistic near 2 suggests no autocorrelation: - <2 suggests positive autocorrelation. - >2 suggests negative autocorrelation. Breusch-Godfrey returns a chi-square statistic; p-values indicate significance.
    Measurement Error No direct test; look for inconsistent results or discrepancies in estimates. Measurement error tests are often qualitative; look for issues in data collection or unexpected inconsistencies in results. No specific command; address by improving data quality or using IV methods. No formal statistic. Look for bias and inconsistencies in coefficients across models.
    Non-Linearity Ramsey RESET Test Checks whether higher-order terms improve the fit of the model. Ramsey RESET uses powers of fitted values to test for specification errors. estat ovtest for Ramsey RESET test RESET: High F-statistic suggests non-linearity or omitted variable issues. Check the p-value.

    Maximum Likelihood Estimation (MLE)

    Summary of Process

    1. Likelihood Function:
      • Define \(L(\mu) = \prod_{i=1}^n f(y_i; \mu)\).
      • For the exponential distribution: \(f(y_i; \mu) = \mu e^{-\mu y_i}\).
    2. Log-Likelihood:
      • Take the natural log: \(\ln L(\mu) = n \ln \mu - \mu \sum_{i=1}^n y_i\).
    3. First-Order Condition:
      • Differentiate \(\ln L(\mu)\) with respect to \(\mu\) and set it to zero: \[ \frac{\partial \ln L(\mu)}{\partial \mu} = \frac{n}{\mu} - \sum_{i=1}^n y_i = 0 \]
      • Solve for \(\hat{\mu}\): \(\hat{\mu} = \frac{n}{\sum_{i=1}^n y_i} = \frac{1}{\bar{y}}\).
    4. Fisher Information Matrix:
      • Compute: \(I(\mu) = -E\left[\frac{\partial^2 \ln L(\mu)}{\partial \mu^2}\right] = \frac{n}{\mu^2}\).
    5. Variance of the Estimator:
      • The variance of \(\hat{\mu}\): \(\text{Var}(\hat{\mu}) = \frac{1}{I(\mu)} = \frac{\mu^2}{n}\).
    6. Asymptotic Distribution:
      • The MLE estimator follows: \[ \sqrt{n}(\hat{\mu} - \mu) \sim N(0, \mu^2) \]

    Step-by-Step Summary

    1. Write the likelihood function \(L(\mu)\).
    2. Take the log-likelihood \(\ln L(\mu)\).
    3. Differentiate \(\ln L(\mu)\) and solve for \(\hat{\mu}\).
    4. Compute the Fisher Information \(I(\mu)\).
    5. The variance of \(\hat{\mu}\) is \(I(\mu)^{-1}\).
    6. \(\hat{\mu} \sim N(\mu, \mu^2 / n)\).

    Method of Moments Estimation (MME)

    Summary of Process

    1. Moment Condition:
      • Use \(E[y_i] = \frac{1}{\mu}\).
    2. Set the Moment Condition:
      • Solve for \(\mu\): \(\hat{\mu}_{MME} = \frac{1}{\bar{y}}\).
    3. Variance of the Estimator:
      • Compute: \[ \text{Var}(\hat{\mu}) = \mu^2 \cdot \frac{1}{n} \]
    4. Asymptotic Distribution:
      • The MME estimator follows: \[ \hat{\mu}_{MME} \sim N(\mu, \frac{\mu^2}{n}) \]

    Step-by-Step Summary

    1. Take the moment condition \(E[y_i] = \frac{1}{\mu}\).
    2. Solve for \(\hat{\mu}_{MME} = \frac{1}{\bar{y}}\).
    3. Compute the variance:
      • \(m_\mu = \frac{\partial m(\mu)}{\partial \mu}\).
      • \(\text{Var}(m(y_i, \mu))\).
    4. Use \(\text{Var}(\hat{\mu}) = \mu^2 / n\).
    5. \(\hat{\mu}_{MME} \sim N(\mu, \mu^2 / n)\).

    Generalized Method of Moments (GMM)

    Summary of Process

    1. Moment Conditions:
      • Define: \(E[m(y_i, \mu)] = E[(y_i - \mu)] = 0\).
    2. Minimization Problem:
      • Solve: \[ \min_\mu Q(\mu) = m_n(\mu)' W_n^{-1} m_n(\mu) \]
      • \(m_n(\mu) = \frac{1}{n} \sum_{i=1}^n m(y_i, \mu)\), and \(W_n\) is the covariance matrix.
    3. Optimal Weights:
      • Set \(W_n = V_n^{-1}\), where \(V_n = \text{Cov}[m(y_i, \mu)]\).
    4. Variance of the Estimator:
      • Compute: \[ \text{Var}(\hat{\mu}_{GMM}) = (m_\mu' W_n^{-1} m_\mu)^{-1} \]
    5. Asymptotic Distribution:
      • The GMM estimator follows: \[ \hat{\mu}_{GMM} \sim N(\mu, \text{Var}(\hat{\mu}_{GMM})) \]

    Step-by-Step Summary

    1. Define the moment condition matrix \(m(y_i, \mu)\).
    2. Set the minimization problem \(\min_\mu Q(\mu)\).
    3. Find \(A = W_n^{-1}\).
    4. Compute \(\text{Var}(\hat{\mu}) = (m_\mu' W_n^{-1} m_\mu)^{-1}\).
    5. Use optimal weights \(W_n = V_n^{-1}\).
    6. \(\hat{\mu}_{GMM} \sim N(\mu, \text{Var}(\hat{\mu}))\).

    Key Comparisons

    Method Estimator Variance Key_Assumptions
    MLE \(\hat{\mu}_{MLE} = \frac{1}{\bar{y}}\) \(\frac{\mu^2}{n}\) Correctly specified likelihood function
    MME \(\hat{\mu}_{MME} = \frac{1}{\bar{y}}\) \(\frac{\mu^2}{n}\) Validity of the moment condition
    GMM \(\hat{\mu}_{GMM} = \arg \min_\mu Q(\mu)\) \((m_\mu' W_n^{-1} m_\mu)^{-1}\) Correctly specified moment conditions