Heteroskedasticity occurs when the variance of the error term in a regression model is not constant across observations — it shifts depending on the values of the regressors. This is purely a problem with the spread of errors, not the coefficients themselves: OLS estimates remain unbiased, but the standard errors become incorrect, making t-stats and p-values unreliable for inference.
Null and Alternative Hypothesis in White’s Test
Both the Breusch-Pagan and White’s tests share the same hypotheses: the null is homoskedasticity (error variance is constant, i.e., all slope coefficients in the auxiliary regression are jointly zero), and the alternative is heteroskedasticity (variance depends on the regressors). The test logic is intuitive, if the squared residuals from the main regression can be systematically explained by the regressors, their squares, and cross-products, that’s direct evidence the error variance is not constant.
Main Regression
I’m using the built-in mtcars dataset. The outcome is fuel efficiency (mpg), and the three predictors are horsepower (hp), weight (wt), and engine displacement (disp).
library(lmtest)
Loading required package: zoo
Attaching package: 'zoo'
The following objects are masked from 'package:base':
as.Date, as.Date.numeric
model <-lm(mpg ~ hp + wt + disp, data = mtcars)summary(model)
Call:
lm(formula = mpg ~ hp + wt + disp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-3.891 -1.640 -0.172 1.061 5.861
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 37.105505 2.110815 17.579 < 2e-16 ***
hp -0.031157 0.011436 -2.724 0.01097 *
wt -3.800891 1.066191 -3.565 0.00133 **
disp -0.000937 0.010350 -0.091 0.92851
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 2.639 on 28 degrees of freedom
Multiple R-squared: 0.8268, Adjusted R-squared: 0.8083
F-statistic: 44.57 on 3 and 28 DF, p-value: 8.65e-11
The model fits well overall — the F-statistic is highly significant (p < 0.001) and R² of 0.83 means the three predictors explain about 83% of the variation in fuel efficiency.wt and hp are both statistically significant with negative coefficients, meaning heavier and more powerful cars get fewer miles per gallon. disp is not significant (p = 0.93) once hp and wt are already in the model — its effect is likely absorbed by those two since all three are correlated with engine size.
Residual Plots
par(mfrow =c(2, 2))plot(model)
par(mfrow =c(1, 1))
The Residuals vs. Fitted plot shows a slight U-shaped curve in the red line, suggesting some mild nonlinearity rather than a clear fan/funnel pattern. The spread of residuals looks fairly similar across fitted values, so no obvious heteroskedasticity visually.
The Q-Q plot shows residuals tracking the diagonal closely with only minor deviation at the tails — normality looks reasonable.
The Scale-Location plot has a very gently upward-sloping red line, hinting that variance may increase slightly at higher fitted values, but the trend is mild.
The Residuals vs. Leverage plot flags the Maserati Bora as a high-leverage point, though it stays within Cook’s distance bounds, so it’s not unduly influencing the fit. Toyota Corolla and Chrysler Imperial appear as mild outliers by residual size.
Overall, the plots don’t give strong visual evidence of heteroskedasticity, but the Scale-Location trend is worth investigating formally with White’s test.
White’s Test
We use bptest() from the lmtest package, specifying the full White’s test formula — original regressors, their squares, and all cross-products.
library(lmtest)white_test <-bptest(model, ~ hp + wt + disp +I(hp^2) +I(wt^2) +I(disp^2) + hp:wt + hp:disp + wt:disp, data = mtcars)white_test
studentized Breusch-Pagan test
data: model
BP = 6.7408, df = 9, p-value = 0.6641
The test statistic is BP = 6.74 with 9 degrees of freedom and a p-value of 0.664. Since p = 0.664 >> 0.05, we fail to reject the null of homoskedasticity. There is no statistically significant evidence of heteroskedasticity in this model — consistent with what the residual plots suggested visually.
Auxiliary Regression
To manually replicate White’s test, we take the squared residuals from the main regression as the dependent variable, and regress them on the original regressors, their squares, and all cross-products.
# Squared residuals from main modele2 <-resid(model)^2# Auxiliary regressionaux <-lm(e2 ~ hp + wt + disp +I(hp^2) +I(wt^2) +I(disp^2) + hp:wt + hp:disp + wt:disp, data = mtcars)summary(aux)
Call:
lm(formula = e2 ~ hp + wt + disp + I(hp^2) + I(wt^2) + I(disp^2) +
hp:wt + hp:disp + wt:disp, data = mtcars)
Residuals:
Min 1Q Median 3Q Max
-15.454 -3.826 -2.096 2.042 20.085
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.212e+01 3.952e+01 0.813 0.425
hp -3.883e-01 3.573e-01 -1.087 0.289
wt 5.759e+00 4.060e+01 0.142 0.888
disp -7.838e-02 3.643e-01 -0.215 0.832
I(hp^2) -1.688e-04 6.340e-04 -0.266 0.793
I(wt^2) -4.277e+00 1.079e+01 -0.396 0.696
I(disp^2) 6.324e-05 9.061e-04 0.070 0.945
hp:wt 1.334e-01 1.572e-01 0.849 0.405
hp:disp -4.766e-07 1.198e-03 0.000 1.000
wt:disp 1.199e-02 1.862e-01 0.064 0.949
Residual standard error: 9.925 on 22 degrees of freedom
Multiple R-squared: 0.2106, Adjusted R-squared: -0.1123
F-statistic: 0.6523 on 9 and 22 DF, p-value: 0.7414
The R² of the auxiliary regression tells us how well the regressors explain the variation in squared residuals. A high R² would mean the error variance is systematically related to the regressors — a red flag for heteroskedasticity. Here, R² = 0.211, which is fairly low, suggesting the regressors have limited ability to predict the size of the errors. This is consistent with homoskedasticity.
Replicating the Test Statistic Manually
n <-nrow(mtcars)r2 <-summary(aux)$r.squareddf <-length(coef(aux)) -1# 9 slope coefficients# Test statistic: n * R2LM <- n * r2cat("Test statistic (n * R2):", LM, "\n")
Test statistic (n * R2): 6.740797
# Critical value at 5% significancecrit <-qchisq(0.95, df = df)cat("Critical value (chi-sq, df=9, alpha=5%):", crit, "\n")
The test statistic is n × R² = 32 × 0.2106 = 6.74, which matches the BP statistic from bptest() exactly. The critical value at 5% significance with 9 degrees of freedom is 16.92. Since 6.74 < 16.92, we fail to reject the null. The p-value of 0.664 confirms this, well above the 0.05 threshold. We conclude there is no statistically significant heteroskedasticity in this model.