Although heteroskedasticity does not produce biased OLS estimates, it leads to a bias in the variance-covariance matrix. This means that standard model testing methods such as t tests or F tests cannot be relied on any longer. This post provides an intuitive illustration of heteroskedasticity and covers the calculation of standard errors that are robust to it.

  • Panel A: If every student, regardless of how many hours they studied, has the same degree of randomness in their exam performance, this would result in homoscedasticity.

  • Panel B: Consider the relationship between income (X) and expenditure on luxury goods (Y). At lower incomes, expenditures might vary little because individuals spend mostly on necessities. At higher incomes, expenditures vary more due to personal preferences and discretionary spending.

  • Panel C: Predicting fuel efficiency based on engine size, where smaller engines exhibit more variation due to differing vehicle types, while larger engines (often for trucks) are more consistent.

    • Employee productivity (Y) based on years of experience (X). At lower experience levels, productivity varies widely because employees are still learning. At higher experience levels, productivity stabilizes and varies less.
  • Panel D: Modeling student performance based on school funding, where variance is lower for mid-funded schools but higher for underfunded or overfunded schools due to other external factors.

1 Data

A popular illustration of heteroskedasticity is the relationship between saving and income, which is shown in the following graph. The dataset is contained the wooldridge package.

# Clear the workspace
  rm(list = ls())  # Clear environment
  cat("\f")        # Clear the console
# Load packages
library(dplyr)
library(ggplot2)
library(wooldridge)
library(stargazer)
  
# Load the sample
data("saving")
stargazer(saving, type="text")
## 
## ================================================
## Statistic  N    Mean    St. Dev.    Min    Max  
## ------------------------------------------------
## sav       100 1,582.510 3,284.902 -5,577  25,405
## inc       100 9,941.240 5,583.998   750   32,080
## size      100   4.350     1.493      2      10  
## educ      100  11.580     3.435      2      20  
## age       100  38.770     7.399     26      54  
## black     100   0.070     0.256      0      1   
## cons      100 8,358.730 5,729.535 -13,055 30,280
## ------------------------------------------------

Observations, where variable inc is larger than 20,000 or variable sav is negative or larger than inc are dropped from the sample.

# Only use positive values of saving, which are smaller than income
saving <- saving %>%
  filter(sav > 0,
         inc < 20000,
         sav < inc)

# Plot
ggplot(data = saving, 
       mapping = aes(x = inc, y = sav)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
  labs(x = "Annual income", y = "Annual savings")
## `geom_smooth()` using formula = 'y ~ x'

The regression line in the graph shows a clear positive relationship between saving and income. However, as income increases, the differences between the observations and the regression line become larger. This means that there is higher uncertainty about the estimated relationship between the two variables at higher income levels. This is an example of heteroskedasticity.

Since standard model testing methods rely on the assumption that there is no correlation between the independent variables and the variance of the dependent variable, the usual standard errors are not very reliable in the presence of heteroskedasticity. Fortunately, the calculation of robust standard errors can help to mitigate this problem.

2 Robust standard errors

The regression line above was derived from the model

\[ sav_i=\beta_0 + \beta_1 \ inc_i+ \epsilon_i , \]

for which the following code produces the standard R output:

# Estimate the model
model <- lm(formula = sav ~ inc, 
            data = saving
            )

# Print estimates and standard test statistics
summary(model)
## 
## Call:
## lm(formula = sav ~ inc, data = saving)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2667.8  -874.5  -302.7   431.1  4606.6 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)   
## (Intercept) 316.19835  462.06882   0.684  0.49595   
## inc           0.14052    0.04672   3.007  0.00361 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1413 on 73 degrees of freedom
## Multiple R-squared:  0.1102, Adjusted R-squared:  0.09805 
## F-statistic: 9.044 on 1 and 73 DF,  p-value: 0.003613
par(mfrow = c(2, 2))
plot(model)

2.0.1 t test

Although heteroskedasticity does not produce biased OLS estimates, it leads to a bias in the variance-covariance matrix. This means that standard model testing methods such as t tests or F tests cannot be relied on any longer.

Since we already know that the model above suffers from heteroskedasticity, we want to obtain heteroskedasticity robust standard errors and their corresponding t values. In R the function coeftest from the lmtest package can be used in combination with the function vcovHC from the sandwich package to do this.

The first argument of the coeftest function contains the output of the lm function and calculates the t test based on the variance-covariance matrix provided in the vcov argument.

  • The vcovHC function produces that matrix and allows to obtain several types of heteroskedasticity robust versions of it. In our case we obtain a simple White standard error, which is indicated by type = "HC0". Other, more sophisticated methods are described in the documentation of the function, ?vcovHC that are tailored to take into account the effect of leverage points in the design matrix.
# Load libraries
library("lmtest")
## Loading required package: zoo
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
?lmtest::coeftest

library("sandwich")

# Robust t test
model0 <- coeftest(model, vcov = vcovHC(model, type = "HC0"))
# Robust t test
model1 <- coeftest(model, vcov = vcovHC(model, type = "HC1"))
# Robust t test
model2 <- coeftest(model, vcov = vcovHC(model, type = "HC2"))
# Robust t test
model3 <- coeftest(model, vcov = vcovHC(model, type = "HC3"))
library(stargazer)
stargazer(model, model1, model2, model3, type = "text")
## 
## =======================================================================
##                                     Dependent variable:                
##                     ---------------------------------------------------
##                              sav                                       
##                              OLS                   coefficient         
##                                                       test             
##                              (1)             (2)       (3)       (4)   
## -----------------------------------------------------------------------
## inc                       0.141***        0.141***  0.141***  0.141*** 
##                            (0.047)         (0.049)   (0.051)   (0.052) 
##                                                                        
## Constant                   316.198         316.198   316.198   316.198 
##                           (462.069)       (420.371) (428.596) (443.298)
##                                                                        
## -----------------------------------------------------------------------
## Observations                 75                                        
## R2                          0.110                                      
## Adjusted R2                 0.098                                      
## Residual Std. Error  1,413.472 (df = 73)                               
## F Statistic         9.044*** (df = 1; 73)                              
## =======================================================================
## Note:                                       *p<0.1; **p<0.05; ***p<0.01
  • As you can see, specifying different heteroskedasticity types only affects standard errors and not the point estimates.

3 Breusch-Pagan Test for detecting heteroskedasticity

Breusch-Pagan test checks whether the variance of the residuals depends on the value of the independent variable.

Therefore, the hypothesis of the Breusch-Pagan Test is:

  • NULL - Homoskedasticty (Residuals are distributed with equal variance)

  • ALTERNATIVE - Heteroskedastcity (Residuals are not distributed with equal variance)

In R, you can perform the Breusch-Pagan test in different ways, for instance with:

  1. The bptest function from the lmtest package,

  2. The ncvTest function from the car package,

  3. The plmtest function from the plm package, or

  4. The breusch_pagan function from the skedastic package.

In this example, we use the bptest function from the lmtest package which requires just a fitted lm - object as its argument (i.e., a linear regression model).

https://www.codingprof.com/3-easy-ways-to-test-for-heteroscedasticity-in-r-examples/

library(lmtest)
lmtest::bptest(model)
## 
##  studentized Breusch-Pagan test
## 
## data:  model
## BP = 4.0869, df = 1, p-value = 0.04322
  • Example 1: Reject the null of homoskedasticty at an alpha of 5% i.e. heteroskedasticity.
library(lmtest)
model_cars <- lm(mpg ~ hp, data = mtcars)
plot(y = mtcars$mpg, 
     x = mtcars$hp)

lmtest::bptest(model_cars)
## 
##  studentized Breusch-Pagan test
## 
## data:  model_cars
## BP = 0.049298, df = 1, p-value = 0.8243
  • Example 2: Fail to Reject the null of homoskedasticty at an alpha of 5%.
library(lmtest)

model_trees <- lm(Volume ~ Height, data = trees)
plot(y = trees$Volume, x = trees$Height)

lmtest::bptest(model_trees)
## 
##  studentized Breusch-Pagan test
## 
## data:  model_trees
## BP = 12.207, df = 1, p-value = 0.0004762
  • Example 3: Reject the null of homoskedasticty at an alpha of 5% .i.e. heteroskedasticity.

4 White Test for detecting heteroskedasticity

https://rdocumentation.org/packages/skedastic/versions/1.0.4/topics/white_lm

library(skedastic)
skedastic::white(model)
## # A tibble: 1 × 5
##   statistic p.value parameter method       alternative
##       <dbl>   <dbl>     <dbl> <chr>        <chr>      
## 1      4.31   0.116         2 White's Test greater
  • Example 1: Fail to Reject the null of homoskedasticty at an alpha of 5% i.e. heteroskedasticity. Different conclusion as with BP test -> explore more.
skedastic::white(model_cars)
## # A tibble: 1 × 5
##   statistic p.value parameter method       alternative
##       <dbl>   <dbl>     <dbl> <chr>        <chr>      
## 1      6.44  0.0400         2 White's Test greater
  • Example 2: Reject the null of homoskedasticty at an alpha of 5%. Different conclusion as with BP test -> explore more.
skedastic::white(model_trees)
## # A tibble: 1 × 5
##   statistic  p.value parameter method       alternative
##       <dbl>    <dbl>     <dbl> <chr>        <chr>      
## 1      17.1 0.000197         2 White's Test greater
  • Example 3: Reject the null of homoskedasticty at an alpha of 5%. Same conclusion as with BP test. Definitely heteroskedasticty.

5 Summary

5.0.1 Breusch-Pagan (BP) Test

  • Purpose: The BP test specifically tests for heteroskedasticity by regressing the squared residuals from the original model on the independent variables. It is designed to test if the variance of the errors is related to the independent variables.

  • Null Hypothesis: The null hypothesis is that there is no heteroskedasticity (i.e., the variance of the residuals is constant).

  • Alternative Hypothesis: The alternative hypothesis is that there is heteroskedasticity (i.e., the variance of the residuals is related to the independent variables).

5.0.2 White Test

  • Purpose: The White test is a more general test for heteroskedasticity that does not assume any specific form of the relationship between the variance of errors and the independent variables. It can also test for model specification errors.

  • Null Hypothesis: The null hypothesis is that there is no heteroskedasticity (i.e., the variance of the residuals is constant).

  • Alternative Hypothesis: The alternative hypothesis is that there is heteroskedasticity (i.e., the variance of the residuals is related to the independent variables in a non-specific way).

5.0.3 Comparison

  • Focus:

    • BP Test: Checks for heteroskedasticity with the assumption that it relates to the explanatory variables in the model.

    • White Test: More general, checks for heteroskedasticity and possible model specification errors without assuming a specific form of heteroskedasticity.

  • Similarities:

    • Both tests ultimately test the same null hypothesis of homoskedasticity.

    • Both can provide similar conclusions regarding the presence of heteroskedasticity.

  • Differences:

    • BP Test: Assumes a specific relationship between residuals and independent variables.

    • White Test: More flexible and robust to different forms of heteroskedasticity.

5.0.4 Interpreting Results

  • If both tests indicate heteroskedasticity, it suggests that the model’s residuals are not homoskedastic.

  • If they yield different results, you might need to investigate further or consider using additional diagnostic tests to confirm the presence and type of heteroskedasticity.

In practice, both tests are used to check for heteroskedasticity, and while they might occasionally show slight differences in results, they often provide similar insights about the presence of heteroskedasticity in your regression model.

6 Advanced: F test

Although heteroskedasticity does not produce biased OLS estimates, it leads to a bias in the variance-covariance matrix. This means that standard model testing methods such as t tests or F tests cannot be relied on any longer.

F test is presented as a method to test the joint significance of multiple regressors.

  • The following example adds two new regressors on education and age to the above model and calculates the corresponding (non-robust) F test using the anova function.
# Estimate unrestricted model ( Includes all variables )
model_unres <- lm(sav ~ inc + size + educ + age, data = saving)

# F test
anova(model, model_unres)
## Analysis of Variance Table
## 
## Model 1: sav ~ inc
## Model 2: sav ~ inc + size + educ + age
##   Res.Df       RSS Df Sum of Sq      F Pr(>F)
## 1     73 145846877                           
## 2     70 144286605  3   1560272 0.2523 0.8594
  • For a heteroskedasticity robust F test we perform a Wald test using the waldtest function, which is also contained in the `lmtest` package. It can be used in a similar way as the anova function, i.e., it uses the output of the restricted and unrestricted model and the robust variance-covariance matrix as argument vcov. Based on the variance-covariance matrix of the unrestricted model we, again, calculate White robust standard errors.
waldtest(model,        # Excludes some variables
         model_unres, 
         vcov = vcovHC(model_unres,
                       type = "HC0") # this matrix accounts for heteroskedasticity and provides robust standard error
         )
## Wald test
## 
## Model 1: sav ~ inc
## Model 2: sav ~ inc + size + educ + age
##   Res.Df Df      F Pr(>F)
## 1     73                 
## 2     70  3 0.3625 0.7803
  • The Wald test allows you to test hypotheses about the coefficients while accounting for potential heteroskedasticity in the errors, providing a more reliable test statistic when variance is not constant across observations.

    • waldtest() evaluates whether the more complex (unrestricted) model significantly improves fit compared to the simpler (restricted) model, using robust standard errors.