\[Y_i=\beta_0+\beta_1x_i+\epsilon_i\]
\(\epsilon_i\) called errors always satisfy the assumption that these are i.i.d. (independently and identically distributed). Independent means no correlation and identical means that they have same variance. Errors are always i.i.d. because they exist only in imagination. Question is always whether sample counter part of these errors called residuals satisfies this i.i.d property or not is always one needs to test for. Residuals are what we get in practice when we run our regression. Residuals are the difference between
\(e_i=Y_i-\hat{Y_i}\)
Homoscedasticity implies that data are generated from the same population. Many books cover many tests for detecting whether there is hetroscedasticity or not. A number of theoretical procedures are also given to deal with Hetroscedasticity if its present. Breauch-Pagan and White (actually it should have been Eicker-White) tests are the most useful tests for detecting hetroscedasticity. hetroscedasticity is mainly a problem in cross sectional data, while autocorrelation is mainly an issue of time series data.
I am pointing here how to resolve hetroscedasticity in practice which niether requires any detection nor any formal way of removing it.
As we know presence of hetroscedasticity affects standard errors, therefore, there is need to make a correction for standard error due to hetroscedasticity. The corrected SE are called hetroscedastic corrected standard errors. This option is available in all statistical softwares, so one should always use hetroscedasticity corrected standard errors whenever one runs an econometric model. If there will be no hetroescedasticity, no correction will take place and in case there will be hetroscedasticity , standard errors will be corrected accordingly.
I am going to demonstrate here how to do it in R, STATA and EVIEWS. You may explore in other softwares at your own. \[ SE(\hat{\beta}_1) = \sqrt{ \frac{1}{n} \cdot \frac{ \frac{1}{n} \sum_{i=1}^n (X_i - \overline{X})^2 \hat{u}_i^2 }{ \left[ \frac{1}{n} \sum_{i=1}^n (X_i - \overline{X})^2 \right]^2} } \tag{1}\]
\[ \begin{align} SE(\hat{\beta}_1)_{HC1} = \sqrt{ \frac{1}{n} \cdot \frac{ \frac{1}{n-2} \sum_{i=1}^n (X_i - \overline{X})^2 \hat{u}_i^2 }{ \left[ \frac{1}{n} \sum_{i=1}^n (X_i - \overline{X})^2 \right]^2}} \tag{2} \end{align}\]
library(readxl)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(moderndive)
library(sandwich)
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(broom)
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
##
## recode
library(knitr)
Growth <- read_excel("C:/Users/hp/Dropbox/Applied Economics/Growth.xlsx")
growth_malta<-Growth %>% filter(country_name!="Malta")
mod.lm1<-lm(growth~tradeshare+yearsschool,data = growth_malta)
kable(tidy(mod.lm1),caption=
"Regular standard errors in the 'growth' equation")
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | -0.1222362 | 0.6626687 | -0.1844605 | 0.8542641 |
tradeshare | 1.8978230 | 0.9360473 | 2.0274862 | 0.0469879 |
yearsschool | 0.2429752 | 0.0837020 | 2.9028597 | 0.0051402 |
cov1 <- hccm(mod.lm1, type="hc1") #needs package 'car'
growth.HC1 <- coeftest(mod.lm1, vcov.=cov1)
kable(tidy(growth.HC1),caption= "Robust (HC1) standard errors in the 'growth' equation")
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | -0.1222362 | 0.6911650 | -0.1768553 | 0.8602079 |
tradeshare | 1.8978230 | 0.8655411 | 2.1926434 | 0.0321573 |
yearsschool | 0.2429752 | 0.0758919 | 3.2015973 | 0.0021717 |