Multicollinearity

Zahid Asghar, School of Economics, Quaid-i-Azam University

11/18/2020

Concept of Multicollinearity

CONSEQUENCES

VARIANCE INFLATION FACTOR

For the following regression model: \(Y_{i}=\beta_0+\beta_1 X_{1i}+\beta_2 X_{2i}+e_{i}\)

It can be shown that :

\({Var}(\hat{\beta_1}) = \sigma^2 \left( \frac{1}{1 - r_{12}^2} \right) \frac{1}{S_{x_j x_j}}\)

\({Var}(\hat{\beta_2}) = \sigma^2 \left( \frac{1}{1 - r_{21}^2} \right) \frac{1}{S_{x_j x_j}}\)

\(S_{x_j x_j} = \sum(x_{ij}-\bar{x}_j)^2\)

Variance Inflation Factor (VIF) is defined as \(\frac{1}{1 - r_{12}^2}\) and \(\frac{1}{1 - r_{21}^2}\)

Detection of Multicollinearity

Longley Models with full data and without 1962

We use two subsets of longley data, one with all 16 years data and one with year 1962 omitted from it. It can be noticed that just adding 1962 year data , makes a big change in coefficient magnitued. We have also calculated VIF for both models.

Longley Data for Mulicollinearity
md62full
(Intercept)1459415.07   1169087.53   
(714182.87)  (835902.44)  
year-721.76   -576.46   
(369.98)  (433.49)  
price-181.12   -19.77   
(135.52)  (138.89)  
gnp0.09 **0.06 **
(0.02)  (0.02)  
armed-0.07   -0.01   
(0.26)  (0.31)  
N15      16      
R20.98   0.97   
logLik-113.20   -123.76   
AIC238.41   259.52   
*** p < 0.001; ** p < 0.01; * p < 0.05.
##       year      price        gnp      armed 
## 121.533754  87.346665 154.075397   1.559474
##       year      price        gnp      armed 
## 143.463545  75.670734 132.463801   1.553191

Strategies to Solve MC