Identify the typical signs of multicollinearity in a regression.
The The easiest way for the detection of multicollinearity is to examine the correlation between each pair of explanatory variables. If two of the variables are highly correlated, then this may the possible source of multicollinearity. However, pair-wise correlation between the explanatory variables may be considered as the sufficient, but not the necessary condition for the multicollinearity. The second easy way for detecting the multicollinearity is to estimate the multiple regression and then examine the output carefully. The rule of thumb to doubt about the presence of multicollinearity is very high \(R^2\) but most of the coefficients are not significant according to their p-values. However, this cannot be considered as an acid test for detecting multicollinearity. It will provide an apparent idea for the presence of multicollinearity11.
Explain what might give rise to multicollinearity and provide real-life examples of such situations.
High or perfect correlation between regressors. For example if we want to check how hieght might explain high grades and we also use income - hieght across the globe is correlated with income.
Correct for multicollinearity in a regression.
# install packages this script uses if you dont have them already
# install.packages("ggplot2")
#install.packages("readxl")
#install.packages("GGally")
#install.packages("corpcor")
#install.packages("mctest")
#install.packages("ppcor")
# Clear workspace
rm(list=ls())
# Call packages you use
library(ggplot2)
library(readxl)
library(GGally)
library(corpcor)
library(mctest)
library(ppcor)
Loading required package: MASS
# Choose the path and name of the file "6_1_data"
wagesmicrodata <- read.csv(file="/Users/oba2311/Desktop/Minerva/Junior/SS154/6/6-1-data.csv")
# To view the dataframe
# View(wagesmicrodata)
# The database is attached to the R search path.
# This means that the database is searched by R when evaluating a variable, so objects in the database can be accessed by simply giving their names.
attach(wagesmicrodata)
# Assuming no multicollinearity, the model is being estimated using the following codes:
fit1 <- lm(log(WAGE)~OCCUPATION+SECTOR+UNION+EDUCATION+EXPERIENCE+AGE+SEX+MARRSTAT+RACE+SOUTH)
# To get model summary
summary(fit1)
Call:
lm(formula = log(WAGE) ~ OCCUPATION + SECTOR + UNION + EDUCATION +
EXPERIENCE + AGE + SEX + MARRSTAT + RACE + SOUTH)
Residuals:
Min 1Q Median 3Q Max
-2.16246 -0.29163 -0.00469 0.29981 1.98248
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.078596 0.687514 1.569 0.117291
OCCUPATION -0.007417 0.013109 -0.566 0.571761
SECTOR 0.091458 0.038736 2.361 0.018589 *
UNION 0.200483 0.052475 3.821 0.000149 ***
EDUCATION 0.179366 0.110756 1.619 0.105949
EXPERIENCE 0.095822 0.110799 0.865 0.387531
AGE -0.085444 0.110730 -0.772 0.440671
SEX -0.221997 0.039907 -5.563 4.24e-08 ***
MARRSTAT 0.076611 0.041931 1.827 0.068259 .
RACE 0.050406 0.028531 1.767 0.077865 .
SOUTH -0.102360 0.042823 -2.390 0.017187 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4398 on 523 degrees of freedom
Multiple R-squared: 0.3185, Adjusted R-squared: 0.3054
F-statistic: 24.44 on 10 and 523 DF, p-value: < 2.2e-16
# Make plots of residuals
par(mfrow=c(2,2))
plot(fit1)
not plotting observations with leverage one:
444
# Create X with explanatory variables
X<-wagesmicrodata[,2:11]
# pair-wise correlation of explanatory variables
ggpairs(X)
plot: [1,1] [=---------------------------------------------------------------------] 1% est: 0s
plot: [1,2] [=---------------------------------------------------------------------] 2% est: 6s
plot: [1,3] [==--------------------------------------------------------------------] 3% est: 7s
plot: [1,4] [===-------------------------------------------------------------------] 4% est: 7s
plot: [1,5] [====------------------------------------------------------------------] 5% est: 7s
plot: [1,6] [====------------------------------------------------------------------] 6% est: 7s
plot: [1,7] [=====-----------------------------------------------------------------] 7% est: 7s
plot: [1,8] [======----------------------------------------------------------------] 8% est: 7s
plot: [1,9] [======----------------------------------------------------------------] 9% est: 7s
plot: [1,10] [=======--------------------------------------------------------------] 10% est: 7s
plot: [2,1] [========--------------------------------------------------------------] 11% est: 7s
plot: [2,2] [========--------------------------------------------------------------] 12% est: 7s
plot: [2,3] [=========-------------------------------------------------------------] 13% est: 7s
plot: [2,4] [==========------------------------------------------------------------] 14% est: 7s
plot: [2,5] [==========------------------------------------------------------------] 15% est: 7s
plot: [2,6] [===========-----------------------------------------------------------] 16% est: 7s
plot: [2,7] [============----------------------------------------------------------] 17% est: 7s
plot: [2,8] [=============---------------------------------------------------------] 18% est: 7s
plot: [2,9] [=============---------------------------------------------------------] 19% est: 6s
plot: [2,10] [==============-------------------------------------------------------] 20% est: 6s
plot: [3,1] [===============-------------------------------------------------------] 21% est: 6s
plot: [3,2] [===============-------------------------------------------------------] 22% est: 6s
plot: [3,3] [================------------------------------------------------------] 23% est: 6s
plot: [3,4] [=================-----------------------------------------------------] 24% est: 6s
plot: [3,5] [==================----------------------------------------------------] 25% est: 6s
plot: [3,6] [==================----------------------------------------------------] 26% est: 6s
plot: [3,7] [===================---------------------------------------------------] 27% est: 6s
plot: [3,8] [====================--------------------------------------------------] 28% est: 6s
plot: [3,9] [====================--------------------------------------------------] 29% est: 6s
plot: [3,10] [=====================------------------------------------------------] 30% est: 6s
plot: [4,1] [======================------------------------------------------------] 31% est: 6s
plot: [4,2] [======================------------------------------------------------] 32% est: 6s
plot: [4,3] [=======================-----------------------------------------------] 33% est: 5s
plot: [4,4] [========================----------------------------------------------] 34% est: 5s
plot: [4,5] [========================----------------------------------------------] 35% est: 5s
plot: [4,6] [=========================---------------------------------------------] 36% est: 5s
plot: [4,7] [==========================--------------------------------------------] 37% est: 5s
plot: [4,8] [===========================-------------------------------------------] 38% est: 5s
plot: [4,9] [===========================-------------------------------------------] 39% est: 5s
plot: [4,10] [============================-----------------------------------------] 40% est: 5s
plot: [5,1] [=============================-----------------------------------------] 41% est: 5s
plot: [5,2] [=============================-----------------------------------------] 42% est: 5s
plot: [5,3] [==============================----------------------------------------] 43% est: 5s
plot: [5,4] [===============================---------------------------------------] 44% est: 5s
plot: [5,5] [================================--------------------------------------] 45% est: 5s
plot: [5,6] [================================--------------------------------------] 46% est: 4s
plot: [5,7] [=================================-------------------------------------] 47% est: 4s
plot: [5,8] [==================================------------------------------------] 48% est: 4s
plot: [5,9] [==================================------------------------------------] 49% est: 4s
plot: [5,10] [==================================-----------------------------------] 50% est: 4s
plot: [6,1] [====================================----------------------------------] 51% est: 4s
plot: [6,2] [====================================----------------------------------] 52% est: 4s
plot: [6,3] [=====================================---------------------------------] 53% est: 4s
plot: [6,4] [======================================--------------------------------] 54% est: 4s
plot: [6,5] [======================================--------------------------------] 55% est: 4s
plot: [6,6] [=======================================-------------------------------] 56% est: 4s
plot: [6,7] [========================================------------------------------] 57% est: 4s
plot: [6,8] [=========================================-----------------------------] 58% est: 3s
plot: [6,9] [=========================================-----------------------------] 59% est: 3s
plot: [6,10] [=========================================----------------------------] 60% est: 3s
plot: [7,1] [===========================================---------------------------] 61% est: 3s
plot: [7,2] [===========================================---------------------------] 62% est: 3s
plot: [7,3] [============================================--------------------------] 63% est: 3s
plot: [7,4] [=============================================-------------------------] 64% est: 3s
plot: [7,5] [==============================================------------------------] 65% est: 3s
plot: [7,6] [==============================================------------------------] 66% est: 3s
plot: [7,7] [===============================================-----------------------] 67% est: 3s
plot: [7,8] [================================================----------------------] 68% est: 3s
plot: [7,9] [================================================----------------------] 69% est: 3s
plot: [7,10] [================================================---------------------] 70% est: 2s
plot: [8,1] [==================================================--------------------] 71% est: 2s
plot: [8,2] [==================================================--------------------] 72% est: 2s
plot: [8,3] [===================================================-------------------] 73% est: 2s
plot: [8,4] [====================================================------------------] 74% est: 2s
plot: [8,5] [====================================================------------------] 75% est: 2s
plot: [8,6] [=====================================================-----------------] 76% est: 2s
plot: [8,7] [======================================================----------------] 77% est: 2s
plot: [8,8] [=======================================================---------------] 78% est: 2s
plot: [8,9] [=======================================================---------------] 79% est: 2s
plot: [8,10] [=======================================================--------------] 80% est: 2s
plot: [9,1] [=========================================================-------------] 81% est: 2s
plot: [9,2] [=========================================================-------------] 82% est: 1s
plot: [9,3] [==========================================================------------] 83% est: 1s
plot: [9,4] [===========================================================-----------] 84% est: 1s
plot: [9,5] [============================================================----------] 85% est: 1s
plot: [9,6] [============================================================----------] 86% est: 1s
plot: [9,7] [=============================================================---------] 87% est: 1s
plot: [9,8] [==============================================================--------] 88% est: 1s
plot: [9,9] [==============================================================--------] 89% est: 1s
plot: [9,10] [==============================================================-------] 90% est: 1s
plot: [10,1] [===============================================================------] 91% est: 1s
plot: [10,2] [===============================================================------] 92% est: 1s
plot: [10,3] [================================================================-----] 93% est: 1s
plot: [10,4] [=================================================================----] 94% est: 0s
plot: [10,5] [==================================================================---] 95% est: 0s
plot: [10,6] [==================================================================---] 96% est: 0s
plot: [10,7] [===================================================================--] 97% est: 0s
plot: [10,8] [====================================================================-] 98% est: 0s
plot: [10,9] [====================================================================-] 99% est: 0s
plot: [10,10] [====================================================================]100% est: 0s
# partial correlation coefficient matrix
cor2pcor(cov(X))
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,] 1.000000000 -0.031750193 0.051510483 -0.99756187 -0.007479144 0.99726160 0.017230877
[2,] -0.031750193 1.000000000 -0.030152499 -0.02231360 -0.097548621 0.02152507 -0.111197596
[3,] 0.051510483 -0.030152499 1.000000000 0.05497703 -0.120087577 -0.05369785 0.020017315
[4,] -0.997561873 -0.022313605 0.054977034 1.00000000 -0.010244447 0.99987574 0.010888486
[5,] -0.007479144 -0.097548621 -0.120087577 -0.01024445 1.000000000 0.01223890 -0.107706183
[6,] 0.997261601 0.021525073 -0.053697851 0.99987574 0.012238897 1.00000000 -0.010803310
[7,] 0.017230877 -0.111197596 0.020017315 0.01088849 -0.107706183 -0.01080331 1.000000000
[8,] 0.029436911 0.008430595 -0.142750864 0.04205856 0.212996388 -0.04414029 0.057539374
[9,] -0.021253493 -0.021518760 -0.112146760 -0.01326166 -0.013531482 0.01456575 0.006412099
[10,] -0.040302967 0.030418218 0.004163264 -0.04097664 0.068918496 0.04509033 0.055645964
[,8] [,9] [,10]
[1,] 0.029436911 -0.021253493 -0.040302967
[2,] 0.008430595 -0.021518760 0.030418218
[3,] -0.142750864 -0.112146760 0.004163264
[4,] 0.042058560 -0.013261665 -0.040976643
[5,] 0.212996388 -0.013531482 0.068918496
[6,] -0.044140293 0.014565751 0.045090327
[7,] 0.057539374 0.006412099 0.055645964
[8,] 1.000000000 0.314746868 -0.018580965
[9,] 0.314746868 1.000000000 0.036495494
[10,] -0.018580965 0.036495494 1.000000000
# Individual Multicollinearity Diagnostics Result
imcdiag(X,WAGE)
Call:
imcdiag(x = X, y = WAGE)
All Individual Multicollinearity Diagnostics Result
VIF TOL Wi Fi Leamer CVIF Klein
EDUCATION 231.1956 0.0043 13402.4982 15106.5849 0.0658 236.4725 1
SOUTH 1.0468 0.9553 2.7264 3.0731 0.9774 1.0707 0
SEX 1.0916 0.9161 5.3351 6.0135 0.9571 1.1165 0
EXPERIENCE 5184.0939 0.0002 301771.2445 340140.5368 0.0139 5302.4188 1
UNION 1.1209 0.8922 7.0368 7.9315 0.9445 1.1464 0
AGE 4645.6650 0.0002 270422.7164 304806.1391 0.0147 4751.7005 1
RACE 1.0371 0.9642 2.1622 2.4372 0.9819 1.0608 0
OCCUPATION 1.2982 0.7703 17.3637 19.5715 0.8777 1.3279 0
SECTOR 1.1987 0.8343 11.5670 13.0378 0.9134 1.2260 0
MARRSTAT 1.0961 0.9123 5.5969 6.3085 0.9551 1.1211 0
1 --> COLLINEARITY is detected by the test
0 --> COLLINEARITY is not detected by the test
EDUCATION , SOUTH , EXPERIENCE , AGE , RACE , OCCUPATION , SECTOR , MARRSTAT , coefficient(s) are non-significant may be due to multicollinearity
R-square of y on all x: 0.2805
* use method argument to check which regressors may be the reason of collinearity
===================================
# t-test for correlation coefficient
pcor(X, method = "pearson")
$estimate
EDUCATION SOUTH SEX EXPERIENCE UNION AGE
EDUCATION 1.000000000 -0.031750193 0.051510483 -0.99756187 -0.007479144 0.99726160
SOUTH -0.031750193 1.000000000 -0.030152499 -0.02231360 -0.097548621 0.02152507
SEX 0.051510483 -0.030152499 1.000000000 0.05497703 -0.120087577 -0.05369785
EXPERIENCE -0.997561873 -0.022313605 0.054977034 1.00000000 -0.010244447 0.99987574
UNION -0.007479144 -0.097548621 -0.120087577 -0.01024445 1.000000000 0.01223890
AGE 0.997261601 0.021525073 -0.053697851 0.99987574 0.012238897 1.00000000
RACE 0.017230877 -0.111197596 0.020017315 0.01088849 -0.107706183 -0.01080331
OCCUPATION 0.029436911 0.008430595 -0.142750864 0.04205856 0.212996388 -0.04414029
SECTOR -0.021253493 -0.021518760 -0.112146760 -0.01326166 -0.013531482 0.01456575
MARRSTAT -0.040302967 0.030418218 0.004163264 -0.04097664 0.068918496 0.04509033
RACE OCCUPATION SECTOR MARRSTAT
EDUCATION 0.017230877 0.029436911 -0.021253493 -0.040302967
SOUTH -0.111197596 0.008430595 -0.021518760 0.030418218
SEX 0.020017315 -0.142750864 -0.112146760 0.004163264
EXPERIENCE 0.010888486 0.042058560 -0.013261665 -0.040976643
UNION -0.107706183 0.212996388 -0.013531482 0.068918496
AGE -0.010803310 -0.044140293 0.014565751 0.045090327
RACE 1.000000000 0.057539374 0.006412099 0.055645964
OCCUPATION 0.057539374 1.000000000 0.314746868 -0.018580965
SECTOR 0.006412099 0.314746868 1.000000000 0.036495494
MARRSTAT 0.055645964 -0.018580965 0.036495494 1.000000000
$p.value
EDUCATION SOUTH SEX EXPERIENCE UNION AGE RACE
EDUCATION 0.0000000 0.46745162 0.238259049 0.0000000 8.641246e-01 0.0000000 0.69337880
SOUTH 0.4674516 0.00000000 0.490162786 0.6096300 2.526916e-02 0.6223281 0.01070652
SEX 0.2382590 0.49016279 0.000000000 0.2080904 5.822656e-03 0.2188841 0.64692038
EXPERIENCE 0.0000000 0.60962999 0.208090393 0.0000000 8.146741e-01 0.0000000 0.80325456
UNION 0.8641246 0.02526916 0.005822656 0.8146741 0.000000e+00 0.7794483 0.01345383
AGE 0.0000000 0.62232811 0.218884070 0.0000000 7.794483e-01 0.0000000 0.80476248
RACE 0.6933788 0.01070652 0.646920379 0.8032546 1.345383e-02 0.8047625 0.00000000
OCCUPATION 0.5005235 0.84704000 0.001027137 0.3356824 8.220095e-07 0.3122902 0.18763758
SECTOR 0.6267278 0.62243025 0.010051378 0.7615531 7.568528e-01 0.7389200 0.88336002
MARRSTAT 0.3562616 0.48634504 0.924111163 0.3482728 1.143954e-01 0.3019796 0.20260170
OCCUPATION SECTOR MARRSTAT
EDUCATION 5.005235e-01 6.267278e-01 0.3562616
SOUTH 8.470400e-01 6.224302e-01 0.4863450
SEX 1.027137e-03 1.005138e-02 0.9241112
EXPERIENCE 3.356824e-01 7.615531e-01 0.3482728
UNION 8.220095e-07 7.568528e-01 0.1143954
AGE 3.122902e-01 7.389200e-01 0.3019796
RACE 1.876376e-01 8.833600e-01 0.2026017
OCCUPATION 0.000000e+00 1.467261e-13 0.6707116
SECTOR 1.467261e-13 0.000000e+00 0.4035489
MARRSTAT 6.707116e-01 4.035489e-01 0.0000000
$statistic
EDUCATION SOUTH SEX EXPERIENCE UNION AGE RACE
EDUCATION 0.0000000 -0.7271618 1.18069629 -327.2105031 -0.1712102 308.6803174 0.3944914
SOUTH -0.7271618 0.0000000 -0.69053623 -0.5109090 -2.2436907 0.4928456 -2.5613138
SEX 1.1806963 -0.6905362 0.00000000 1.2603880 -2.7689685 -1.2309760 0.4583091
EXPERIENCE -327.2105031 -0.5109090 1.26038801 0.0000000 -0.2345184 1451.9092015 0.2492636
UNION -0.1712102 -2.2436907 -2.76896848 -0.2345184 0.0000000 0.2801822 -2.4799336
AGE 308.6803174 0.4928456 -1.23097601 1451.9092015 0.2801822 0.0000000 -0.2473135
RACE 0.3944914 -2.5613138 0.45830912 0.2492636 -2.4799336 -0.2473135 0.0000000
OCCUPATION 0.6741338 0.1929920 -3.30152873 0.9636171 4.9902208 -1.0114033 1.3193223
SECTOR -0.4866246 -0.4927010 -2.58345399 -0.3036001 -0.3097781 0.3334607 0.1467827
MARRSTAT -0.9233273 0.6966272 0.09530228 -0.9387867 1.5813765 1.0332156 1.2757711
OCCUPATION SECTOR MARRSTAT
EDUCATION 0.6741338 -0.4866246 -0.92332727
SOUTH 0.1929920 -0.4927010 0.69662719
SEX -3.3015287 -2.5834540 0.09530228
EXPERIENCE 0.9636171 -0.3036001 -0.93878671
UNION 4.9902208 -0.3097781 1.58137652
AGE -1.0114033 0.3334607 1.03321563
RACE 1.3193223 0.1467827 1.27577106
OCCUPATION 0.0000000 7.5906763 -0.42541117
SECTOR 7.5906763 0.0000000 0.83597695
MARRSTAT -0.4254112 0.8359769 0.00000000
$n
[1] 534
$gp
[1] 8
$method
[1] "pearson"
# dropping "EXPERIENCE" and fitting a new model
fit2<- lm(log(WAGE)~OCCUPATION+SECTOR+UNION+EDUCATION+AGE+SEX+MARRSTAT+RACE+SOUTH)
summary(fit2)
Call:
lm(formula = log(WAGE) ~ OCCUPATION + SECTOR + UNION + EDUCATION +
AGE + SEX + MARRSTAT + RACE + SOUTH)
Residuals:
Min 1Q Median 3Q Max
-2.16018 -0.29085 -0.00513 0.29985 1.97932
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.501358 0.164794 3.042 0.002465 **
OCCUPATION -0.006941 0.013095 -0.530 0.596309
SECTOR 0.091013 0.038723 2.350 0.019125 *
UNION 0.200018 0.052459 3.813 0.000154 ***
EDUCATION 0.083815 0.007728 10.846 < 2e-16 ***
AGE 0.010305 0.001745 5.905 6.34e-09 ***
SEX -0.220100 0.039837 -5.525 5.20e-08 ***
MARRSTAT 0.075125 0.041886 1.794 0.073458 .
RACE 0.050674 0.028523 1.777 0.076210 .
SOUTH -0.103186 0.042802 -2.411 0.016261 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4397 on 524 degrees of freedom
Multiple R-squared: 0.3175, Adjusted R-squared: 0.3058
F-statistic: 27.09 on 9 and 524 DF, p-value: < 2.2e-16
Please bring a small paragraph open in a tab on your browser where you explain what method for identifying multicollinearity did you find the most useful and why.
I liked the work flow of the blog starting with visual inspection of pairwise correlations, then going into depth with the full FG workflow of the three tests. I find the chi-square useful. I would choose to go through the 3 tests starting with visual inspection.
Simulate 1,000 data points in R from each of the following four models and save the 1,000 data points (Y_i ,X_i ,W_i ) for each model. Then use the simulated data to estimate (b0,b1,b2) for each model.
\(Y_i = b0 + b1\cdot X_i + b2 \cdot W_i + e_i\)
Y<- runif(1000)
Where \((b0, b1, b2)\) = \((1, 2, 3)\) That would just be bad since the coefficients are randomly chosen.
Where \(X_i \~ N(1,2)\)
norm_x <- rnorm(1000,1,2)
norm_w <- rnorm(1000,3,2)
model1 <- lm(Y~norm_x+norm_w)
summary(model1)
Call:
lm(formula = Y ~ norm_x + norm_w)
Residuals:
Min 1Q Median 3Q Max
-0.50429 -0.25404 0.00223 0.25235 0.51092
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.5137616 0.0172251 29.826 <2e-16 ***
norm_x -0.0008997 0.0046467 -0.194 0.847
norm_w -0.0055364 0.0045966 -1.204 0.229
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2954 on 997 degrees of freedom
Multiple R-squared: 0.001486, Adjusted R-squared: -0.0005166
F-statistic: 0.7421 on 2 and 997 DF, p-value: 0.4764
Where \(e_i ~ N(0,1)\)
norm_e <-rnorm(1000,0,1)
Where \(W_i\) comes from: Let \(W_i = 2\cdot X_i\). What happens when you try to estimate the model? (Perfectly collinear so matrix not invertible so it will give error)
norm_w<- 2*norm_x
model2 <- lm(Y~norm_x+norm_w)
summary(model2)
Call:
lm(formula = Y ~ norm_x + norm_w)
Residuals:
Min 1Q Median 3Q Max
-0.49677 -0.25782 0.00199 0.25514 0.50677
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4974106 0.0106048 46.904 <2e-16 ***
norm_x -0.0008503 0.0046476 -0.183 0.855
norm_w NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2955 on 998 degrees of freedom
Multiple R-squared: 3.354e-05, Adjusted R-squared: -0.0009684
F-statistic: 0.03347 on 1 and 998 DF, p-value: 0.8549
norm_w = log(10+norm_x)
model3 <- lm(Y~norm_x+norm_w)
summary(model3)
Call:
lm(formula = Y ~ norm_x + norm_w)
Residuals:
Min 1Q Median 3Q Max
-0.50536 -0.25742 0.00025 0.25593 0.50581
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.06498 0.88008 1.210 0.227
norm_x 0.02213 0.03593 0.616 0.538
norm_w -0.24807 0.38463 -0.645 0.519
Residual standard error: 0.2955 on 997 degrees of freedom
Multiple R-squared: 0.0004506, Adjusted R-squared: -0.001555
F-statistic: 0.2247 on 2 and 997 DF, p-value: 0.7988
u_i <- rnorm(1000,1,0.1)
norm_w = norm_x+ u_i
model4 <- lm(Y~norm_x+norm_w)
summary(model4)
Call:
lm(formula = Y ~ norm_x + norm_w)
Residuals:
Min 1Q Median 3Q Max
-0.49725 -0.25819 0.00227 0.25665 0.50467
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.46289 0.09604 4.820 1.66e-06 ***
norm_x -0.03532 0.09542 -0.370 0.711
norm_w 0.03450 0.09538 0.362 0.718
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2956 on 997 degrees of freedom
Multiple R-squared: 0.0001647, Adjusted R-squared: -0.001841
F-statistic: 0.08213 on 2 and 997 DF, p-value: 0.9212
norm_w <- rnorm(1000,-1,2)
model5 <- lm(Y~norm_x+norm_w)
summary(model5)
Call:
lm(formula = Y ~ norm_x + norm_w)
Residuals:
Min 1Q Median 3Q Max
-0.49651 -0.25762 0.00172 0.25580 0.51046
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4959764 0.0114634 43.266 <2e-16 ***
norm_x -0.0008748 0.0046503 -0.188 0.851
norm_w -0.0015144 0.0045839 -0.330 0.741
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2956 on 997 degrees of freedom
Multiple R-squared: 0.000143, Adjusted R-squared: -0.001863
F-statistic: 0.07129 on 2 and 997 DF, p-value: 0.9312