### Define Parameters ###
set.seed(1001)
x1 = sample(1:4, replace = TRUE, 100)
x2 = sample(1:4, replace = TRUE, 100)
x3 = sample(1:4, replace = TRUE, 100)
y = rbinom(100, x1, .3) + rbinom(100, x2, .9) + rpois(100, x3/5)
### Full model with all parameters ###
linreg = lm(y ~ x1 + x2 + x3)
summary(linreg)
##
## Call:
## lm(formula = y ~ x1 + x2 + x3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.0334 -0.8558 -0.0677 0.4114 4.0576
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.85818 0.40328 2.128 0.0359 *
## x1 0.09936 0.09540 1.042 0.3003
## x2 0.89235 0.08823 10.114 <2e-16 ***
## x3 0.03083 0.09088 0.339 0.7352
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.048 on 96 degrees of freedom
## Multiple R-squared: 0.5253, Adjusted R-squared: 0.5104
## F-statistic: 35.41 on 3 and 96 DF, p-value: 1.699e-15
From the full model, the x2 variable (p-value < 0.05) found to be significant and the x1, x3 are insignificant variables (p-value > 0.05).
### Model with significant parameters ###
linreg2 = lm(y ~ x2)
summary(linreg2)
##
## Call:
## lm(formula = y ~ x2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.9607 -0.8593 0.0393 0.2421 3.9379
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.16350 0.24184 4.811 5.44e-06 ***
## x2 0.89860 0.08727 10.297 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.044 on 98 degrees of freedom
## Multiple R-squared: 0.5197, Adjusted R-squared: 0.5148
## F-statistic: 106 on 1 and 98 DF, p-value: < 2.2e-16
### Model with insignificant parameters ###
linreg3 = lm(y ~ x1 + x3)
summary(linreg3)
##
## Call:
## lm(formula = y ~ x1 + x3)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.5589 -1.1847 -0.2686 1.0688 4.1651
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.7308 0.5122 5.331 6.36e-07 ***
## x1 0.1452 0.1362 1.066 0.289
## x3 0.1309 0.1292 1.013 0.314
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.499 on 97 degrees of freedom
## Multiple R-squared: 0.01944, Adjusted R-squared: -0.0007822
## F-statistic: 0.9613 on 2 and 97 DF, p-value: 0.386
Individual model with x2 variable alone also showed it is a significant variable and model with x1 and x3 found to insignicant.
### Comparison of full model containing both significant and insignificant parameters with the model containing insignificant parameters ###
anova(linreg,linreg3)
## Analysis of Variance Table
##
## Model 1: y ~ x1 + x2 + x3
## Model 2: y ~ x1 + x3
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 96 105.48
## 2 97 217.87 -1 -112.39 102.29 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Comparison of full model containing both significant and insignificant parameters with the model containing insignificant parameters was performed and found to be that x2 is significant coefficient.
### Comparison of full model containing both significant and insignificant parameters with the model containing significant parameters ###
anova(linreg,linreg2)
## Analysis of Variance Table
##
## Model 1: y ~ x1 + x2 + x3
## Model 2: y ~ x2
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 96 105.48
## 2 98 106.72 -2 -1.2388 0.5637 0.5709
Comparison of full model containing both significant and insignificant parameters with the model containing significant parameters was performed and found to be that x1 and x3 are insignificant coefficients. Therefore x1 and x3 did not contribute to reduce the error in the model and are not needed for the analysis.