bailsofhay — Dec 3, 2013, 10:58 PM
data=read.table("http://www.stat.lsu.edu/exstweb/statlab/datasets/KNNLData/APPENC03.txt")
names(data)=c("ID","y","x1","x2","x3","x4","x5","x6")
### Part a ####
fit=lm(y~x1+x2+I(x3==1)+I(x4==1)+I(x6==1999)+I(x6==2001)+I(x6==2002), data =data)
fit
Call:
lm(formula = y ~ x1 + x2 + I(x3 == 1) + I(x4 == 1) + I(x6 ==
1999) + I(x6 == 2001) + I(x6 == 2002), data = data)
Coefficients:
(Intercept) x1 x2
3.02e+00 -2.47e-01 -9.65e-05
I(x3 == 1)TRUE I(x4 == 1)TRUE I(x6 == 1999)TRUE
4.09e-01 1.24e-01 1.32e-02
I(x6 == 2001)TRUE I(x6 == 2002)TRUE
-1.09e-01 -8.31e-02
library(MASS)
stepAIC(fit , k = log(36))
Start: AIC=-115.6
y ~ x1 + x2 + I(x3 == 1) + I(x4 == 1) + I(x6 == 1999) + I(x6 ==
2001) + I(x6 == 2002)
Df Sum of Sq RSS AIC
- I(x6 == 1999) 1 0.000 0.655 -119.2
- x2 1 0.006 0.660 -118.9
- I(x6 == 2002) 1 0.022 0.676 -118.0
- x1 1 0.036 0.691 -117.3
- I(x6 == 2001) 1 0.054 0.709 -116.3
<none> 0.654 -115.6
- I(x4 == 1) 1 0.119 0.774 -113.2
- I(x3 == 1) 1 1.350 2.004 -78.9
Step: AIC=-119.2
y ~ x1 + x2 + I(x3 == 1) + I(x4 == 1) + I(x6 == 2001) + I(x6 ==
2002)
Df Sum of Sq RSS AIC
- x2 1 0.005 0.660 -122.5
- I(x6 == 2002) 1 0.025 0.679 -121.4
- x1 1 0.036 0.691 -120.8
- I(x6 == 2001) 1 0.063 0.717 -119.5
<none> 0.655 -119.2
- I(x4 == 1) 1 0.123 0.778 -116.6
- I(x3 == 1) 1 1.350 2.005 -82.5
Step: AIC=-122.5
y ~ x1 + I(x3 == 1) + I(x4 == 1) + I(x6 == 2001) + I(x6 == 2002)
Df Sum of Sq RSS AIC
- I(x6 == 2002) 1 0.020 0.680 -125.0
- x1 1 0.032 0.692 -124.4
- I(x6 == 2001) 1 0.057 0.717 -123.0
<none> 0.660 -122.5
- I(x4 == 1) 1 0.118 0.778 -120.1
- I(x3 == 1) 1 1.362 2.023 -85.7
Step: AIC=-125
y ~ x1 + I(x3 == 1) + I(x4 == 1) + I(x6 == 2001)
Df Sum of Sq RSS AIC
- I(x6 == 2001) 1 0.038 0.718 -127
<none> 0.680 -125
- x1 1 0.087 0.768 -124
- I(x4 == 1) 1 0.119 0.799 -123
- I(x3 == 1) 1 1.361 2.041 -89
Step: AIC=-126.6
y ~ x1 + I(x3 == 1) + I(x4 == 1)
Df Sum of Sq RSS AIC
<none> 0.718 -126.6
- x1 1 0.113 0.831 -124.9
- I(x4 == 1) 1 0.118 0.836 -124.7
- I(x3 == 1) 1 1.361 2.079 -91.9
Call:
lm(formula = y ~ x1 + I(x3 == 1) + I(x4 == 1), data = data)
Coefficients:
(Intercept) x1 I(x3 == 1)TRUE I(x4 == 1)TRUE
3.185 -0.353 0.399 0.118
#### Part b #####
# The best model from above is Yi=x1+x3+x4
# Yes they are in agreement for part c of 8.42 since dropping x2 and x6 passed the test.