Analyzing using R
The “diabetes.csv” dataset consists of data related to relative weight and results of different tests to diagonise diabetes of 144 persons.
relwt : Relative weight.
glufast : Fasting Plasma Glucose (FPG).
glutest : Test Plasma Glucose.
sspg : Steady State Plasma Glucose.
instest : Plasma Insulin during Test.
group : Clinical group.
relwt glufast glutest sspg instest group
1 0.81 80 356 124 55 0
2 0.95 97 289 117 76 0
3 0.94 105 319 143 105 0
4 1.04 90 356 199 108 0
5 1.00 90 323 240 143 0
Structure of the dataset.
'data.frame': 144 obs. of 6 variables:
$ relwt : num 0.81 0.95 0.94 1.04 1 0.76 0.91 1.1 0.99 0.78 ...
$ glufast: int 80 97 105 90 90 86 100 85 97 97 ...
$ glutest: int 356 289 319 356 323 381 350 301 379 296 ...
$ sspg : int 124 117 143 199 240 157 221 186 142 131 ...
$ instest: int 55 76 105 108 143 165 119 105 98 94 ...
$ group : Factor w/ 3 levels "0","2","1": 1 1 1 1 1 1 1 1 1 1 ...
Let us quickly see some basic features of the data.
Let us look into the boxplot with IQR.
\(y_{i}=\beta_{0}+\beta_{1}x_{1i}+\beta_{2}x_{2i}+\beta_{3}x_{3i}+\beta_{4}x_{4i}+\beta_{5}z_{5i}+\beta_{6}z_{6i}+\epsilon_{i}\:i=1\left(1\right)n.\)
assuming, \(\epsilon_{i}\)’s are iid \(\mathcal{N}\left(0,\sigma^{2}\right),\sigma^{2}\) is unknown.
• \(\boldsymbol{y}\) stands for Relative Weight.
•\(\boldsymbol{x_{1}}\) is glutest.
• \(\boldsymbol{x_{2}}\) is glufast.
• \(\boldsymbol{x_{3}}\) is sspg.
• \(\boldsymbol{x}_{4}\) is Instest.
• \(\boldsymbol{z_{5}}\) is the indicator whether the person is chem. diabetic.
• \(\boldsymbol{z_{6}}\) is the indicator whether the person is over diabetic.
Fitting the full model.
Call:
lm(formula = relwt ~ ., data = dbts)
Residuals:
Min 1Q Median 3Q Max
-0.214399 -0.065387 0.004717 0.072101 0.293801
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.398e-01 3.288e-02 28.585 < 2e-16 ***
glufast 7.723e-04 6.331e-04 1.220 0.22460
glutest -4.704e-04 1.541e-04 -3.053 0.00272 **
sspg -1.156e-04 9.384e-05 -1.232 0.22024
instest 9.793e-04 1.559e-04 6.280 4.19e-09 ***
group2 6.838e-02 5.188e-02 1.318 0.18969
group1 1.004e-01 3.075e-02 3.264 0.00139 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1041 on 137 degrees of freedom
Multiple R-squared: 0.3673, Adjusted R-squared: 0.3396
F-statistic: 13.26 on 6 and 137 DF, p-value: 8.367e-12
VIF for the covariates of Diabetes dataset.1.DFFITS
ggdffits=ggplot(data.frame(y=dffits(lmodel1)),mapping=aes(y=y,x=1:length(y)))+labs(title="Measuring DFFIT",x="Index",y="DFFITS")
plot1=ggdffits+geom_point()+geom_hline(yintercept=2*sqrt(p/n),col="grey",lwd=1)
plot1=ggdffits+geom_point()+geom_hline(yintercept=2*sqrt(p/n),col="grey",lwd=1)+geom_label(aes(label=1:length(y)))
plot12.Leverages
p=7;n=nrow(dbts)
gghat=ggplot(data.frame(y=hatvalues(lmodel1)),
mapping=aes(y=y,x=1:length(y)))+labs(title="Identifying High
Leverage Points",x="Index",y="Hat Matrix Diagonals")
plot2=gghat+geom_point()+geom_hline(yintercept=2*p/n,col="grey",lwd=1)
gghat+geom_point()+geom_hline(yintercept=2*p/n,col="grey",
lwd=1)+geom_label(aes(label=1:length(y))) 3.DFBETAS
lmodel1.dfbetas=data.frame(dfbetas(lmodel1))
db1=ggplot(data.frame(y=lmodel1.dfbetas[,2]),mapping=aes(y=y,x=1:length(y)))+geom_point()+geom_hline(yintercept=2/sqrt(n))+labs(title="glufast",x="Index",y="dfbetas")+geom_label(aes(label=1:length(y)))
db2=ggplot(data.frame(y=lmodel1.dfbetas[,3]),mapping=aes(y=y,x=1:length(y)))+geom_point()+geom_hline(yintercept=2/sqrt(n))+labs(title="glutest",x="Index",y="dfbetas")+geom_label(aes(label=1:length(y)))
db3=ggplot(data.frame(y=lmodel1.dfbetas[,4]),mapping=aes(y=y,x=1:length(y)))+geom_point()+geom_hline(yintercept=2/sqrt(n))+labs(title="sspg",x="Index",y="dfbetas")+geom_label(aes(label=1:length(y)))
db4=ggplot(data.frame(y=lmodel1.dfbetas[,5]),mapping=aes(y=y,x=1:length(y)))+geom_point()+geom_hline(yintercept=2/sqrt(n))+labs(title="instest",x="Index",y="instest")+geom_label(aes(label=1:length(y)))
grid.arrange(db1,db2,db3,db4,ncol=2,nrow=2)Influence measures of
lm(formula = relwt ~ ., data = dbts) :
dfb.1_ dfb.glfs dfb.glts dfb.sspg dfb.inst dfb.grp2 dfb.grp1
1 -5.20e-02 0.042638 -0.046618 1.16e-02 4.02e-02 0.022870 3.09e-02
2 6.78e-03 0.012117 -0.012286 -3.42e-03 -3.60e-03 0.005742 5.21e-03
3 -6.28e-03 -0.020874 0.020419 2.88e-03 1.85e-03 -0.006245 -5.63e-03
4 3.20e-02 -0.012966 0.020287 3.03e-02 -1.95e-02 -0.035553 -6.39e-02
5 3.31e-03 0.008398 -0.009792 1.23e-02 7.93e-03 -0.004187 -1.57e-02
6 -1.23e-01 0.176477 -0.137012 8.55e-02 -1.70e-01 0.183067 2.15e-01
7 -3.07e-04 -0.018486 0.014601 -1.50e-02 4.97e-03 0.002015 9.94e-03
8 7.43e-02 0.045181 -0.055814 1.13e-02 -8.34e-03 0.011194 -3.48e-02
9 3.24e-02 -0.001633 0.008468 -8.84e-03 -1.67e-02 -0.024239 -2.68e-02
10 -1.01e-01 -0.182625 0.190489 5.06e-02 2.33e-02 -0.073692 -5.96e-02
11 3.57e-03 0.003417 0.005189 2.22e-02 -3.12e-02 -0.001268 -1.01e-02
12 -8.30e-02 -0.074439 0.061164 -3.97e-02 1.11e-01 -0.039460 6.52e-03
13 -2.39e-02 0.002729 0.006371 1.65e-02 -1.93e-02 0.003963 1.02e-02
14 -1.57e-02 0.017154 -0.025965 -2.29e-02 2.58e-02 0.024029 3.87e-02
15 -5.17e-02 -0.049389 0.030248 -7.78e-02 1.12e-01 -0.023693 3.09e-02
16 3.40e-02 -0.102568 0.109796 2.66e-02 -2.09e-02 -0.074203 -9.96e-02
17 1.04e-01 -0.039927 0.063371 4.98e-03 -1.04e-01 -0.056028 -8.09e-02
18 -2.05e-03 -0.006321 0.002007 -6.96e-03 1.53e-02 -0.001726 5.73e-05
19 -6.80e-02 -0.031348 0.033897 2.58e-02 3.97e-02 -0.021999 -1.19e-02
20 3.20e-02 0.003758 0.002354 -5.69e-03 -3.70e-02 -0.000493 -3.57e-03
21 4.98e-02 -0.029179 0.047865 4.37e-03 -7.96e-02 -0.025593 -3.15e-02
22 -1.08e-01 -0.079131 0.068891 1.25e-02 1.16e-01 -0.049670 -2.71e-02
23 8.85e-03 0.010295 -0.006532 7.66e-03 -1.28e-02 -0.001685 -7.52e-03
24 5.33e-03 0.003372 -0.004120 1.31e-03 1.95e-03 -0.001739 -5.61e-03
25 3.44e-02 0.036752 -0.092758 -1.82e-01 1.49e-01 0.057519 1.19e-01
26 3.73e-02 -0.119127 0.083641 -1.91e-02 1.45e-01 -0.104635 -1.41e-01
27 9.99e-02 0.022085 -0.049678 -9.18e-02 1.45e-01 -0.090803 -1.02e-01
28 4.51e-02 0.072172 -0.076786 -1.40e-02 4.99e-02 -0.027921 -4.64e-02
29 1.26e-02 -0.001104 -0.003228 -4.13e-03 1.29e-02 -0.006896 -1.33e-02
30 -8.48e-03 -0.133842 0.113866 -8.19e-02 6.10e-02 -0.035819 8.24e-03
31 9.81e-02 -0.143801 0.133083 -3.46e-02 -7.63e-03 -0.079068 -1.02e-01
32 2.47e-02 0.035242 -0.034465 -1.02e-02 -4.59e-03 0.004027 6.17e-04
33 -8.85e-02 0.113036 -0.136507 -4.10e-02 1.45e-01 0.046079 8.37e-02
34 6.56e-03 0.028709 -0.020592 -1.15e-01 -1.02e-01 0.105145 1.95e-01
35 1.74e-03 -0.002230 0.002691 7.10e-04 -1.94e-03 -0.001766 -2.53e-03
36 -9.80e-03 -0.003628 0.000301 1.96e-03 1.05e-02 0.004188 4.25e-03
37 5.08e-02 0.131764 -0.038468 1.41e-01 -3.05e-01 0.008002 -2.96e-02
38 -4.19e-02 0.089794 -0.060260 -2.62e-02 -1.11e-01 0.088574 1.51e-01
39 2.43e-02 0.022587 -0.006956 -5.51e-03 -3.86e-02 -0.014715 -9.16e-03
40 -5.97e-03 -0.053033 0.063415 -8.50e-04 -4.05e-02 -0.012196 1.53e-03
41 1.56e-02 0.017121 -0.011100 7.95e-04 -3.79e-03 -0.017180 -2.09e-02
42 1.04e-02 0.014488 -0.016039 -5.41e-02 -2.43e-02 0.038665 7.23e-02
43 3.76e-02 0.043588 -0.045977 -2.40e-02 -1.49e-02 0.021917 2.14e-02
44 -9.93e-02 -0.055307 0.054623 1.89e-02 9.70e-02 -0.056840 -3.53e-02
45 -1.42e-01 -0.090396 0.082173 8.32e-02 1.12e-01 -0.049638 -5.89e-02
46 1.30e-02 0.000489 -0.000096 -5.29e-03 -8.80e-03 0.000322 -6.66e-04
47 -3.36e-03 0.000930 -0.000192 2.37e-03 -1.48e-03 0.001252 1.65e-03
48 2.80e-02 0.010663 -0.006710 -1.28e-02 -2.06e-02 -0.002713 -2.07e-03
49 3.48e-02 0.031113 -0.040843 -1.53e-02 1.59e-02 0.009082 -3.20e-03
50 9.56e-03 0.017879 -0.025403 -5.74e-02 8.82e-03 0.027631 5.72e-02
51 -6.51e-02 -0.089469 0.174513 8.38e-02 -3.52e-01 0.065264 1.20e-01
52 3.56e-02 0.136412 -0.145868 5.74e-02 3.28e-02 0.023845 -4.12e-02
53 -9.95e-02 -0.002409 0.033115 6.42e-02 -7.64e-02 0.029381 5.39e-02
54 -7.39e-02 0.026234 -0.023625 3.91e-02 1.92e-02 0.024304 2.91e-02
55 -3.98e-02 -0.044101 0.045285 1.36e-02 1.66e-03 -0.002241 6.95e-03
56 4.67e-02 0.004893 -0.008003 -7.68e-03 -1.09e-02 -0.005848 -2.17e-02
57 -1.85e-02 -0.004834 0.004844 1.14e-02 1.03e-02 -0.002687 -3.36e-03
58 -1.21e-01 0.053157 -0.059456 5.86e-02 8.56e-02 0.025124 2.52e-02
59 -2.21e-03 -0.000556 0.000400 2.40e-03 2.50e-03 -0.001250 -5.07e-03
60 6.91e-02 -0.034821 0.014077 -8.82e-02 1.23e-01 -0.097565 -9.07e-02
61 5.36e-02 0.013927 0.010308 -5.17e-02 -4.64e-03 -0.087063 -6.09e-02
62 -6.73e-02 0.010546 0.029981 1.01e-01 -9.24e-02 -0.007017 -1.05e-01
63 5.59e-02 0.091741 -0.112180 -8.38e-02 1.54e-02 0.069302 1.89e-01
64 8.15e-02 -0.051277 0.027710 -5.51e-02 5.51e-02 -0.044876 -6.10e-02
65 7.36e-02 -0.044816 0.005990 -1.21e-01 1.01e-01 -0.021827 7.36e-02
66 -1.64e-03 -0.000381 0.000782 2.03e-03 4.93e-05 -0.000800 -2.89e-03
67 1.19e-01 -0.163453 0.149857 -1.10e-01 1.32e-01 -0.212112 -2.07e-01
68 -2.71e-02 0.028765 -0.008405 4.52e-02 -9.52e-02 0.050734 5.21e-02
69 6.83e-02 0.172693 -0.140032 -1.36e-01 -2.47e-01 0.198371 3.02e-01
70 -3.60e-02 0.075464 -0.083699 1.66e-03 2.76e-02 0.054453 6.06e-02
71 1.55e-03 -0.016947 0.029847 -1.27e-03 -3.72e-02 -0.011714 -3.80e-02
72 6.45e-03 -0.001637 -0.001618 -3.23e-03 3.14e-02 -0.024534 -3.00e-02
73 5.88e-02 -0.067583 0.071135 -2.69e-02 -2.09e-02 -0.052668 -5.50e-02
74 1.01e-02 -0.008349 0.013448 1.22e-02 -1.04e-02 -0.017974 -2.66e-02
75 -1.68e-02 0.003138 -0.010386 1.31e-02 7.86e-02 -0.044297 -5.76e-02
76 -8.89e-02 0.078494 -0.016673 1.76e-01 -8.25e-02 -0.052667 -9.19e-02
77 1.03e-02 0.010703 -0.013638 -1.16e-02 -1.82e-03 0.011228 3.07e-02
78 -1.33e-02 0.004943 -0.000867 9.89e-03 -1.89e-02 0.014469 1.76e-02
79 -1.26e-02 0.058195 -0.089647 -8.86e-02 6.32e-02 0.086207 1.27e-01
80 -9.27e-03 0.000538 -0.009033 -1.90e-02 3.25e-02 0.004444 1.34e-02
81 -2.68e-02 0.008100 -0.008082 2.59e-02 -2.96e-03 0.018263 1.24e-02
82 -1.20e-01 -0.042921 0.050314 1.98e-01 9.94e-02 -0.100372 -1.88e-01
83 1.11e-01 -0.008567 -0.034810 -1.48e-01 6.20e-02 0.027238 1.45e-01
84 1.26e-01 -0.002219 -0.067750 -1.76e-01 2.30e-01 -0.054679 -5.15e-02
85 2.59e-02 0.027899 -0.011805 6.26e-04 -1.03e-01 0.046299 1.32e-01
86 -1.98e-01 0.032576 0.054045 3.36e-01 -2.09e-01 0.011653 4.70e-03
87 1.05e-02 -0.010502 0.011831 -4.08e-03 -1.75e-02 0.002331 -3.13e-02
88 -2.00e-02 -0.015250 0.014511 6.05e-02 -1.27e-02 0.008456 -9.70e-02
89 -1.22e-01 0.042380 -0.023640 1.81e-01 -4.91e-03 0.008191 1.60e-02
90 -1.86e-02 0.063780 -0.061556 6.83e-03 7.81e-03 0.024436 1.25e-01
91 5.40e-03 -0.037779 0.026496 -6.79e-02 9.42e-02 -0.063293 5.72e-02
92 -3.99e-02 0.001219 0.009654 5.78e-02 -1.20e-02 -0.007090 1.07e-02
93 2.95e-01 0.023940 -0.091991 -4.35e-01 5.25e-02 0.066332 4.35e-02
94 -2.93e-03 -0.018076 0.025680 9.54e-03 -2.25e-02 -0.009568 -6.42e-02
95 7.49e-05 -0.029028 0.027950 -8.03e-03 1.67e-02 -0.024135 -3.86e-04
96 7.17e-02 0.113420 -0.079662 -7.06e-02 -1.95e-01 0.103884 2.61e-01
97 -2.82e-03 0.023481 -0.039435 -1.86e-02 6.55e-02 0.001339 6.62e-02
98 4.21e-03 -0.011624 0.016469 -5.88e-03 -1.46e-02 -0.007327 -4.01e-02
99 -1.43e-01 -0.030629 0.075984 2.47e-01 -9.12e-02 -0.021852 1.04e-04
100 1.38e-02 -0.009639 0.010095 -1.15e-02 -1.44e-02 0.001758 -1.04e-02
101 5.97e-02 -0.025588 0.040647 -3.48e-03 -1.47e-01 0.047730 -1.06e-01
102 -4.58e-02 -0.008066 0.014657 8.23e-02 -8.70e-03 -0.002513 3.33e-02
103 -2.28e-02 -0.017357 0.029155 4.72e-02 -3.29e-02 -0.006780 -5.82e-02
104 -2.54e-03 -0.022873 0.018497 1.39e-02 1.02e-02 -0.007791 3.57e-03
105 -1.19e-01 -0.019477 0.044766 1.53e-01 7.44e-03 -0.047941 -1.86e-01
106 -1.12e-03 -0.044215 0.039030 -2.80e-03 2.98e-02 -0.031303 1.77e-03
107 -8.44e-02 0.157794 -0.161086 1.05e-01 3.68e-02 0.077429 -5.73e-02
108 -1.56e-02 0.007761 -0.004134 1.26e-03 1.51e-02 -0.012521 -1.15e-01
109 6.34e-02 -0.028205 0.028171 -1.36e-01 1.30e-02 -0.038295 1.39e-01
110 -6.24e-02 -0.027646 0.040196 5.37e-02 3.43e-02 -0.054681 -1.51e-01
111 1.23e-03 -0.004285 0.006589 -1.61e-03 -6.88e-03 -0.002755 4.03e-03
112 -1.24e-01 0.046210 -0.059293 1.69e-01 1.05e-01 0.004104 -2.26e-01
113 1.05e-01 0.068463 -0.123684 3.19e-02 -7.01e-02 0.145395 1.00e-01
114 3.29e-04 0.000156 -0.000427 -5.86e-05 2.34e-04 0.000333 1.90e-04
115 -1.89e-01 0.186474 -0.051713 -7.57e-02 -2.94e-02 -0.273881 4.69e-02
116 9.48e-02 0.153671 -0.222867 -7.07e-03 8.15e-03 0.163507 1.34e-01
117 -9.72e-03 0.185518 -0.171100 7.29e-02 -1.32e-01 0.053901 1.30e-01
118 3.70e-03 -0.005216 0.002270 -4.26e-03 7.10e-03 0.002225 -2.68e-03
119 1.27e-01 -0.134344 0.082789 -5.15e-02 -3.34e-03 0.089059 -1.75e-02
120 2.17e-02 -0.013505 -0.001093 -2.88e-03 9.66e-03 0.006596 -5.92e-04
121 1.37e-01 0.014114 -0.105371 -7.88e-02 9.28e-02 0.218865 5.67e-02
122 2.04e-04 0.000240 -0.000745 -6.66e-05 1.16e-03 0.001776 -3.92e-05
123 1.61e-02 0.025683 -0.061744 3.46e-03 7.09e-02 0.106831 4.88e-03
124 3.83e-02 0.029568 -0.052851 -8.68e-03 1.30e-03 0.076637 3.56e-02
125 -7.53e-02 0.114470 -0.065477 1.97e-02 -5.17e-02 -0.051420 4.61e-02
126 3.48e-03 0.002815 -0.006086 -1.03e-03 3.82e-03 0.003407 2.44e-03
127 5.37e-02 -0.084826 0.051711 -3.41e-02 5.05e-02 -0.048518 -3.54e-02
128 -3.90e-02 0.060777 -0.036129 3.44e-03 -1.89e-02 -0.031114 2.38e-02
129 -1.91e-02 -0.108854 0.095040 -6.96e-02 1.64e-01 -0.080768 -1.05e-01
130 1.09e-01 -0.144548 0.077247 -6.20e-02 9.01e-02 0.048241 -5.31e-02
131 -2.99e-03 -0.062282 0.103350 -9.92e-02 -1.53e-02 -0.146861 -2.19e-02
132 -2.22e-02 -0.016955 0.021007 2.92e-02 5.80e-03 -0.053145 -2.54e-02
133 -1.45e-02 0.004174 0.007323 3.80e-03 -1.25e-02 -0.008022 -1.31e-03
134 -3.23e-01 -0.189288 0.329725 -8.83e-06 2.13e-01 -0.627542 -3.09e-01
135 6.00e-02 -0.003970 -0.017524 -2.03e-02 -2.63e-02 0.073870 3.31e-02
136 2.42e-02 -0.004997 -0.013783 4.27e-02 -2.56e-02 0.075623 8.08e-03
137 2.29e-01 -0.131530 0.023173 2.07e-01 -2.55e-01 0.494459 5.50e-02
138 2.88e-03 -0.003699 0.001591 -2.44e-03 4.08e-03 0.002740 -1.64e-03
139 2.61e-02 -0.042963 0.031683 6.39e-03 -1.06e-02 -0.001257 -1.28e-02
140 -1.50e-02 -0.003108 0.009351 -2.35e-03 1.16e-02 -0.011159 -1.10e-02
141 6.43e-02 -0.080092 0.053322 4.34e-03 -2.44e-02 0.010315 -1.48e-02
142 4.60e-02 -0.021786 0.030931 1.83e-02 -1.25e-01 0.088746 3.50e-02
143 3.24e-03 0.001235 0.000909 -1.10e-03 -1.18e-02 0.008732 5.19e-03
144 -4.18e-02 0.075091 -0.060682 8.54e-03 9.72e-03 0.017035 2.35e-02
dffit cov.r cook.d hat inf
1 -0.098757 1.059 1.40e-03 0.0250
2 0.018281 1.087 4.81e-05 0.0318
3 -0.029052 1.083 1.21e-04 0.0297
4 0.134091 1.002 2.56e-03 0.0142
5 0.039666 1.067 2.26e-04 0.0180
6 -0.335546 0.869 1.57e-02 0.0256
7 -0.043730 1.066 2.75e-04 0.0181
8 0.184905 0.952 4.84e-03 0.0149
9 0.086661 1.040 1.08e-03 0.0145
10 -0.283143 0.936 1.13e-02 0.0273
11 0.047288 1.076 3.22e-04 0.0263
12 -0.240915 0.922 8.17e-03 0.0190
13 -0.038400 1.072 2.12e-04 0.0213
14 -0.079521 1.051 9.08e-04 0.0169
15 -0.223583 0.945 7.06e-03 0.0196
16 0.150406 1.047 3.24e-03 0.0307
17 0.262700 0.873 9.65e-03 0.0168
18 -0.021101 1.085 6.41e-05 0.0307
19 -0.114976 1.041 1.89e-03 0.0207
20 0.068593 1.063 6.76e-04 0.0211
21 0.136460 1.039 2.67e-03 0.0247
22 -0.234630 0.954 7.79e-03 0.0226
23 0.039251 1.064 2.22e-04 0.0157
24 0.017768 1.067 4.54e-05 0.0143
25 -0.275044 0.976 1.07e-02 0.0340
26 0.204538 1.085 5.99e-03 0.0611
27 0.294532 0.850 1.21e-02 0.0185
28 0.204652 0.944 5.92e-03 0.0167
29 0.031356 1.067 1.41e-04 0.0161
30 -0.234283 0.958 7.77e-03 0.0232
31 0.190428 1.035 5.18e-03 0.0346
32 0.073051 1.056 7.66e-04 0.0182
33 -0.249965 0.999 8.88e-03 0.0350
34 -0.311297 0.859 1.35e-02 0.0214
35 0.005323 1.075 4.08e-06 0.0208
36 -0.027923 1.068 1.12e-04 0.0167
37 0.468356 0.702 2.97e-02 0.0248 *
38 -0.217853 0.988 6.74e-03 0.0263
39 0.094590 1.047 1.28e-03 0.0186
40 -0.088920 1.093 1.14e-03 0.0454
41 0.073844 1.047 7.83e-04 0.0142
42 -0.110826 1.050 1.76e-03 0.0233
43 0.077798 1.074 8.70e-04 0.0296
44 -0.180850 1.022 4.67e-03 0.0280
45 -0.261962 0.964 9.71e-03 0.0290
46 0.021557 1.073 6.69e-05 0.0202
47 -0.005276 1.071 4.01e-06 0.0173
48 0.059243 1.062 5.04e-04 0.0182
49 0.078095 1.056 8.76e-04 0.0191
50 -0.091753 1.061 1.21e-03 0.0249
51 -0.448683 0.904 2.82e-02 0.0483
52 0.281088 0.887 1.11e-02 0.0204
53 -0.186409 0.968 4.93e-03 0.0173
54 -0.117464 1.030 1.97e-03 0.0176
55 -0.108758 1.032 1.69e-03 0.0164
56 0.091654 1.035 1.20e-03 0.0139
57 -0.029222 1.077 1.23e-04 0.0245
58 -0.196723 1.008 5.51e-03 0.0273
59 -0.006631 1.106 6.33e-06 0.0486
60 0.203300 0.991 5.88e-03 0.0244
61 0.206171 0.940 6.00e-03 0.0165
62 -0.202045 1.051 5.84e-03 0.0435
63 0.226565 1.040 7.33e-03 0.0439
64 0.135059 1.027 2.61e-03 0.0202
65 0.194112 1.080 5.40e-03 0.0567
66 -0.003741 1.102 2.01e-06 0.0452
67 0.344286 0.835 1.64e-02 0.0229 *
68 -0.111644 1.104 1.79e-03 0.0566
69 -0.410923 1.049 2.40e-02 0.0853
70 -0.111682 1.071 1.79e-03 0.0347
71 -0.082964 1.091 9.89e-04 0.0430
72 0.056030 1.066 4.51e-04 0.0205
73 0.121749 1.047 2.12e-03 0.0247
74 0.052436 1.059 3.95e-04 0.0150
75 0.108171 1.092 1.68e-03 0.0476
76 0.263293 0.965 9.81e-03 0.0294
77 0.036650 1.092 1.93e-04 0.0378
78 -0.036889 1.068 1.96e-04 0.0182
79 -0.222343 0.944 6.98e-03 0.0192
80 -0.053253 1.070 4.08e-04 0.0221
81 -0.049737 1.067 3.56e-04 0.0200
82 0.292727 1.126 1.23e-02 0.1010
83 0.229103 1.054 7.50e-03 0.0504
84 0.287819 1.050 1.18e-02 0.0608
85 0.178833 1.066 4.58e-03 0.0458
86 0.370002 1.423 1.96e-02 0.2738 *
87 -0.065519 1.080 6.17e-04 0.0322
88 -0.147504 1.058 3.12e-03 0.0352
89 0.228672 1.144 7.50e-03 0.1016
90 0.184267 1.031 4.85e-03 0.0319
91 0.196842 1.048 5.54e-03 0.0410
92 0.082332 1.110 9.75e-04 0.0570
93 -0.523366 1.062 3.88e-02 0.1124
94 -0.104739 1.067 1.57e-03 0.0308
95 0.049475 1.110 3.52e-04 0.0535
96 0.321572 1.087 1.48e-02 0.0853
97 0.139300 1.068 2.78e-03 0.0386
98 -0.073807 1.077 7.83e-04 0.0313
99 0.302320 1.110 1.31e-02 0.0939
100 -0.037724 1.107 2.05e-04 0.0506
101 -0.318349 0.961 1.43e-02 0.0383
102 0.147387 1.072 3.11e-03 0.0425
103 -0.089421 1.094 1.15e-03 0.0458
104 0.051455 1.099 3.81e-04 0.0447
105 -0.246339 1.054 8.67e-03 0.0540
106 0.082443 1.096 9.77e-04 0.0467
107 -0.267046 1.085 1.02e-02 0.0738
108 -0.194358 1.013 5.38e-03 0.0282
109 0.250828 1.025 8.96e-03 0.0433
110 -0.186071 1.043 4.95e-03 0.0368
111 0.012704 1.146 2.32e-05 0.0813
112 -0.367214 0.981 1.91e-02 0.0525
113 -0.256381 1.152 9.42e-03 0.1104
114 -0.000776 1.196 8.67e-08 0.1201 *
115 -0.487851 0.993 3.36e-02 0.0792
116 -0.290419 1.186 1.21e-02 0.1368 *
117 -0.371635 0.972 1.95e-02 0.0507
118 0.014011 1.109 2.82e-05 0.0507
119 0.238913 1.080 8.17e-03 0.0652
120 -0.042591 1.143 2.61e-04 0.0800
121 0.335410 0.991 1.59e-02 0.0492
122 0.004131 1.092 2.46e-06 0.0359
123 0.210910 1.043 6.35e-03 0.0418
124 0.083794 1.171 1.01e-03 0.1036 *
125 -0.192263 1.099 5.30e-03 0.0673
126 -0.010417 1.161 1.56e-05 0.0936 *
127 -0.142793 1.105 2.93e-03 0.0621
128 -0.102747 1.118 1.52e-03 0.0660
129 0.260756 1.099 9.73e-03 0.0801
130 0.232762 1.113 7.76e-03 0.0826
131 -0.204646 1.274 6.02e-03 0.1817 *
132 -0.108854 1.074 1.70e-03 0.0360
133 0.028442 1.225 1.16e-04 0.1406 *
134 -0.657575 1.041 6.09e-02 0.1290
135 0.091005 1.156 1.19e-03 0.0936 *
136 0.100438 1.169 1.45e-03 0.1037 *
137 0.709716 0.834 6.93e-02 0.0753 *
138 0.010717 1.100 1.65e-05 0.0432
139 -0.059546 1.267 5.10e-04 0.1701 *
140 0.039086 1.112 2.20e-04 0.0548
141 -0.120572 1.280 2.09e-03 0.1803 *
142 0.176127 1.126 4.45e-03 0.0818
143 0.019659 1.114 5.56e-05 0.0558
144 0.088997 1.326 1.14e-03 0.2076 *
| Measure | Labels | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Leverages | 86,144,131,141,139,133,116 etc. | ||||||||||
| DFFIT | 137,37 | ||||||||||
| DFBETA |
|
From all the and the table above, we the index of the observations which may affect the regression the most are 37,86,134,137,144. Hence, we fit a different model discarding these.
Call:
lm(formula = relwt ~ ., data = dbts[-c(37, 86, 134, 137, 144),
])
Residuals:
Min 1Q Median 3Q Max
-0.21114 -0.06052 0.01099 0.07530 0.22765
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.9499692 0.0344679 27.561 < 2e-16 ***
glufast 0.0008085 0.0006640 1.218 0.225552
glutest -0.0005159 0.0001589 -3.247 0.001479 **
sspg -0.0001918 0.0001033 -1.858 0.065430 .
instest 0.0010806 0.0001609 6.716 5.06e-10 ***
group2 0.0694496 0.0542013 1.281 0.202325
group1 0.1069546 0.0300919 3.554 0.000527 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.09928 on 132 degrees of freedom
Multiple R-squared: 0.414, Adjusted R-squared: 0.3873
F-statistic: 15.54 on 6 and 132 DF, p-value: 1.961e-13
The Principal Components Regression approach involves constructing the first M principal components,and then using these components as the predictors in a linear regression model that is fit using least squares. We fit a PCR model storing into the object ‘model.pc’ .
Data: X dimension: 144 6
Y dimension: 144 1
Fit method: svdpc
Number of components considered: 6
VALIDATION: RMSEP
Cross-validated using 10 random segments.
(Intercept) 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps
CV 0.1286 0.1289 0.1167 0.1179 0.1081 0.1077 0.1057
adjCV 0.1286 0.1288 0.1166 0.1178 0.1058 0.1075 0.1055
TRAINING: % variance explained
1 comps 2 comps 3 comps 4 comps 5 comps 6 comps
X 59.7904 84.75 93.51 96.74 99.66 100.00
relwt 0.6653 19.22 19.40 33.75 33.99 36.73
Start: AIC=-644.64
relwt ~ glufast + glutest + sspg + instest + group
Call:
lm(formula = relwt ~ glufast + glutest + sspg + instest + group,
data = dbts)
Coefficients:
(Intercept) glufast glutest sspg instest group2
0.9397650 0.0007723 -0.0004704 -0.0001156 0.0009793 0.0683806
group1
0.1003792
Start: AIC=-644.64
relwt ~ glufast + glutest + sspg + instest + group
Df Sum of Sq RSS AIC
- glufast 1 0.01614 1.5019 -645.08
- sspg 1 0.01645 1.5022 -645.05
<none> 1.4858 -644.64
- group 2 0.12115 1.6069 -637.35
- glutest 1 0.10111 1.5869 -637.16
- instest 1 0.42770 1.9135 -610.21
Step: AIC=-645.08
relwt ~ glutest + sspg + instest + group
Df Sum of Sq RSS AIC
- sspg 1 0.02006 1.5220 -645.17
<none> 1.5019 -645.08
- group 2 0.10674 1.6087 -639.19
- glutest 1 0.19932 1.7012 -629.14
- instest 1 0.45926 1.9612 -608.66
Step: AIC=-645.17
relwt ~ glutest + instest + group
Df Sum of Sq RSS AIC
<none> 1.5220 -645.17
- group 2 0.08698 1.6090 -641.17
- glutest 1 0.17974 1.7017 -631.10
- instest 1 0.44898 1.9710 -609.95
Call:
lm(formula = relwt ~ glutest + instest + group, data = dbts)
Coefficients:
(Intercept) glutest instest group2 group1
0.9249126 -0.0002686 0.0009326 0.0432644 0.0686934
Start: AIC=-644.64
relwt ~ glufast + glutest + sspg + instest + group
Df Sum of Sq RSS AIC
- glufast 1 0.01614 1.5019 -645.08
- sspg 1 0.01645 1.5022 -645.05
<none> 1.4858 -644.64
- group 2 0.12115 1.6069 -637.35
- glutest 1 0.10111 1.5869 -637.16
- instest 1 0.42770 1.9135 -610.21
Step: AIC=-645.08
relwt ~ glutest + sspg + instest + group
Df Sum of Sq RSS AIC
- sspg 1 0.02006 1.5220 -645.17
<none> 1.5019 -645.08
+ glufast 1 0.01614 1.4858 -644.64
- group 2 0.10674 1.6087 -639.19
- glutest 1 0.19932 1.7012 -629.14
- instest 1 0.45926 1.9612 -608.66
Step: AIC=-645.17
relwt ~ glutest + instest + group
Df Sum of Sq RSS AIC
<none> 1.5220 -645.17
+ sspg 1 0.02006 1.5019 -645.08
+ glufast 1 0.01975 1.5022 -645.05
- group 2 0.08698 1.6090 -641.17
- glutest 1 0.17974 1.7017 -631.10
- instest 1 0.44898 1.9710 -609.95
Call:
lm(formula = relwt ~ glutest + instest + group, data = dbts)
Coefficients:
(Intercept) glutest instest group2 group1
0.9249126 -0.0002686 0.0009326 0.0432644 0.0686934
| Method | Selected Model |
|---|---|
| Forward | Full Model |
| Backward | glutest+instest+group |
| Stepwise | glutest+instest+group |
par(mfrow=c(2,2))
reg.sub<-regsubsets(relwt~.,data=dbts)
reg.sum<-summary(reg.sub)
# rss
plot(reg.sum$rss,type='b',ylab='RSS',xlab='Number of Variables');box(lwd=2)
# adj R2
plot(reg.sum$adjr2,type='b',ylab='Adjusted Rsq',xlab='Number of Variables');box(lwd=2)
max<-which.max(reg.sum$adjr2)
points(max,reg.sum$adjr2[max],col='red',cex=2,pch=16)
# Cp
plot(reg.sum$cp,type='b',ylab='Cp',xlab='Number of Variables');box(lwd=2)
min<-which.min(reg.sum$cp)
points(min,reg.sum$cp[min],col='red',cex=2,pch=16)
# bic
plot(reg.sum$bic,type='b',ylab='BIC',xlab='Number of Variables');box(lwd=2)
min<-which.min(reg.sum$bic)
points(min,reg.sum$bic[min],col='red',cex=2,pch=16)\(y_{i}=\beta_{0}+\beta_{1}x_{1i}+\beta_{4}x_{4i}+\beta_{5}z_{5i}+\beta_{6}z_{6i}+\epsilon_{i}\:i=1\left(1\right)n.\)
assuming, \(\epsilon_{i}\)’s are iid \(\mathcal{N}\left(0,\sigma^{2}\right),\sigma^{2}\) is unknown.
Call:
lm(formula = relwt ~ glutest + instest + group, data = dbts)
Residuals:
Min 1Q Median 3Q Max
-0.223985 -0.073355 0.005761 0.067120 0.293289
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.249e-01 2.537e-02 36.454 < 2e-16 ***
glutest -2.686e-04 6.628e-05 -4.052 8.42e-05 ***
instest 9.326e-04 1.456e-04 6.403 2.18e-09 ***
group2 4.326e-02 4.803e-02 0.901 0.36928
group1 6.869e-02 2.496e-02 2.752 0.00671 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1046 on 139 degrees of freedom
Multiple R-squared: 0.3519, Adjusted R-squared: 0.3332
F-statistic: 18.87 on 4 and 139 DF, p-value: 2.067e-12
VIF for the covariates of Diabetes dataset in our final model.
Influential observation : 37,137.
Call:
lm(formula = relwt ~ glutest + instest + group, data = dbts[c(-37,
-137), ])
Residuals:
Min 1Q Median 3Q Max
-0.223346 -0.070933 0.008438 0.070699 0.226741
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.102e-01 2.472e-02 36.816 < 2e-16 ***
glutest -2.589e-04 6.408e-05 -4.041 8.84e-05 ***
instest 9.971e-04 1.407e-04 7.088 6.54e-11 ***
group2 1.973e-02 4.733e-02 0.417 0.6774
group1 6.514e-02 2.404e-02 2.710 0.0076 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1003 on 137 degrees of freedom
Multiple R-squared: 0.3916, Adjusted R-squared: 0.3739
F-statistic: 22.05 on 4 and 137 DF, p-value: 4.578e-14
\(\boldsymbol{\hat{\beta}_{\lambda}^{ridge}}=\left(X'X+\lambda I_{p}\right)^{-1}X'\boldsymbol{y}\)
Length Class Mode
a0 100 -none- numeric
beta 300 dgCMatrix S4
df 100 -none- numeric
dim 2 -none- numeric
lambda 100 -none- numeric
dev.ratio 100 -none- numeric
nulldev 1 -none- numeric
npasses 1 -none- numeric
jerr 1 -none- numeric
offset 1 -none- logical
call 5 -none- call
nobs 1 -none- numeric
Length Class Mode
a0 100 -none- numeric
beta 300 dgCMatrix S4
df 100 -none- numeric
dim 2 -none- numeric
lambda 100 -none- numeric
dev.ratio 100 -none- numeric
nulldev 1 -none- numeric
npasses 1 -none- numeric
jerr 1 -none- numeric
offset 1 -none- logical
call 5 -none- call
nobs 1 -none- numeric
Call:
lm(formula = res.bc ~ glutest + instest + group, data = dbts)
Residuals:
Min 1Q Median 3Q Max
-0.217303 -0.072383 0.004163 0.064085 0.293849
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0713167 0.0251096 -2.840 0.00519 **
glutest -0.0002673 0.0000656 -4.075 7.71e-05 ***
instest 0.0009245 0.0001441 6.415 2.06e-09 ***
group2 0.0427601 0.0475343 0.900 0.36991
group1 0.0679716 0.0247008 2.752 0.00672 **
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1036 on 139 degrees of freedom
Multiple R-squared: 0.3522, Adjusted R-squared: 0.3336
F-statistic: 18.9 on 4 and 139 DF, p-value: 1.995e-12
Generalized least squares fit by REML
Model: relwt ~ .
Data: dbts
AIC BIC logLik
-149.649 -123.3692 83.82452
Correlation Structure: AR(1)
Formula: ~1
Parameter estimate(s):
Phi
0.05761797
Coefficients:
Value Std.Error t-value p-value
(Intercept) 0.9367570 0.03322435 28.194896 0.0000
glufast 0.0006766 0.00063339 1.068208 0.2873
glutest -0.0004289 0.00015397 -2.785816 0.0061
sspg -0.0001052 0.00009390 -1.120579 0.2644
instest 0.0009436 0.00015707 6.007647 0.0000
group2 0.0597484 0.05270192 1.133704 0.2589
group1 0.0955520 0.03117120 3.065393 0.0026
Correlation:
(Intr) glufst glutst sspg instst group2
glufast -0.268
glutest -0.068 -0.888
sspg -0.597 0.105 0.071
instest 0.028 -0.147 -0.087 -0.352
group2 0.355 0.385 -0.608 0.037 -0.268
group1 0.130 0.494 -0.539 -0.241 -0.265 0.570
Standardized residuals:
Min Q1 Med Q3 Max
-2.0539249 -0.6534057 0.0257434 0.7033226 2.7915264
Residual standard error: 0.1043158
Degrees of freedom: 144 total; 137 residual
Here we will again fit the final linear model but this time using Huber’s Loss functon
Call: rlm(formula = relwt ~ ., data = dbts)
Residuals:
Min 1Q Median 3Q Max
-0.213342 -0.064421 0.006464 0.075124 0.299906
Coefficients:
Value Std. Error t value
(Intercept) 0.9364 0.0361 25.9688
glufast 0.0008 0.0007 1.0905
glutest -0.0005 0.0002 -2.8161
sspg -0.0001 0.0001 -1.2300
instest 0.0010 0.0002 6.0267
group2 0.0634 0.0569 1.1146
group1 0.1014 0.0337 3.0076
Residual standard error: 0.1059 on 137 degrees of freedom
Now we will measure the accuracy
| Different models | MSE |
|---|---|
| BoxCox | 0.0105 |
| Ridge | 0.0147 |
| PCR | 0.012 |
| LASSO | 0.0122 |
| M-estimation | 0.0103 |
| Final Linear Model | 0.011 |
Linear Regression Analysis - George AF Seber, Alan J. Lee. - Wiley.
Introduction to Linear Regression Analysis - Douglas C. Montgomery - Wiley.
R Documentation