SLR on Physicians SLR on Literacy Rate Plot, significance of each predictor, confidence and prediction interval, partial f-test full f-test, t-test, correlation, residuals
(lplot<-plot(LIFEEXP~LITERATE,xlab="Literacy Rate",ylab="Life Expectancy",main= "Expected Life Based on Literacy Rate", data = Life))
NULL
abline(35.9664,.3802)
Prediction
73-.3802*14.2
[1] 67.60116
newdata <- data.frame(ILLITERATE = 14)
(distpredict <- predict.lm(lmod, newdata, interval="predict") )
Error in eval(predvars, data, env) : object 'mag' not found
Actual= 73.9 The linear model gives a much lower expected life.
cresid1<-lmod$residuals
cresid1
1 2 3 4 5 6 7
-17.6288023 9.8854749 6.3153232 -18.2605617 13.4745141 -0.2964257 -6.3926035
8 9 10 11 12 13 14
8.5931192 2.3922195 -10.7859706 10.0219562 2.2095900 -14.3517944 -3.9418450
15 16 17 18 19 20 21
10.9285891 -4.3539288 4.8944130 -1.1496009 8.0305002 -13.5136253 6.2120067
22 23 24 25 26 27 28
11.9352221 -5.2376288 -8.4055870 -11.2022705 -2.3738276 -10.7288023 9.6978411
29 30 31 32 33 34 35
8.9214505 -16.2605617 -9.3539288 13.1219562 7.4769309 5.7788420 -6.5804606
36 37 38 39 40 41 42
-14.3155364 12.1338167 -12.7671947 3.1480939 7.3646763 -1.7793376 3.2168402
43 44 45 46 47 48 49
13.4697923 2.2409553 7.5072849 8.3631593 5.3470826 -10.6970430 -11.4901868
50 51 52 53 54 55 56
-7.9022705 3.0287008 3.2438777 -4.8453847 -9.9755155 2.1438777 -1.2221694
57 58 59 60 61 62 63
-3.2736044 6.0697923 5.5034627 -5.3155364 6.9081846 -3.8494893 3.2524218
64 65 66 67 68 69 70
1.0532099 9.4811471 6.1584374 -5.2567396 4.4400556 1.0192570 -0.9437561
71 72 73 74 75 76 77
8.2617539 12.5243729 1.8660817 -11.9343123 -8.1705111 -9.7428564 10.2423607
78 79 80 81 82 83 84
9.8489937 -6.9135137 -3.0964257 -17.2055870 -15.0022705 7.1887914 -7.5556166
85 86 87 88 89 90 91
5.1116127 2.9395499 -2.6453847 -13.3506123 10.5366275 2.7001462 -6.8605617
92 93 94 95 96 97 98
3.1253843 3.0844636 7.3769309 5.8243729 5.3532099 -4.7850708 -7.2334126
99 100 101 102 103 104 105
3.9111070 8.2181340 -16.9022705 -0.6069924 -9.4970430 1.9678812 3.3803590
106 107 108 109 110 111 112
8.0097016 10.4413494 -3.8506123 -14.4937265 4.0836755 8.6338167 1.2299945
113 114 115 116 117 118 119
7.8039684 -2.9055870 6.0186397 5.1086903 8.4565264 2.8931192 0.4855865
120 121 122 123 124 125 126
3.9845752 2.5376388 -16.5020473 -14.6055870 4.3053738 -13.1546052 7.0584374
127 128 129 130 131 132 133
7.6366275 2.8214505 5.5755255 2.4427548 3.4111070 5.3523101 -17.9022705
134 135 136 137 138 139 140
12.6205508 -1.7746157 6.2296005 2.7811471 -12.7249802 3.9054854 9.2115011
141 142 143 144 145 146 147
-3.2837771 7.7280835 -19.4738276 4.0088019 3.1040800 6.8205508 -3.7339183
148 149 150 151 152 153 154
-8.6506123 8.1413494 -0.3638782 -1.9539288 11.4963241 5.5717033 7.0305002
155 156 157 158 159 160 161
4.8788420 -18.5404396 -10.2605617 -7.0864762 8.3177400 7.5713093 5.1281951
162 163 164 165 166 167 168
-2.5025529 -6.9016532 9.1844636 3.6310059 11.4148175 0.2479823 -19.6671947
169
-19.4738276
hist(cresid1,xlab="residual", main="Plot of Residuals")
qqnorm(cresid1)
qqline(cresid1)
There is a relationship but maybe linear isn’t the best way to describe it. Correlation
summary(lmod)
Call:
lm(formula = LIFEEXP ~ PHYSICIAN, data = Life)
Residuals:
Min 1Q Median 3Q Max
-19.667 -6.580 2.781 6.908 13.475
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 59.547296 0.993539 59.94 <2e-16 ***
PHYSICIAN 0.051658 0.004927 10.48 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 8.666 on 167 degrees of freedom
Multiple R-squared: 0.397, Adjusted R-squared: 0.3934
F-statistic: 109.9 on 1 and 167 DF, p-value: < 2.2e-16
looks like they might have a log normal relationship because it’s clearly not linear. We don’t have to do any of the other testings to see if linear is the best model since we can clearly see from the plot.
?stepAIC