WhoStart<-read.csv("who.csv")
plot(WhoStart$TotExp,WhoStart$LifeExp)
WhoStartLM<-lm(LifeExp ~ TotExp,data=WhoStart)
summary(WhoStartLM)
##
## Call:
## lm(formula = LifeExp ~ TotExp, data = WhoStart)
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.764 -4.778 3.154 7.116 13.292
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.475e+01 7.535e-01 85.933 < 2e-16 ***
## TotExp 6.297e-05 7.795e-06 8.079 7.71e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 9.371 on 188 degrees of freedom
## Multiple R-squared: 0.2577, Adjusted R-squared: 0.2537
## F-statistic: 65.26 on 1 and 188 DF, p-value: 7.714e-14
abline(WhoStartLM)
The \(p\)-values, \(F\)-statistic and and Standard Error all point to this being a reasonable model. With a high value for F and a low \(p\)-value, we can say that the two are related. With the \(p\)-value for the the coefficient and intercept so low, the model does the best that it can with the data. However, given the low value of the \(R^2\) values and adjusted \(R^2\) values, the model does not do much to capture the data.
WhoNu<-WhoStart
WhoNu$LifeExp<-WhoNu$LifeExp^4.6
WhoNu$TotExp<-WhoNu$TotExp^.06
WhoNuLM<-lm(LifeExp ~ TotExp, data=WhoNu)
summary(WhoNuLM)
##
## Call:
## lm(formula = LifeExp ~ TotExp, data = WhoNu)
##
## Residuals:
## Min 1Q Median 3Q Max
## -308616089 -53978977 13697187 59139231 211951764
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -736527910 46817945 -15.73 <2e-16 ***
## TotExp 620060216 27518940 22.53 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 90490000 on 188 degrees of freedom
## Multiple R-squared: 0.7298, Adjusted R-squared: 0.7283
## F-statistic: 507.7 on 1 and 188 DF, p-value: < 2.2e-16
plot(WhoNu$TotExp,WhoNu$LifeExp)
abline(WhoNuLM)
This is a better model. The \(F\)-statistic is an order of magnitude larger. The \(R^S\) and adjusted \(R^2\) values have almost trippled. The \(p\)-values are as low, if not lower than before. While the residuals and standard error are much higher, the adjusted \(R^2\) value makes this more useful in predicting values.
Tot1.5<-predict(WhoNuLM, data.frame(TotExp=1.5))
Tot1.5^(1/4.6)
## 1
## 63.31153
Tot2.5<-predict(WhoNuLM, data.frame(TotExp=2.5))
Tot2.5^(1/4.6)
## 1
## 86.50645
Doing a sanity check, these seem reasonable.
WhoStartLM2<-lm(LifeExp~PropMD+TotExp+PropMD*TotExp, data=WhoStart)
summary(WhoStartLM2)
##
## Call:
## lm(formula = LifeExp ~ PropMD + TotExp + PropMD * TotExp, data = WhoStart)
##
## Residuals:
## Min 1Q Median 3Q Max
## -27.320 -4.132 2.098 6.540 13.074
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.277e+01 7.956e-01 78.899 < 2e-16 ***
## PropMD 1.497e+03 2.788e+02 5.371 2.32e-07 ***
## TotExp 7.233e-05 8.982e-06 8.053 9.39e-14 ***
## PropMD:TotExp -6.026e-03 1.472e-03 -4.093 6.35e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 8.765 on 186 degrees of freedom
## Multiple R-squared: 0.3574, Adjusted R-squared: 0.3471
## F-statistic: 34.49 on 3 and 186 DF, p-value: < 2.2e-16
Generally, the values are good, however once again, the Adjusted \(R^2\) value is mediocre. With only a .35, this seems like it would be lightly predictive of the values.
predict(WhoStartLM2,data.frame(PropMD=.03,TotExp=14))
## 1
## 107.696
This is unreasonable. Any country with an average life expectency of 108 would have people routinely living far past the records for longest lived individuals. In Japan, the country with the second highest life expectancy, the value is 85. Japan also has the greatest proportion of centenarians in the world, with .048% of their population over the age of 100. If the country with the greatest number (both absolute and proportionally) of centenarians has a life expectancy of only 85, there is no way any reasonable country could hit 107.