Textbook: Linear Models with R, 2nd, Julian J. Faraway
data(fat,package="faraway")
Description
Age, weight, height, and 10 body circumference measurements are recorded for 252 men. Each man’s percentage of body fat was accurately estimated by an underwater weighing technique.
head(fat)
## brozek siri density age weight height adipos free neck chest abdom hip
## 1 12.6 12.3 1.0708 23 154.25 67.75 23.7 134.9 36.2 93.1 85.2 94.5
## 2 6.9 6.1 1.0853 22 173.25 72.25 23.4 161.3 38.5 93.6 83.0 98.7
## 3 24.6 25.3 1.0414 22 154.00 66.25 24.7 116.0 34.0 95.8 87.9 99.2
## 4 10.9 10.4 1.0751 26 184.75 72.25 24.9 164.7 37.4 101.8 86.4 101.2
## 5 27.8 28.7 1.0340 24 184.25 71.25 25.6 133.1 34.4 97.3 100.0 101.9
## 6 20.6 20.9 1.0502 24 210.25 74.75 26.5 167.0 39.0 104.5 94.4 107.8
## thigh knee ankle biceps forearm wrist
## 1 59.0 37.3 21.9 32.0 27.4 17.1
## 2 58.7 37.3 23.4 30.5 28.9 18.2
## 3 59.6 38.9 24.0 28.8 25.2 16.6
## 4 60.1 37.3 22.8 32.4 29.4 18.2
## 5 63.2 42.2 24.0 32.2 27.7 17.7
## 6 66.0 42.0 25.6 35.7 30.6 18.8
dim(fat) #252 observations on 18 variables.
## [1] 252 18
Fit a linear regression model.
y = brozek: “Percent body fat using Brozek’s equation, 457/Density - 414.2.”
\[ Y=X\beta +\varepsilon, \varepsilon \sim N(0,\sigma^2 I)\]
lmod<-lm(brozek~age+weight+height+neck+chest+abdom+hip+thigh+knee+ankle+biceps+forearm+wrist,data=fat)
summary(lmod)
##
## Call:
## lm(formula = brozek ~ age + weight + height + neck + chest +
## abdom + hip + thigh + knee + ankle + biceps + forearm + wrist,
## data = fat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.264 -2.572 -0.097 2.898 9.327
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -15.29255 16.06992 -0.952 0.34225
## age 0.05679 0.02996 1.895 0.05929 .
## weight -0.08031 0.04958 -1.620 0.10660
## height -0.06460 0.08893 -0.726 0.46830
## neck -0.43754 0.21533 -2.032 0.04327 *
## chest -0.02360 0.09184 -0.257 0.79740
## abdom 0.88543 0.08008 11.057 < 2e-16 ***
## hip -0.19842 0.13516 -1.468 0.14341
## thigh 0.23190 0.13372 1.734 0.08418 .
## knee -0.01168 0.22414 -0.052 0.95850
## ankle 0.16354 0.20514 0.797 0.42614
## biceps 0.15280 0.15851 0.964 0.33605
## forearm 0.43049 0.18445 2.334 0.02044 *
## wrist -1.47654 0.49552 -2.980 0.00318 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.988 on 238 degrees of freedom
## Multiple R-squared: 0.749, Adjusted R-squared: 0.7353
## F-statistic: 54.63 on 13 and 238 DF, p-value: < 2.2e-16
For individual \(i\), \(y_i = \beta_0 + \beta_1 x_{i1} + ... + \beta_p x_{i(p-1)} + \varepsilon_i = x_i^T \beta + \varepsilon_i\), where \(\varepsilon_i \sim N(0,\sigma^2)\)
new point: \(x_0 = (x_{01},...,x_{0(p-1)})\) where \(x_{0j}=median(x_{1j},...,x_{nj})\)
x<-model.matrix(lmod)
(x0<-apply(x,2,median))
## (Intercept) age weight height neck chest
## 1.00 43.00 176.50 70.00 38.00 99.65
## abdom hip thigh knee ankle biceps
## 90.95 99.30 59.00 38.50 22.80 32.05
## forearm wrist
## 28.70 18.30
(y0<-sum(x0*coef(lmod))) # point estimate
## [1] 17.49322
predict(lmod,new=data.frame(t(x0)))
## 1
## 17.49322
s.lmod<-summary(lmod)
# 95% CI for the prediction
predict.lower<-y0-qt(0.975,s.lmod$df[2])*s.lmod$sigma*sqrt(1+t(x0)%*%s.lmod$cov.unscaled%*%as.matrix(x0)) #lower bound
predict.upper<-y0+qt(0.975,s.lmod$df[2])*s.lmod$sigma*sqrt(1+t(x0)%*%s.lmod$cov.unscaled%*%as.matrix(x0)) #upper bound
c(predict.lower, predict.upper)
## [1] 9.61783 25.36861
predict(lmod,new=data.frame(t(x0)),interval="prediction") # CI for prediction
## fit lwr upr
## 1 17.49322 9.61783 25.36861
# 95% CI for the prediction
p.mean.lower<-y0-qt(0.975,s.lmod$df[2])*s.lmod$sigma*sqrt(t(x0)%*%s.lmod$cov.unscaled%*%as.matrix(x0)) #lower bound
p.mean.upper<-y0+qt(0.975,s.lmod$df[2])*s.lmod$sigma*sqrt(t(x0)%*%s.lmod$cov.unscaled%*%as.matrix(x0)) #upper bound
c(p.mean.lower, p.mean.upper)
## [1] 16.94426 18.04219
predict(lmod,new=data.frame(t(x0)),interval="confidence") # CI for mean response
## fit lwr upr
## 1 17.49322 16.94426 18.04219