Body Fat with measurements from body(All Parameters Version)

s188026 Yifei Liu

bodyfat: http://garthtarr.github.io/mplot/reference/bodyfat.html

(Interpret and comments are in every step, conlusion is at the last)

extract data

#install.packages("mplot")
library(mplot)
data(bodyfat)
Fat    = bodyfat$Bodyfat
Neck   = bodyfat$Neck
Chest  = bodyfat$Chest
Abdo   = bodyfat$Abdo
Hip    = bodyfat$Hip
Thigh  = bodyfat$Thigh
Knee   = bodyfat$Knee
Ankle  = bodyfat$Ankle
Bic    = bodyfat$Bic
Fore   = bodyfat$Fore
Wrist  = bodyfat$Wrist
Age    = bodyfat$Age
Height = bodyfat$Height
Weight = bodyfat$Weight
Bodyfat<-data.frame(Fat, Age, Height, Weight, Neck, Chest, Abdo, Hip, Thigh, Knee, Ankle, Bic, Fore, Wrist)

cor()

corResult<-cor(Bodyfat)
corResult[,1]
##       Fat       Age    Height    Weight      Neck     Chest      Abdo       Hip 
## 1.0000000 0.2543748 0.1513761 0.6615254 0.4647812 0.7379384 0.8382541 0.6273400 
##     Thigh      Knee     Ankle       Bic      Fore     Wrist 
## 0.5225635 0.4876076 0.3595300 0.4915120 0.3403827 0.4200907

the correlation between the body fat and other parameters, some of them have strong correlation, the smallest is Age and Height.

build the model

Model<-lm(Fat ~ Age+Height+Weight+Neck+Chest+Abdo+Hip+Thigh+Knee+Ankle+Bic+Fore+Wrist)
summary(Model)
## 
## Call:
## lm(formula = Fat ~ Age + Height + Weight + Neck + Chest + Abdo + 
##     Hip + Thigh + Knee + Ankle + Bic + Fore + Wrist)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.3767 -2.5514 -0.1723  2.6391  9.1393 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -52.553646  40.062856  -1.312   0.1922    
## Age           0.009288   0.043470   0.214   0.8312    
## Height        0.258388   0.320810   0.805   0.4223    
## Weight       -0.271016   0.243569  -1.113   0.2682    
## Neck         -0.592669   0.322125  -1.840   0.0684 .  
## Chest         0.090883   0.164738   0.552   0.5822    
## Abdo          0.995184   0.123072   8.086 7.29e-13 ***
## Hip          -0.141981   0.204533  -0.694   0.4890    
## Thigh         0.101272   0.200714   0.505   0.6148    
## Knee         -0.096682   0.325889  -0.297   0.7673    
## Ankle        -0.048017   0.507695  -0.095   0.9248    
## Bic           0.075332   0.244105   0.309   0.7582    
## Fore          0.412107   0.272144   1.514   0.1327    
## Wrist        -0.263067   0.745145  -0.353   0.7247    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.081 on 114 degrees of freedom
## Multiple R-squared:  0.7519, Adjusted R-squared:  0.7236 
## F-statistic: 26.57 on 13 and 114 DF,  p-value: < 2.2e-16

Abdo’s coefficient was significantly not 0 at the level of P <0.05.(Neck should be considered, too)

Multiple R-squared: 0.7519 shows that predict variable explained 75.19% of the variance in bodyfat.

Residual standard error: 4.081 shows that the average estimation error of bodyfat is 4.081% in this model.

We need optimize it later.

diagnose

library(car)
## Loading required package: carData
plot(Model)

qqPlot(Model,id.method='identify',simulate = TRUE,labels=row.names(Bodyfat),main='Q-Q plot')

## [1] 60 73

Residuals vs Fitted: normal distribution —-> OK

Normal QQ: combine with qqplot, Within the confidence interval —-> OK

Scale-Location: The variance is basically a constant —-> OK

Residuals vs Leverage: cook’s distance is in the 0.5 —-> OK

independence

durbinWatsonTest(Model)
##  lag Autocorrelation D-W Statistic p-value
##    1     -0.07255836      2.121203    0.55
##  Alternative hypothesis: rho != 0

p=0.49>0.05. No autocorrelation.

homoscedasticity

ncvTest(Model)
## Non-constant Variance Score Test 
## Variance formula: ~ fitted.values 
## Chisquare = 0.8365218, Df = 1, p = 0.36039

p value shows the variance is constant.

VIF multicollinearity

vif(Model)
##       Age    Height    Weight      Neck     Chest      Abdo       Hip     Thigh 
##  2.256602  4.323426 64.038005  3.849251 13.233597 10.759846 12.318911  7.432930 
##      Knee     Ankle       Bic      Fore     Wrist 
##  4.506660  3.382574  4.086372  2.344915  3.662724

Weight is much bugger than 10 —-> delete it

Model<-lm(Fat ~ Age+Height+Neck+Chest+Abdo+Hip+Thigh+Knee+Ankle+Bic+Fore+Wrist)
summary(Model)
## 
## Call:
## lm(formula = Fat ~ Age + Height + Neck + Chest + Abdo + Hip + 
##     Thigh + Knee + Ankle + Bic + Fore + Wrist)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.3267 -2.4316 -0.1254  2.7091  9.3941 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -10.31143   12.80995  -0.805   0.4225    
## Age           0.01850    0.04272   0.433   0.6658    
## Height       -0.01202    0.20965  -0.057   0.9544    
## Neck         -0.67730    0.31334  -2.162   0.0327 *  
## Chest        -0.03454    0.12026  -0.287   0.7745    
## Abdo          0.94212    0.11358   8.295 2.33e-13 ***
## Hip          -0.25301    0.17872  -1.416   0.1596    
## Thigh         0.06148    0.19771   0.311   0.7564    
## Knee         -0.14806    0.32293  -0.458   0.6475    
## Ankle        -0.21031    0.48680  -0.432   0.6665    
## Bic           0.03261    0.24132   0.135   0.8927    
## Fore          0.34778    0.26621   1.306   0.1940    
## Wrist        -0.39793    0.73598  -0.541   0.5898    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.085 on 115 degrees of freedom
## Multiple R-squared:  0.7492, Adjusted R-squared:  0.723 
## F-statistic: 28.62 on 12 and 115 DF,  p-value: < 2.2e-16
vif(Model)
##      Age   Height     Neck    Chest     Abdo      Hip    Thigh     Knee 
## 2.174732 1.842510 3.634663 7.037885 9.144421 9.386768 7.196930 4.416175 
##    Ankle      Bic     Fore    Wrist 
## 3.103387 3.985286 2.239099 3.565813

VIF smaller than 10—->OK

Computing best subsets regression

1.regression

2.plot

3.Model selection criteria: Adjusted R2, Cp and BIC

library(leaps)
leaps1<-regsubsets(Fat ~ Age+Height+Neck+Chest+Abdo+Hip+Thigh+Knee+Ankle+Bic+Fore+Wrist, data = Bodyfat, nvmax = 12)
leaps2<-regsubsets(Fat ~ Age+Height+Neck+Chest+Abdo+Hip+Thigh+Knee+Ankle+Bic+Fore+Wrist, data = Bodyfat, nbest = 12)
res.sum <- summary(leaps1)
res.sum
## Subset selection object
## Call: regsubsets.formula(Fat ~ Age + Height + Neck + Chest + Abdo + 
##     Hip + Thigh + Knee + Ankle + Bic + Fore + Wrist, data = Bodyfat, 
##     nvmax = 12)
## 12 Variables  (and intercept)
##        Forced in Forced out
## Age        FALSE      FALSE
## Height     FALSE      FALSE
## Neck       FALSE      FALSE
## Chest      FALSE      FALSE
## Abdo       FALSE      FALSE
## Hip        FALSE      FALSE
## Thigh      FALSE      FALSE
## Knee       FALSE      FALSE
## Ankle      FALSE      FALSE
## Bic        FALSE      FALSE
## Fore       FALSE      FALSE
## Wrist      FALSE      FALSE
## 1 subsets of each size up to 12
## Selection Algorithm: exhaustive
##           Age Height Neck Chest Abdo Hip Thigh Knee Ankle Bic Fore Wrist
## 1  ( 1 )  " " " "    " "  " "   "*"  " " " "   " "  " "   " " " "  " "  
## 2  ( 1 )  " " " "    "*"  " "   "*"  " " " "   " "  " "   " " " "  " "  
## 3  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   " "  " "   " " " "  " "  
## 4  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   " "  " "   " " "*"  " "  
## 5  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   " "  "*"   " " "*"  " "  
## 6  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   " "  "*"   " " "*"  "*"  
## 7  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   "*"  "*"   " " "*"  "*"  
## 8  ( 1 )  "*" " "    "*"  " "   "*"  "*" " "   "*"  "*"   " " "*"  "*"  
## 9  ( 1 )  "*" " "    "*"  " "   "*"  "*" "*"   "*"  "*"   " " "*"  "*"  
## 10  ( 1 ) "*" " "    "*"  "*"   "*"  "*" "*"   "*"  "*"   " " "*"  "*"  
## 11  ( 1 ) "*" " "    "*"  "*"   "*"  "*" "*"   "*"  "*"   "*" "*"  "*"  
## 12  ( 1 ) "*" "*"    "*"  "*"   "*"  "*" "*"   "*"  "*"   "*" "*"  "*"
data.frame(
  Adj.R2 = which.max(res.sum$adjr2),
  CP = which.min(res.sum$cp),
  BIC = which.min(res.sum$bic)
)
##   Adj.R2 CP BIC
## 1      4  3   3
plot(leaps1,scale = 'adjr2')

leaps1 for maximum size of subsets to examine, shows choose 4 when consider R2 and choose 3 for CP and BIC.

4 is Neck+Abdo+Hip+Fore

3 is -Fore

res.sum <- summary(leaps2)
res.sum
## Subset selection object
## Call: regsubsets.formula(Fat ~ Age + Height + Neck + Chest + Abdo + 
##     Hip + Thigh + Knee + Ankle + Bic + Fore + Wrist, data = Bodyfat, 
##     nbest = 12)
## 12 Variables  (and intercept)
##        Forced in Forced out
## Age        FALSE      FALSE
## Height     FALSE      FALSE
## Neck       FALSE      FALSE
## Chest      FALSE      FALSE
## Abdo       FALSE      FALSE
## Hip        FALSE      FALSE
## Thigh      FALSE      FALSE
## Knee       FALSE      FALSE
## Ankle      FALSE      FALSE
## Bic        FALSE      FALSE
## Fore       FALSE      FALSE
## Wrist      FALSE      FALSE
## 12 subsets of each size up to 8
## Selection Algorithm: exhaustive
##           Age Height Neck Chest Abdo Hip Thigh Knee Ankle Bic Fore Wrist
## 1  ( 1 )  " " " "    " "  " "   "*"  " " " "   " "  " "   " " " "  " "  
## 1  ( 2 )  " " " "    " "  "*"   " "  " " " "   " "  " "   " " " "  " "  
## 1  ( 3 )  " " " "    " "  " "   " "  "*" " "   " "  " "   " " " "  " "  
## 1  ( 4 )  " " " "    " "  " "   " "  " " "*"   " "  " "   " " " "  " "  
## 1  ( 5 )  " " " "    " "  " "   " "  " " " "   " "  " "   "*" " "  " "  
## 1  ( 6 )  " " " "    " "  " "   " "  " " " "   "*"  " "   " " " "  " "  
## 1  ( 7 )  " " " "    "*"  " "   " "  " " " "   " "  " "   " " " "  " "  
## 1  ( 8 )  " " " "    " "  " "   " "  " " " "   " "  " "   " " " "  "*"  
## 1  ( 9 )  " " " "    " "  " "   " "  " " " "   " "  "*"   " " " "  " "  
## 1  ( 10 ) " " " "    " "  " "   " "  " " " "   " "  " "   " " "*"  " "  
## 1  ( 11 ) "*" " "    " "  " "   " "  " " " "   " "  " "   " " " "  " "  
## 1  ( 12 ) " " "*"    " "  " "   " "  " " " "   " "  " "   " " " "  " "  
## 2  ( 1 )  " " " "    "*"  " "   "*"  " " " "   " "  " "   " " " "  " "  
## 2  ( 2 )  " " " "    " "  " "   "*"  "*" " "   " "  " "   " " " "  " "  
## 2  ( 3 )  " " " "    " "  " "   "*"  " " " "   " "  " "   " " " "  "*"  
## 2  ( 4 )  " " " "    " "  " "   "*"  " " " "   " "  "*"   " " " "  " "  
## 2  ( 5 )  " " " "    " "  " "   "*"  " " " "   "*"  " "   " " " "  " "  
## 2  ( 6 )  " " " "    " "  " "   "*"  " " "*"   " "  " "   " " " "  " "  
## 2  ( 7 )  " " "*"    " "  " "   "*"  " " " "   " "  " "   " " " "  " "  
## 2  ( 8 )  " " " "    " "  " "   "*"  " " " "   " "  " "   "*" " "  " "  
## 2  ( 9 )  " " " "    " "  "*"   "*"  " " " "   " "  " "   " " " "  " "  
## 2  ( 10 ) "*" " "    " "  " "   "*"  " " " "   " "  " "   " " " "  " "  
## 2  ( 11 ) " " " "    " "  " "   "*"  " " " "   " "  " "   " " "*"  " "  
## 2  ( 12 ) " " " "    "*"  "*"   " "  " " " "   " "  " "   " " " "  " "  
## 3  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   " "  " "   " " " "  " "  
## 3  ( 2 )  " " " "    " "  " "   "*"  "*" " "   " "  " "   " " " "  "*"  
## 3  ( 3 )  " " " "    "*"  " "   "*"  " " " "   " "  "*"   " " " "  " "  
## 3  ( 4 )  " " " "    "*"  " "   "*"  " " " "   "*"  " "   " " " "  " "  
## 3  ( 5 )  " " " "    "*"  " "   "*"  " " "*"   " "  " "   " " " "  " "  
## 3  ( 6 )  " " "*"    "*"  " "   "*"  " " " "   " "  " "   " " " "  " "  
## 3  ( 7 )  "*" " "    "*"  " "   "*"  " " " "   " "  " "   " " " "  " "  
## 3  ( 8 )  " " " "    "*"  " "   "*"  " " " "   " "  " "   " " " "  "*"  
## 3  ( 9 )  " " " "    "*"  " "   "*"  " " " "   " "  " "   " " "*"  " "  
## 3  ( 10 ) " " " "    " "  " "   "*"  "*" " "   " "  "*"   " " " "  " "  
## 3  ( 11 ) " " " "    " "  " "   "*"  "*" " "   "*"  " "   " " " "  " "  
## 3  ( 12 ) " " " "    " "  " "   "*"  " " "*"   " "  " "   " " " "  "*"  
## 4  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   " "  " "   " " "*"  " "  
## 4  ( 2 )  " " " "    "*"  " "   "*"  "*" " "   " "  " "   "*" " "  " "  
## 4  ( 3 )  " " " "    "*"  " "   "*"  "*" " "   "*"  " "   " " " "  " "  
## 4  ( 4 )  " " " "    "*"  " "   "*"  "*" " "   " "  "*"   " " " "  " "  
## 4  ( 5 )  " " " "    "*"  " "   "*"  "*" " "   " "  " "   " " " "  "*"  
## 4  ( 6 )  " " " "    "*"  " "   "*"  "*" "*"   " "  " "   " " " "  " "  
## 4  ( 7 )  " " "*"    "*"  " "   "*"  "*" " "   " "  " "   " " " "  " "  
## 4  ( 8 )  "*" " "    "*"  " "   "*"  "*" " "   " "  " "   " " " "  " "  
## 4  ( 9 )  " " " "    "*"  "*"   "*"  "*" " "   " "  " "   " " " "  " "  
## 4  ( 10 ) " " " "    "*"  " "   "*"  " " " "   " "  "*"   " " "*"  " "  
## 4  ( 11 ) " " " "    "*"  " "   "*"  " " " "   "*"  " "   " " "*"  " "  
## 4  ( 12 ) " " " "    "*"  " "   "*"  " " " "   "*"  "*"   " " " "  " "  
## 5  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   " "  "*"   " " "*"  " "  
## 5  ( 2 )  " " " "    "*"  " "   "*"  "*" " "   "*"  " "   " " "*"  " "  
## 5  ( 3 )  " " " "    "*"  " "   "*"  "*" " "   " "  " "   " " "*"  "*"  
## 5  ( 4 )  " " "*"    "*"  " "   "*"  "*" " "   " "  " "   " " "*"  " "  
## 5  ( 5 )  " " " "    "*"  "*"   "*"  "*" " "   " "  " "   " " "*"  " "  
## 5  ( 6 )  " " " "    "*"  " "   "*"  "*" " "   " "  " "   "*" "*"  " "  
## 5  ( 7 )  " " " "    "*"  " "   "*"  "*" "*"   " "  " "   " " "*"  " "  
## 5  ( 8 )  "*" " "    "*"  " "   "*"  "*" " "   " "  " "   " " "*"  " "  
## 5  ( 9 )  " " " "    "*"  " "   "*"  "*" " "   "*"  " "   "*" " "  " "  
## 5  ( 10 ) " " " "    "*"  " "   "*"  "*" " "   " "  "*"   "*" " "  " "  
## 5  ( 11 ) " " " "    "*"  " "   "*"  "*" " "   " "  " "   "*" " "  "*"  
## 5  ( 12 ) " " " "    "*"  " "   "*"  "*" " "   "*"  " "   " " " "  "*"  
## 6  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   " "  "*"   " " "*"  "*"  
## 6  ( 2 )  " " " "    "*"  " "   "*"  "*" " "   "*"  "*"   " " "*"  " "  
## 6  ( 3 )  " " " "    "*"  " "   "*"  "*" " "   "*"  " "   " " "*"  "*"  
## 6  ( 4 )  " " " "    "*"  " "   "*"  "*" "*"   " "  "*"   " " "*"  " "  
## 6  ( 5 )  " " " "    "*"  "*"   "*"  "*" " "   " "  "*"   " " "*"  " "  
## 6  ( 6 )  " " "*"    "*"  " "   "*"  "*" " "   " "  "*"   " " "*"  " "  
## 6  ( 7 )  " " " "    "*"  " "   "*"  "*" " "   " "  "*"   "*" "*"  " "  
## 6  ( 8 )  "*" " "    "*"  " "   "*"  "*" " "   " "  "*"   " " "*"  " "  
## 6  ( 9 )  " " "*"    "*"  " "   "*"  "*" " "   " "  " "   " " "*"  "*"  
## 6  ( 10 ) " " " "    "*"  " "   "*"  "*" "*"   "*"  " "   " " "*"  " "  
## 6  ( 11 ) "*" " "    "*"  " "   "*"  "*" " "   " "  " "   " " "*"  "*"  
## 6  ( 12 ) " " " "    "*"  "*"   "*"  "*" " "   "*"  " "   " " "*"  " "  
## 7  ( 1 )  " " " "    "*"  " "   "*"  "*" " "   "*"  "*"   " " "*"  "*"  
## 7  ( 2 )  " " "*"    "*"  " "   "*"  "*" " "   " "  "*"   " " "*"  "*"  
## 7  ( 3 )  " " " "    "*"  " "   "*"  "*" "*"   "*"  "*"   " " "*"  " "  
## 7  ( 4 )  " " " "    "*"  "*"   "*"  "*" " "   " "  "*"   " " "*"  "*"  
## 7  ( 5 )  "*" " "    "*"  " "   "*"  "*" " "   "*"  " "   " " "*"  "*"  
## 7  ( 6 )  " " " "    "*"  " "   "*"  "*" "*"   " "  "*"   " " "*"  "*"  
## 7  ( 7 )  "*" " "    "*"  " "   "*"  "*" " "   " "  "*"   " " "*"  "*"  
## 7  ( 8 )  " " " "    "*"  "*"   "*"  "*" " "   "*"  "*"   " " "*"  " "  
## 7  ( 9 )  " " " "    "*"  " "   "*"  "*" " "   " "  "*"   "*" "*"  "*"  
## 7  ( 10 ) " " " "    "*"  " "   "*"  "*" " "   "*"  "*"   "*" "*"  " "  
## 7  ( 11 ) " " " "    "*"  "*"   "*"  "*" " "   "*"  " "   " " "*"  "*"  
## 7  ( 12 ) " " "*"    "*"  " "   "*"  "*" " "   "*"  "*"   " " "*"  " "  
## 8  ( 1 )  "*" " "    "*"  " "   "*"  "*" " "   "*"  "*"   " " "*"  "*"  
## 8  ( 2 )  " " " "    "*"  " "   "*"  "*" "*"   "*"  "*"   " " "*"  "*"  
## 8  ( 3 )  " " " "    "*"  "*"   "*"  "*" " "   "*"  "*"   " " "*"  "*"  
## 8  ( 4 )  "*" " "    "*"  " "   "*"  "*" "*"   "*"  " "   " " "*"  "*"  
## 8  ( 5 )  " " "*"    "*"  "*"   "*"  "*" " "   " "  "*"   " " "*"  "*"  
## 8  ( 6 )  " " "*"    "*"  " "   "*"  "*" " "   "*"  "*"   " " "*"  "*"  
## 8  ( 7 )  " " " "    "*"  " "   "*"  "*" " "   "*"  "*"   "*" "*"  "*"  
## 8  ( 8 )  "*" "*"    "*"  " "   "*"  "*" " "   " "  "*"   " " "*"  "*"  
## 8  ( 9 )  " " " "    "*"  "*"   "*"  "*" "*"   "*"  "*"   " " "*"  " "  
## 8  ( 10 ) "*" " "    "*"  " "   "*"  "*" "*"   "*"  "*"   " " "*"  " "  
## 8  ( 11 ) "*" " "    "*"  " "   "*"  "*" "*"   " "  "*"   " " "*"  "*"  
## 8  ( 12 ) "*" " "    "*"  " "   "*"  "*" " "   "*"  " "   "*" "*"  "*"
data.frame(
  Adj.R2 = which.max(res.sum$adjr2),
  CP = which.min(res.sum$cp),
  BIC = which.min(res.sum$bic)
)
##   Adj.R2 CP BIC
## 1     37 25  25
plot(leaps2,scale = 'adjr2')

leaps2 for number of subsets of each size to record, shows choose 31 when consider R2 and choose 21 for CP and BIC.

37 is Neck+Abdo+Hip+Fore

25 is -Fore

So all information said that the best choice is Neck+Abdo+Hip (+Fore(if consider R2))

stepwise method

AIC | backward

library(MASS)
stepAIC(Model,direction = "backward")
## Start:  AIC=372.58
## Fat ~ Age + Height + Neck + Chest + Abdo + Hip + Thigh + Knee + 
##     Ankle + Bic + Fore + Wrist
## 
##          Df Sum of Sq    RSS    AIC
## - Height  1      0.05 1919.3 370.58
## - Bic     1      0.30 1919.5 370.60
## - Chest   1      1.38 1920.6 370.67
## - Thigh   1      1.61 1920.8 370.69
## - Ankle   1      3.11 1922.3 370.79
## - Age     1      3.13 1922.3 370.79
## - Knee    1      3.51 1922.7 370.81
## - Wrist   1      4.88 1924.1 370.90
## - Fore    1     28.48 1947.7 372.46
## <none>                1919.2 372.58
## - Hip     1     33.45 1952.7 372.79
## - Neck    1     77.97 1997.2 375.68
## - Abdo    1   1148.35 3067.6 430.61
## 
## Step:  AIC=370.58
## Fat ~ Age + Neck + Chest + Abdo + Hip + Thigh + Knee + Ankle + 
##     Bic + Fore + Wrist
## 
##         Df Sum of Sq    RSS    AIC
## - Bic    1      0.32 1919.6 368.60
## - Chest  1      1.34 1920.6 368.67
## - Thigh  1      1.95 1921.2 368.71
## - Ankle  1      3.12 1922.4 368.79
## - Age    1      3.44 1922.7 368.81
## - Wrist  1      4.82 1924.1 368.90
## - Knee   1      4.93 1924.2 368.91
## - Fore   1     28.48 1947.7 370.47
## <none>               1919.3 370.58
## - Hip    1     37.35 1956.6 371.05
## - Neck   1     80.13 1999.4 373.82
## - Abdo   1   1158.13 3077.4 429.02
## 
## Step:  AIC=368.6
## Fat ~ Age + Neck + Chest + Abdo + Hip + Thigh + Knee + Ankle + 
##     Fore + Wrist
## 
##         Df Sum of Sq    RSS    AIC
## - Chest  1      1.16 1920.8 366.68
## - Thigh  1      2.98 1922.6 366.80
## - Age    1      3.38 1923.0 366.83
## - Ankle  1      3.62 1923.2 366.84
## - Wrist  1      4.64 1924.2 366.91
## - Knee   1      4.86 1924.5 366.93
## <none>               1919.6 368.60
## - Fore   1     36.09 1955.7 368.99
## - Hip    1     37.04 1956.6 369.05
## - Neck   1     79.81 1999.4 371.82
## - Abdo   1   1164.23 3083.8 427.28
## 
## Step:  AIC=366.68
## Fat ~ Age + Neck + Abdo + Hip + Thigh + Knee + Ankle + Fore + 
##     Wrist
## 
##         Df Sum of Sq    RSS    AIC
## - Thigh  1      3.26 1924.0 364.90
## - Age    1      3.33 1924.1 364.90
## - Ankle  1      3.60 1924.3 364.92
## - Wrist  1      4.73 1925.5 364.99
## - Knee   1      4.82 1925.6 365.00
## <none>               1920.7 366.68
## - Fore   1     34.94 1955.7 366.99
## - Hip    1     38.40 1959.1 367.21
## - Neck   1     85.07 2005.8 370.23
## - Abdo   1   1924.58 3845.3 453.53
## 
## Step:  AIC=364.9
## Fat ~ Age + Neck + Abdo + Hip + Knee + Ankle + Fore + Wrist
## 
##         Df Sum of Sq    RSS    AIC
## - Age    1      1.52 1925.5 363.00
## - Ankle  1      3.17 1927.2 363.11
## - Knee   1      3.32 1927.3 363.12
## - Wrist  1      5.13 1929.1 363.24
## <none>               1924.0 364.90
## - Hip    1     36.09 1960.1 365.28
## - Fore   1     37.24 1961.2 365.35
## - Neck   1     82.15 2006.2 368.25
## - Abdo   1   1961.69 3885.7 452.87
## 
## Step:  AIC=363
## Fat ~ Neck + Abdo + Hip + Knee + Ankle + Fore + Wrist
## 
##         Df Sum of Sq    RSS    AIC
## - Knee   1      2.62 1928.1 361.17
## - Wrist  1      3.70 1929.2 361.24
## - Ankle  1      4.36 1929.9 361.29
## <none>               1925.5 363.00
## - Fore   1     35.85 1961.4 363.36
## - Hip    1     56.44 1982.0 364.70
## - Neck   1     84.61 2010.1 366.50
## - Abdo   1   2556.02 4481.5 469.13
## 
## Step:  AIC=361.17
## Fat ~ Neck + Abdo + Hip + Ankle + Fore + Wrist
## 
##         Df Sum of Sq    RSS    AIC
## - Wrist  1      3.79 1931.9 359.42
## - Ankle  1      9.03 1937.2 359.77
## <none>               1928.1 361.17
## - Fore   1     36.30 1964.4 361.56
## - Hip    1     67.91 1996.1 363.60
## - Neck   1     87.51 2015.7 364.85
## - Abdo   1   2570.36 4498.5 467.61
## 
## Step:  AIC=359.42
## Fat ~ Neck + Abdo + Hip + Ankle + Fore
## 
##         Df Sum of Sq    RSS    AIC
## - Ankle  1     14.86 1946.8 358.40
## <none>               1931.9 359.42
## - Fore   1     36.06 1968.0 359.79
## - Hip    1     67.31 1999.2 361.81
## - Neck   1    140.04 2072.0 366.38
## - Abdo   1   2574.16 4506.1 465.83
## 
## Step:  AIC=358.4
## Fat ~ Neck + Abdo + Hip + Fore
## 
##        Df Sum of Sq    RSS    AIC
## - Fore  1     28.63 1975.4 358.27
## <none>              1946.8 358.40
## - Hip   1    126.23 2073.0 364.45
## - Neck  1    147.87 2094.7 365.78
## - Abdo  1   2670.93 4617.7 466.96
## 
## Step:  AIC=358.27
## Fat ~ Neck + Abdo + Hip
## 
##        Df Sum of Sq    RSS    AIC
## <none>              1975.4 358.27
## - Hip   1    107.53 2083.0 363.06
## - Neck  1    119.24 2094.7 363.78
## - Abdo  1   2642.74 4618.2 464.97
## 
## Call:
## lm(formula = Fat ~ Neck + Abdo + Hip)
## 
## Coefficients:
## (Intercept)         Neck         Abdo          Hip  
##    -14.2955      -0.6266       0.9290      -0.2863

least AIC: Neck+Abdo+Hip

anova and AIC test AIC for fore (and chest for interest)

Model1<-lm(Fat ~ Neck+Abdo+Hip)
Model2<-lm(Fat ~ Neck+Abdo+Hip+Fore)
#test for interest
#Because in cor() the chest has a high correlation.
#But all result shows we can delete them
Model3<-lm(Fat ~ Chest+Neck+Abdo+Hip)
anova(Model1,Model2)
## Analysis of Variance Table
## 
## Model 1: Fat ~ Neck + Abdo + Hip
## Model 2: Fat ~ Neck + Abdo + Hip + Fore
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1    124 1975.4                           
## 2    123 1946.8  1    28.626 1.8086 0.1812
AIC(Model1,Model2,Model3)
##        df      AIC
## Model1  5 723.5212
## Model2  6 723.6528
## Model3  6 725.5200

p = 0.1812 means we can delete fore. And AIC has same result

Result

summary(Model1)
## 
## Call:
## lm(formula = Fat ~ Neck + Abdo + Hip)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.0590 -2.5669 -0.0023  2.6107  8.7289 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -14.29546    7.49107  -1.908  0.05866 .  
## Neck         -0.62659    0.22903  -2.736  0.00713 ** 
## Abdo          0.92901    0.07213  12.880  < 2e-16 ***
## Hip          -0.28631    0.11020  -2.598  0.01051 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.991 on 124 degrees of freedom
## Multiple R-squared:  0.7418, Adjusted R-squared:  0.7356 
## F-statistic: 118.8 on 3 and 124 DF,  p-value: < 2.2e-16

So the regression equation:

bodyfat = -14.29546 - 0.62659 * Neck + 0.92901 * Abdomen - 0.28631 * Hip

The result is the same with when we only consider the circumference without age, height, weight. (https://rpubs.com/YifeiLiu/RegressionBodyfat)

Consider weight(same result)

because weight is an important parameter in common sense

newModel1<-lm(Fat ~ Weight+Neck+Abdo+Hip)
vif(newModel1)
##    Weight      Neck      Abdo       Hip 
## 13.310543  2.631873  4.694637  8.183684
newModel2<-lm(Fat ~ Weight+Neck+Abdo+Fore)
vif(newModel2)
##   Weight     Neck     Abdo     Fore 
## 7.058795 2.774132 4.979725 1.899964
summary(newModel2)
## 
## Call:
## lm(formula = Fat ~ Weight + Neck + Abdo + Fore)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -9.1637 -2.8476  0.0877  2.7522  8.3342 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -41.88325    8.30967  -5.040 1.62e-06 ***
## Weight       -0.22932    0.07868  -2.915  0.00423 ** 
## Neck         -0.61856    0.26606  -2.325  0.02172 *  
## Abdo          0.98046    0.08146  12.036  < 2e-16 ***
## Fore          0.43409    0.23834   1.821  0.07099 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.971 on 123 degrees of freedom
## Multiple R-squared:  0.7466, Adjusted R-squared:  0.7383 
## F-statistic: 90.58 on 4 and 123 DF,  p-value: < 2.2e-16
plot(newModel2)

Consider weight and hip will let VIF > 10.

Consider weight and fore, the summary is not better than not consider weight above.

So the regression equation is still:

bodyfat = -14.29546 - 0.62659 * Neck + 0.92901 * Abdomen - 0.28631 * Hip

The result is the same with when we only consider the circumference without age, height, weight. (https://rpubs.com/YifeiLiu/RegressionBodyfat)