Introduction
In this report I will analyze ‘bodyfat’ data set containing 252 observations of variables describing body parts measurements.
The final goal is to fit the multiple regression model for the percentage of body fat.
Setup
library(tidyverse)
library(ggpubr)
library(car)
library(mfp)
library(corrplot)
library(papeR)
library(knitr)
library(kableExtra)
library(xtable)
library(olsrr)
data(bodyfat)
bf <- bodyfatData analysis
General summary
Data analysis process begins with understanding the data. First important observation is that height is measured in inches and weight in pounds. Other body measurements are in centimeters.
summary(bodyfat)## case brozek siri density
## Min. : 1.00 Min. : 0.00 Min. : 0.00 Min. :0.995
## 1st Qu.: 63.75 1st Qu.:12.80 1st Qu.:12.47 1st Qu.:1.041
## Median :126.50 Median :19.00 Median :19.20 Median :1.055
## Mean :126.50 Mean :18.94 Mean :19.15 Mean :1.056
## 3rd Qu.:189.25 3rd Qu.:24.60 3rd Qu.:25.30 3rd Qu.:1.070
## Max. :252.00 Max. :45.10 Max. :47.50 Max. :1.109
## age weight height neck
## Min. :22.00 Min. :118.5 Min. :29.50 Min. :31.10
## 1st Qu.:35.75 1st Qu.:159.0 1st Qu.:68.25 1st Qu.:36.40
## Median :43.00 Median :176.5 Median :70.00 Median :38.00
## Mean :44.88 Mean :178.9 Mean :70.15 Mean :37.99
## 3rd Qu.:54.00 3rd Qu.:197.0 3rd Qu.:72.25 3rd Qu.:39.42
## Max. :81.00 Max. :363.1 Max. :77.75 Max. :51.20
## chest abdomen hip thigh
## Min. : 79.30 Min. : 69.40 Min. : 85.0 Min. :47.20
## 1st Qu.: 94.35 1st Qu.: 84.58 1st Qu.: 95.5 1st Qu.:56.00
## Median : 99.65 Median : 90.95 Median : 99.3 Median :59.00
## Mean :100.82 Mean : 92.56 Mean : 99.9 Mean :59.41
## 3rd Qu.:105.38 3rd Qu.: 99.33 3rd Qu.:103.5 3rd Qu.:62.35
## Max. :136.20 Max. :148.10 Max. :147.7 Max. :87.30
## knee ankle biceps forearm wrist
## Min. :33.00 Min. :19.1 Min. :24.80 Min. :21.00 Min. :15.80
## 1st Qu.:36.98 1st Qu.:22.0 1st Qu.:30.20 1st Qu.:27.30 1st Qu.:17.60
## Median :38.50 Median :22.8 Median :32.05 Median :28.70 Median :18.30
## Mean :38.59 Mean :23.1 Mean :32.27 Mean :28.66 Mean :18.23
## 3rd Qu.:39.92 3rd Qu.:24.0 3rd Qu.:34.33 3rd Qu.:30.00 3rd Qu.:18.80
## Max. :49.10 Max. :33.9 Max. :45.00 Max. :34.90 Max. :21.40
Check if there are any NA values:
any(is.na(bodyfat))## [1] FALSE
Conversion of height and weight to centimeters and kilograms.
bf$height <- bf$height * 2.54
bf$weight <- bf$weight * 0.4536Adjustments on data set
I drop ‘brozek’ column which was the body fat percentage using Brozek’s equation 457/Density - 414.2.
Preferred body fat percentage formula will be obtained by the Siri’s equation 495/Density - 450.
Column ‘case’ also can be dropped since it won’t be used as indexing tool.
bf <- subset(bf, select = -c(brozek))
bf <- subset(bf, select = -c(case))
bf<- rename(bf, fat = siri)There is a summary for the adjusted data set:
summary(bf)## fat density age weight
## Min. : 0.00 Min. :0.995 Min. :22.00 Min. : 53.75
## 1st Qu.:12.47 1st Qu.:1.041 1st Qu.:35.75 1st Qu.: 72.12
## Median :19.20 Median :1.055 Median :43.00 Median : 80.06
## Mean :19.15 Mean :1.056 Mean :44.88 Mean : 81.16
## 3rd Qu.:25.30 3rd Qu.:1.070 3rd Qu.:54.00 3rd Qu.: 89.36
## Max. :47.50 Max. :1.109 Max. :81.00 Max. :164.72
## height neck chest abdomen
## Min. : 74.93 Min. :31.10 Min. : 79.30 Min. : 69.40
## 1st Qu.:173.35 1st Qu.:36.40 1st Qu.: 94.35 1st Qu.: 84.58
## Median :177.80 Median :38.00 Median : 99.65 Median : 90.95
## Mean :178.18 Mean :37.99 Mean :100.82 Mean : 92.56
## 3rd Qu.:183.51 3rd Qu.:39.42 3rd Qu.:105.38 3rd Qu.: 99.33
## Max. :197.49 Max. :51.20 Max. :136.20 Max. :148.10
## hip thigh knee ankle biceps
## Min. : 85.0 Min. :47.20 Min. :33.00 Min. :19.1 Min. :24.80
## 1st Qu.: 95.5 1st Qu.:56.00 1st Qu.:36.98 1st Qu.:22.0 1st Qu.:30.20
## Median : 99.3 Median :59.00 Median :38.50 Median :22.8 Median :32.05
## Mean : 99.9 Mean :59.41 Mean :38.59 Mean :23.1 Mean :32.27
## 3rd Qu.:103.5 3rd Qu.:62.35 3rd Qu.:39.92 3rd Qu.:24.0 3rd Qu.:34.33
## Max. :147.7 Max. :87.30 Max. :49.10 Max. :33.9 Max. :45.00
## forearm wrist
## Min. :21.00 Min. :15.80
## 1st Qu.:27.30 1st Qu.:17.60
## Median :28.70 Median :18.30
## Mean :28.66 Mean :18.23
## 3rd Qu.:30.00 3rd Qu.:18.80
## Max. :34.90 Max. :21.40
Certain data corrections are necessary. It is assumed that height of 75 cm is not possible for 92 kg man aged 44. Therefore, height is corrected by adding 100 cm. After the change height 175 is close to the median.
bf[42,6] <- bf[42,6] + 100New variable BMI is introduced - Body Mass Index is calculated by the formula:
BMI = Weight(kg) / Height(m)
bf<- bf %>% mutate(bmi = weight / ((height)/100)^2)Correlations evaluation
It can be guessed that certain body parts size informs about person’s silhouette. Initially I thought that there is a strong relationship between abdomen, hip, and chest and fat percentage. On the other hand I suspected not much correlation between age and body fat.
Indeed my guess was correct, but it is not enough to select variables for regression model.
abd <- ggplot(bf, aes(x=abdomen, y = fat)) + geom_point() + geom_smooth(method = "lm", color="green")
chst <- ggplot(bf, aes(x=chest, y = fat)) + geom_point()+ geom_smooth(method = "lm", color="red")
hp <- ggplot(bf, aes(x=hip, y = fat)) + geom_point()+ geom_smooth(method = "lm", color="blue")
ag <- ggplot(bf, aes(x=age, y = fat)) + geom_point()+ geom_smooth(method = "lm", color="gold")
ggarrange(abd, chst, hp, ag) The best way to evaluate correlations in the data set will be to use correlation matrix.
It shows how variables are related to each other.
corrplot(cor(bf), method = "pie", type = "upper", diag =FALSE, title = "Correlation matrix - body fat data")It can be concluded that in face chest, abdomen and hip are strongly correlated with the body fat percentage.
However, it is necessary to minimize multicollinearity in the regression model. On the plot it can be observed that for example chest and abdomen measurements do not only influence fat level but also are strongly correlated with each other. Possibly only one of them should be included int the model.
Correlation between abdomen and chest:
attach(bf)
cor(abdomen, chest)## [1] 0.9158277
Multiple regression model
First step in model selection would be to analyze the general model with predictors being all variables. Below table shows that only density variable has a significant relationship with fat.
bf.mod <- lm(fat ~ density + age + weight + height + neck + chest + abdomen + hip + thigh + knee + ankle + biceps + forearm + wrist)
bf.mod.pretty <- papeR::prettify(summary(bf.mod))
kable(bf.mod.pretty) %>% kable_styling(bootstrap_options = "bordered")| Estimate | Std. Error | t value | Pr(>|t|) | ||
|---|---|---|---|---|---|
| (Intercept) | 448.6977416 | 11.4767040 | 39.0963941 | <0.001 | *** |
| density | -411.7188339 | 8.1936918 | -50.2482696 | <0.001 | *** |
| age | 0.0121518 | 0.0095772 | 1.2688193 | 0.206 | |
| weight | 0.0120007 | 0.0402611 | 0.2980725 | 0.766 | |
| height | 0.0019652 | 0.0206052 | 0.0953761 | 0.924 | |
| neck | 0.0070514 | 0.0245608 | 0.2870983 | 0.774 | |
| chest | 0.0287861 | 0.0304640 | 0.9449225 | 0.346 | |
| abdomen | 0.0193305 | 0.0323866 | 0.5968684 | 0.551 | |
| hip | 0.0240600 | 0.0430406 | 0.5590072 | 0.577 | |
| thigh | -0.0170527 | 0.0431916 | -0.3948148 | 0.693 | |
| knee | -0.0041492 | 0.0721130 | -0.0575373 | 0.954 | |
| ankle | -0.0811411 | 0.0658685 | -1.2318655 | 0.219 | |
| biceps | -0.0554998 | 0.0509266 | -1.0897992 | 0.277 | |
| forearm | 0.0297893 | 0.0591214 | 0.5038664 | 0.615 | |
| wrist | -0.0108506 | 0.1564040 | -0.0693753 | 0.945 |
summary(bf.mod)##
## Call:
## lm(formula = fat ~ density + age + weight + height + neck + chest +
## abdomen + hip + thigh + knee + ankle + biceps + forearm +
## wrist)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.4081 -0.3803 -0.1240 0.2062 15.1557
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.487e+02 1.148e+01 39.096 <2e-16 ***
## density -4.117e+02 8.194e+00 -50.248 <2e-16 ***
## age 1.215e-02 9.577e-03 1.269 0.206
## weight 1.200e-02 4.026e-02 0.298 0.766
## height 1.965e-03 2.060e-02 0.095 0.924
## neck 7.051e-03 2.456e-02 0.287 0.774
## chest 2.879e-02 3.046e-02 0.945 0.346
## abdomen 1.933e-02 3.239e-02 0.597 0.551
## hip 2.406e-02 4.304e-02 0.559 0.577
## thigh -1.705e-02 4.319e-02 -0.395 0.693
## knee -4.149e-03 7.211e-02 -0.058 0.954
## ankle -8.114e-02 6.587e-02 -1.232 0.219
## biceps -5.550e-02 5.093e-02 -1.090 0.277
## forearm 2.979e-02 5.912e-02 0.504 0.615
## wrist -1.085e-02 1.564e-01 -0.069 0.945
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.275 on 237 degrees of freedom
## Multiple R-squared: 0.9781, Adjusted R-squared: 0.9768
## F-statistic: 756 on 14 and 237 DF, p-value: < 2.2e-16
Model selection using Akaike Information Criterion (AIC)
Second general model with introduced bmi variable.
bf.mod2 <- lm(fat ~ density + age + weight + +height + neck + chest + abdomen + hip + thigh + knee + ankle + biceps + forearm + wrist + bmi)
summary(bf.mod2)##
## Call:
## lm(formula = fat ~ density + age + weight + +height + neck +
## chest + abdomen + hip + thigh + knee + ankle + biceps + forearm +
## wrist + bmi)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.4273 -0.3765 -0.1143 0.2050 15.1315
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.476e+02 1.206e+01 37.122 <2e-16 ***
## density -4.113e+02 8.319e+00 -49.445 <2e-16 ***
## age 1.251e-02 9.669e-03 1.293 0.197
## weight 8.483e-03 4.203e-02 0.202 0.840
## height 7.917e-03 2.871e-02 0.276 0.783
## neck -1.230e-02 6.937e-02 -0.177 0.859
## chest 2.906e-02 3.054e-02 0.952 0.342
## abdomen 2.051e-02 3.269e-02 0.627 0.531
## hip 2.134e-02 4.408e-02 0.484 0.629
## thigh -1.508e-02 4.378e-02 -0.345 0.731
## knee -7.513e-03 7.313e-02 -0.103 0.918
## ankle -8.288e-02 6.625e-02 -1.251 0.212
## biceps -5.370e-02 5.138e-02 -1.045 0.297
## forearm 3.143e-02 5.949e-02 0.528 0.598
## wrist 1.406e-04 1.610e-01 0.001 0.999
## bmi 1.906e-02 6.389e-02 0.298 0.766
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.277 on 236 degrees of freedom
## Multiple R-squared: 0.9781, Adjusted R-squared: 0.9767
## F-statistic: 702.9 on 15 and 236 DF, p-value: < 2.2e-16
From the general model we can extract which variables to choose for the model. Optimized model can be obtained by selecting predictors by backward selection.
Step function evaluates Akaike Information Criterion and selects proper predictors.
backward2 <- step(bf.mod2, scope = list(lower ~ density), trace = 0)
backward2##
## Call:
## lm(formula = fat ~ density + age + chest + hip + ankle)
##
## Coefficients:
## (Intercept) density age chest hip ankle
## 449.47601 -413.56974 0.01517 0.04054 0.03339 -0.08120
Obviously selected model fits pretty well to the general model, but it still uses many predictors which may lead to overfitting.
plot(fitted(bf.mod2) ~ fitted(backward2))
abline(0,1)cor(fitted(bf.mod2), fitted(backward2))## [1] 0.999874
Third model
From the second model summary it can be concluded that density variable has a large t-value and standard error. Moreover it is strongly related with chest. For that reason I will not use it as a predictor in a third model. Ankle won’t be taken into consideration for now.
Proposed model has age, chest and hip as predictors.
bf.mod3 <- lm(fat ~ age + chest + hip)
summary(bf.mod3)##
## Call:
## lm(formula = fat ~ age + chest + hip)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.8710 -4.4784 0.0518 4.1996 15.1019
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -63.42655 5.25875 -12.061 < 2e-16 ***
## age 0.15266 0.03098 4.928 1.52e-06 ***
## chest 0.42596 0.08278 5.145 5.43e-07 ***
## hip 0.32809 0.09601 3.417 0.000739 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.685 on 248 degrees of freedom
## Multiple R-squared: 0.5441, Adjusted R-squared: 0.5386
## F-statistic: 98.66 on 3 and 248 DF, p-value: < 2.2e-16
AIC of proposed model:
AIC(bf.mod3)## [1] 1596.955
Linear model coefficients:
backward3 <- step(bf.mod3, scope = list(lower ~ age), trace = 0)
backward3##
## Call:
## lm(formula = fat ~ age + chest + hip)
##
## Coefficients:
## (Intercept) age chest hip
## -63.4265 0.1527 0.4260 0.3281
Fit of the full and subset models is compared by looking at the corresponding fitted values.
plot(fitted(bf.mod3) ~ fitted(backward3))
abline(0,1)qqPlot(bf.mod3)## [1] 39 216
Model and its subset provide the same response variable values.
cor(fitted(backward3), fitted(bf.mod3))## [1] 1
For now the fitted slope is described by the equation:
fat = -63.42655 + 0.15266 * age + 0.42596 * chest + 0.32809 * hip
Residuals normality evaluation: Distribution seems to be roughly symmetrical and normal. On the Q-Q plot residuals are also really close to the normal line where they are expected to be.
hist(bf.mod3$residuals)qqnorm(bf.mod3$residuals)
qqline(bf.mod3$residuals) From the output, the p-value > 0.05 implying that the distribution of the data are not significantly different from normal distribution. In other words, we can assume the normality.
shapiro.test(bf.mod3$residuals)##
## Shapiro-Wilk normality test
##
## data: bf.mod3$residuals
## W = 0.9942, p-value = 0.4471
Global Validation of Linear Models Assumptions:
library(gvlma)
gvmodel3 <- gvlma(bf.mod3)
summary(gvmodel3)##
## Call:
## lm(formula = fat ~ age + chest + hip)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.8710 -4.4784 0.0518 4.1996 15.1019
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -63.42655 5.25875 -12.061 < 2e-16 ***
## age 0.15266 0.03098 4.928 1.52e-06 ***
## chest 0.42596 0.08278 5.145 5.43e-07 ***
## hip 0.32809 0.09601 3.417 0.000739 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.685 on 248 degrees of freedom
## Multiple R-squared: 0.5441, Adjusted R-squared: 0.5386
## F-statistic: 98.66 on 3 and 248 DF, p-value: < 2.2e-16
##
##
## ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
## USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
## Level of Significance = 0.05
##
## Call:
## gvlma(x = bf.mod3)
##
## Value p-value Decision
## Global Stat 9.2525 0.05509 Assumptions acceptable.
## Skewness 0.3161 0.57394 Assumptions acceptable.
## Kurtosis 1.7417 0.18692 Assumptions acceptable.
## Link Function 6.2044 0.01274 Assumptions NOT satisfied!
## Heteroscedasticity 0.9902 0.31970 Assumptions acceptable.
As we can see the model obtained linearity, nearly normal residuals, and constant variability (heteroscedasticity). One assumption is still to be met.
Outliers, leverage, influential observations
Below I examine which observations have significant influence in the fitted regression line. In order to to that I start with method which evaluates the difference in fits.
ols_plot_dffits(bf.mod3) On the plot it can be seen that observation number 39 is extremely influential. Further investigation of this observation is performed:
kable(bf[which(dffits(bf.mod3) < -0.5),]) %>% kable_styling(bootstrap_options = c("bordered", "striped"))| fat | density | age | weight | height | neck | chest | abdomen | hip | thigh | knee | ankle | biceps | forearm | wrist | bmi | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 39 | 35.2 | 1.0202 | 46 | 164.7248 | 183.515 | 51.2 | 136.2 | 148.1 | 147.7 | 87.3 | 49.1 | 29.6 | 45 | 29 | 21.4 | 48.91206 |
outlierTest(bf.mod3)## No Studentized residuals with Bonferroni p < 0.05
## Largest |rstudent|:
## rstudent unadjusted p-value Bonferroni p
## 39 -2.995523 0.0030185 0.76067
Cook’s distance informs that deleting observation #39 will significantly change fitted slope.
ols_plot_cooksd_bar(bf.mod3) Studentized residuals vs leverage plot shows which obs. are influencial and which are outliers.
ols_plot_resid_lev(bf.mod3) In the table we can see three selected outliers, but only one has extremely large Cook’s distance.
influencePlot(bf.mod3)## StudRes Hat CookD
## 39 -2.995523 0.21284208 0.58767636
## 59 1.035629 0.06166681 0.01761635
## 216 2.723931 0.02422004 0.04488034
Further residuals plots confirm that observation #39 should be adjusted or removed.
ols_plot_resid_stud(bf.mod3)ols_plot_resid_stand(bf.mod3)Since there is only one extremely high leverage outlier data set won’t be transformed.
Observation #39 will be removed from the data set.
Delete observation 39.
bf_adj <- bf[-c(39),]Reprat the model analysis
Now model is re - calibrated. I repeat steps to obtain new regression line.
bf.mod3.adj <- lm(bf_adj$fat ~ bf_adj$age + bf_adj$chest + bf_adj$hip)
summary(bf.mod3.adj)##
## Call:
## lm(formula = bf_adj$fat ~ bf_adj$age + bf_adj$chest + bf_adj$hip)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.2756 -4.4269 -0.2851 4.2276 14.3620
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -70.71043 5.71888 -12.364 < 2e-16 ***
## bf_adj$age 0.16227 0.03066 5.293 2.66e-07 ***
## bf_adj$chest 0.37766 0.08306 4.547 8.55e-06 ***
## bf_adj$hip 0.44618 0.10240 4.357 1.93e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.595 on 247 degrees of freedom
## Multiple R-squared: 0.5535, Adjusted R-squared: 0.5481
## F-statistic: 102.1 on 3 and 247 DF, p-value: < 2.2e-16
backward3.adj <- step(bf.mod3.adj, scope = list(lower ~ bf_adj$age), trace = 0)
backward3.adj##
## Call:
## lm(formula = bf_adj$fat ~ bf_adj$age + bf_adj$chest + bf_adj$hip)
##
## Coefficients:
## (Intercept) bf_adj$age bf_adj$chest bf_adj$hip
## -70.7104 0.1623 0.3777 0.4462
plot(fitted(bf.mod3.adj) ~ fitted(backward3.adj))
abline(0,1)
## [1] 203 215
Repeated Global Validation of Linear Models Assumptions tells that all assumptions were met this time.
gvmodel3.adj <- gvlma(bf.mod3.adj)
summary(gvmodel3.adj)##
## Call:
## lm(formula = bf_adj$fat ~ bf_adj$age + bf_adj$chest + bf_adj$hip)
##
## Residuals:
## Min 1Q Median 3Q Max
## -14.2756 -4.4269 -0.2851 4.2276 14.3620
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -70.71043 5.71888 -12.364 < 2e-16 ***
## bf_adj$age 0.16227 0.03066 5.293 2.66e-07 ***
## bf_adj$chest 0.37766 0.08306 4.547 8.55e-06 ***
## bf_adj$hip 0.44618 0.10240 4.357 1.93e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.595 on 247 degrees of freedom
## Multiple R-squared: 0.5535, Adjusted R-squared: 0.5481
## F-statistic: 102.1 on 3 and 247 DF, p-value: < 2.2e-16
##
##
## ASSESSMENT OF THE LINEAR MODEL ASSUMPTIONS
## USING THE GLOBAL TEST ON 4 DEGREES-OF-FREEDOM:
## Level of Significance = 0.05
##
## Call:
## gvlma(x = bf.mod3.adj)
##
## Value p-value Decision
## Global Stat 3.9986 0.4062 Assumptions acceptable.
## Skewness 0.4934 0.4824 Assumptions acceptable.
## Kurtosis 2.5063 0.1134 Assumptions acceptable.
## Link Function 0.4039 0.5251 Assumptions acceptable.
## Heteroscedasticity 0.5950 0.4405 Assumptions acceptable.
Difference in fits plot improved a lot since outlier has been removed. No observation passes the threshold 0.5 set in previous model evaluation.
ols_plot_dffits(bf.mod3.adj)bf[which(dffits(bf.mod3.adj) < -0.5),]## [1] fat density age weight height neck chest abdomen hip
## [10] thigh knee ankle biceps forearm wrist bmi
## <0 rows> (or 0-length row.names)
Cook’s distances also look way better. Possibly observation #216 removal should be considered, but even without it results seem satisfactory.
ols_plot_cooksd_bar(bf.mod3.adj)ols_plot_resid_lev(bf.mod3.adj)All residuals are normal:
ols_plot_resid_stud(bf.mod3.adj) AIC has also improved:
AIC(bf.mod3.adj)## [1] 1582.699
avPlots(bf.mod3.adj)Obtained regression line:
fat = -70.71043 + 0.16227 * age + 0.37766 * chest + 0.44618 * hip