Analyze measures of body fat in NHANES. The following types of analysis might be explored:
- Correspondence of different measures of body fat with each other.
- Correspondence of different measures of body fat with other measures (e.g. lab test results).
- Correspondence of different measures of body fat with outcomes (e.g. cardiovascular disease).
The Classification of obesity wiki provides a helpful overview.
The National Health and Nutrition Examination Survey (NHANES) provides a variety of data which allow estimating body fat.
This includes the variables (using 1999-2000 names):
- BIDPFAT - Estimated percent body fat - from Bioelectrical Impedance Analysis (BIX)
- DXDTOPF - Total percent fat - from Dual Energy X-ray Absorptiometry (DXA)
- BMXBMI - Body Mass Index (kg/m**2) - from Body Measures (BMX) (computed from Height and Weight)
- BMXWT - Weight (kg)
- BMXHT - Standing Height (cm)
- BMXWAIST - Waist Circumference (cm)
- RIDAGEYR - Age at Screening
- RIAGENDR - Gender - 1 for male, 2 for female
This document focuses on the 1999-2000 survey which is representative of the recurring surveys from 1999-2014. The data collected varies by survey year.
The following measures are commonly used to evaluate body composition.
NHANES provides BMI data for all surveys and it is easily calculated from height and weight. BMI is currently the most commonly used measure of body composition.
1999-2000 NHANES BMI Details
Using the variable:
- BMXBMI - Body Mass Index (kg/m**2) - from Body Measures (BMX) (computed from Height and Weight)
From Body mass index as a measure of body fatness: age- and sex-specific prediction formulas
Child Body Fat % = (1.51 x BMI) - (0.70 x Age) - (3.6 x gender) + 1.4
Adult Body Fat % = (1.20 x BMI) + (0.23 x Age) - (10.8 x gender) - 5.4
For additional detail see http://www.halls.md/bmi/heritage.htm
Note that in reality %BF vs. BMI curve is not linear (decreasing slope). This is most noticeable above BMI of 35.
x <- transform(x,
BMIPFEST = ifelse(RIDAGEYR <= 15,
1.51 * BMXBMI - 0.70 * RIDAGEYR - 3.6 * (RIAGENDR == 1) + 1.4, # child
1.20 * BMXBMI + 0.23 * RIDAGEYR - 10.8 * (RIAGENDR == 1) - 5.4 # adult
))
NHANES provides WC data for all surveys. It can be used either as a standalone measure or as a component of other measures (e.g. see WHtR and WHR below).
Using the variable:
- BMXWAIST - Waist Circumference (cm)
NHANES provides waist circumference and height data for all surveys. This allows calculation of WHtR.
Using the variables:
- BMXHT - Standing Height (cm)
- BMXWAIST - Waist Circumference (cm)
Note the acronym is WHtR to avoid confusion with waist to hip ratio.
Paper: http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0103483 review article: http://journals.cambridge.org/action/displayAbstract?aid=7929311
x$WHtR <- x$BMXWAIST / x$BMXHT
x$WHtR2 <- x$BMXWAIST / sqrt(x$BMXHT) # Try square root of height given residual variation with height
NHANES does not provide hip circumference data so WHR can not be calculated from the data.
NHANES provides BIA data for the 1999-2004 surveys.
1999-2000 NHANES BIA Details
Using the variable:
- BIDPFAT - Estimated percent body fat - from Bioelectrical Impedance Analysis (BIX)
NHANES provides DXA data for the 1999-2006 surveys.
NHANES DXA Overview
1999-2000 NHANES DXA Details (2MB PDF)
At the moment I am averaging the 5 DXDTOPF values. This is discouraged by the documentation and I view it only as useful for making some initial assessments.
- DXDTOPF - Total percent fat - from Dual Energy X-ray Absorptiometry (DXA)
NHANES does not provide Hydrostatic (aka Underwater) Weighing data.
Examine the difference in variation of different measures with age. The goal here is to get a sense of how each measure varies with age. Further analysis will be done only with subjects 20 or older.
scatterplot(BMXBMI ~ RIDAGEYR, data=x[x$RIAGENDR == 1,], main="BMI vs Age for Men")
scatterplot(BMXBMI ~ RIDAGEYR, data=x[x$RIAGENDR == 2,], main="BMI vs Age for Women")
scatterplot(DXDTOPF ~ RIDAGEYR, data=x[x$RIAGENDR == 1,], main="DXA % Fat vs Age for Men")
scatterplot(DXDTOPF ~ RIDAGEYR, data=x[x$RIAGENDR == 2,], main="DXA % Fat vs Age for Women")
The two main things to notice here are dramatic differences before 20 and the gradual increase with age after 20.
## used (Mb) gc trigger (Mb) max used (Mb)
## Ncells 314707 8.5 597831 16.0 597831 16.0
## Vcells 7811170 59.6 37704823 287.7 70035086 534.4
First look at correlations for all of the metrics in adults.
dataCor <- cor(adults[,c("BIDPFAT", "DXDTOPF", "BMIPFEST", "BMXBMI", "BMXWAIST",
"WHtR", "WHtR2")],
use="pairwise.complete.obs")
corrplot(dataCor, title="Metric Correlations for Adults")
dataCor
## BIDPFAT DXDTOPF BMIPFEST BMXBMI BMXWAIST WHtR WHtR2
## BIDPFAT 1.0000 0.8757 0.7535 0.4410 0.3093 0.4865 0.4055
## DXDTOPF 0.8757 1.0000 0.8674 0.6112 0.4516 0.6467 0.5602
## BMIPFEST 0.7535 0.8674 1.0000 0.7697 0.6196 0.7862 0.7170
## BMXBMI 0.4410 0.6112 0.7697 1.0000 0.8696 0.8880 0.8965
## BMXWAIST 0.3093 0.4516 0.6196 0.8696 1.0000 0.9219 0.9804
## WHtR 0.4865 0.6467 0.7862 0.8880 0.9219 1.0000 0.9802
## WHtR2 0.4055 0.5602 0.7170 0.8965 0.9804 0.9802 1.0000
The metrics fall into two groups: percent fat estimates and height/weight/waist estimates.
Check how gender affects the correlations (should also check age).
dataCor <- cor(adults[adults$RIAGENDR == 1,c("BIDPFAT", "DXDTOPF", "BMIPFEST", "BMXBMI", "BMXWAIST", "WHtR", "WHtR2")],
use="pairwise.complete.obs")
corrplot(dataCor, title="Metric Correlations for Men")
dataCor
## BIDPFAT DXDTOPF BMIPFEST BMXBMI BMXWAIST WHtR WHtR2
## BIDPFAT 1.0000 0.7213 0.5080 0.4901 0.6016 0.5961 0.6069
## DXDTOPF 0.7213 1.0000 0.7967 0.7422 0.8381 0.8555 0.8579
## BMIPFEST 0.5080 0.7967 1.0000 0.8348 0.8790 0.9031 0.9027
## BMXBMI 0.4901 0.7422 0.8348 1.0000 0.9089 0.8960 0.9143
## BMXWAIST 0.6016 0.8381 0.8790 0.9089 1.0000 0.9486 0.9873
## WHtR 0.5961 0.8555 0.9031 0.8960 0.9486 1.0000 0.9868
## WHtR2 0.6069 0.8579 0.9027 0.9143 0.9873 0.9868 1.0000
dataCor <- cor(adults[adults$RIAGENDR == 2,c("BIDPFAT", "DXDTOPF", "BMIPFEST", "BMXBMI", "BMXWAIST", "WHtR", "WHtR2")],
use="pairwise.complete.obs")
corrplot(dataCor, title="Metric Correlations for Women")
dataCor
## BIDPFAT DXDTOPF BMIPFEST BMXBMI BMXWAIST WHtR WHtR2
## BIDPFAT 1.0000 0.7529 0.5901 0.6004 0.6069 0.5983 0.6078
## DXDTOPF 0.7529 1.0000 0.7925 0.7913 0.7644 0.7804 0.7806
## BMIPFEST 0.5901 0.7925 1.0000 0.8866 0.8384 0.8642 0.8603
## BMXBMI 0.6004 0.7913 0.8866 1.0000 0.8944 0.8833 0.8982
## BMXWAIST 0.6069 0.7644 0.8384 0.8944 1.0000 0.9588 0.9897
## WHtR 0.5983 0.7804 0.8642 0.8833 0.9588 1.0000 0.9896
## WHtR2 0.6078 0.7806 0.8603 0.8982 0.9897 0.9896 1.0000
Separating by gender makes a substantial difference. I am unsure how to interpret that.
Examine the difference in variation of different measures with height. The goal here is to check if any of the measures appear to be biased by height.
scatterplot(WHtR ~ BMXHT, data=adults[adults$RIAGENDR == 1,], main="Waist-to-Height Ratio vs Height for Men")
scatterplot(WHtR ~ BMXHT, data=adults[adults$RIAGENDR == 2,], main="Waist-to-Height Ratio vs Height for Women")
Given the variation of WHtR with height consider using a different exponent for height. First look at the waist vs height variation.
scatterplot(BMXWAIST ~ BMXHT, data=adults[adults$RIAGENDR == 1,], main="Waist vs Height for Men")
scatterplot(BMXWAIST ~ BMXHT, data=adults[adults$RIAGENDR == 2,], main="Waist vs Height for Women")
Try waist to square root of height.
scatterplot(WHtR2 ~ BMXHT, data=adults[adults$RIAGENDR == 1,], main="Waist-to-sqrt(Height) Ratio vs Height for Men")
scatterplot(WHtR2 ~ BMXHT, data=adults[adults$RIAGENDR == 2,], main="Waist-to-sqrt(Height) Ratio vs Height for Women")
Using the square root the variation with height is small (and in opposite directions for men and women, why?).
I think this metric is worth further evaluation. But, it would be good to have some physical justification for doing this.
scatterplot(BMXBMI ~ BMXHT, data=adults[adults$RIAGENDR == 1,], main="BMI vs Height for Men")
scatterplot(BMXBMI ~ BMXHT, data=adults[adults$RIAGENDR == 2,], main="BMI vs Height for Women")
scatterplot(DXDTOPF ~ BMXHT, data=adults[adults$RIAGENDR == 1,], main="DXA % Fat vs Height for Men")
scatterplot(DXDTOPF ~ BMXHT, data=adults[adults$RIAGENDR == 2,], main="DXA % Fat vs Height for Women")
scatterplot(WHtR ~ BMXBMI, data=adults, main="Waist-to-Height Ratio vs BMI for Adults")
scatterplot(WHtR ~ BMXBMI, data=adults[adults$RIAGENDR == 1,], main="Waist-to-Height Ratio vs BMI for Men")
scatterplot(WHtR ~ BMXBMI, data=adults[adults$RIAGENDR == 2,], main="Waist-to-Height Ratio vs BMI for Women")
Although highly correlated with BMI, WHtR clearly adds additional information (especially for women).
Not run currently.