library(SASxport)
## Warning: package 'SASxport' was built under R version 3.3.3
experimental = read.xport("C:/Users/Exped/Desktop/607P2/BMX_H.XPT")
bmi = experimental$BMXBMI
bhi = experimental$BMXHT
dataAnalysis = data.frame(bmi,bhi)
Is a person’s height predictive of BMI classification. The BMI system uses height to ascertain a person’s bodyfat, however height should not be predictive of body fat.
BMI classifications are as follows.
1. Underweight (BMI < 5th percentile)
2. Normal weight (BMI 5th to < 85th percentiles)
3. Overweight (BMI 85th to < 95th percentiles)
4. Obese (BMI ??? 95th percentile)
Each case represents an eligible survey participant aged 2-150, quality assurance for fairness and random sampling controlled by the NCHS Research Data Center and CDC. This dataset contains around 9000 observations.
Data is collected through the Centers for Disease Control and Prevention (CDC) government database. Data is collected in a joint effort through the CDC and the NCHS Research Data Center.
This is an observational study. Observing 2013-2014 data.
Data can be found here ‘https://wwwn.cdc.gov/nchs/nhanes/search/datapage.aspx?Component=Examination&CycleBeginYear=2013’
Documentation can be found here ‘https://wwwn.cdc.gov/Nchs/Nhanes/2013-2014/BMX_H.htm’
The response variable is the BMI of our 9000 observations… Numerical, discrete.
The explanatory variable is the height of our 9000 observations…numerical, discrete.
describe(dataAnalysis$bmi)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 9055 25.68 7.96 24.7 24.97 7.71 12.1 82.9 70.8 1.02 2
## se
## X1 0.08
describe(dataAnalysis$bhi)
## vars n mean sd median trimmed mad min max range skew
## X1 1 9067 155.88 23.18 162 159.14 15.12 79.7 202.6 122.9 -1.24
## kurtosis se
## X1 0.96 0.24
ggplot(dataAnalysis,aes(x=bhi)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 746 rows containing non-finite values (stat_bin).
ggplot(dataAnalysis,aes(x=bmi)) + geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 758 rows containing non-finite values (stat_bin).