Load Body Dimensions Data

load("more/bdims.RData")

Make subset of Data for men and women

mdims <- subset(bdims, sex == 1)
fdims <- subset(bdims, sex == 0)

Question 1

Historgram of Men’s heights

hist(mdims$hgt, xlab = "men's height (cm)", main = "Histogram - Men's Heights")

Historgram of Women’s heights

hist(fdims$hgt, xlab = "Women's height (cm)", main = "Histogram - Women's Heights")

The mean height for men is larger than the mean height for women. The distribution of men’s heights is unimodal and symmetric. The range of heights that are most common for men is narrower than the range of heights that are most common for women. The distribution for women’s heights is unimodal and it harder to tell if it is symmetric or skewed left.

Mean and Standard Deviation for Women’s Heights

fhgtmean <- mean(fdims$hgt)
fhgtsd   <- sd(fdims$hgt)

Density Histogram for Women’s Heights with Normal Curve

hist(fdims$hgt, probability = TRUE, ylim = c(0, 0.06), main = "Histogram - Density of Women's Heights", xlab = "women's height (cm)")
x <- 140:190
y <- dnorm(x = x, mean = fhgtmean, sd = fhgtsd)
lines(x = x, y = y, col = "blue")

  1. It might be a normal curve, but the spread around the mean looks too wide.

Simulating a Normal Probability Function with the sample’s mean and standard deviation

sim_norm <- rnorm(n = length(fdims$hgt), mean = fhgtmean, sd = fhgtsd)
Question 3

Normal Probability Plot for Simulated Data

qqnorm(sim_norm)
qqline(sim_norm)

Normal Probability Plot for Height for Women

qqnorm(fdims$hgt)
qqline(fdims$hgt)

All of the points do not fall on the line. There are some outliers at the low and high end. The simulated data conforms to the line in a similar manner to the actual data of women’s heights.

Many Simulated Normal Plots

qqnormsim(fdims$hgt)

Question 4

The normal probability plot for women’s height looks similar to the simulated data. The data is not as smooth but the number of outliers is comprable to the number of outliers on the simulation. I think there is evidence that female heights are nearly normal.

Question 5

Analyzing the Distribution of Female Weights

Mean and Standard Deviation of Female Weights

Mean and Standard Deviation for Women’s Heights

fwgtmean <- mean(fdims$wgt)
fwgtsd   <- sd(fdims$wgt)

Density Histogram for Women’s Weights with Normal Curve

hist(fdims$wgt, probability = TRUE, main = "Histogram - Density of Women's Weights", xlab = "women's weight (kg)")
x <- 40:120
y <- dnorm(x = x, mean = fwgtmean, sd = fwgtsd)
lines(x = x, y = y, col = "blue")

The distribution of women’s weights is unimodal and skews significantly to the right. When comparing the distribution of the data to the normal curve, the peaks do not quite overlap and the assymetry in the data is also noticable.

Normal Probability Plot for Weight for Women

qqnorm(fdims$wgt)
qqline(fdims$wgt)

There are significant deviations between the line on the qq plot and the data. There are many outliers at the high end and some outliers at the low end. The outliers at the high end are significantly far from the line.

Many Simulated Normal Plots

qqnormsim(fdims$wgt)

Comparing the fit between the data and the line on the qq plot to the fit between the simulated data and the line on the qq plot, it is apparent that the women’s weights do not conform to a normal distribution. The simulated data follows the line much more closely than the data of women’s weights.

Question 6

What is the probability a women is shorter than 5 feet 1 inches (61 inches = 155 cm)?

Theoretical Probability

pnorm(q=155, mean = fhgtmean, sd = fhgtsd)
## [1] 0.06571769

Empirical Probability

sum(fdims$hgt < 155)/length(fdims$hgt)
## [1] 0.06153846

What is the probability a women weighs less than 46 kg?

Theoretical Probability

pnorm(q=46, mean = fwgtmean, sd = fwgtsd)
## [1] 0.064458

Empirical Probability

sum(fdims$wgt < 46)/length(fdims$wgt)
## [1] 0.02692308

The height had a closer agreement between the 2 methods. This is expected because the height data followed a normal distribution more closely than weight.

Histogram Match

Normal Probability Plor for Elbow Diameter

qqnorm(fdims$elb.di)
qqline(fdims$elb.di)

Normal Probability Plot for Pelvic Breadth

qqnorm(fdims$bii.di)
qqline(fdims$bii.di)

Normal Probability Plot for Age

qqnorm(bdims$age)
qqline(bdims$age)

Normal Probability Plot for Chest Diameter

qqnorm(fdims$che.de)
qqline(fdims$che.de)

  1. The histogram for female biiliac (pelvic) diameter (bii.di) belongs to normal probability plot letter B.

  2. The histogram for female elbow diameter (elb.di) belongs to normal probability plot letter C.

  3. The histogram for general age (age) belongs to normal probability plot letter D.

d.The histogram for female chest depth (che.de) belongs to normal probability plot letter A.

Plots C and D have a slight step-wise pattern because the data is descretized. The ages are measured each year and there are no ages measured in between. Elbow diameter is measured in centimeters and shows a step-wise pattern for the same reason - there are no measurements in between whole numbers of centimeters.

Normal Probability Plot for Female Knee Diameter

qqnorm(fdims$kne.di)
qqline(fdims$kne.di)

Because there is a lot of data at the lower end and the data trends upward, the distribution is right skewed.

Histogram for Female Knee Diameter

hist(fdims$kne.di, xlab = "Women's knee diameter (cm)", main = "Histogram - Women's Knee Diameter")

The histogram also shows that the distribution is right skewed.