set.seed(1112)
load("more/bdims.RData")
mdims <- subset(bdims, sex == 1)
fdims <- subset(bdims, sex == 0)
  1. Make a histogram of men’s heights and a histogram of women’s heights. How would you compare the various aspects of the two distributions?
hist(mdims$hgt)

hist(fdims$hgt)

The distributions are unimodal and bell shaped. If you were to put more breaks in the female height histogram, it might look like the mean mode is more centrally located instead of being slightly skewed to the right but it appears to be a normal distribution.

fhgtmean <- mean(fdims$hgt)
fhgtsd   <- sd(fdims$hgt)
hist(fdims$hgt, probability = TRUE)
x <- 140:190
y <- dnorm(x = x, mean = fhgtmean, sd = fhgtsd)
lines(x = x, y = y, col = "blue")

  1. Based on the this plot, does it appear that the data follow a nearly normal distribution?

Yes. The probability curve created by the female height is normally distributed.

Evaluating the normal distribution

Eyeballing the shape of the histogram is one way to determine if the data appear to be nearly normally distributed, but it can be frustrating to decide just how close the histogram is to the curve. An alternative approach involves constructing a normal probability plot, also called a normal Q-Q plot for “quantile-quantile”.

  1. Make a normal probability plot of sim_norm. Do all of the points fall on the line? How does this plot compare to the probability plot for the real data?

All the points fall on the line except for a few outliers which are still fairly close to the line. Compared to the actual data, the simulated data looks less jagged but that it most likely due to measurements going to fewer decimal places than the simulation.

  1. Does the normal probability plot for fdims$hgt look similar to the plots created for the simulated data? That is, do plots provide evidence that the female heights are nearly normal?

Yes. The simulated plots all look very similar to fdims$hgt. We can conclude that female heights are nearly normal.

  1. Using the same technique, determine whether or not female weights appear to come from a normal distribution.

At first glance, the observed female weight data appeared to be skewed to the right based on the ends of the Q-Q plot curving upwards. But after looking at the 10 simulated Q-Q plots based on the observed data, female weight appears to be normally distributed.

  1. a. The histogram for female biiliac (pelvic) diameter (bii.di) belongs to normal probability plot letter ____.

    B

    b. The histogram for female elbow diameter (elb.di) belongs to normal probability plot letter ____.

    C

    c. The histogram for general age (age) belongs to normal probability plot letter ____.

    D

    d. The histogram for female chest depth (che.de) belongs to normal probability plot letter ____.

    A

  2. Note that normal probability plots C and D have a slight stepwise pattern.
    Why do you think this is the case?

For plot C, the measurements of elbow diameter are small but the precision is lacking creating larger jumps between measurement and the observed stepwise pattern. For plot D, age is measured in years and not months or days so you had less precision as well.

  1. As you can see, normal probability plots can be used both to assess normality and visualize skewness. Make a normal probability plot for female knee diameter (kne.di). Based on this normal probability plot, is this variable left skewed, symmetric, or right skewed? Use a histogram to confirm your findings.

The upward curve of observations at the beginning and end of the line tells us that this is skewed to the right. This is confirmed by the histogram.