On Your Own

  1. Now let’s consider some of the other variables in the body dimensions data set. Using the figures at the end of the exercises, match the histogram to its normal probability plot. All of the variables have been standardized (first subtract the mean, then divide by the standard deviation), so the units won’t be of any help. If you are uncertain based on these figures, generate the plots in R to check.

    a. The histogram for female biiliac (pelvic) diameter (bii.di) belongs to normal probability plot letter B.

    b. The histogram for female elbow diameter (elb.di) belongs to normal probability plot letter C.

    c. The histogram for general age (age) belongs to normal probability plot letter d.

    d. The histogram for female chest depth (che.de) belongs to normal probability plot letter a.

    Answer: EDIT ME

download.file("http://www.openintro.org/stat/data/bdims.RData", destfile = "bdims.RData")
load("bdims.RData")

f_pelvic<-bdims$bii.di

mean(f_pelvic)
## [1] 27.82998
pelvic_st<-(f_pelvic - mean(f_pelvic))/sd(f_pelvic)

f_elbows<-bdims$elb.di

mean(f_elbows)
## [1] 13.38521
elbows_st<-(f_elbows - mean(f_elbows))/sd(f_elbows)

g_age<-bdims$age

mean(g_age)
## [1] 30.18146
age_st<-(g_age - mean(g_age))/sd(g_age)

f_chest<-bdims$che.de

mean(f_chest)
## [1] 19.22604
chest_st<-(f_chest - mean(f_chest))/sd(f_chest)
  1. Note that normal probability plots C and D have a slight stepwise pattern.
    Why do you think this is the case?

    It is because the scale is based on numbers that have a finite number of values, due to this the data in this dataset is only whole numbers and this creates a step pattern on the variable y axis of the qqplot and x-axis refers to the percentiles of the normal distribution which is continuous therefore plots are continuous in their x-values. Age and chest were integers.

  2. As you can see, normal probability plots can be used both to assess normality and visualize skewness. Make a normal probability plot for female knee diameter (kne.di). Based on this normal probability plot, is this variable left skewed, symmetric, or right skewed? Use a histogram to confirm your findings.

    It is right skewed based on the histogram and qq plot.

f_knee<-bdims$kne.di

mean(f_knee)
## [1] 18.81065
knees_st<-(f_knee - mean(f_knee))/sd(f_knee)