library(openintro)
download.file("http://www.openintro.org/stat/data/bdims.RData", destfile = "bdims.RData")
load("bdims.RData")
## bia.di bii.di bit.di che.de che.di elb.di wri.di kne.di ank.di sho.gi che.gi
## 1 42.9 26.0 31.5 17.7 28.0 13.1 10.4 18.8 14.1 106.2 89.5
## 2 43.7 28.5 33.5 16.9 30.8 14.0 11.8 20.6 15.1 110.5 97.0
## 3 40.1 28.2 33.3 20.9 31.7 13.9 10.9 19.7 14.1 115.1 97.5
## 4 44.3 29.9 34.0 18.4 28.2 13.9 11.2 20.9 15.0 104.5 97.0
## 5 42.5 29.9 34.0 21.5 29.4 15.2 11.6 20.7 14.9 107.5 97.5
## 6 43.3 27.0 31.5 19.6 31.3 14.0 11.5 18.8 13.9 119.8 99.9
## wai.gi nav.gi hip.gi thi.gi bic.gi for.gi kne.gi cal.gi ank.gi wri.gi age
## 1 71.5 74.5 93.5 51.5 32.5 26.0 34.5 36.5 23.5 16.5 21
## 2 79.0 86.5 94.8 51.5 34.4 28.0 36.5 37.5 24.5 17.0 23
## 3 83.2 82.9 95.0 57.3 33.4 28.8 37.0 37.3 21.9 16.9 28
## 4 77.8 78.8 94.0 53.0 31.0 26.2 37.0 34.8 23.0 16.6 23
## 5 80.0 82.5 98.5 55.4 32.0 28.4 37.7 38.6 24.4 18.0 22
## 6 82.5 80.1 95.3 57.5 33.0 28.0 36.6 36.1 23.5 16.9 21
## wgt hgt sex
## 1 65.6 174.0 1
## 2 71.8 175.3 1
## 3 80.7 193.5 1
## 4 72.6 186.5 1
## 5 78.8 187.2 1
## 6 74.8 181.5 1
mdims <- subset(bdims, sex = 1)
fdims <- subset(bdims, sex = 0)
Exercise 1
Make a histogram of men’s heights and a histogram of women’s heights. How would you compare the various aspects of the two distributions?
hist(mdims$hgt,main="Male Heights",xlab="Height")

hist(fdims$hgt,main="Female Heights",xlab="Height")

The two distributions are very similar.
fhgtmean <- mean(fdims$hgt)
fhgtsd <- sd(fdims$hgt)
hist(fdims$hgt, probability = TRUE)
x <- 140:190
y <- dnorm(x = x, mean = fhgtmean, sd = fhgtsd)
lines(x = x, y = y, col = "blue")

Exercise 2
Based on the this plot, does it appear that the data follow a nearly normal distribution?
Yes, this data does generally follow a normal distribution although the center values are slightly lower, there is a general decreasing slope on either side from the center. …
qqnorm(fdims$hgt)
qqline(fdims$hgt)

sim_norm <- rnorm(n = length(fdims$hgt), mean = fhgtmean, sd = fhgtsd)
Exercise 3
Make a normal probability plot of sim_norm. Do all of the points fall on the line? How does this plot compare to the probability plot for the real data?
qqnorm(sim_norm, main="Probability plot of Sim Norm")
qqline(sim_norm)
The probability plot of Sim_Norm matches up fairly well with the normal Q-Q plot, the outliers on the end actually fall within the qqline more directly than the theoretical plot.
Exercise 4
Does the normal probability plot for fdims$hgt look similar to the plots created for the simulated data? That is, do plots provide evidence that the female heights are nearly normal?
Yes, there is generally a linear slope following a normal distribution.
Exercise 5
Using the same technique, determine whether or not female weights appear to come from a normal distribution.
Yes.
1 - pnorm(q = 182, mean = fhgtmean, sd = fhgtsd)
## [1] 0.1242436
sum(fdims$hgt > 182) / length(fdims$hgt)
## [1] 0.1301775
###Exercise 6 Write out two probability questions that you would like to answer; one regarding female heights and one regarding female weights. Calculate the those probabilities using both the theoretical normal distribution as well as the empirical distribution (four probabilities in all). Which variable, height or weight, had a closer agreement between the two methods?
What is the probability that a woman is more than 170 centimeters? What is the probability that a woman is more than 52 kilograms?
1 - pnorm(q = 170, mean = fhgtmean, sd = fhgtsd)
## [1] 0.5483867
sum(fdims$hgt > 170) / length(fdims$hgt)
## [1] 0.5325444
fwgtmean <- mean(fdims$wgt)
fwgtsd <- sd(fdims$wgt)
pnorm(q = 52, mean = fwgtmean, sd = fwgtsd)
## [1] 0.09941932
sum(fdims$wgt < 52) / length(fdims$wgt)
## [1] 0.07495069
The agreement in heights is much closer than weights.
On your own - 1
- The histogram for female biiliac (pelvic) diameter (bii.di) belongs to normal probability plot letter B
- The histogram for female elbow diameter (elb.di) belongs to normal probability plot letter C
- The histogram for general age (age) belongs to normal probability plot letter D
- The histogram for female chest depth (che.de) belongs to normal probability plot letter A
On your own - 2
Note that normal probability plots C and D have a slight stepwise pattern. Why do you think this is the case?
Plots C and D measure age and chest depth, there could have possibly been some rounding issues for these values as opposed to pelvic and elbow diameters for the first two plots.
Own your own - 3
As you can see, normal probability plots can be used both to assess normality and visualize skewness. Make a normal probability plot for female knee diameter (kne.di). Based on this normal probability plot, is this variable left skewed, symmetric, or right skewed? Use a histogram to confirm your findings.
qqnorm(fdims$kne.di)
qqline(fdims$kne.di)
This graph appears to be right skewed as there is a slight upward/smiley face shape curve.
It is right skewed.
LS0tCnRpdGxlOiAiTGFiIDQgVGhlIE5vcm1hbCBEaXN0cmlidXRpb24iCmF1dGhvcjogIktpcnN0ZW4gR29sZG5lciIKZGF0ZTogImByIFN5cy5EYXRlKClgIgpvdXRwdXQ6IG9wZW5pbnRybzo6bGFiX3JlcG9ydAotLS0KCmBgYHtyIGxvYWQtcGFja2FnZXMsIG1lc3NhZ2U9RkFMU0V9CmxpYnJhcnkob3BlbmludHJvKQpkb3dubG9hZC5maWxlKCJodHRwOi8vd3d3Lm9wZW5pbnRyby5vcmcvc3RhdC9kYXRhL2JkaW1zLlJEYXRhIiwgZGVzdGZpbGUgPSAiYmRpbXMuUkRhdGEiKQpsb2FkKCJiZGltcy5SRGF0YSIpCmBgYApgYGB7cn0KaGVhZChiZGltcykKbWRpbXMgPC0gc3Vic2V0KGJkaW1zLCBzZXggPSAxKQpmZGltcyA8LSBzdWJzZXQoYmRpbXMsIHNleCA9IDApCmBgYAojIyMgRXhlcmNpc2UgMQpNYWtlIGEgaGlzdG9ncmFtIG9mIG1lbuKAmXMgaGVpZ2h0cyBhbmQgYSBoaXN0b2dyYW0gb2Ygd29tZW7igJlzIGhlaWdodHMuIEhvdyB3b3VsZCB5b3UgY29tcGFyZSB0aGUgdmFyaW91cyBhc3BlY3RzIG9mIHRoZSB0d28gZGlzdHJpYnV0aW9ucz8KCgpgYGB7ciBjb2RlLWNodW5rLWxhYmVsfQpoaXN0KG1kaW1zJGhndCxtYWluPSJNYWxlIEhlaWdodHMiLHhsYWI9IkhlaWdodCIpCmBgYApgYGB7cn0KaGlzdChmZGltcyRoZ3QsbWFpbj0iRmVtYWxlIEhlaWdodHMiLHhsYWI9IkhlaWdodCIpCmBgYAoKVGhlIHR3byBkaXN0cmlidXRpb25zIGFyZSB2ZXJ5IHNpbWlsYXIuICAKYGBge3J9CmZoZ3RtZWFuIDwtIG1lYW4oZmRpbXMkaGd0KQpmaGd0c2QgICA8LSBzZChmZGltcyRoZ3QpCmBgYAoKYGBge3J9Cmhpc3QoZmRpbXMkaGd0LCBwcm9iYWJpbGl0eSA9IFRSVUUpCnggPC0gMTQwOjE5MAp5IDwtIGRub3JtKHggPSB4LCBtZWFuID0gZmhndG1lYW4sIHNkID0gZmhndHNkKQpsaW5lcyh4ID0geCwgeSA9IHksIGNvbCA9ICJibHVlIikKYGBgCgoKIyMjIEV4ZXJjaXNlIDIKQmFzZWQgb24gdGhlIHRoaXMgcGxvdCwgZG9lcyBpdCBhcHBlYXIgdGhhdCB0aGUgZGF0YSBmb2xsb3cgYSBuZWFybHkgbm9ybWFsIGRpc3RyaWJ1dGlvbj8KClllcywgdGhpcyBkYXRhIGRvZXMgZ2VuZXJhbGx5IGZvbGxvdyBhIG5vcm1hbCBkaXN0cmlidXRpb24gYWx0aG91Z2ggdGhlIGNlbnRlciB2YWx1ZXMgYXJlIHNsaWdodGx5IGxvd2VyLCB0aGVyZSBpcyBhIGdlbmVyYWwgZGVjcmVhc2luZyBzbG9wZSBvbiBlaXRoZXIgc2lkZSBmcm9tIHRoZSBjZW50ZXIuIAouLi4KCmBgYHtyfQpxcW5vcm0oZmRpbXMkaGd0KQpxcWxpbmUoZmRpbXMkaGd0KQpgYGAKCmBgYHtyfQpzaW1fbm9ybSA8LSBybm9ybShuID0gbGVuZ3RoKGZkaW1zJGhndCksIG1lYW4gPSBmaGd0bWVhbiwgc2QgPSBmaGd0c2QpCmBgYAoKIyMjIEV4ZXJjaXNlIDMKTWFrZSBhIG5vcm1hbCBwcm9iYWJpbGl0eSBwbG90IG9mIHNpbV9ub3JtLiBEbyBhbGwgb2YgdGhlIHBvaW50cyBmYWxsIG9uIHRoZSBsaW5lPyBIb3cgZG9lcyB0aGlzIHBsb3QgY29tcGFyZSB0byB0aGUgcHJvYmFiaWxpdHkgcGxvdCBmb3IgdGhlIHJlYWwgZGF0YT8KYGBge3J9CnFxbm9ybShzaW1fbm9ybSwgbWFpbj0iUHJvYmFiaWxpdHkgcGxvdCBvZiBTaW0gTm9ybSIpIApxcWxpbmUoc2ltX25vcm0pCmBgYApUaGUgcHJvYmFiaWxpdHkgcGxvdCBvZiBTaW1fTm9ybSBtYXRjaGVzIHVwIGZhaXJseSB3ZWxsIHdpdGggdGhlIG5vcm1hbCBRLVEgcGxvdCwgdGhlIG91dGxpZXJzIG9uIHRoZSBlbmQgYWN0dWFsbHkgZmFsbCB3aXRoaW4gdGhlIHFxbGluZSBtb3JlIGRpcmVjdGx5IHRoYW4gdGhlIHRoZW9yZXRpY2FsIHBsb3QuIAoKIyMjIEV4ZXJjaXNlIDQKRG9lcyB0aGUgbm9ybWFsIHByb2JhYmlsaXR5IHBsb3QgZm9yIGZkaW1zJGhndCBsb29rIHNpbWlsYXIgdG8gdGhlIHBsb3RzIGNyZWF0ZWQgZm9yIHRoZSBzaW11bGF0ZWQgZGF0YT8gVGhhdCBpcywgZG8gcGxvdHMgcHJvdmlkZSBldmlkZW5jZSB0aGF0IHRoZSBmZW1hbGUgaGVpZ2h0cyBhcmUgbmVhcmx5IG5vcm1hbD8KClllcywgdGhlcmUgaXMgZ2VuZXJhbGx5IGEgbGluZWFyIHNsb3BlIGZvbGxvd2luZyBhIG5vcm1hbCBkaXN0cmlidXRpb24uIAoKIyMjIEV4ZXJjaXNlIDUKVXNpbmcgdGhlIHNhbWUgdGVjaG5pcXVlLCBkZXRlcm1pbmUgd2hldGhlciBvciBub3QgZmVtYWxlIHdlaWdodHMgYXBwZWFyIHRvIGNvbWUgZnJvbSBhIG5vcm1hbCBkaXN0cmlidXRpb24uCgpZZXMuIAoKYGBge3J9CjEgLSBwbm9ybShxID0gMTgyLCBtZWFuID0gZmhndG1lYW4sIHNkID0gZmhndHNkKQpzdW0oZmRpbXMkaGd0ID4gMTgyKSAvIGxlbmd0aChmZGltcyRoZ3QpCmBgYAoKCiMjI0V4ZXJjaXNlIDYKV3JpdGUgb3V0IHR3byBwcm9iYWJpbGl0eSBxdWVzdGlvbnMgdGhhdCB5b3Ugd291bGQgbGlrZSB0byBhbnN3ZXI7IG9uZSByZWdhcmRpbmcgZmVtYWxlIGhlaWdodHMgYW5kIG9uZSByZWdhcmRpbmcgZmVtYWxlIHdlaWdodHMuIENhbGN1bGF0ZSB0aGUgdGhvc2UgcHJvYmFiaWxpdGllcyB1c2luZyBib3RoIHRoZSB0aGVvcmV0aWNhbCBub3JtYWwgZGlzdHJpYnV0aW9uIGFzIHdlbGwgYXMgdGhlIGVtcGlyaWNhbCBkaXN0cmlidXRpb24gKGZvdXIgcHJvYmFiaWxpdGllcyBpbiBhbGwpLiBXaGljaCB2YXJpYWJsZSwgaGVpZ2h0IG9yIHdlaWdodCwgaGFkIGEgY2xvc2VyIGFncmVlbWVudCBiZXR3ZWVuIHRoZSB0d28gbWV0aG9kcz8KCldoYXQgaXMgdGhlIHByb2JhYmlsaXR5IHRoYXQgYSB3b21hbiBpcyBtb3JlIHRoYW4gMTcwIGNlbnRpbWV0ZXJzPyAKV2hhdCBpcyB0aGUgcHJvYmFiaWxpdHkgdGhhdCBhIHdvbWFuIGlzIG1vcmUgdGhhbiA1MiBraWxvZ3JhbXM/CgpgYGB7cn0KMSAtIHBub3JtKHEgPSAxNzAsIG1lYW4gPSBmaGd0bWVhbiwgc2QgPSBmaGd0c2QpCnN1bShmZGltcyRoZ3QgPiAxNzApIC8gbGVuZ3RoKGZkaW1zJGhndCkKCmBgYApgYGB7cn0KZndndG1lYW4gPC0gbWVhbihmZGltcyR3Z3QpCmZ3Z3RzZCAgIDwtIHNkKGZkaW1zJHdndCkKCnBub3JtKHEgPSA1MiwgbWVhbiA9IGZ3Z3RtZWFuLCBzZCA9IGZ3Z3RzZCkKc3VtKGZkaW1zJHdndCA8IDUyKSAvIGxlbmd0aChmZGltcyR3Z3QpCmBgYAoKVGhlIGFncmVlbWVudCBpbiBoZWlnaHRzIGlzIG11Y2ggY2xvc2VyIHRoYW4gd2VpZ2h0cy4gCgojIyMgT24geW91ciBvd24gLSAxCmEuIFRoZSBoaXN0b2dyYW0gZm9yIGZlbWFsZSBiaWlsaWFjIChwZWx2aWMpIGRpYW1ldGVyIChiaWkuZGkpIGJlbG9uZ3MgdG8gbm9ybWFsIHByb2JhYmlsaXR5IHBsb3QgbGV0dGVyIEIKYi4gVGhlIGhpc3RvZ3JhbSBmb3IgZmVtYWxlIGVsYm93IGRpYW1ldGVyIChlbGIuZGkpIGJlbG9uZ3MgdG8gbm9ybWFsIHByb2JhYmlsaXR5IHBsb3QgbGV0dGVyIEMKYy4gVGhlIGhpc3RvZ3JhbSBmb3IgZ2VuZXJhbCBhZ2UgKGFnZSkgYmVsb25ncyB0byBub3JtYWwgcHJvYmFiaWxpdHkgcGxvdCBsZXR0ZXIgRApkLiBUaGUgaGlzdG9ncmFtIGZvciBmZW1hbGUgY2hlc3QgZGVwdGggKGNoZS5kZSkgYmVsb25ncyB0byBub3JtYWwgcHJvYmFiaWxpdHkgcGxvdCBsZXR0ZXIgQQoKIyMjIE9uIHlvdXIgb3duIC0gMgpOb3RlIHRoYXQgbm9ybWFsIHByb2JhYmlsaXR5IHBsb3RzIEMgYW5kIEQgaGF2ZSBhIHNsaWdodCBzdGVwd2lzZSBwYXR0ZXJuLgpXaHkgZG8geW91IHRoaW5rIHRoaXMgaXMgdGhlIGNhc2U/CgpQbG90cyBDIGFuZCBEIG1lYXN1cmUgYWdlIGFuZCBjaGVzdCBkZXB0aCwgdGhlcmUgY291bGQgaGF2ZSBwb3NzaWJseSBiZWVuIHNvbWUgcm91bmRpbmcgaXNzdWVzIGZvciB0aGVzZSB2YWx1ZXMgYXMgb3Bwb3NlZCB0byBwZWx2aWMgYW5kIGVsYm93IGRpYW1ldGVycyBmb3IgdGhlIGZpcnN0IHR3byBwbG90cy4gCgojIyMgT3duIHlvdXIgb3duIC0gMwpBcyB5b3UgY2FuIHNlZSwgbm9ybWFsIHByb2JhYmlsaXR5IHBsb3RzIGNhbiBiZSB1c2VkIGJvdGggdG8gYXNzZXNzIG5vcm1hbGl0eSBhbmQgdmlzdWFsaXplIHNrZXduZXNzLiBNYWtlIGEgbm9ybWFsIHByb2JhYmlsaXR5IHBsb3QgZm9yIGZlbWFsZSBrbmVlIGRpYW1ldGVyIChrbmUuZGkpLiBCYXNlZCBvbiB0aGlzIG5vcm1hbCBwcm9iYWJpbGl0eSBwbG90LCBpcyB0aGlzIHZhcmlhYmxlIGxlZnQgc2tld2VkLCBzeW1tZXRyaWMsIG9yIHJpZ2h0IHNrZXdlZD8gVXNlIGEgaGlzdG9ncmFtIHRvIGNvbmZpcm0geW91ciBmaW5kaW5ncy4KCmBgYHtyfQpxcW5vcm0oZmRpbXMka25lLmRpKQpxcWxpbmUoZmRpbXMka25lLmRpKQpgYGAKVGhpcyBncmFwaCBhcHBlYXJzIHRvIGJlIHJpZ2h0IHNrZXdlZCBhcyB0aGVyZSBpcyBhIHNsaWdodCB1cHdhcmQvc21pbGV5IGZhY2Ugc2hhcGUgY3VydmUuCgpgYGB7cn0KaGlzdChmZGltcyRrbmUuZGkpCmBgYApJdCBpcyByaWdodCBza2V3ZWQuIAo=