load("more/bdims.RData")
mdims <- subset(bdims, sex == 1)
fdims <- subset(bdims, sex == 0)
hist(mdims$hgt, xlab = "men's height (cm)", main = "Histogram - Men's Heights")
hist(fdims$hgt, xlab = "Women's height (cm)", main = "Histogram - Women's Heights")
The mean height for men is larger than the mean height for women. The distribution of men’s heights is unimodal and symmetric. The range of heights that are most common for men is narrower than the range of heights that are most common for women. The distribution for women’s heights is unimodal and it harder to tell if it is symmetric or skewed left.
fhgtmean <- mean(fdims$hgt)
fhgtsd <- sd(fdims$hgt)
hist(fdims$hgt, probability = TRUE, ylim = c(0, 0.06), main = "Histogram - Density of Women's Heights", xlab = "women's height (cm)")
x <- 140:190
y <- dnorm(x = x, mean = fhgtmean, sd = fhgtsd)
lines(x = x, y = y, col = "blue")
sim_norm <- rnorm(n = length(fdims$hgt), mean = fhgtmean, sd = fhgtsd)
qqnorm(sim_norm)
qqline(sim_norm)
qqnorm(fdims$hgt)
qqline(fdims$hgt)
All of the points do not fall on the line. There are some outliers at the low and high end. The simulated data conforms to the line in a similar manner to the actual data of women’s heights.
qqnormsim(fdims$hgt)
The normal probability plot for women’s height looks similar to the simulated data. The data is not as smooth but the number of outliers is comprable to the number of outliers on the simulation. I think there is evidence that female heights are nearly normal.
fwgtmean <- mean(fdims$wgt)
fwgtsd <- sd(fdims$wgt)
hist(fdims$wgt, probability = TRUE, main = "Histogram - Density of Women's Weights", xlab = "women's weight (kg)")
x <- 40:120
y <- dnorm(x = x, mean = fwgtmean, sd = fwgtsd)
lines(x = x, y = y, col = "blue")
The distribution of women’s weights is unimodal and skews significantly to the right. When comparing the distribution of the data to the normal curve, the peaks do not quite overlap and the assymetry in the data is also noticable.
qqnorm(fdims$wgt)
qqline(fdims$wgt)
There are significant deviations between the line on the qq plot and the data. There are many outliers at the high end and some outliers at the low end. The outliers at the high end are significantly far from the line.
qqnormsim(fdims$wgt)
Comparing the fit between the data and the line on the qq plot to the fit between the simulated data and the line on the qq plot, it is apparent that the women’s weights do not conform to a normal distribution. The simulated data follows the line much more closely than the data of women’s weights.
pnorm(q=155, mean = fhgtmean, sd = fhgtsd)
## [1] 0.06571769
sum(fdims$hgt < 155)/length(fdims$hgt)
## [1] 0.06153846
pnorm(q=46, mean = fwgtmean, sd = fwgtsd)
## [1] 0.064458
sum(fdims$wgt < 46)/length(fdims$wgt)
## [1] 0.02692308
The height had a closer agreement between the 2 methods. This is expected because the height data followed a normal distribution more closely than weight.
qqnorm(fdims$elb.di)
qqline(fdims$elb.di)
qqnorm(fdims$bii.di)
qqline(fdims$bii.di)
qqnorm(bdims$age)
qqline(bdims$age)
qqnorm(fdims$che.de)
qqline(fdims$che.de)
The histogram for female biiliac (pelvic) diameter (bii.di) belongs to normal probability plot letter B.
The histogram for female elbow diameter (elb.di) belongs to normal probability plot letter C.
The histogram for general age (age) belongs to normal probability plot letter D.
d.The histogram for female chest depth (che.de) belongs to normal probability plot letter A.
Plots C and D have a slight step-wise pattern because the data is descretized. The ages are measured each year and there are no ages measured in between. Elbow diameter is measured in centimeters and shows a step-wise pattern for the same reason - there are no measurements in between whole numbers of centimeters.
qqnorm(fdims$kne.di)
qqline(fdims$kne.di)
Because there is a lot of data at the lower end and the data trends upward, the distribution is right skewed.
hist(fdims$kne.di, xlab = "Women's knee diameter (cm)", main = "Histogram - Women's Knee Diameter")
The histogram also shows that the distribution is right skewed.