Student Details
Student Name: Nischay Bikram Thapa
Student No: s3819491
Problem Statement
Do chest diameter of men or women follow a bell curve?
Males and females tend to have different body dimension.The dataset provided describes 26 different body measurements from 247 men and 260 women primarily in their twenties and early thirties with a scattering of older men and women, all physically active (several hours of exercise a week). However, this report investigates one particular variable to determine whether the selected variable fits a normal distribution separately in males and females. Here, the variable of interest is ‘che.di’ which indicates the respondent’s chest diameter in centimetres, measured at nipple level, mid-expiration.
The investigation begins by importing the dataset(bdims.csv) into R, transforming levels of sex variable into factors such as Male and Female. Subsequently, summary statistics are calculated separately for men and women with a variable of interest. After this, further exploration is done by generating a histogram for the two levels. Finally, the study ends with the comparison of the output with a clear conclusion on whether the chest diameter of men and women fits a normal distribution curve.
Load Packages
# Load necessary packages
#For data manipulation
library(dplyr)
package 㤼㸱dplyr㤼㸲 was built under R version 3.6.3Registered S3 method overwritten by 'dplyr':
method from
print.rowwise_df
Attaching package: 㤼㸱dplyr㤼㸲
The following objects are masked from 㤼㸱package:stats㤼㸲:
filter, lag
The following objects are masked from 㤼㸱package:base㤼㸲:
intersect, setdiff, setequal, union
# For reading csv file
library(readr)
Data
While importing the dataset, all variable is interpreted as numeric including the sex column. As per our understanding sex is a nominal variable comprising two distinct categories. Therefore, for better representation, it has been converted into factors of two levels where 0 denotes ‘Female’ and 1 denotes ‘Male’.
# Read the csv file with read_csv function
data <- read_csv('bdims.csv')
Parsed with column specification:
cols(
.default = col_double()
)
See spec(...) for full column specifications.
# Convert numeric to factors with labels for sex variable
data$sex <- factor(data$sex,levels = c('1','0'),labels=c('Male','Female'))
#Initial view of the data
head(data)
Summary Statistics
# Group the data by sex column and calculate descriptive statistics
data %>% group_by(sex) %>% summarise(mean = mean(che.di,na.rm = T) %>% round(2),
median = median(che.di,na.rm=T),
s.d. = sd(che.di,na.rm=T) %>% round(2),
first_quartile = quantile(che.di,0.25,na.rm=T),
third_quartile = quantile(che.di,0.75,na.rm=T),
IQR = IQR(che.di,na.rm=T),
min = min(che.di,na.rm=T),
max = max(che.di,na.rm=T))
The above summary statistics depict different information about male and female chest diameter measurements. On average, the chest diameter of a male is 29.95 cm whereas 26.1 in females. However, the measurement of females tends to be close towards the mean opposed to males.
Distribution Fitting
# Subset data for male and females separately
male <- subset(data$che.di,data$sex=="Male")
female <- subset(data$che.di,data$sex=="Female")
#Plot histogram with normal distribution curve
hist(male,breaks = 15,xlim = c(20,40),probability = T,col="dodgerblue3",xlab="Chest diameter (cm)", ylab = "Probability",main="Histogram of Male Chest Diameter(in cm)")
# Add a density curve
lines(density(male), col = "lightblue4", lwd = 2)
# Calculate normal distribution
male_norm = rnorm(length(male),mean(male),sd(male))
# Draw a smooth normal curve
lines(density(male_norm,adjust = 2), col = "Red", lwd = 2)

#Plot histogram with normal distribution curve
hist(female,breaks = 15,xlim=c(20,35),probability = T,col="palegoldenrod",xlab="Chest diameter (cm)", ylab = "Probability",main="Histogram of Female Chest Diameter(in cm)")
#Add a density curve
lines(density(female), col = "red", lwd = 2)
# Calculate normal distribution
female_norm = rnorm(length(female),mean(female),sd(female))
# Draw a smooth normal curve
lines(density(female_norm,adjust = 2), col = "midnightblue", lwd = 2)

Interpretation
A normal distribution has a bell-shaped density curve described by its mean and standard deviation. The density curve is symmetrical, centred about its mean, with its spread determined by its standard deviation. To speak specifically of any normal distribution, two quantities have to be specified: the mean, where the peak of the density occurs, and the standard deviation, which indicates the spread or girth of the bell curve.
The empirical rule of a normal density curve states that 68% of the observations fall within 1 standard deviation of the mean following with 95% within 2 standard deviations and 99.7% within 3 standard deviations of the mean. Therefore, these theoretical attributes are satisfied as the chest diameter of the male holds most of its values clustered around the mean i.e 29.95 equal to the median that signifies symmetric property with a standard deviation of 2.08. However, the chest diameter of the female consists of data points extended towards the right show mean (26.10) greater than the median (25.9).
Moreover, after investigating graphically in the above histogram with a normal curve overlay, we can infer that male chest diameter fits a normal distribution whereas the female chest diameter is skewed towards the right.
LS0tDQp0aXRsZTogIk1BVEgxMzI0IEFzc2lnbm1lbnQgMSINCnN1YnRpdGxlOiBNb2RlbGluZyBCb2R5IE1lYXN1cmVtZW50cw0Kb3V0cHV0Og0KICBodG1sX25vdGVib29rOiBkZWZhdWx0DQogIGh0bWxfZG9jdW1lbnQ6DQogICAgZGZfcHJpbnQ6IHBhZ2VkDQotLS0NCg0KIyMgU3R1ZGVudCBEZXRhaWxzDQoNCg0KU3R1ZGVudCBOYW1lOiAgTmlzY2hheSBCaWtyYW0gVGhhcGEgPGJyLz4NClN0dWRlbnQgTm86IHMzODE5NDkxDQoNCg0KIyMgUHJvYmxlbSBTdGF0ZW1lbnQNCg0KPGg0PkRvIGNoZXN0IGRpYW1ldGVyIG9mIG1lbiBvciB3b21lbiBmb2xsb3cgYSBiZWxsIGN1cnZlPyA8L2g0Pg0KDQo8cD5NYWxlcyBhbmQgZmVtYWxlcyB0ZW5kIHRvIGhhdmUgZGlmZmVyZW50IGJvZHkgZGltZW5zaW9uLlRoZSBkYXRhc2V0IHByb3ZpZGVkIGRlc2NyaWJlcyAyNiBkaWZmZXJlbnQgYm9keSBtZWFzdXJlbWVudHMgZnJvbSAyNDcgbWVuIGFuZCAyNjAgd29tZW4gcHJpbWFyaWx5IGluIHRoZWlyIHR3ZW50aWVzIGFuZCBlYXJseSB0aGlydGllcyB3aXRoIGEgc2NhdHRlcmluZyBvZiBvbGRlciBtZW4gYW5kIHdvbWVuLCBhbGwgcGh5c2ljYWxseSBhY3RpdmUgKHNldmVyYWwgaG91cnMgb2YgZXhlcmNpc2UgYSB3ZWVrKS4gSG93ZXZlciwgdGhpcyByZXBvcnQgaW52ZXN0aWdhdGVzIG9uZSBwYXJ0aWN1bGFyIHZhcmlhYmxlIHRvIGRldGVybWluZSB3aGV0aGVyIHRoZSBzZWxlY3RlZCB2YXJpYWJsZSBmaXRzIGEgbm9ybWFsIGRpc3RyaWJ1dGlvbiBzZXBhcmF0ZWx5IGluIG1hbGVzIGFuZCBmZW1hbGVzLiBIZXJlLCB0aGUgdmFyaWFibGUgb2YgaW50ZXJlc3QgaXMgJ2NoZS5kaScgd2hpY2ggaW5kaWNhdGVzIHRoZSByZXNwb25kZW50J3MgY2hlc3QgZGlhbWV0ZXIgaW4gY2VudGltZXRyZXMsIG1lYXN1cmVkIGF0IG5pcHBsZSBsZXZlbCwgbWlkLWV4cGlyYXRpb24uPC9wPg0KDQo8cD5UaGUgaW52ZXN0aWdhdGlvbiBiZWdpbnMgYnkgaW1wb3J0aW5nIHRoZSBkYXRhc2V0KGJkaW1zLmNzdikgaW50byBSLCB0cmFuc2Zvcm1pbmcgbGV2ZWxzIG9mIHNleCB2YXJpYWJsZSBpbnRvIGZhY3RvcnMgc3VjaCBhcyBNYWxlIGFuZCBGZW1hbGUuIFN1YnNlcXVlbnRseSwgc3VtbWFyeSBzdGF0aXN0aWNzIGFyZSBjYWxjdWxhdGVkIHNlcGFyYXRlbHkgZm9yIG1lbiBhbmQgd29tZW4gd2l0aCBhIHZhcmlhYmxlIG9mIGludGVyZXN0LiBBZnRlciB0aGlzLCBmdXJ0aGVyIGV4cGxvcmF0aW9uIGlzIGRvbmUgYnkgZ2VuZXJhdGluZyBhIGhpc3RvZ3JhbSBmb3IgdGhlIHR3byBsZXZlbHMuIEZpbmFsbHksIHRoZSBzdHVkeSBlbmRzIHdpdGggdGhlIGNvbXBhcmlzb24gb2YgdGhlIG91dHB1dCB3aXRoIGEgY2xlYXIgY29uY2x1c2lvbiBvbiB3aGV0aGVyIHRoZSBjaGVzdCBkaWFtZXRlciBvZiBtZW4gYW5kIHdvbWVuIGZpdHMgYSBub3JtYWwgZGlzdHJpYnV0aW9uIGN1cnZlLjwvcD4NCg0KDQojIyBMb2FkIFBhY2thZ2VzDQoNCmBgYHtyfQ0KIyBMb2FkIG5lY2Vzc2FyeSBwYWNrYWdlcw0KI0ZvciBkYXRhIG1hbmlwdWxhdGlvbg0KbGlicmFyeShkcGx5cikNCiMgRm9yIHJlYWRpbmcgY3N2IGZpbGUNCmxpYnJhcnkocmVhZHIpDQpgYGANCg0KIyMgRGF0YQ0KDQoNCjxwPldoaWxlIGltcG9ydGluZyB0aGUgZGF0YXNldCwgYWxsIHZhcmlhYmxlIGlzIGludGVycHJldGVkIGFzIG51bWVyaWMgaW5jbHVkaW5nIHRoZSBzZXggY29sdW1uLiBBcyBwZXIgb3VyIHVuZGVyc3RhbmRpbmcgc2V4IGlzIGEgbm9taW5hbCB2YXJpYWJsZSBjb21wcmlzaW5nIHR3byBkaXN0aW5jdCBjYXRlZ29yaWVzLiBUaGVyZWZvcmUsIGZvciBiZXR0ZXIgcmVwcmVzZW50YXRpb24sIGl0IGhhcyBiZWVuIGNvbnZlcnRlZCBpbnRvIGZhY3RvcnMgb2YgdHdvIGxldmVscyB3aGVyZSAwIGRlbm90ZXMgJ0ZlbWFsZScgYW5kIDEgZGVub3RlcyAnTWFsZScuIDwvcD4NCg0KYGBge3J9DQojIFJlYWQgdGhlIGNzdiBmaWxlIHdpdGggcmVhZF9jc3YgZnVuY3Rpb24NCmRhdGEgPC0gcmVhZF9jc3YoJ2JkaW1zLmNzdicpDQojIENvbnZlcnQgbnVtZXJpYyB0byBmYWN0b3JzIHdpdGggbGFiZWxzIGZvciBzZXggdmFyaWFibGUNCmRhdGEkc2V4IDwtIGZhY3RvcihkYXRhJHNleCxsZXZlbHMgPSBjKCcxJywnMCcpLGxhYmVscz1jKCdNYWxlJywnRmVtYWxlJykpDQojSW5pdGlhbCB2aWV3IG9mIHRoZSBkYXRhDQpoZWFkKGRhdGEpDQpgYGANCg0KIyMgU3VtbWFyeSBTdGF0aXN0aWNzDQpgYGB7cn0NCiMgR3JvdXAgdGhlIGRhdGEgYnkgc2V4IGNvbHVtbiBhbmQgY2FsY3VsYXRlIGRlc2NyaXB0aXZlIHN0YXRpc3RpY3MNCg0KZGF0YSAlPiUgZ3JvdXBfYnkoc2V4KSAlPiUgc3VtbWFyaXNlKG1lYW4gPSBtZWFuKGNoZS5kaSxuYS5ybSA9IFQpICU+JSByb3VuZCgyKSwNCiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBtZWRpYW4gPSBtZWRpYW4oY2hlLmRpLG5hLnJtPVQpLA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHMuZC4gPSBzZChjaGUuZGksbmEucm09VCkgJT4lIHJvdW5kKDIpLA0KICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBmaXJzdF9xdWFydGlsZSA9ICAgICAgcXVhbnRpbGUoY2hlLmRpLDAuMjUsbmEucm09VCksDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHRoaXJkX3F1YXJ0aWxlID0gcXVhbnRpbGUoY2hlLmRpLDAuNzUsbmEucm09VCksDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIElRUiA9IElRUihjaGUuZGksbmEucm09VCksDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIG1pbiA9IG1pbihjaGUuZGksbmEucm09VCksDQogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIG1heCA9IG1heChjaGUuZGksbmEucm09VCkpDQpgYGANCg0KVGhlIGFib3ZlIHN1bW1hcnkgc3RhdGlzdGljcyBkZXBpY3QgZGlmZmVyZW50IGluZm9ybWF0aW9uIGFib3V0IG1hbGUgYW5kIGZlbWFsZSBjaGVzdCBkaWFtZXRlciBtZWFzdXJlbWVudHMuIE9uIGF2ZXJhZ2UsIHRoZSBjaGVzdCBkaWFtZXRlciBvZiBhIG1hbGUgaXMgMjkuOTUgY20gd2hlcmVhcyAyNi4xIGluIGZlbWFsZXMuIEhvd2V2ZXIsIHRoZSBtZWFzdXJlbWVudCBvZiBmZW1hbGVzIHRlbmRzIHRvIGJlIGNsb3NlIHRvd2FyZHMgdGhlIG1lYW4gb3Bwb3NlZCB0byBtYWxlcy4gDQoNCiMjIERpc3RyaWJ1dGlvbiBGaXR0aW5nDQoNCg0KYGBge3J9DQojIFN1YnNldCBkYXRhIGZvciBtYWxlIGFuZCBmZW1hbGVzIHNlcGFyYXRlbHkNCm1hbGUgPC0gc3Vic2V0KGRhdGEkY2hlLmRpLGRhdGEkc2V4PT0iTWFsZSIpDQpmZW1hbGUgPC0gc3Vic2V0KGRhdGEkY2hlLmRpLGRhdGEkc2V4PT0iRmVtYWxlIikNCiNQbG90IGhpc3RvZ3JhbSB3aXRoIG5vcm1hbCBkaXN0cmlidXRpb24gY3VydmUNCmhpc3QobWFsZSxicmVha3MgPSAxNSx4bGltID0gYygyMCw0MCkscHJvYmFiaWxpdHkgPSBULGNvbD0iZG9kZ2VyYmx1ZTMiLHhsYWI9IkNoZXN0IGRpYW1ldGVyIChjbSkiLCB5bGFiID0gIlByb2JhYmlsaXR5IixtYWluPSJIaXN0b2dyYW0gb2YgTWFsZSBDaGVzdCBEaWFtZXRlcihpbiBjbSkiKQ0KIyBBZGQgYSBkZW5zaXR5IGN1cnZlDQpsaW5lcyhkZW5zaXR5KG1hbGUpLCBjb2wgPSAibGlnaHRibHVlNCIsIGx3ZCA9IDIpDQojIENhbGN1bGF0ZSBub3JtYWwgZGlzdHJpYnV0aW9uDQptYWxlX25vcm0gPSBybm9ybShsZW5ndGgobWFsZSksbWVhbihtYWxlKSxzZChtYWxlKSkNCiMgRHJhdyBhIHNtb290aCBub3JtYWwgY3VydmUNCmxpbmVzKGRlbnNpdHkobWFsZV9ub3JtLGFkanVzdCA9IDIpLCBjb2wgPSAiUmVkIiwgbHdkID0gMikNCg0KYGBgDQoNCmBgYHtyfQ0KI1Bsb3QgaGlzdG9ncmFtIHdpdGggbm9ybWFsIGRpc3RyaWJ1dGlvbiBjdXJ2ZQ0KaGlzdChmZW1hbGUsYnJlYWtzID0gMTUseGxpbT1jKDIwLDM1KSxwcm9iYWJpbGl0eSA9IFQsY29sPSJwYWxlZ29sZGVucm9kIix4bGFiPSJDaGVzdCBkaWFtZXRlciAoY20pIiwgeWxhYiA9ICJQcm9iYWJpbGl0eSIsbWFpbj0iSGlzdG9ncmFtIG9mIEZlbWFsZSBDaGVzdCBEaWFtZXRlcihpbiBjbSkiKQ0KI0FkZCBhIGRlbnNpdHkgY3VydmUNCmxpbmVzKGRlbnNpdHkoZmVtYWxlKSwgY29sID0gInJlZCIsIGx3ZCA9IDIpDQojIENhbGN1bGF0ZSBub3JtYWwgZGlzdHJpYnV0aW9uDQpmZW1hbGVfbm9ybSA9IHJub3JtKGxlbmd0aChmZW1hbGUpLG1lYW4oZmVtYWxlKSxzZChmZW1hbGUpKQ0KIyBEcmF3IGEgc21vb3RoIG5vcm1hbCBjdXJ2ZQ0KbGluZXMoZGVuc2l0eShmZW1hbGVfbm9ybSxhZGp1c3QgPSAyKSwgY29sID0gIm1pZG5pZ2h0Ymx1ZSIsIGx3ZCA9IDIpDQpgYGANCg0KIyMgSW50ZXJwcmV0YXRpb24NCg0KQSBub3JtYWwgZGlzdHJpYnV0aW9uIGhhcyBhIGJlbGwtc2hhcGVkIGRlbnNpdHkgY3VydmUgZGVzY3JpYmVkIGJ5IGl0cyBtZWFuIGFuZCBzdGFuZGFyZCBkZXZpYXRpb24uIFRoZSBkZW5zaXR5IGN1cnZlIGlzIHN5bW1ldHJpY2FsLCBjZW50cmVkIGFib3V0IGl0cyBtZWFuLCB3aXRoIGl0cyBzcHJlYWQgZGV0ZXJtaW5lZCBieSBpdHMgc3RhbmRhcmQgZGV2aWF0aW9uLiBUbyBzcGVhayBzcGVjaWZpY2FsbHkgb2YgYW55IG5vcm1hbCBkaXN0cmlidXRpb24sIHR3byBxdWFudGl0aWVzIGhhdmUgdG8gYmUgc3BlY2lmaWVkOiB0aGUgbWVhbiwgd2hlcmUgdGhlIHBlYWsgb2YgdGhlIGRlbnNpdHkgb2NjdXJzLCBhbmQgdGhlIHN0YW5kYXJkIGRldmlhdGlvbiwgd2hpY2ggaW5kaWNhdGVzIHRoZSBzcHJlYWQgb3IgZ2lydGggb2YgdGhlIGJlbGwgY3VydmUuDQoNClRoZSBlbXBpcmljYWwgcnVsZSBvZiBhIG5vcm1hbCBkZW5zaXR5IGN1cnZlIHN0YXRlcyB0aGF0IDY4JSBvZiB0aGUgb2JzZXJ2YXRpb25zIGZhbGwgd2l0aGluIDEgc3RhbmRhcmQgZGV2aWF0aW9uIG9mIHRoZSBtZWFuIGZvbGxvd2luZyB3aXRoIDk1JSB3aXRoaW4gMiBzdGFuZGFyZCBkZXZpYXRpb25zIGFuZCA5OS43JSB3aXRoaW4gMyBzdGFuZGFyZCBkZXZpYXRpb25zIG9mIHRoZSBtZWFuLiBUaGVyZWZvcmUsIHRoZXNlIHRoZW9yZXRpY2FsIGF0dHJpYnV0ZXMgYXJlIHNhdGlzZmllZCBhcyB0aGUgY2hlc3QgZGlhbWV0ZXIgb2YgdGhlIG1hbGUgaG9sZHMgbW9zdCBvZiBpdHMgdmFsdWVzIGNsdXN0ZXJlZCBhcm91bmQgdGhlIG1lYW4gaS5lIDI5Ljk1IGVxdWFsIHRvIHRoZSBtZWRpYW4gdGhhdCBzaWduaWZpZXMgc3ltbWV0cmljIHByb3BlcnR5IHdpdGggYSBzdGFuZGFyZCBkZXZpYXRpb24gb2YgMi4wOC4gSG93ZXZlciwgdGhlIGNoZXN0IGRpYW1ldGVyIG9mIHRoZSBmZW1hbGUgY29uc2lzdHMgb2YgZGF0YSBwb2ludHMgZXh0ZW5kZWQgdG93YXJkcyB0aGUgcmlnaHQgc2hvdyBtZWFuICgyNi4xMCkgZ3JlYXRlciB0aGFuIHRoZSBtZWRpYW4gKDI1LjkpLiANCg0KTW9yZW92ZXIsIGFmdGVyIGludmVzdGlnYXRpbmcgZ3JhcGhpY2FsbHkgaW4gdGhlIGFib3ZlIGhpc3RvZ3JhbSB3aXRoIGEgbm9ybWFsIGN1cnZlIG92ZXJsYXksIHdlIGNhbiBpbmZlciB0aGF0IG1hbGUgY2hlc3QgZGlhbWV0ZXIgZml0cyBhIG5vcm1hbCBkaXN0cmlidXRpb24gd2hlcmVhcyB0aGUgZmVtYWxlIGNoZXN0IGRpYW1ldGVyIGlzIHNrZXdlZCB0b3dhcmRzIHRoZSByaWdodC4NCg0KIyMgUmVmZXJlbmNlcw0KDQpodHRwOi8vc3RhdHdlYi5zdGFuZm9yZC5lZHUvfm5hcmFzL2pzbS9Ob3JtYWxEZW5zaXR5L05vcm1hbERlbnNpdHkuaHRtbA0KDQpodHRwOi8vd3d3LnN0YXQueWFsZS5lZHUvQ291cnNlcy8xOTk3LTk4LzEwMS9ub3JtYWwuaHRtDQoNCg==