In the Dataset (bdims.csv) the Body girth measurements and skeletal diameter measurements with age, weight, height, and gender, are given for 507 physically active individuals - 247 men and 260 women is been provided.
In the data the “bia.di”-: Respondent’s biacromial diameter in centimeters which is the width of the shoulder is considered to calculate normal distribution.
Install packages
library(readr)
library(readxl)
library(tidyr)
library(dplyr)
library(ggformula)
library(mosaic)
library(car)
library(Hmisc)
library(outliers)
##Import Data
a<-read_excel("bdims.csv (1).xlsx")
a
## # A tibble: 507 x 25
## bia.di bii.di bit.di che.de che.di elb.di wri.di kne.di ank.di sho.gi
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 42.9 26 31.5 17.7 28 13.1 10.4 18.8 14.1 106.
## 2 43.7 28.5 33.5 16.9 30.8 14 11.8 20.6 15.1 110.
## 3 40.1 28.2 33.3 20.9 31.7 13.9 10.9 19.7 14.1 115.
## 4 44.3 29.9 34 18.4 28.2 13.9 11.2 20.9 15 104.
## 5 42.5 29.9 34 21.5 29.4 15.2 11.6 20.7 14.9 108.
## 6 43.3 27 31.5 19.6 31.3 14 11.5 18.8 13.9 120.
## 7 43.5 30 34 21.9 31.7 16.1 12.5 20.8 15.6 124.
## 8 44.4 29.8 33.2 21.8 28.8 15.1 11.9 21 14.6 120.
## 9 43.5 26.5 32.1 15.5 27.5 14.1 11.2 18.9 13.2 111
## 10 42 28 34 22.5 28 15.6 12 21.1 15 120.
## # ... with 497 more rows, and 15 more variables: che.gi <dbl>,
## # wai.gi <dbl>, nav.gi <dbl>, hip.gi <dbl>, thi.gi <dbl>, bic.gi <dbl>,
## # for.gi <dbl>, kne.gi <dbl>, cal.gi <dbl>, ank.gi <dbl>, wri.gi <dbl>,
## # age <dbl>, wgt <dbl>, hgt <dbl>, sex <dbl>
##Create Subset created a subset by selecting one measurement(“bia.di”-: Respondent’s biacromial diameter in centimeters) and Sex
ab<- a[,c(1,25)]
ab
## # A tibble: 507 x 2
## bia.di sex
## <dbl> <dbl>
## 1 42.9 1
## 2 43.7 1
## 3 40.1 1
## 4 44.3 1
## 5 42.5 1
## 6 43.3 1
## 7 43.5 1
## 8 44.4 1
## 9 43.5 1
## 10 42 1
## # ... with 497 more rows
Tidying data by labeling Male and Female.
ab$sex <- factor(ab$sex, levels = c(1,0), labels = c("Male","Female"))
ab
## # A tibble: 507 x 2
## bia.di sex
## <dbl> <fct>
## 1 42.9 Male
## 2 43.7 Male
## 3 40.1 Male
## 4 44.3 Male
## 5 42.5 Male
## 6 43.3 Male
## 7 43.5 Male
## 8 44.4 Male
## 9 43.5 Male
## 10 42 Male
## # ... with 497 more rows
Created a subset male along with bia.di
m <- subset(ab,subset=sex =="Male")
m
## # A tibble: 247 x 2
## bia.di sex
## <dbl> <fct>
## 1 42.9 Male
## 2 43.7 Male
## 3 40.1 Male
## 4 44.3 Male
## 5 42.5 Male
## 6 43.3 Male
## 7 43.5 Male
## 8 44.4 Male
## 9 43.5 Male
## 10 42 Male
## # ... with 237 more rows
checking if measurement fits a normal distribution
male<-qqPlot(m$bia.di, dist="norm")
male
## [1] 78 13
Created a subset female along with bia.di
f <- subset(ab,subset=sex =="Female")
f
## # A tibble: 260 x 2
## bia.di sex
## <dbl> <fct>
## 1 37.6 Female
## 2 36.7 Female
## 3 34.8 Female
## 4 36.6 Female
## 5 35.5 Female
## 6 37 Female
## 7 35.5 Female
## 8 37.4 Female
## 9 37.8 Female
## 10 38.6 Female
## # ... with 250 more rows
checking if measurement fits a normal distribution
female<-qqPlot(f$bia.di, dist="norm")
female
## [1] 24 233
statistics summary for bia.di
favstats(~bia.di | sex,data =ab)
## sex min Q1 median Q3 max mean sd n missing
## 1 Male 34.1 40.000 41.2 42.6 47.4 41.24130 2.087164 247 0
## 2 Female 32.4 35.175 36.4 37.8 42.6 36.50308 1.779221 260 0
##Empirical Distribution of Body Measurement #Distribution Calculation for male
M=mean(m$bia.di)
S=sd(m$bia.di)
ploting of histogram
hist(m$bia.di, breaks=20, prob=TRUE,
xlab="x-variable", ylim=c(0,0.4), col = "light blue",
main="Normal Curve Histogram")
x<- seq(min(m$bia.di),max(m$bia.di),0.3)
y<- dnorm(x,M,S)
points(x,y,type = 'l',col=" orange",lwd=3)
#Distribution Calculation for female
Me=mean(f$bia.di)
Sd=sd(f$bia.di)
ploting of histogram
hist(f$bia.di, breaks=20, prob=TRUE,
xlab="x-variable", ylim=c(0,0.4),col = "orange",
main="Normal Curve Histogram")
xx<- seq(min(f$bia.di),max(f$bia.di),0.3)
yy<- dnorm(xx,Me,Sd)
points(xx,yy,type = 'l',col="brown",lwd=3 )
By the above ploted histogram we can conclude that the variable “bia.di” which is the shoulder width of the human body fits the emperical data The curve also fits the data by which we can understand that it fits the normal distribution
While we compare the mean values of male and female we can see that the male has a higher mean of 41.24130 than female having 36.50308 . Next the standard deviation of male is 2.087164 amd for female it is 1.779221’. Also the min value of male is 34.1 and max value is 47.4, Where as for the female the min value is 32.4 and the max value is 42.6.
By considering the above paramaters we can conclude that the male have a wider shoulder than compared to the female
————————————————————————-END——————————————————————————-