Problem Statement

The purpose of this report is to analyze whether the biilac diameter of both male and female fits for normal distribution. Biiliac diameter is chosen as the variable for this analysis and the analysis is conducted based on the gender.

Load Packages

library(ggplot2)
library(magrittr)
## Warning: package 'magrittr' was built under R version 3.6.3
library(knitr)
## Warning: package 'knitr' was built under R version 3.6.3
library(readr) # Useful for importing data
## Warning: package 'readr' was built under R version 3.6.3
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.6.3
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

Data

bdims.csv file is imported into R. Factor function is used to convert the numeric indication (0 and 1) of gender into word form. Then, two data frame is created to store the variables based on the gender group

bdims <- read_csv("bdims.csv.csv")
## Parsed with column specification:
## cols(
##   .default = col_double()
## )
## See spec(...) for full column specifications.
bdims$sex <- factor(bdims$sex, levels = c(0, 1), labels = c("Female", "Male"))
#Creating data frames according to gender
bdims.male <- bdims %>% filter(sex == "Male")
bdims.female <- bdims %>% filter(sex == "Female")

Summary Statistics

Decriptive statistics of biiliac diameter is calculated according to the gender type.

#Quartile, Minimum, Maximum and Mean values of biiliac diameter in female
summary(bdims.female$bii.di)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   18.70   26.20   27.80   27.58   29.20   33.30
#Quartile, Minimum, Maximum and Mean values of biiliac diameter in male
summary(bdims.male$bii.di)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   19.40   26.80   28.00   28.09   29.45   34.70
#Interquartile range of biiliac diameter in female
IQR(bdims.female$bii.di)
## [1] 3
#Interquartile range of biiliac diameter in male
IQR(bdims.male$bii.di)
## [1] 2.65
#Standard deviation in biiliac diameter in female
sd(bdims.female$bii.di)
## [1] 2.307476
#Standard deviation in biiliac diameter in male
sd(bdims.male$bii.di)
## [1] 2.067098
#Variance in biiliac diameter in female
var(bdims.female$bii.di)
## [1] 5.324446
#Variance in biiliac diameter in male
var(bdims.male$bii.di)
## [1] 4.272895

Distribution Fitting

Histogram and graph of empirical distribution are plotted for both male and female.The blue line in the histograms shows the mean of biilac diameter and the solid red line shows the normal distribution. These visual presentations are created to draw an insightful conclusion on biiliac diameter in fitting normal distribution.

# Histogram for biiliac diameter in female
his_female <- ggplot(bdims.female, aes(x = bii.di))
his_female <- his_female + geom_histogram(binwidth = 1, color = "black", fill = "yellow",aes
(y=..density..))+ geom_vline(xintercept = mean(bdims.female$bii.di), color = "blue", linetype
= "solid")+stat_function(fun=dnorm, color="red", args = list(mean=mean(bdims.female$bii.di),
 sd=sd(bdims.female$bii.di)))

his_female <- his_female + ggtitle("Distribution of biiliac diameter in female") + xlab(" Bi
iliac diameter(cm)") + ylab("Density")
his_female

# Histogram for biiliac diameter in male
his_male <- ggplot(bdims.male, aes(x = bii.di))
his_male <- his_male + geom_histogram(binwidth = 1, color = "black", fill = "yellow",aes
(y=..density..))+ geom_vline(xintercept = mean(bdims.male$bii.di), color = "blue", linetype= "solid")+stat_function(fun=dnorm, color="red", args = list(mean=mean(bdims.male$bii.di),
 sd=sd(bdims.male$bii.di)))

his_male <- his_male + ggtitle("Distribution of biiliac diameter in male") + xlab(" Biiliac diameter(cm)") + ylab("Density")

his_male

#Empirical distribution of female
plot(ecdf(bdims.female$bii.di), main="Empirical distribution of biiliac diameter in female",
 xlab = "Biiliac diameter(cm)", ylab = "Density", col="blue")

#Empirical distribution of male
plot(ecdf(bdims.male$bii.di), main="Empirical distribution of biiliac diameter in male", xlab
= "Biiliac diameter(cm)", ylab = "Density", col="blue")

Interpretation

Although the mean value of biiliac diameter for the males and females are almost similar, the males have a higher biilac diameter compared to the females. The histogram of biiliac diameter for female is left skewed and the histogram of biiliac diameter for male is symmetric. The curved red line in the histograms indicates normal distribution. Both of the histograms have overlayed bell curve and almost all the data in the yellow colour follows the red bell curve closely. It clearly shows that the histograms of biiliac diameter of both male and female fit the normal distribution.Moreover,having look at the empirical distribution of biiliac diameter plots of female and male, the blue curve which is approaching 1 from 0 without reaching 1, shows that the empirical distribution of biiliac diameter plots are true cumulative distribution in the standard normal distribution. In a nutshell, biiliac diameter of both male and female fits the normal distribution.