Is there a gender gap when it comes to salary?

setwd("~/Downloads/Prof Sameer Mathur")
d_dilemma<-read.csv("Data - Deans Dilemma.csv")
placed<-d_dilemma[d_dilemma$Placement_B==1,c(1:23,26)]
sal_genderwise<-aggregate(placed$Salary, list(placed$Gender), mean)
colnames(sal_genderwise)<-c("Gender", "Average salary")
sal_genderwise
##   Gender Average salary
## 1      F       253068.0
## 2      M       284241.9
summary(placed$Gender)
##   F   M 
##  97 215

The average salary of men is definitely higher than that of the women by a modest margin of 12%. When we see the count of men and women who were placed, we can see a gender gap in the numbers itself, which shows that there are fewer women at the workplace when compared to men.

Boxplot and T-test

attach(placed)
sal_mean_gender<-aggregate(Salary~Gender, data = placed, FUN = mean)
options(scipen = 999)
boxplot(Salary~Gender, main="Gender-wise salaries", xlab="Gender",ylab="salary",col = c("green","blue"),log = "y")

log.transformed.salary = log(Salary)
t.test(log.transformed.salary~Gender,mean.equal = TRUE)
## 
##  Welch Two Sample t-test
## 
## data:  log.transformed.salary by Gender
## t = -2.9732, df = 212.24, p-value = 0.003287
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.1711001 -0.0346748
## sample estimates:
## mean in group F mean in group M 
##        12.40435        12.50723

Interpretations :-

1. The average salary of men at 284241 is higher than that of women at 253068 by around 12%.

2. The boxplot shows that the median male and female salaries are not too spaced out but the male boxplot shows too many outliers towards the higher salary bracket.

3. After performing a t-test on the two variables we can see that the p value is 0.003287, which is too low and hence the NULL hypothesis here that the mean salary of men is NOT significantly higher than the women will be rejected which shows us that the average salary of men is indeed higher than that of women.

4. We get a similar picture from the boxplot that the men have many outliers beyond the maximum salary markm which shows a bias in favor of the men.

```