dilemma.df<-read.csv("Data - Deans Dilemma.csv")
Use R to create a table showing the average salary of males and females, who were placed. Review whether there is a gender gap in the data. In other words, observe whether the average salaries of males is higher than the average salaries of females in this dataset.
Answer 1.
placed.df<-dilemma.df[which(dilemma.df$Placement_B==1),]
aggregate(placed.df$Salary~placed.df$Gender, FUN = mean)
## placed.df$Gender placed.df$Salary
## 1 F 253068.0
## 2 M 284241.9
Answer 2.
Average salary of male MBAs who were placed is 284241.9
Answer 3.
Average salary of female MBAs who were placed is 253068.0
Use R to run a t-test to test the following hypothesis: H1: The average salary of the male MBAs is higher than the average salary of female MBAs.
Answer 4.
t.test(placed.df$Salary~placed.df$Gender,data = placed.df)
##
## Welch Two Sample t-test
##
## data: placed.df$Salary by placed.df$Gender
## t = -3.0757, df = 243.03, p-value = 0.00234
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -51138.42 -11209.22
## sample estimates:
## mean in group F mean in group M
## 253068.0 284241.9
Answer 5.
p-value based on the t-test is 0.00234
Answer 6.
As the p-value is 0.00234 which is less than 0.05, hence we can safely reject the null hypothesis. There is a significant difference between the male mean salary and the female mean salary of our sample population. That is average salary of male is higher than average salary of females.