Synopsis

From among the applicants seeking admission to their MBA program, the B-schools had to identify those students who could successfully complete the course and also get placed eventually. Identifying potential students was a critical decision and it became imperative to identify the factors that could differentiate the two categories of students-those who could be placed easily and those who would struggle to get placed. Each year, the admissions committee of every B-school faced the tough task of screening the applicants and selecting those students who would eventually succeed in the MBA program. MBA admissions needed much more analytical reasoning, taking multiple criteria into consideration. The admissions team wanted to understand whether a student’s academic record would have any reflection on the placement status.

2.Exercise

1.Use R to create a table showing the average salary of males and females, who were placed. Review whether there is a gender gap in the data. In other words, observe whether the average salaries of males is higher than the average salaries of females in this dataset.

dilemma.df<-read.csv(paste("Data - Deans Dilemma.csv",sep = ""))
placed.df<-dilemma.df[which(dilemma.df$Placement=="Placed"), ]
aggregate(Salary~Gender,data = placed.df, FUN = mean)
##   Gender   Salary
## 1      F 253068.0
## 2      M 284241.9

Since the average salaries of males is higher than the average salaries of females in this dataset, there is a gender gap.

2.What is the average salary of male MBAs who were placed?

From the above table it is clear that the average salary of male MBAs who were placed is Rs. 284241.9

3.What is the average salary of female MBAs who were placed?

From the above table it is clear that the average salary of female MBAs who were placed is Rs. 253068.0

4.Use R to run a t-test to test the following hypothesis:

H1: The average salary of the male MBAs is higher than the average salary of female MBAs.

dilemma.df<-read.csv(paste("Data - Deans Dilemma.csv",sep = ""))
placed.df<-dilemma.df[which(dilemma.df$Placement=="Placed"), ]
t.test(log(placed.df$Salary)~placed.df$Gender, var.equal=TRUE)
## 
##  Two Sample t-test
## 
## data:  log(placed.df$Salary) by placed.df$Gender
## t = -2.8142, df = 310, p-value = 0.005203
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.17482594 -0.03094897
## sample estimates:
## mean in group F mean in group M 
##        12.40435        12.50723

5.What is the p-value based on the t-test?

The p-value based on the t-test is 0.005203.

6.Please interpret the meaning of the t-test, as applied to the average salaries of male and female MBAs.

Since the p-value based on the t-test is less than 0.05, the null hypothesis is rejected. Thus, there is a significant difference between the average salaries of male and female MBAs, and hence the hypothesis is proved true.

Conclusion

The hypothesis that the average salary of the male MBAs is higher than the female MBAs is indeed true, as proved by independent t-test.This verifies the gender gap in this situation.