The analysis is based on the dataset Data - Deans Dilemma.csv.

This file explains the meaning of each column in the given dataset.

dilemma.df<-read.csv("Data - Deans Dilemma.csv")

Task 3b

Use R to create a table showing the average salary of males and females, who were placed. Review whether there is a gender gap in the data. In other words, observe whether the average salaries of males is higher than the average salaries of females in this dataset.

placed.df<-dilemma.df[which(dilemma.df$Placement_B==1),]
aggregate(placed.df$Salary~placed.df$Gender, FUN = mean)
##   placed.df$Gender placed.df$Salary
## 1                F         253068.0
## 2                M         284241.9

Clearly, there is a gender gap in the mean salaries.

Task 3c

Use R to run a t-test to test the following hypothesis: H1: The average salary of the male MBAs is higher than the average salary of female MBAs.

t.test(placed.df$Salary~placed.df$Gender,data = placed.df)
## 
##  Welch Two Sample t-test
## 
## data:  placed.df$Salary by placed.df$Gender
## t = -3.0757, df = 243.03, p-value = 0.00234
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -51138.42 -11209.22
## sample estimates:
## mean in group F mean in group M 
##        253068.0        284241.9

Task 3d

  1. Submit your R code that creates a table showing the mean salary of males and females, who were placed.
placed.df<-dilemma.df[which(dilemma.df$Placement_B==1),]
aggregate(placed.df$Salary~placed.df$Gender, FUN = mean)
##   placed.df$Gender placed.df$Salary
## 1                F         253068.0
## 2                M         284241.9
  1. What is the average salary of male MBAs who were placed?

284241.9

  1. What is the average salary of female MBAs who were placed?

253068.0

  1. Submit R code to run a t-test for the Hypothesis “The average salary of the male MBAs is higher than the average salary of female MBAs.”
t.test(placed.df$Salary~placed.df$Gender,data = placed.df)
## 
##  Welch Two Sample t-test
## 
## data:  placed.df$Salary by placed.df$Gender
## t = -3.0757, df = 243.03, p-value = 0.00234
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -51138.42 -11209.22
## sample estimates:
## mean in group F mean in group M 
##        253068.0        284241.9
  1. What is the p-value based on the t-test?

p-value = 0.00234

  1. Please interpret the meaning of the t-test, as applied to the average salaries of male and female MBAs.

Since p-value < 0.05, we would reject our null hypothesis. Thus, there’s significant difference between the means of our sample population i.e. it is true that the average salary of the male MBAs is higher than the average salary of female MBAs.