getwd()
## [1] "C:/Users/parvp/Desktop/data analytics internship"
df <- read.csv("Data - Deans Dilemma.csv")

Task 3b

placed<-df[which(df$Placement_B==1),]
(aggregate(df$Salary~df$Gender, data=placed, FUN=mean))
##   df$Gender df$Salary
## 1         F  193288.2
## 2         M  231484.8

average salary of male MBAs who were placed - 284241.9

average salary of female MBAs who were placed - 253068.0

boxplot(placed$Salary~placed$Gender,main = "Average Salary of Male MBAs and Female MBAs", horizontal = TRUE, ylab = "Gender (Male/Female)", xlab = "Salary" ,col = (c("yellow","skyblue")), ylab = "Average Salary")

Since , the average salaries of males is higher than the average salaries of females in this dataset , their is a significant Gender Gap .

Task 3c

Let us consider this Null Hypothesis :The is no significant difference between the average salary of male MBAs and female MBAs

t.test(Salary ~ Gender , data = placed)
## 
##  Welch Two Sample t-test
## 
## data:  Salary by Gender
## t = -3.0757, df = 243.03, p-value = 0.00234
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -51138.42 -11209.22
## sample estimates:
## mean in group F mean in group M 
##        253068.0        284241.9

The p-value is 0.0023 which is less than 0.05. Hence, we can safely reject the null hypothesis. This implies that there is a significant difference between average salaries of males and females i.e the average salary of the male MBAs is higher than the average salary of female MBAs.