First we need to import the Dean’s Dilemma.csv file into R dataframe like this:
setwd("F:/Data Analytics for Managerial Applications")
mba.df <- read.csv(paste("Data - Deans Dilemma.csv", sep = ""))
View(mba.df)
To create a dataset for the placed students only:
placed <- mba.df[which(mba.df$Placement == 'Placed'),]
View(placed)
3d. 1) To create a table for average salaries of males and females who were placed:
aggregate(placed$Salary,by = list(sex = placed$Gender), mean)
## sex x
## 1 F 253068.0
## 2 M 284241.9
3d. 2) Average salary of males MBAs who were placed = 284241.9/-
3d. 3) Average salary of female MBAs who were placed = 253068.0/-
3d. 4) To run a t-test to test the hypothesis - The average salaries of male MBAs is higher than the average salaries of female MBAs. The Null hypthosesis is - There is no significant difference in the average salaries of male MBAs and female MBAs:
t.test(Salary ~ Gender, data = placed)
##
## Welch Two Sample t-test
##
## data: Salary by Gender
## t = -3.0757, df = 243.03, p-value = 0.00234
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -51138.42 -11209.22
## sample estimates:
## mean in group F mean in group M
## 253068.0 284241.9
3d. 5) The p-value based on the t-test is : 0.00234
3d. 6) Therefore, as per the results of the t-test, since p-value = 0.00234 < 0.05, we reject the null hypothesis that there is no significant difference in the average salaries of males MBAs and female MBAs.