library(vcd) #library initialization
library(psych)
2.b Reading the dataset
Load Data - Dean’s Dilemma data file
de1.df <- read.csv(paste("Data - Deans Dilemma.csv", sep=""))
3.b to create a table showing the average salary of males and females, who were placed.
placed <- de1.df[ which(de1.df$Placement=='Placed'), ]
attach(placed)
aggregate(Salary~Gender,data=placed,FUN = mean)
## Gender Salary
## 1 F 253068.0
## 2 M 284241.9
3.c to run a t-test to test the following hypothesis:
H1: The average salary of the male MBAs is higher than the average salary of female MBAs.
t.test(Salary~Gender, data = placed)
##
## Welch Two Sample t-test
##
## data: Salary by Gender
## t = -3.0757, df = 243.03, p-value = 0.00234
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -51138.42 -11209.22
## sample estimates:
## mean in group F mean in group M
## 253068.0 284241.9
Based on the above output of the t-test, we can reject the hypothesis that the average salary of female and male are equal because (p<0.005).
3.d.1 to create a table showing the mean salary of males and females, who were placed.
aggregate(Salary~Gender,data=placed,FUN = mean)
## Gender Salary
## 1 F 253068.0
## 2 M 284241.9
3.d.2,3 from the above output we can see the avg salary of male is 284241.9 and the female is 253068.0.
3.d.4 R code to run a t-test for the Hypothesis “The average salary of the male MBAs is higher than the average salary of female MBAs.
t.test(Salary~Gender, data = placed)
##
## Welch Two Sample t-test
##
## data: Salary by Gender
## t = -3.0757, df = 243.03, p-value = 0.00234
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -51138.42 -11209.22
## sample estimates:
## mean in group F mean in group M
## 253068.0 284241.9
3.d.5,6 As we can see the p-value from the above output is 0.00234.
therefore we can say the given hypothesis is right , as p-value is less than 0.01.