dean.df<- read.csv("Data - Deans Dilemma.csv",sep = ",")
View(dean.df)
placedmale.df<- dean.df[ which(dean.df$Gender.B=='0') , ]
View(placedmale.df)
placedfemale.df<- dean.df[ which(dean.df$Gender.B=='1') , ]
View(placedfemale.df)
x=mean(placedfemale.df$Salary)
y=mean(placedmale.df$Salary)
table(x,y)
## y
## x 231484.848484848
## 193288.188976378 1
H0: The average salary of the male MBAs is higher than the average salary of female MBAs. H1: The average salary of the male MBAs is not higher than the average salary of female MBAs. #two sided test
library(MASS)
library(psych)
describe(placedfemale.df$Salary)
## vars n mean sd median trimmed mad min max range
## X1 1 127 193288.2 125857.6 220000 192598.1 99334.2 0 650000 650000
## skew kurtosis se
## X1 -0.12 0.17 11168.06
describe(placedmale.df$Salary)
## vars n mean sd median trimmed mad min max range skew
## X1 1 264 231484.9 142489.8 250000 229273.6 74130 0 940000 940000 0.31
## kurtosis se
## X1 2.05 8769.64
par(mfrow=c(1,2))
boxplot(placedfemale.df$Salary)
boxplot(placedmale.df$Salary)
Q.What is the average salary of male MBAs who were placed? Q.What is the average salary of female MBAs who were placed?
aggregate(dean.df$Salary, by=list(dean.df$Gender.B), FUN= mean)
## Group.1 x
## 1 0 231484.8
## 2 1 193288.2
t.test(dean.df$Salary,dean.df$Gender.B,paired = TRUE)
##
## Paired t-test
##
## data: dean.df$Salary and dean.df$Gender.B
## t = 31.32, df = 390, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 205325.9 232830.0
## sample estimates:
## mean of the differences
## 219077.9
Q.What is the p-value based on the t-test?
(t.test(dean.df$Salary,dean.df$Gender.B,paired = TRUE))$p.value
## [1] 1.642345e-108
Please interpret the meaning of the t-test, as applied to the average salaries of male and female MBAs.
It is observed that the avg. salary of placed males is more than avg. salary placed females.