dd <- read.csv("Data - Deans Dilemma.csv")
placed<- dd[ which(dd$Placement_B=='1'), ]
View(placed)
2)Table showing the mean salary of placed students (male and female both)
aggregate(placed$Salary, by=list(Gender=placed$Gender), mean)
## Gender x
## 1 F 253068.0
## 2 M 284241.9
placed_m <- placed[ which(placed$Gender.B=='0'), ]
library(psych)
describe(placed_m$Salary)
## vars n mean sd median trimmed mad min max range
## X1 1 215 284241.9 99430.42 265000 273317.9 51891 120000 940000 820000
## skew kurtosis se
## X1 2.25 9.91 6781.1
placed_f <- placed[ which(placed$Gender.B=='1'), ]
library(psych)
describe(placed_f$Salary)
## vars n mean sd median trimmed mad min max range skew
## X1 1 97 253068 74190.54 240000 246329.1 59304 120000 650000 530000 1.81
## kurtosis se
## X1 7.03 7532.91
t.test(placed$Salary~placed$Gender)
##
## Welch Two Sample t-test
##
## data: placed$Salary by placed$Gender
## t = -3.0757, df = 243.03, p-value = 0.00234
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -51138.42 -11209.22
## sample estimates:
## mean in group F mean in group M
## 253068.0 284241.9
The p-value is 0.00234 which is less than 0.05, therefore, we can rejet the null hypothesis and interpret that the average salary of the male students is higher than the average salary of female students.