w2d1

dean.df <- read.csv(paste("abcd.csv",sep = ""))
View(dean.df)

The average salary of males and females is shown in front of the cells marked M and F respectively in the table shown below.

placed.df <- dean.df[which(dean.df$Placement_B == 1) , ]
aggregate(placed.df$Salary , by = list(placed.df$Gender) ,mean )

##   Group.1        x
## 1       F 253068.0
## 2       M 284241.9

We can observe that the average salary if males is higher than that of females, but to check for a significant difference, we will have to use the T-Test

library("MASS", lib.loc="~/R/win-library/3.4")
t.test(Salary~Gender , data = placed.df)

## 
##  Welch Two Sample t-test
## 
## data:  Salary by Gender
## t = -3.0757, df = 243.03, p-value = 0.00234
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -51138.42 -11209.22
## sample estimates:
## mean in group F mean in group M 
##        253068.0        284241.9

The p-value of the above data is 0.00234 as given in the result of the t-test.

So, the average salary of males is significantly greater than the average salary for females, we fail to reject the given hypothesis H1.

w2d1

nihir gulati

December 10, 2017