This is an assignment given on Week 2, Day 1 of the Data Analytics Internship under Prof. Sameer Mathur, IIML.

TASK 2a : Download and review the Data - Dean’s Dilemma.csv data file associated with this case. Read the data set into RStudio.

setwd("C:/Users/Krushna/Downloads/UDEMY/T Test")
ttest.df <- read.csv(paste("Data - Deans Dilemma.csv", sep=""))
View(ttest.df)

TASK 3b : Use R to create a table showing the average salary of males and females, who were placed. Review whether there is a gender gap in the data.

placed <- ttest.df[ which(ttest.df$Placement_B==1), ]
 by(placed$Salary, placed$Gender, mean)

In Table Form :


table<-aggregate(placed$Salary, by=list(placed$Gender),mean)
table

Thus we can see that the average salaries of males is higher than the average salaries of females in this dataset.

TASK 3c : Use R to run a t-test to test the following hypothesis:

H1: The average salary of the male MBAs is higher than the average salary of female MBAs.

Therefore our null hypothesis is : There is no significant difference between the average salaries of males and females.

t.test(placed$Salary~placed$Gender)

The p-value=0.00234

As p-value is less than 0.005 , we reject the null hypothesis.