T-Test Analysis based on “A Dean’s Dilemma: Selection of Students for the MBA Program”

This is a report on the t-Test analysis on the salaries of placed male and female MBAs, based on the Dean’s dilemma dataset.

setwd("D:/desktop/Data Analytics internship-sameer mathur/work/datasets")
mba.df<-read.csv(paste("Data - Deans Dilemma.csv", sep=""))
View(mba.df)

3b: table showing the average salary of males and females, who were placed

placed.df <- subset(mba.df, mba.df$Placement_B==1 )
placed.salary.mean <- aggregate(placed.df$Salary, list(placed.df$Gender), mean)
colnames(placed.salary.mean)[2] <- "Mean Salary"
colnames(placed.salary.mean)[1] <- "Gender"
placed.salary.mean
##   Gender Mean Salary
## 1      F    253068.0
## 2      M    284241.9

3c: t-Test:

H1: The average salary of the male MBAs is higher than the average salary of female MBAs.

  1. Inspecting the structure of the data
head(placed.df)
##   SlNo Gender Gender.B Percent_SSC Board_SSC Board_CBSE Board_ICSE
## 1    1      M        0       62.00    Others          0          0
## 2    2      M        0       76.33      ICSE          0          1
## 3    3      M        0       72.00    Others          0          0
## 4    4      M        0       60.00      CBSE          1          0
## 5    5      M        0       61.00      CBSE          1          0
## 6    6      M        0       55.00      ICSE          0          1
##   Percent_HSC Board_HSC Stream_HSC Percent_Degree         Course_Degree
## 1       88.00    Others   Commerce          52.00               Science
## 2       75.33    Others    Science          75.48 Computer Applications
## 3       78.00    Others   Commerce          66.63           Engineering
## 4       63.00      CBSE       Arts          58.00            Management
## 5       55.00       ISC    Science          54.00           Engineering
## 6       64.00      CBSE   Commerce          50.00              Commerce
##   Degree_Engg Experience_Yrs Entrance_Test S.TEST Percentile_ET
## 1           0              0           MAT      1          55.0
## 2           0              1           MAT      1          86.5
## 3           1              0          None      0           0.0
## 4           0              0           MAT      1          75.0
## 5           1              1           MAT      1          66.0
## 6           0              0          None      0           0.0
##   S.TEST.SCORE Percent_MBA  Specialization_MBA Marks_Communication
## 1         55.0       58.80      Marketing & HR                  50
## 2         86.5       66.28 Marketing & Finance                  69
## 3          0.0       52.91 Marketing & Finance                  50
## 4         75.0       57.80 Marketing & Finance                  54
## 5         66.0       59.43      Marketing & HR                  52
## 6          0.0       56.81 Marketing & Finance                  53
##   Marks_Projectwork Marks_BOCA Placement Placement_B Salary
## 1                65         74    Placed           1 270000
## 2                70         75    Placed           1 200000
## 3                61         59    Placed           1 240000
## 4                66         62    Placed           1 250000
## 5                65         67    Placed           1 180000
## 6                70         53    Placed           1 300000
  1. summary statistics
table(placed.df$Gender)
## 
##   F   M 
##  97 215
summary(placed.df$Salary)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  120000  220000  260000  274600  300000  940000
  1. Average salaries of female and male MBAs placed.
placed.salary.mean
##   Gender Mean Salary
## 1      F    253068.0
## 2      M    284241.9
  1. Graphical represenation
boxplot(Salary ~ Gender, data=placed.df, yaxt="n", horizontal = TRUE, ylab="Gender", 
        xlab="Salary", main="Comparison of salaries of Males and Females", col=c("lightblue", "pink"))
axis(side=2, at=c(1,2), labels=c("Males" , "Females"))

log.transformed.salary = log(placed.df$Salary)

Hypothesis 1: The average salary of the male MBAs is higher than the average salary of female MBAs.

NULL hypothesis: -" there is no significant difference between average salary of the MBA and female MBA "

t.test(log.transformed.salary~placed.df$Gender,var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  log.transformed.salary by placed.df$Gender
## t = -2.8142, df = 310, p-value = 0.005203
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.17482594 -0.03094897
## sample estimates:
## mean in group F mean in group M 
##        12.40435        12.50723

3d: (1)

placed.salary.mean
##   Gender Mean Salary
## 1      F    253068.0
## 2      M    284241.9

3d: (2)

placed.salary.mean[2,2]
## [1] 284241.9

3d: (3)

placed.salary.mean[1,2]
## [1] 253068

3d: (4)

t.test(log.transformed.salary~placed.df$Gender,var.equal = TRUE)
## 
##  Two Sample t-test
## 
## data:  log.transformed.salary by placed.df$Gender
## t = -2.8142, df = 310, p-value = 0.005203
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.17482594 -0.03094897
## sample estimates:
## mean in group F mean in group M 
##        12.40435        12.50723

The p-value is 0.005203.

3d: (5)