This is a case study based on the case “Dean’s Dilemma: Selection of Students for the MBA Program”.
Reading and Viewing the Dean’s Dilemma Database
setwd("C:\\Users\\Adithya Nataraj\\Downloads")
mba.df<-read.csv(paste("Data - Deans Dilemma.csv", sep=""))
View(mba.df)
Summary statistics of Dean’s Dilemma database
summary(mba.df)
## SlNo Gender Gender.B Percent_SSC Board_SSC
## Min. : 1.0 F:127 Min. :0.0000 Min. :37.00 CBSE :113
## 1st Qu.: 98.5 M:264 1st Qu.:0.0000 1st Qu.:56.00 ICSE : 77
## Median :196.0 Median :0.0000 Median :64.50 Others:201
## Mean :196.0 Mean :0.3248 Mean :64.65
## 3rd Qu.:293.5 3rd Qu.:1.0000 3rd Qu.:74.00
## Max. :391.0 Max. :1.0000 Max. :87.20
##
## Board_CBSE Board_ICSE Percent_HSC Board_HSC
## Min. :0.000 Min. :0.0000 Min. :40.0 CBSE : 96
## 1st Qu.:0.000 1st Qu.:0.0000 1st Qu.:54.0 ISC : 48
## Median :0.000 Median :0.0000 Median :63.0 Others:247
## Mean :0.289 Mean :0.1969 Mean :63.8
## 3rd Qu.:1.000 3rd Qu.:0.0000 3rd Qu.:72.0
## Max. :1.000 Max. :1.0000 Max. :94.7
##
## Stream_HSC Percent_Degree Course_Degree
## Arts : 18 Min. :35.00 Arts : 13
## Commerce:222 1st Qu.:57.52 Commerce :117
## Science :151 Median :63.00 Computer Applications: 32
## Mean :62.98 Engineering : 37
## 3rd Qu.:69.00 Management :163
## Max. :89.00 Others : 5
## Science : 24
## Degree_Engg Experience_Yrs Entrance_Test S.TEST
## Min. :0.00000 Min. :0.0000 MAT :265 Min. :0.0000
## 1st Qu.:0.00000 1st Qu.:0.0000 None : 67 1st Qu.:1.0000
## Median :0.00000 Median :0.0000 K-MAT : 24 Median :1.0000
## Mean :0.09463 Mean :0.4783 CAT : 22 Mean :0.8286
## 3rd Qu.:0.00000 3rd Qu.:1.0000 PGCET : 8 3rd Qu.:1.0000
## Max. :1.00000 Max. :3.0000 GCET : 2 Max. :1.0000
## (Other): 3
## Percentile_ET S.TEST.SCORE Percent_MBA
## Min. : 0.00 Min. : 0.00 Min. :50.83
## 1st Qu.:41.19 1st Qu.:41.19 1st Qu.:57.20
## Median :62.00 Median :62.00 Median :61.01
## Mean :54.93 Mean :54.93 Mean :61.67
## 3rd Qu.:78.00 3rd Qu.:78.00 3rd Qu.:66.02
## Max. :98.69 Max. :98.69 Max. :77.89
##
## Specialization_MBA Marks_Communication Marks_Projectwork
## Marketing & Finance:222 Min. :50.00 Min. :50.00
## Marketing & HR :156 1st Qu.:53.00 1st Qu.:64.00
## Marketing & IB : 13 Median :58.00 Median :69.00
## Mean :60.54 Mean :68.36
## 3rd Qu.:67.00 3rd Qu.:74.00
## Max. :88.00 Max. :87.00
##
## Marks_BOCA Placement Placement_B Salary
## Min. :50.00 Not Placed: 79 Min. :0.000 Min. : 0
## 1st Qu.:57.00 Placed :312 1st Qu.:1.000 1st Qu.:172800
## Median :63.00 Median :1.000 Median :240000
## Mean :64.38 Mean :0.798 Mean :219078
## 3rd Qu.:72.50 3rd Qu.:1.000 3rd Qu.:300000
## Max. :96.00 Max. :1.000 Max. :940000
##
library(psych)
describe(mba.df)
## vars n mean sd median trimmed
## SlNo 1 391 196.00 113.02 196.00 196.00
## Gender* 2 391 1.68 0.47 2.00 1.72
## Gender.B 3 391 0.32 0.47 0.00 0.28
## Percent_SSC 4 391 64.65 10.96 64.50 64.76
## Board_SSC* 5 391 2.23 0.87 3.00 2.28
## Board_CBSE 6 391 0.29 0.45 0.00 0.24
## Board_ICSE 7 391 0.20 0.40 0.00 0.12
## Percent_HSC 8 391 63.80 11.42 63.00 63.34
## Board_HSC* 9 391 2.39 0.85 3.00 2.48
## Stream_HSC* 10 391 2.34 0.56 2.00 2.36
## Percent_Degree 11 391 62.98 8.92 63.00 62.91
## Course_Degree* 12 391 3.85 1.61 4.00 3.81
## Degree_Engg 13 391 0.09 0.29 0.00 0.00
## Experience_Yrs 14 391 0.48 0.67 0.00 0.36
## Entrance_Test* 15 391 5.85 1.35 6.00 6.08
## S.TEST 16 391 0.83 0.38 1.00 0.91
## Percentile_ET 17 391 54.93 31.17 62.00 56.87
## S.TEST.SCORE 18 391 54.93 31.17 62.00 56.87
## Percent_MBA 19 391 61.67 5.85 61.01 61.45
## Specialization_MBA* 20 391 1.47 0.56 1.00 1.42
## Marks_Communication 21 391 60.54 8.82 58.00 59.68
## Marks_Projectwork 22 391 68.36 7.15 69.00 68.60
## Marks_BOCA 23 391 64.38 9.58 63.00 64.08
## Placement* 24 391 1.80 0.40 2.00 1.87
## Placement_B 25 391 0.80 0.40 1.00 0.87
## Salary 26 391 219078.26 138311.65 240000.00 217011.50
## mad min max range skew kurtosis
## SlNo 145.29 1.00 391.00 390.00 0.00 -1.21
## Gender* 0.00 1.00 2.00 1.00 -0.75 -1.45
## Gender.B 0.00 0.00 1.00 1.00 0.75 -1.45
## Percent_SSC 12.60 37.00 87.20 50.20 -0.06 -0.72
## Board_SSC* 0.00 1.00 3.00 2.00 -0.45 -1.53
## Board_CBSE 0.00 0.00 1.00 1.00 0.93 -1.14
## Board_ICSE 0.00 0.00 1.00 1.00 1.52 0.31
## Percent_HSC 13.34 40.00 94.70 54.70 0.29 -0.67
## Board_HSC* 0.00 1.00 3.00 2.00 -0.83 -1.13
## Stream_HSC* 0.00 1.00 3.00 2.00 -0.12 -0.72
## Percent_Degree 8.90 35.00 89.00 54.00 0.05 0.24
## Course_Degree* 1.48 1.00 7.00 6.00 0.00 -1.08
## Degree_Engg 0.00 0.00 1.00 1.00 2.76 5.63
## Experience_Yrs 0.00 0.00 3.00 3.00 1.27 1.17
## Entrance_Test* 0.00 1.00 9.00 8.00 -2.52 7.04
## S.TEST 0.00 0.00 1.00 1.00 -1.74 1.02
## Percentile_ET 25.20 0.00 98.69 98.69 -0.74 -0.69
## S.TEST.SCORE 25.20 0.00 98.69 98.69 -0.74 -0.69
## Percent_MBA 6.39 50.83 77.89 27.06 0.34 -0.52
## Specialization_MBA* 0.00 1.00 3.00 2.00 0.70 -0.56
## Marks_Communication 8.90 50.00 88.00 38.00 0.74 -0.25
## Marks_Projectwork 7.41 50.00 87.00 37.00 -0.26 -0.27
## Marks_BOCA 11.86 50.00 96.00 46.00 0.29 -0.85
## Placement* 0.00 1.00 2.00 1.00 -1.48 0.19
## Placement_B 0.00 0.00 1.00 1.00 -1.48 0.19
## Salary 88956.00 0.00 940000.00 940000.00 0.24 1.74
## se
## SlNo 5.72
## Gender* 0.02
## Gender.B 0.02
## Percent_SSC 0.55
## Board_SSC* 0.04
## Board_CBSE 0.02
## Board_ICSE 0.02
## Percent_HSC 0.58
## Board_HSC* 0.04
## Stream_HSC* 0.03
## Percent_Degree 0.45
## Course_Degree* 0.08
## Degree_Engg 0.01
## Experience_Yrs 0.03
## Entrance_Test* 0.07
## S.TEST 0.02
## Percentile_ET 1.58
## S.TEST.SCORE 1.58
## Percent_MBA 0.30
## Specialization_MBA* 0.03
## Marks_Communication 0.45
## Marks_Projectwork 0.36
## Marks_BOCA 0.48
## Placement* 0.02
## Placement_B 0.02
## Salary 6994.72
There are 390 entries in total. The average salary of a student is 219078.26 rupees.
Answer to 3a: median salary of all the students
median(mba.df$Salary)
## [1] 240000
The median is 240000 rupees.
Answer to 3b: the percentage of students who were placed
mytable <- xtabs(~ Placement_B, data=mba.df)
mytable
## Placement_B
## 0 1
## 79 312
format(round((prop.table(mytable)*100), 2), nsmall = 2)
## Placement_B
## 0 1
## "20.20" "79.80"
79.80% of the students were placed.
Answer to 3c: Creating and viewing a dataframe called ‘placed’, that contains a subset of only those students who were successfully placed
placed.df <- subset(mba.df, mba.df$Placement_B==1 )
View(placed.df)
Answer to 3d: median salary of students who were placed
median(placed.df$Salary)
## [1] 260000
The median salary of students who were placed is 260000 rupees
Answer to 3e: table showing the mean salary of males and females, who were placed
placed.salary.mean <- aggregate(placed.df$Salary, list(placed.df$Gender), mean)
colnames(placed.salary.mean)[2] <- "Mean Salary"
colnames(placed.salary.mean)[1] <- "Gender"
placed.salary.mean
## Gender Mean Salary
## 1 F 253068.0
## 2 M 284241.9
The mean salary of females who were placed is 253068.0 rupees and that of males who were placed is 284241.9 rupees
Answer to 3f: histogram showing MBA performance of the students who were placed
hist(placed.df$Percent_MBA, main="MBA Performance of placed students", ylab="Count", xlab="MBA Percentage", col=c("orange","red","black","green"), breaks=20, xlim=c(50,80), ylim=c(0,30))
From the graph, maximum number of students have got an average percentage of 62-63% during their mba.
Answer to 3g: Creating and viewing a dataframe called ‘notplaced’, that contains a set of only those students who were not placed after their MBA
notplaced.df <- subset(mba.df, mba.df$Placement_B==0 )
View(notplaced.df)
Answer to 3h: comparison of the MBA performance of Placed and Not Placed students
par(mfrow=c(1,2))
hist(placed.df$Percent_MBA, main="MBA Performance of placed students", ylab="Count", xlab="MBA Percentage", col="turquoise", breaks=20, xlim=c(50,80), ylim=c(0,30))
hist(notplaced.df$Percent_MBA, main="MBA Performance of NOT placed students", ylab="Count", xlab="MBA Percentage", col="turquoise", breaks=20, xlim=c(50,80), ylim=c(0,8))
From the graph, it is clear that the performance of the placed MBA students is just a bit better than that of not placed MBA students.
Answer to 3i: comparison of the distribution of salaries of males and females who were placed
boxplot(Salary ~ Gender, data=placed.df, yaxt="n", horizontal = TRUE, ylab="Gender",
xlab="Salary", main="Comparison of salaries of Males and Females", col=c("green", "red"))
axis(side=2, at=c(1,2), labels=c("Males" , "Females"))
From the boxplot, the females who are placed have higher salaries than males. The median salary of females are also higher than that of males who are placed.
Answer to 3j: Creating and viewing a dataframe called ‘placedet’, representing students who were placed after completing MBA and who also gave some MBA entrance test before admission into the MBA program
placedet.df <- subset(mba.df, mba.df$Placement_B==1 & mba.df$S.TEST==1)
View(placedet.df)
Answer to 3k: Scatter Plot Matrix for 3 variables: salary, percentage obtained in mba program and percentile obtained in mba entrance test
library(car)
##
## Attaching package: 'car'
## The following object is masked from 'package:psych':
##
## logit
scatterplotMatrix(formula= ~ Salary+Percent_MBA+Percentile_ET, cex=0.6, data=placedet.df, diagonal="density", main="Scatter Plot Matrix")
From the scatterplot, it is clear that students who have took an entrance test have performed better during their MBA program.