Since MBA programs are prestigious in terms of the exposure the students get and placements, the deans of almost all B-schools come across the dilemma of setting qualitative as well as quantitative criteria to be followed to accept students. A lot of information regarding their academic, co-curricular performance and work experience is gathered to be able to select students. The following is a CSV format file which contains information like Gender, Percentage scored in 10th and 12th board examinations, Performance in under-grad, Work Experience, MBA Entrance Test taken, Performance in MBA, Salary post MBA. The analysis is given below:-
setwd("C:/Users/Dell/Desktop/Project/Week 1/Day 6")
dilemma.df=read.csv("Data - Deans Dilemma.csv")
View(dilemma.df)
prop.table(table(dilemma.df$Gender),margin=NULL)*100
##
## F M
## 32.48082 67.51918
b.) How many actually took the test
prop.table(table(dilemma.df$S.TEST),margin=NULL)*100
##
## 0 1
## 17.13555 82.86445
c.) Summary of students in CBSE board during their SSC
library(psych)
CBSE_SSC=dilemma.df[which(dilemma.df$Board_SSC=='CBSE'),"Percent_SSC"]
summary(CBSE_SSC)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 40.00 55.00 61.00 62.92 70.00 85.80
describe(CBSE_SSC)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 113 62.92 11.04 61 62.61 11.86 40 85.8 45.8 0.23 -0.74
## se
## X1 1.04
d.) Summary of students in ICSE board during their SSC
ICSE_SSC=dilemma.df[which(dilemma.df$Board_SSC=='ICSE'),"Percent_SSC"]
summary(ICSE_SSC)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 50.0 59.0 64.0 65.4 72.0 87.0
describe(ICSE_SSC)
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 77 65.4 8.78 64 64.93 8.9 50 87 37 0.44 -0.67 1
e.) Summary of students in Other boards during their SSC
Others_SSC=dilemma.df[which(dilemma.df$Board_SSC=='Others'),"Percent_SSC"]
summary(Others_SSC)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 37.00 56.00 66.30 65.34 75.00 87.20
describe(Others_SSC)
## vars n mean sd median trimmed mad min max range skew kurtosis
## X1 1 201 65.34 11.59 66.3 65.8 14.38 37 87.2 50.2 -0.28 -0.77
## se
## X1 0.82
By looking at the summary reports, we can say that each board can be treated equally for evaluating the appplications, that is almost same cut offs. This is because of the statistical summary to be close enough(mean).
f.) Summary of students who took Commerce during HSC
Commerce=dilemma.df[which(dilemma.df$Stream_HSC=='Commerce'),"Percent_HSC"]
summary(Commerce)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 40.00 57.00 66.72 66.52 75.15 94.00
g.) Summary of students who took Science during HSC
Science=dilemma.df[which(dilemma.df$Stream_HSC=='Science'),"Percent_HSC"]
summary(Science)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 42.00 52.00 58.00 59.76 67.65 94.70
h.) Summary of students who took Arts during HSC
Arts=dilemma.df[which(dilemma.df$Stream_HSC=='Arts'),"Percent_HSC"]
summary(Arts)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 40.00 59.00 63.00 64.05 72.25 83.00
By looking at the summary reports, we have to appreciate that each stream has performed differently in boards for which the cuts off should vary.
i.) Summary of Test score in the MBA Entrance Tests
summary(dilemma.df$S.TEST.SCORE)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 41.19 62.00 54.93 78.00 98.69
median(dilemma.df$Salary)
## [1] 240000
m=prop.table(xtabs(~Placement, dilemma.df),margin=NULL)
m*100
## Placement
## Not Placed Placed
## 20.2046 79.7954
PLACED.df=dilemma.df[which(dilemma.df$Placement_B==1),]
View(PLACED.df)
median(PLACED.df$Salary)
## [1] 260000
Female_Salary=PLACED.df[which(PLACED.df$Gender.B==1),"Salary"]
Male_Salary=PLACED.df[which(PLACED.df$Gender.B==0),"Salary"]
Mean_Salary=c(mean(Female_Salary),mean(Male_Salary))
Mean_Salary
## [1] 253068.0 284241.9
GENERATING A HISTOGRAM SHOWING A BREAKUP OF THE MBA PERFORMANCE OF THE STUDENTS WHO WERE PLACED
UNPLACED.df, A SUBSET OF ONLY THOSE STUDENTS WHO WERE NOT PLACED
UNPLACED.df=dilemma.df[which(dilemma.df$Placement_B==0),]
View(UNPLACED.df)
DRAWING TWO HISTOGRAMS SIDE-BY-SIDE, VISUALLY COMPARING THE MBA PERFORMANCE OF PLACED AND NOT PLACED STUDENTS
DRAWING TWO BOXPLOTS, ONE BELOW THE OTHER, COMPARING THE DISTRIBUTION OF SALARIES OF MALES AND FEMALES WHO WERE PLACED
PlacedET.df, REPRESENTING STUDENTS WHO WERE PLACED AFTER MBA AND WHO ALSO GAVE SOME MBA ENTRANCE TEST BEFORE ADMISSION INTO THE MBA PROGRAM
PlacedET.df=dilemma.df[which(dilemma.df$S.TEST==1&dilemma.df$Placement_B==1),]
View(PlacedET.df)
##
## Attaching package: 'car'
## The following object is masked from 'package:psych':
##
## logit