1.Synopsis

From among the applicants seeking admission to their MBA program, the B-schools had to identify those students who could successfully complete the course and also get placed eventually. Identifying potential students was a critical decision and it became imperative to identify the factors that could differentiate the two categories of students-those who could be placed easily and those who would struggle to get placed. Each year, the admissions committee of every B-school faced the tough task of screening the applicants and selecting those students who would eventually succeed in the MBA program. MBA admissions needed much more analytical reasoning, taking multiple criteria into consideration. The admissions team wanted to understand whether a student’s academic record would have any reflection on the placement status.

2.Exercise

Read the data set into RStudio

dilemma.df<-read.csv(paste("Data - Deans Dilemma.csv",sep=""))
View(dilemma.df)

Create summary statistics (e.g. mean, standard deviation, median, mode) for the important variables in the dataset.

library(psych)
describe(dilemma.df)
##                     vars   n      mean        sd    median   trimmed
## SlNo                   1 391    196.00    113.02    196.00    196.00
## Gender*                2 391      1.68      0.47      2.00      1.72
## Gender.B               3 391      0.32      0.47      0.00      0.28
## Percent_SSC            4 391     64.65     10.96     64.50     64.76
## Board_SSC*             5 391      2.23      0.87      3.00      2.28
## Board_CBSE             6 391      0.29      0.45      0.00      0.24
## Board_ICSE             7 391      0.20      0.40      0.00      0.12
## Percent_HSC            8 391     63.80     11.42     63.00     63.34
## Board_HSC*             9 391      2.39      0.85      3.00      2.48
## Stream_HSC*           10 391      2.34      0.56      2.00      2.36
## Percent_Degree        11 391     62.98      8.92     63.00     62.91
## Course_Degree*        12 391      3.85      1.61      4.00      3.81
## Degree_Engg           13 391      0.09      0.29      0.00      0.00
## Experience_Yrs        14 391      0.48      0.67      0.00      0.36
## Entrance_Test*        15 391      5.85      1.35      6.00      6.08
## S.TEST                16 391      0.83      0.38      1.00      0.91
## Percentile_ET         17 391     54.93     31.17     62.00     56.87
## S.TEST.SCORE          18 391     54.93     31.17     62.00     56.87
## Percent_MBA           19 391     61.67      5.85     61.01     61.45
## Specialization_MBA*   20 391      1.47      0.56      1.00      1.42
## Marks_Communication   21 391     60.54      8.82     58.00     59.68
## Marks_Projectwork     22 391     68.36      7.15     69.00     68.60
## Marks_BOCA            23 391     64.38      9.58     63.00     64.08
## Placement*            24 391      1.80      0.40      2.00      1.87
## Placement_B           25 391      0.80      0.40      1.00      0.87
## Salary                26 391 219078.26 138311.65 240000.00 217011.50
##                          mad   min       max     range  skew kurtosis
## SlNo                  145.29  1.00    391.00    390.00  0.00    -1.21
## Gender*                 0.00  1.00      2.00      1.00 -0.75    -1.45
## Gender.B                0.00  0.00      1.00      1.00  0.75    -1.45
## Percent_SSC            12.60 37.00     87.20     50.20 -0.06    -0.72
## Board_SSC*              0.00  1.00      3.00      2.00 -0.45    -1.53
## Board_CBSE              0.00  0.00      1.00      1.00  0.93    -1.14
## Board_ICSE              0.00  0.00      1.00      1.00  1.52     0.31
## Percent_HSC            13.34 40.00     94.70     54.70  0.29    -0.67
## Board_HSC*              0.00  1.00      3.00      2.00 -0.83    -1.13
## Stream_HSC*             0.00  1.00      3.00      2.00 -0.12    -0.72
## Percent_Degree          8.90 35.00     89.00     54.00  0.05     0.24
## Course_Degree*          1.48  1.00      7.00      6.00  0.00    -1.08
## Degree_Engg             0.00  0.00      1.00      1.00  2.76     5.63
## Experience_Yrs          0.00  0.00      3.00      3.00  1.27     1.17
## Entrance_Test*          0.00  1.00      9.00      8.00 -2.52     7.04
## S.TEST                  0.00  0.00      1.00      1.00 -1.74     1.02
## Percentile_ET          25.20  0.00     98.69     98.69 -0.74    -0.69
## S.TEST.SCORE           25.20  0.00     98.69     98.69 -0.74    -0.69
## Percent_MBA             6.39 50.83     77.89     27.06  0.34    -0.52
## Specialization_MBA*     0.00  1.00      3.00      2.00  0.70    -0.56
## Marks_Communication     8.90 50.00     88.00     38.00  0.74    -0.25
## Marks_Projectwork       7.41 50.00     87.00     37.00 -0.26    -0.27
## Marks_BOCA             11.86 50.00     96.00     46.00  0.29    -0.85
## Placement*              0.00  1.00      2.00      1.00 -1.48     0.19
## Placement_B             0.00  0.00      1.00      1.00 -1.48     0.19
## Salary              88956.00  0.00 940000.00 940000.00  0.24     1.74
##                          se
## SlNo                   5.72
## Gender*                0.02
## Gender.B               0.02
## Percent_SSC            0.55
## Board_SSC*             0.04
## Board_CBSE             0.02
## Board_ICSE             0.02
## Percent_HSC            0.58
## Board_HSC*             0.04
## Stream_HSC*            0.03
## Percent_Degree         0.45
## Course_Degree*         0.08
## Degree_Engg            0.01
## Experience_Yrs         0.03
## Entrance_Test*         0.07
## S.TEST                 0.02
## Percentile_ET          1.58
## S.TEST.SCORE           1.58
## Percent_MBA            0.30
## Specialization_MBA*    0.03
## Marks_Communication    0.45
## Marks_Projectwork      0.36
## Marks_BOCA             0.48
## Placement*             0.02
## Placement_B            0.02
## Salary              6994.72

Use R to calculate the median salary of all the students in the data sample

median(dilemma.df$Salary)
## [1] 240000

Use R to calculate the percentage of students who were placed, correct to 2 decimal places.

mytable<-with(dilemma.df,table(Placement_B))
format(round(prop.table(mytable)*100,2), nsmall = 2)
## Placement_B
##       0       1 
## "20.20" "79.80"

Use R to create a dataframe called placed, that contains a subset of only those students who were successfully placed.ALso Use R to find the median salary of students who were placed.

placed.df<-dilemma.df[which(dilemma.df$Placement=="Placed"),]
median(placed.df$Salary)
## [1] 260000

Use R to create a table showing the mean salary of males and females, who were placed.

aggregate(placed.df$Salary, by=list(Gender=placed.df$Gender),mean)
##   Gender        x
## 1      F 253068.0
## 2      M 284241.9

Use R to generate the following histogram showing a breakup of the MBA performance of the students who were placed

hist(placed.df$Percent_MBA, main = "MBA Performance of placed students",xlab = "MBA Percentage", ylab = "Count", xlim=c(50,80),ylim=c(0,150),breaks = 3, col = "gray")

Create a dataframe called notplaced, that contains a subset of only those students who were NOT placed after their MBA.

notplaced.df<-dilemma.df[which(dilemma.df$Placement=="Not Placed"),]

Draw two histograms side-by-side, visually comparing the MBA performance of Placed and Not Placed students, as follows:

par(mfrow=c(1,2))
 with(placed.df,hist(placed.df$Percent_MBA, main="MBA Performance of placed students", xlab = "MBA Percentage", ylab = "Count", xlim = c(50,80), ylim = c(0,150), breaks = 3,col = "gray"))
 with(notplaced.df,hist(notplaced.df$Percent_MBA, main = "MBA Performance of not placed students", xlab = "MBA Percentage", ylab = "Count", xlim = c(50,80), ylim = c(0,30), breaks = 3, col = "gray"))

Use R to draw two boxplots, one below the other, comparing the distribution of salaries of males and females who were placed, as follows:

par(mfrow=c(1,1))
boxplot(Salary~Gender,data = placed.df, horizontal = TRUE, yaxt="n",ylab="Gender", xlab="Salary", main="Comparison of Salaries of Males and Females")
axis(side=2,at=c(1,2), labels=c("Females", "Males"))

Create a dataframe called placedET, representing students who were placed after the MBA and who also gave some MBA entrance test before admission into the MBA program.

placedET.df<-dilemma.df[which(dilemma.df$Placement=="Placed"&dilemma.df$S.TEST==1),]

Draw a Scatter Plot Matrix for 3 variables – {Salary, Percent_MBA, Percentile_ET} using the dataframe placedET.

library(car)
## 
## Attaching package: 'car'
## The following object is masked from 'package:psych':
## 
##     logit
scatterplotMatrix(formula=~Salary+Percent_MBA+Percentile_ET, cex=1.2, data=placedET.df,diagonal="density")

Conclusion

The scores of a candidate during his/her MBA,along with his/her percentile score in the Entrance Test play a decisive role in whether or not that candidate is going to be placed. So, these parameters must be kept in mind while granting admission to a candidate.