The Dean’s Dilemma Case Study

3a)Median salary of all the students in the data sample:

median(dean$Salary)
## [1] 240000

3b)Percentage of students who were placed, correct to 2 decimal places:

format(round(100*mean(dean$Placement_B),2),nsmall=2)
## [1] "79.80"

3c) create a dataframe called placed, that contains a subset of only those students who were successfully placed:

placed<-dean[which(dean$Placement_B==1),]

3d)median salary of students who were placed:

median(placed$Salary)
## [1] 260000

3e)table showing the mean salary of males and females, who were placed:

aggregate(placed$Salary,by=list(Sex=placed$Gender),mean)
##   Sex        x
## 1   F 253068.0
## 2   M 284241.9

3f)histogram showing a breakup of the MBA performance of the students who were placed:

hist(placed$Percent_MBA,main="MBA Performance of placed students",xlab="MBA Percentage",ylab="Count",breaks=2,col="grey")

3g)Create a dataframe called notplaced, that contains a subset of only those students who were NOT placed after their MBA:

notplaced<-dean[which(dean$Placement_B==0),]

3h)Draw two histograms side-by-side, visually comparing the MBA performance of Placed and Not Placed students:

par(mfrow=c(1, 2))
hist(placed$Percent_MBA, main="MBA Performance of placed students",xlab="MBA Percentage",ylab="Count",breaks=2,col="grey")       
hist(notplaced$Percent_MBA,main="MBA Performance of not placed students",xlab="MBA Percentage",ylab="Count",breaks=2,col="grey")       

par(mfrow=c(1, 1))

3i)draw two boxplots, one below the other, comparing the distribution of salaries of males and females who were placed:

boxplot(placed$Salary ~ placed$Gender,data=placed,horizontal=TRUE, yaxt="n",ylab="Gender", xlab="Salary",main="Comparison of Salaries of Males and Females")
axis(side=2,at=c(1,2),labels=c("Females","Males"))

3j)Create a dataframe called placedET, representing students who were placed after the MBA and who also gave some MBA entrance test before admission into the MBA program:

placedET<-placed[which(placed$S.TEST==1),]

3k)Draw a Scatter Plot Matrix for 3 variables – {Salary, Percent_MBA, Percentile_ET} using the dataframe placedET:

scatterplotMatrix(placedET[,c("Salary","Percent_MBA","Percentile_ET")],spread=FALSE, smoother.args=list(lty=2),main="Scatter Plot Matrix")