The Dean’s Dilemma Case Study
3a)Median salary of all the students in the data sample:
median(dean$Salary)
## [1] 240000
3b)Percentage of students who were placed, correct to 2 decimal places:
format(round(100*mean(dean$Placement_B),2),nsmall=2)
## [1] "79.80"
3c) create a dataframe called placed, that contains a subset of only those students who were successfully placed:
placed<-dean[which(dean$Placement_B==1),]
3d)median salary of students who were placed:
median(placed$Salary)
## [1] 260000
3e)table showing the mean salary of males and females, who were placed:
aggregate(placed$Salary,by=list(Sex=placed$Gender),mean)
## Sex x
## 1 F 253068.0
## 2 M 284241.9
3f)histogram showing a breakup of the MBA performance of the students who were placed:
hist(placed$Percent_MBA,main="MBA Performance of placed students",xlab="MBA Percentage",ylab="Count",breaks=2,col="grey")
3g)Create a dataframe called notplaced, that contains a subset of only those students who were NOT placed after their MBA:
notplaced<-dean[which(dean$Placement_B==0),]
3h)Draw two histograms side-by-side, visually comparing the MBA performance of Placed and Not Placed students:
par(mfrow=c(1, 2))
hist(placed$Percent_MBA, main="MBA Performance of placed students",xlab="MBA Percentage",ylab="Count",breaks=2,col="grey")
hist(notplaced$Percent_MBA,main="MBA Performance of not placed students",xlab="MBA Percentage",ylab="Count",breaks=2,col="grey")
par(mfrow=c(1, 1))
3i)draw two boxplots, one below the other, comparing the distribution of salaries of males and females who were placed:
boxplot(placed$Salary ~ placed$Gender,data=placed,horizontal=TRUE, yaxt="n",ylab="Gender", xlab="Salary",main="Comparison of Salaries of Males and Females")
axis(side=2,at=c(1,2),labels=c("Females","Males"))
3j)Create a dataframe called placedET, representing students who were placed after the MBA and who also gave some MBA entrance test before admission into the MBA program:
placedET<-placed[which(placed$S.TEST==1),]
3k)Draw a Scatter Plot Matrix for 3 variables – {Salary, Percent_MBA, Percentile_ET} using the dataframe placedET:
scatterplotMatrix(placedET[,c("Salary","Percent_MBA","Percentile_ET")],spread=FALSE, smoother.args=list(lty=2),main="Scatter Plot Matrix")