Tanushree
4th October 2017
dean = read.csv("DeansDilemmaData.csv")
summary(dean)
SlNo Gender Percent_SSC Board_SSC Board_CBSE
Min. : 1.0 F:127 Min. :37.00 CBSE :113 No :278
1st Qu.: 98.5 M:264 1st Qu.:56.00 ICSE : 77 Yes:113
Median :196.0 Median :64.50 Others:201
Mean :196.0 Mean :64.65
3rd Qu.:293.5 3rd Qu.:74.00
Max. :391.0 Max. :87.20
Board_ICSE Percent_HSC Board_HSC Stream_HSC Percent_Degree
No :314 Min. :40.0 CBSE : 96 Arts : 18 Min. :35.00
Yes: 77 1st Qu.:54.0 ISC : 48 Commerce:222 1st Qu.:57.52
Median :63.0 Others:247 Science :151 Median :63.00
Mean :63.8 Mean :62.98
3rd Qu.:72.0 3rd Qu.:69.00
Max. :94.7 Max. :89.00
Course_Degree Degree_Engg Experience_Yrs Entrance_Test
Arts : 13 No :354 Min. :0.0000 MAT :265
Commerce :117 Yes: 37 1st Qu.:0.0000 None : 67
Computer Applications: 32 Median :0.0000 K-MAT : 24
Engineering : 37 Mean :0.4783 CAT : 22
Management :163 3rd Qu.:1.0000 PGCET : 8
Others : 5 Max. :3.0000 GCET : 2
Science : 24 (Other): 3
S.TEST Percentile_ET S.TEST.SCORE Percent_MBA
Min. :0.0000 Min. : 0.00 Min. : 0.00 Min. :50.83
1st Qu.:1.0000 1st Qu.:41.19 1st Qu.:41.19 1st Qu.:57.20
Median :1.0000 Median :62.00 Median :62.00 Median :61.01
Mean :0.8286 Mean :54.93 Mean :54.93 Mean :61.67
3rd Qu.:1.0000 3rd Qu.:78.00 3rd Qu.:78.00 3rd Qu.:66.02
Max. :1.0000 Max. :98.69 Max. :98.69 Max. :77.89
Specialization_MBA Marks_Communication Marks_Projectwork
Marketing & Finance:222 Min. :50.00 Min. :50.00
Marketing & HR :156 1st Qu.:53.00 1st Qu.:64.00
Marketing & IB : 13 Median :58.00 Median :69.00
Mean :60.54 Mean :68.36
3rd Qu.:67.00 3rd Qu.:74.00
Max. :88.00 Max. :87.00
Marks_BOCA Placement Salary
Min. :50.00 Not Placed: 79 Min. : 0
1st Qu.:57.00 Placed :312 1st Qu.:172800
Median :63.00 Median :240000
Mean :64.38 Mean :219078
3rd Qu.:72.50 3rd Qu.:300000
Max. :96.00 Max. :940000
median(dean$Salary)
[1] 240000
round(prop.table(with(dean, table(dean$Placement)))*100 , 2)[2]
Placed
79.8
placed = subset(dean, Placement == 'Placed')
median(placed$Salary)
[1] 260000
aggregate(Salary ~ Gender, data = placed , mean)
Gender Salary
1 F 253068.0
2 M 284241.9
hist(placed$Percent_MBA, freq = FALSE)
notplaced = subset(dean, Placement == 'Not Placed')
library(lattice)
histogram(~ Percent_MBA | Placement, data=dean)
boxplot(placed$Salary ~ Gender , data = placed , horizontal = TRUE , col= c ('light pink', 'sky blue') )
placedET = subset(placed,Entrance_Test != 'None')
library(car)
attach(placedET)
scatterplot.matrix(~placedET$Salary+placedET$Percent_MBA+placedET$Percentile_ET, data=placedET,
main="Three variable plot {Salary, Percent_MBA, Percentile_ET}")
mean(placed$Salary)
[1] 274550
Shapiro Test
shapiro.test(dean$Salary)
Shapiro-Wilk normality test
data: dean$Salary
W = 0.89789, p-value = 1.619e-15
p-value = 1.619e-15 is less than the 0.05 then the null hypothesis is rejected and there is evidence that the data are not normally distributed.
QQ Plot
qqnorm(Salary)
qqline(Salary)
Histogram and Density Curve
hist(Salary,freq=FALSE)
lines(density(Salary), lwd=2)
aggregate(Salary ~ Gender, data = placed , mean)
Gender Salary
1 F 253068.0
2 M 284241.9
library(gplots)
plotmeans(Salary ~ Gender, data = placed, frame = TRUE)
aggregate(Salary ~ Gender, data = placed , var)
Gender Salary
1 F 5504236572
2 M 9886408520
var.test(Salary ~ Gender, data = placed, alternative = "two.sided")
F test to compare two variances
data: Salary by Gender
F = 0.55675, num df = 96, denom df = 214, p-value = 0.00135
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.3999212 0.7927360
sample estimates:
ratio of variances
0.5567478
p-value = 0.00135 , In conclusion, there is significant difference between the two variances.
t.test(Salary ~ Gender, data = placed)
Welch Two Sample t-test
data: Salary by Gender
t = -3.0757, df = 243.03, p-value = 0.00234
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-51138.42 -11209.22
sample estimates:
mean in group F mean in group M
253068.0 284241.9
p-value = 0.00234 , we can conclude that the averages of two groups are not significantly similar.