##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 4510
##
##
## | val$predicted
## val$left | 0 | 1 | Row Total |
## -------------|-----------|-----------|-----------|
## 0 | 3421 | 5 | 3426 |
## | 241.144 | 790.697 | |
## | 0.999 | 0.001 | 0.760 |
## | 0.990 | 0.005 | |
## | 0.759 | 0.001 | |
## -------------|-----------|-----------|-----------|
## 1 | 35 | 1049 | 1084 |
## | 762.141 | 2499.012 | |
## | 0.032 | 0.968 | 0.240 |
## | 0.010 | 0.995 | |
## | 0.008 | 0.233 | |
## -------------|-----------|-----------|-----------|
## Column Total | 3456 | 1054 | 4510 |
## | 0.766 | 0.234 | |
## -------------|-----------|-----------|-----------|
##
##
## accounting hr IT management marketing product_mng
## 767 739 1227 630 858 902
## RandD sales support technical
## 787 4140 2229 2720
## CP nsplit rel error xerror xstd
## 1 0.26491366 0 1.0000000 1.0000000 0.017268873
## 2 0.18269231 1 0.7350863 0.7350863 0.015413205
## 3 0.07397959 3 0.3697017 0.3697017 0.011498380
## 4 0.04984301 5 0.2217425 0.2217425 0.009076995
## 5 0.02825746 6 0.1718995 0.1762166 0.008138311
## 6 0.01687598 7 0.1436421 0.1503140 0.007540782
## 7 0.01000000 8 0.1267661 0.1326531 0.007099516
## ct1.variable.importance
## satisfaction_level 2205.051676
## average_montly_hours 1171.008521
## number_project 1136.764011
## last_evaluation 929.554234
## time_spend_company 829.388045
## Work_accident 30.133103
## promotion_last_5years 4.703823
## satisfaction_level last_evaluation number_project average_montly_hours
## Min. :0.0900 Min. :0.3600 Min. :2.000 Min. : 96.0
## 1st Qu.:0.4400 1st Qu.:0.5600 1st Qu.:3.000 1st Qu.:156.0
## Median :0.6500 Median :0.7200 Median :4.000 Median :202.0
## Mean :0.6153 Mean :0.7181 Mean :3.801 Mean :201.6
## 3rd Qu.:0.8200 3rd Qu.:0.8700 3rd Qu.:5.000 3rd Qu.:245.0
## Max. :1.0000 Max. :1.0000 Max. :7.000 Max. :310.0
##
## time_spend_company Work_accident left promotion_last_5years
## Min. : 2.000 Min. :0.0000 0:3366 Min. :0.00000
## 1st Qu.: 3.000 1st Qu.:0.0000 1:1023 1st Qu.:0.00000
## Median : 3.000 Median :0.0000 Median :0.00000
## Mean : 3.508 Mean :0.1408 Mean :0.02028
## 3rd Qu.: 4.000 3rd Qu.:0.0000 3rd Qu.:0.00000
## Max. :10.000 Max. :1.0000 Max. :1.00000
##
## sales salary left_predicted
## sales :1198 high : 379 0:3418
## technical : 783 low :2140 1: 971
## support : 668 medium:1870
## IT : 355
## product_mng: 268
## marketing : 253
## (Other) : 864
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 4389
##
##
## | validation$left_predicted
## validation$left | 0 | 1 | Row Total |
## ----------------|-----------|-----------|-----------|
## 0 | 3314 | 52 | 3366 |
## | 183.038 | 644.308 | |
## | 0.985 | 0.015 | 0.767 |
## | 0.970 | 0.054 | |
## | 0.755 | 0.012 | |
## ----------------|-----------|-----------|-----------|
## 1 | 104 | 919 | 1023 |
## | 602.253 | 2119.980 | |
## | 0.102 | 0.898 | 0.233 |
## | 0.030 | 0.946 | |
## | 0.024 | 0.209 | |
## ----------------|-----------|-----------|-----------|
## Column Total | 3418 | 971 | 4389 |
## | 0.779 | 0.221 | |
## ----------------|-----------|-----------|-----------|
##
##
#1. 11% of the data states that when satisfaction level is less than 0.46, and their last evaluation is less than 0.45 they are likely to leave the company. #2. 58% of the data states that when satisfaction level is greater than or equal to 0.47 and they have been with the company for less than 5 years they will continue to stay with the company
##
## Call:
## glm(formula = left ~ ., family = binomial(link = "logit"), data = trainlr)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.3927 -0.6771 -0.4306 -0.1523 3.1369
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.2297893 0.1399177 1.642 0.101
## satisfaction_level -4.1541678 0.1156513 -35.920 < 2e-16 ***
## last_evaluation 0.8140587 0.1744926 4.665 3.08e-06 ***
## number_project -0.3252356 0.0249471 -13.037 < 2e-16 ***
## average_montly_hours 0.0042840 0.0006043 7.089 1.35e-12 ***
## time_spend_company 0.2394629 0.0179363 13.351 < 2e-16 ***
## Work_accident -1.5533337 0.1066776 -14.561 < 2e-16 ***
## promotion_last_5years -1.7907012 0.3050679 -5.870 4.36e-09 ***
## salary NA NA NA NA
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 11569.0 on 10582 degrees of freedom
## Residual deviance: 9336.2 on 10575 degrees of freedom
## AIC: 9352.2
##
## Number of Fisher Scoring iterations: 5
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: left
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 10582 11569.0
## satisfaction_level 1 1595.97 10581 9973.0 < 2.2e-16 ***
## last_evaluation 1 12.41 10580 9960.6 0.000426 ***
## number_project 1 93.35 10579 9867.2 < 2.2e-16 ***
## average_montly_hours 1 56.30 10578 9810.9 6.226e-14 ***
## time_spend_company 1 135.68 10577 9675.2 < 2.2e-16 ***
## Work_accident 1 289.22 10576 9386.0 < 2.2e-16 ***
## promotion_last_5years 1 49.82 10575 9336.2 1.683e-12 ***
## salary 0 0.00 10575 9336.2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Call:
## glm(formula = left ~ satisfaction_level + last_evaluation + average_montly_hours +
## promotion_last_5years, family = binomial(link = "logit"),
## data = trainlr)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.4859 -0.7023 -0.4980 -0.3267 2.7014
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.4469563 0.1320889 3.384 0.000715 ***
## satisfaction_level -3.8272903 0.1053896 -36.316 < 2e-16 ***
## last_evaluation 0.3094267 0.1580656 1.958 0.050279 .
## average_montly_hours 0.0016123 0.0005289 3.048 0.002302 **
## promotion_last_5years -1.4525982 0.2865873 -5.069 4.01e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 11569.0 on 10582 degrees of freedom
## Residual deviance: 9914.2 on 10578 degrees of freedom
## AIC: 9924.2
##
## Number of Fisher Scoring iterations: 5
## Analysis of Deviance Table
##
## Model: binomial, link: logit
##
## Response: left
##
## Terms added sequentially (first to last)
##
##
## Df Deviance Resid. Df Resid. Dev Pr(>Chi)
## NULL 10582 11569.0
## satisfaction_level 1 1595.97 10581 9973.0 < 2.2e-16 ***
## last_evaluation 1 12.41 10580 9960.6 0.000426 ***
## average_montly_hours 1 9.43 10579 9951.1 0.002129 **
## promotion_last_5years 1 36.96 10578 9914.2 1.205e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.01476 0.10711 0.18567 0.24017 0.31129 0.70888
## [1] 1072
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 4416
##
##
## | vallr$predicted_left01
## vallr$left | 0 | 1 | Row Total |
## -------------|-----------|-----------|-----------|
## 0 | 2861 | 483 | 3344 |
## | 42.685 | 133.152 | |
## | 0.856 | 0.144 | 0.757 |
## | 0.856 | 0.451 | |
## | 0.648 | 0.109 | |
## -------------|-----------|-----------|-----------|
## 1 | 483 | 589 | 1072 |
## | 133.152 | 415.354 | |
## | 0.451 | 0.549 | 0.243 |
## | 0.144 | 0.549 | |
## | 0.109 | 0.133 | |
## -------------|-----------|-----------|-----------|
## Column Total | 3344 | 1072 | 4416 |
## | 0.757 | 0.243 | |
## -------------|-----------|-----------|-----------|
##
##
Particulars Logistic Regression | CART | RandomForest
----------------- -------- --------
Predicted left vs actually left 84.8% | 98.9% | 99.8%
Predicted to leave vs actually stayed 15.2% | 1.1% | 0.2%
Predicted to stay vs actually left 45.9% | 9.9% | 3.1%
Predicted stay vs actually stayed 54.1% | 90.1% | 96.9%