Introduction

This analysis examines differences between employees who left the company and those who stayed using t-tests and data visualization. Four continuous variables were tested: satisfaction level, last evaluation, number of projects, and average monthly hours.

T-Test 1: Satisfaction Level

t1 <- t.test(satisfaction_level ~ left, data = hr)
t1
## 
##  Welch Two Sample t-test
## 
## data:  satisfaction_level by left
## t = 46.636, df = 5167, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  0.2171815 0.2362417
## sample estimates:
## mean in group 0 mean in group 1 
##       0.6668096       0.4400980
Technical Interpretation:
  • p-value < 0.01 means the difference in satisfaction levels between those who left and those who stayed is statistically significant.
Non-technical Interpretation:
  • Employees who left the company had noticeably lower satisfaction than those who stayed.
Boxplot:
plot_data <- hr %>%
  mutate(Status = factor(left, labels = c("Stayed", "Left")))

plot_ly(plot_data,
        x = ~Status,
        y = ~satisfaction_level,
        type = "box") %>%
  layout(title = "Employees Who Left Had Lower Satisfaction Levels")

T-Test 2: Last Evaluation

t2 <- t.test(last_evaluation ~ left, data = hr)
t2
## 
##  Welch Two Sample t-test
## 
## data:  last_evaluation by left
## t = -0.72534, df = 5154.9, p-value = 0.4683
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -0.009772224  0.004493874
## sample estimates:
## mean in group 0 mean in group 1 
##       0.7154734       0.7181126
Technical Interpretation:
  • The t-test comparing last evaluation scores between employees who left and those who stayed resulted in a p-value of 0.4683, which is greater than 0.01. This indicates no statistically significant difference in average evaluation scores between the two groups.
Non-Technical Interpretation:
  • Employees who left and employees who stayed had very similar performance evaluations. This suggests that evaluation scores did not play a major role in employees’ decisions to leave the company.
Boxplot:
plot_ly(plot_data,
          x = ~Status,
          y = ~last_evaluation,
          type = "box") %>%
  layout(title = "Employees Who Left Had Similar Evaluation Scores as Those Who Stayed")

T-Test 3: Number Project

t3 <- t.test(number_project ~ left, data = hr)
t3
## 
##  Welch Two Sample t-test
## 
## data:  number_project by left
## t = -2.1663, df = 4236.5, p-value = 0.03034
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -0.131136535 -0.006540119
## sample estimates:
## mean in group 0 mean in group 1 
##        3.786664        3.855503
Technical Interpretation:
  • The t-test comparing the average number of projects between employees who left and those who stayed yielded a p-value of 0.03034, which is greater than 0.01.
  • This indicates there is not a statistically significant difference between the two groups. Therefore, we accept the null hypothesis.
Non-Technical Interpretation:
  • Employees who left the company worked on roughly the same number of projects as those who stayed.
  • This suggests that workload alone was not a strong factor contributing to employee attrition.
Boxplot:
 plot_ly(plot_data,
          x = ~Status,
          y = ~number_project,
          type = "box") %>%
  layout(title = "Employees Who Left Handled More Projects on Average")

T-Test 4: Average Monthly Hours

t4 <- t.test(average_montly_hours ~ left, data = hr)
t4
## 
##  Welch Two Sample t-test
## 
## data:  average_montly_hours by left
## t = -7.5323, df = 4875.1, p-value = 5.907e-14
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
##  -10.534631  -6.183384
## sample estimates:
## mean in group 0 mean in group 1 
##        199.0602        207.4192
Technical Interpretation:
  • The t-test comparing average monthly hours between employees who left and those who stayed produced a t-value of -7.53 and a p-value of 5.907e-14, which is far below the 0.01 threshold.
  • This indicates a highly significant difference in working hours between the two groups. Employees who left worked on average between 6 and 10 hours more per month than those who stayed. We therefore reject the null hypothesis.
Non-Technical Interpretation:
  • Employees who left the company were putting in more monthly hours on average than those who remained.
  • This pattern suggests that overwork or lack of balance might have played a key role in employee turnover.
Boxplot:
plot_ly(plot_data,
        x = ~Status,
        y = ~average_montly_hours,
        type = "box") %>%
  layout(title = "Employees Who Left Worked Significantly Longer Hours Per Month")

Conclusion

Overall, the results suggest that satisfaction, workload, and working hours play a meaningful role in employee attrition. Employees who left tended to have lower satisfaction, worked on more projects, and logged longer monthly hours. However, performance evaluations were similar between the two groups, indicating that leaving the company is less about performance and more about workload and satisfaction.