This analysis examines differences between employees who left the company and those who stayed using t-tests and data visualization. Four continuous variables were tested: satisfaction level, last evaluation, number of projects, and average monthly hours.
t1 <- t.test(satisfaction_level ~ left, data = hr)
t1
##
## Welch Two Sample t-test
##
## data: satisfaction_level by left
## t = 46.636, df = 5167, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.2171815 0.2362417
## sample estimates:
## mean in group 0 mean in group 1
## 0.6668096 0.4400980
plot_data <- hr %>%
mutate(Status = factor(left, labels = c("Stayed", "Left")))
plot_ly(plot_data,
x = ~Status,
y = ~satisfaction_level,
type = "box") %>%
layout(title = "Employees Who Left Had Lower Satisfaction Levels")
t2 <- t.test(last_evaluation ~ left, data = hr)
t2
##
## Welch Two Sample t-test
##
## data: last_evaluation by left
## t = -0.72534, df = 5154.9, p-value = 0.4683
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.009772224 0.004493874
## sample estimates:
## mean in group 0 mean in group 1
## 0.7154734 0.7181126
plot_ly(plot_data,
x = ~Status,
y = ~last_evaluation,
type = "box") %>%
layout(title = "Employees Who Left Had Similar Evaluation Scores as Those Who Stayed")
t3 <- t.test(number_project ~ left, data = hr)
t3
##
## Welch Two Sample t-test
##
## data: number_project by left
## t = -2.1663, df = 4236.5, p-value = 0.03034
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.131136535 -0.006540119
## sample estimates:
## mean in group 0 mean in group 1
## 3.786664 3.855503
plot_ly(plot_data,
x = ~Status,
y = ~number_project,
type = "box") %>%
layout(title = "Employees Who Left Handled More Projects on Average")
t4 <- t.test(average_montly_hours ~ left, data = hr)
t4
##
## Welch Two Sample t-test
##
## data: average_montly_hours by left
## t = -7.5323, df = 4875.1, p-value = 5.907e-14
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -10.534631 -6.183384
## sample estimates:
## mean in group 0 mean in group 1
## 199.0602 207.4192
plot_ly(plot_data,
x = ~Status,
y = ~average_montly_hours,
type = "box") %>%
layout(title = "Employees Who Left Worked Significantly Longer Hours Per Month")
Overall, the results suggest that satisfaction, workload, and working hours play a meaningful role in employee attrition. Employees who left tended to have lower satisfaction, worked on more projects, and logged longer monthly hours. However, performance evaluations were similar between the two groups, indicating that leaving the company is less about performance and more about workload and satisfaction.