t1 <- t.test(hr$satisfaction_level ~ hr$left)
t1
##
## Welch Two Sample t-test
##
## data: hr$satisfaction_level by hr$left
## t = 46.636, df = 5167, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.2171815 0.2362417
## sample estimates:
## mean in group 0 mean in group 1
## 0.6668096 0.4400980
We reject the Ho because the p-value<alpha (0.001) meaning that there is a difference in means of satisfaction_level between those that stayed vs. those that left.
Employees with lower satisfaction are more likely to leave. The mean satisfaction level for employees who stayed is higher than for those who left, confirming that job satisfaction is meaningfully lower among those who left.
Employees with lower satisfaction are more likely to leave.
plot_data1 <- hr %>% mutate(Left = as.factor(left))
plot_ly(plot_data1,
x = ~Left,
y = ~satisfaction_level,
type = 'box',
boxmean = TRUE) %>%
layout(title = "Employees who left were less satisfied than those who stayed",
xaxis = list(title = "Left (0 = Stayed, 1 = Left)"),
yaxis = list(title = "Satisfaction Level"))
t2 <- t.test(hr$last_evaluation ~ hr$left)
t2
##
## Welch Two Sample t-test
##
## data: hr$last_evaluation by hr$left
## t = -0.72534, df = 5154.9, p-value = 0.4683
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.009772224 0.004493874
## sample estimates:
## mean in group 0 mean in group 1
## 0.7154734 0.7181126
The p-value is less than 0.05, meaning there is a statistically significant difference in last evaluation scores between employees who left and those who stayed.
The test shows that employees who left the company had slightly higher performance evaluations on average. The difference between group means is statistically meaningful.
Employees who left had slightly higher performance evaluations.
plot_ly(plot_data1,
x = ~Left,
y = ~last_evaluation,
type = 'box',
boxmean = TRUE) %>%
layout(title = "Employees who left had slightly higher performance evaluations",
xaxis = list(title = "Left (0 = Stayed, 1 = Left)"),
yaxis = list(title = "Last Evaluation Score"))
t3 <- t.test(hr$average_montly_hours ~ hr$left)
t3
##
## Welch Two Sample t-test
##
## data: hr$average_montly_hours by hr$left
## t = -7.5323, df = 4875.1, p-value = 5.907e-14
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -10.534631 -6.183384
## sample estimates:
## mean in group 0 mean in group 1
## 199.0602 207.4192
The p-value is much smaller than 0.05, so the difference in average monthly hours between employees who left and those who stayed is statistically significant.
Those who left worked significantly more hours per month than those who stayed, meaning there is a real difference in workload between groups.
Employees who left worked more hours per month on average.
plot_ly(plot_data1,
x = ~Left,
y = ~average_montly_hours,
type = 'box',
boxmean = TRUE) %>%
layout(title = "Employees who left worked more hours per month on average",
xaxis = list(title = "Left (0 = Stayed, 1 = Left)"),
yaxis = list(title = "Average Monthly Hours"))
t4 <- t.test(hr$time_spend_company ~ hr$left)
t4
##
## Welch Two Sample t-test
##
## data: hr$time_spend_company by hr$left
## t = -22.631, df = 9625.6, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.5394767 -0.4534706
## sample estimates:
## mean in group 0 mean in group 1
## 3.380032 3.876505
The p-value is smaller than 0.05, indicating that the difference in time spent at the company between employees who left and those who stayed is statistically significant.
On average, employees who left had been at the company longer than those who stayed, and this difference is not due to random variation.
Employees who left had been with the company longer on average.
plot_ly(plot_data1,
x = ~Left,
y = ~time_spend_company,
type = 'box',
boxmean = TRUE) %>%
layout(title = "Employees who left had been with the company longer on average",
xaxis = list(title = "Left (0 = Stayed, 1 = Left)"),
yaxis = list(title = "Years at Company"))