- T-Test number 1
hr$left <- as.factor(hr$left)
t_test_result <- t.test(satisfaction_level ~ left, data = hr)
print(t_test_result)
##
## Welch Two Sample t-test
##
## data: satisfaction_level by left
## t = 46.636, df = 5167, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.2171815 0.2362417
## sample estimates:
## mean in group 0 mean in group 1
## 0.6668096 0.4400980
- P-value Interpretation: The p-value is very small, <.05, so the
satifaction level of employees is significant
- T-test interpretation: The range of 0.2171815 to 0.2362417
represents the difference in means with 95% confidnce. The fact that the
range does not include zero reinforces the significance of satisfaction
levels.
- Non-technical interpretation: Employees with higher satisfaction
are more likely to stay.
hr <- hr %>%
mutate(left = factor(left, levels = c(0, 1), labels = c("Stayed", "Left")))
plot_ly(hr, y = ~satisfaction_level, color = ~left, type = "box",
colors = c("Stayed" = "green", "Left" = "red")) %>%
layout(
title = "More Satisfied Employees are More likely to stay",
yaxis = list(title = "Level of Satisfaction"),
xaxis = list(title = "Employment Status")
)
- T-Test Number 2
t_test_evaluation <- t.test(last_evaluation ~ left, data = hr)
print(t_test_evaluation)
##
## Welch Two Sample t-test
##
## data: last_evaluation by left
## t = -0.72534, df = 5154.9, p-value = 0.4683
## alternative hypothesis: true difference in means between group Stayed and group Left is not equal to 0
## 95 percent confidence interval:
## -0.009772224 0.004493874
## sample estimates:
## mean in group Stayed mean in group Left
## 0.7154734 0.7181126
- P-value Interpretation: The p-value is large, >.05, so the last
evaluation of employees is not significant
- T-test interpretation: the p value is .4683, so that means there
is a 46.83% chance of observing a difference as or more extreme than
this one based off random chance
- Non-technical interpretation: The is no significance to the last
evaluation the employee recieves
plot_ly(hr, y = ~last_evaluation, color = ~left, type = "box",
colors = c("Stayed" = "green", "Left" = "red")) %>%
layout(
title = "Last Evaluation doesn't have an effect on staying or leaving ",
yaxis = list(title = "Last Evaluation Score"),
xaxis = list(title = "Employment Status")
)
- T-Test Number 3
t_test_hours <- t.test(average_montly_hours ~ left, data = hr)
print(t_test_hours)
##
## Welch Two Sample t-test
##
## data: average_montly_hours by left
## t = -7.5323, df = 4875.1, p-value = 5.907e-14
## alternative hypothesis: true difference in means between group Stayed and group Left is not equal to 0
## 95 percent confidence interval:
## -10.534631 -6.183384
## sample estimates:
## mean in group Stayed mean in group Left
## 199.0602 207.4192
- P-value Interpretation: The p-value is very small, <.05, so the
average monthly hours of employees is significant
- T-test interpretation: The range of -10.534631 to -6.183384
represents the difference in means with 95% confidnce. The fact that the
range does not include zero reinforces the significance of satisfaction
levels.
- Non-technical interpretation: Employees with higher average
monthly hours are more likely to leave.
plot_ly(hr, y = ~average_montly_hours, color = ~left, type = "box",
colors = c("Stayed" = "green", "Left" = "red")) %>%
layout(
title = "Employees with higher Average Monthly Hours are more liekly to leave",
yaxis = list(title = "Average Monthly Hours"),
xaxis = list(title = "Employment Status")
)
- T-Test Number 4
t_test_time_spent <- t.test(time_spend_company ~ left, data = hr)
print(t_test_time_spent)
##
## Welch Two Sample t-test
##
## data: time_spend_company by left
## t = -22.631, df = 9625.6, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Stayed and group Left is not equal to 0
## 95 percent confidence interval:
## -0.5394767 -0.4534706
## sample estimates:
## mean in group Stayed mean in group Left
## 3.380032 3.876505
- P-value Interpretation: The p-value is very small, <.05, so the
time spent at the company is significant
- The difference in mean of staying and leaving based off time spent
at the comapny is significant, where the difference in MPG is at least
.4534706 years
- Non-technical interpretation: Employees with higher average
monthly hours are more likely to leave
plot_ly(hr, y = ~time_spend_company, color = ~left, type = "box",
colors = c("Stayed" = "green", "Left" = "red")) %>%
layout(
title = "Employees who stayed longer at the company are more likely to leave",
yaxis = list(title = "Time Spent at Company (Years)"),
xaxis = list(title = "Employment Status")
)