hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
## Rows: 14999 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Department, salary
## dbl (8): satisfaction_level, last_evaluation, number_project, average_montly...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
t_test_satisfaction <- t.test(satisfaction_level ~ left, data = hr)
t_test_satisfaction
##
## Welch Two Sample t-test
##
## data: satisfaction_level by left
## t = 46.636, df = 5167, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## 0.2171815 0.2362417
## sample estimates:
## mean in group 0 mean in group 1
## 0.6668096 0.4400980
P-value interpretation: The p-value is extremely small (less than 0.05), indicating that the difference in satisfaction levels between employees who stayed and those who left is statistically significant.
T-test interpretation: The difference in mean satisfaction level between employees who stayed and those who left is significant, with the difference being at least 0.217.
Non-technical interpretation: Employees who left the company had significantly lower satisfaction levels than those who stayed, suggesting that lower satisfaction is associated with a higher likelihood of leaving.
plot_data <- hr %>%
mutate(Status = ifelse(left == 1, "Left", "Stayed"))
plot_ly(plot_data, x = ~Status, y = ~satisfaction_level, type = 'box') %>%
layout(title = "Satisfaction Levels: Employees Who Stayed vs. Left",
xaxis = list(title = "Employee Status"),
yaxis = list(title = "Satisfaction Level"))
t_test_evaluation <- t.test(last_evaluation ~ left, data = hr)
t_test_evaluation
##
## Welch Two Sample t-test
##
## data: last_evaluation by left
## t = -0.72534, df = 5154.9, p-value = 0.4683
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.009772224 0.004493874
## sample estimates:
## mean in group 0 mean in group 1
## 0.7154734 0.7181126
P-value interpretation: The p-value is 0.4683, which is greater than 0.05. This indicates that there is no statistically significant difference in the last evaluation scores between employees who stayed and those who left.
T-test interpretation:The difference in mean last evaluation scores between employees who stayed and those who left is not significant, with the confidence interval ranging from -0.0098 to 0.0045. This suggests that any observed difference in evaluation scores could be due to random chance.
Non-technical interpretation:There is no meaningful difference in last evaluation scores between employees who stayed and those who left, indicating that performance evaluation scores alone may not be a factor in employee retention or attrition.
plot_data <- hr %>%
mutate(Status = ifelse(left == 1, "Left", "Stayed"))
plot_ly(plot_data, x = ~Status, y = ~last_evaluation, type = 'box') %>%
layout(title = "Last Evaluation Scores: Employees Who Stayed vs. Left",
xaxis = list(title = "Employee Status"),
yaxis = list(title = "Last Evaluation Score"))
t_test_hours <- t.test(average_montly_hours ~ left, data = hr)
t_test_hours
##
## Welch Two Sample t-test
##
## data: average_montly_hours by left
## t = -7.5323, df = 4875.1, p-value = 5.907e-14
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -10.534631 -6.183384
## sample estimates:
## mean in group 0 mean in group 1
## 199.0602 207.4192
P-value interpretation: The p-value is 5.907e-14, which is far smaller than 0.05. This indicates a statistically significant difference in average monthly hours between employees who stayed and those who left.
T-test interpretation: The difference in mean average monthly hours between employees who stayed and those who left is significant, with the confidence interval suggesting that employees who left worked between 6.18 and 10.53 more hours per month than those who stayed.
Non-technical interpretation: Employees who left the company worked significantly more hours per month, on average, than those who stayed. This suggests that higher monthly hours may be associated with a higher likelihood of leaving the company.
plot_data <- hr %>%
mutate(Status = ifelse(left == 1, "Left", "Stayed"))
plot_ly(plot_data, x = ~Status, y = ~average_montly_hours, type = 'box') %>%
layout(title = "Average Monthly Hours: Employees Who Stayed vs. Left",
xaxis = list(title = "Employee Status"),
yaxis = list(title = "Average Monthly Hours"))
t_test_time <- t.test(time_spend_company ~ left, data = hr)
t_test_time
##
## Welch Two Sample t-test
##
## data: time_spend_company by left
## t = -22.631, df = 9625.6, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
## 95 percent confidence interval:
## -0.5394767 -0.4534706
## sample estimates:
## mean in group 0 mean in group 1
## 3.380032 3.876505
P-value interpretation: The p-value is extremely small (less than 0.05), indicating that the difference in time spent at the company between employees who stayed and those who left is statistically significant.
T-test interpretation: The difference in mean time spent at the company between employees who stayed and those who left is significant, with the confidence interval showing that employees who left had spent between 0.453 and 0.539 more years at the company than those who stayed.
Non-technical interpretation: Employees who left the company had been with the company significantly longer, on average, than those who stayed. This suggests that longer tenure might be associated with an increased likelihood of leaving.
plot_data <- hr %>%
mutate(Status = ifelse(left == 1, "Left", "Stayed"))
plot_ly(plot_data, x = ~Status, y = ~time_spend_company, type = 'box') %>%
layout(title = "Time Spent at Company: Employees Who Stayed vs. Left",
xaxis = list(title = "Employee Status"),
yaxis = list(title = "Time Spent at Company (Years)"))