Chi-Square Test 1: Salary Level vs. Attrition

1. Perform the Chi-Square Test

chisq.test(hr$salary, hr$left)
## 
##  Pearson's Chi-squared test
## 
## data:  hr$salary and hr$left
## X-squared = 381.23, df = 2, p-value < 2.2e-16

2. Technical Interpretation

The p-value is very small, so we reject the null hypothesis that salary level and employee attrition are independent. The probability of observing this association by random chance alone is essentially zero, indicating a statistically significant relationship between salary and whether an employee left the company.

3. Non-Technical Interpretation

Employees with low salaries are far more likely to leave the company than those earning medium or high salaries.

4. Visualization

prop_salary <- hr %>%
  group_by(salary) %>%
  summarise(
    Stayed = sum(left == 0) / n(),
    Left   = sum(left == 1) / n()
  ) %>%
  mutate(salary = factor(salary, levels = c("low", "medium", "high")))

plot_ly(prop_salary) %>%
  add_bars(x = ~salary, y = ~Stayed, name = "Stayed",
           marker = list(color = "#2196F3")) %>%
  add_bars(x = ~salary, y = ~Left, name = "Left",
           marker = list(color = "#F44336")) %>%
  layout(
    barmode = "stack",
    xaxis   = list(title = "Salary Level"),
    yaxis   = list(title = "Proportion", tickformat = ",.0%"),
    title   = "Employees with low salaries are far more likely to leave the company",
    legend  = list(orientation = "h", x = 0.3, y = -0.2)
  )

Chi-Square Test 2: Work Accident vs. Attrition

1. Perform the Chi-Square Test

chisq.test(hr$Work_accident, hr$left)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  hr$Work_accident and hr$left
## X-squared = 357.56, df = 1, p-value < 2.2e-16

2. Technical Interpretation

The p-value is far below 0.05, so we reject the null hypothesis that work accidents and employee attrition are independent. There is a statistically significant association between whether an employee experienced a workplace accident and whether they chose to leave the company.

3. Non-Technical Interpretation

Employees who have been involved in a workplace accident are actually less likely to leave the company.

4. Visualization

prop_accident <- hr %>%
  group_by(Work_accident) %>%
  summarise(
    Stayed = sum(left == 0) / n(),
    Left   = sum(left == 1) / n()
  ) %>%
  mutate(Work_accident = factor(Work_accident, levels = c(0, 1),
                                labels = c("No Accident", "Had Accident")))

plot_ly(prop_accident) %>%
  add_bars(x = ~Work_accident, y = ~Stayed, name = "Stayed",
           marker = list(color = "#2196F3")) %>%
  add_bars(x = ~Work_accident, y = ~Left, name = "Left",
           marker = list(color = "#F44336")) %>%
  layout(
    barmode = "stack",
    xaxis   = list(title = "Work Accident Status"),
    yaxis   = list(title = "Proportion", tickformat = ",.0%"),
    title   = "Employees who had a workplace accident are less likely to leave",
    legend  = list(orientation = "h", x = 0.3, y = -0.2)
  )

Chi-Square Test 3: Promotion in Last 5 Years vs. Attrition

1. Perform the Chi-Square Test

chisq.test(hr$promotion_last_5years, hr$left)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  hr$promotion_last_5years and hr$left
## X-squared = 56.262, df = 1, p-value = 6.344e-14

2. Technical Interpretation

The p-value below the 0.05 significance threshold, so we reject the null hypothesis that promotions and employee attrition are independent. The data provides strong statistical evidence that receiving (or not receiving) a promotion within the last five years is significantly associated with whether an employee leaves the company.

3. Non-Technical Interpretation

Employees who were not promoted in the last 5 years are significantly more likely to leave the company.

4. Visualization

prop_promo <- hr %>%
  group_by(promotion_last_5years) %>%
  summarise(
    Stayed = sum(left == 0) / n(),
    Left   = sum(left == 1) / n()
  ) %>%
  mutate(promotion_last_5years = factor(promotion_last_5years, levels = c(0, 1),
                                        labels = c("No Promotion", "Promoted")))

plot_ly(prop_promo) %>%
  add_bars(x = ~promotion_last_5years, y = ~Stayed, name = "Stayed",
           marker = list(color = "#2196F3")) %>%
  add_bars(x = ~promotion_last_5years, y = ~Left, name = "Left",
           marker = list(color = "#F44336")) %>%
  layout(
    barmode = "stack",
    xaxis   = list(title = "Promotion in Last 5 Years"),
    yaxis   = list(title = "Proportion", tickformat = ",.0%"),
    title   = "Employees not promoted in the last 5 years are more likely to leave",
    legend  = list(orientation = "h", x = 0.3, y = -0.2)
  )

Chi-Square Test 4: Department vs. Attrition

1. Perform the Chi-Square Test

chisq.test(hr$Department, hr$left)
## 
##  Pearson's Chi-squared test
## 
## data:  hr$Department and hr$left
## X-squared = 86.825, df = 9, p-value = 7.042e-15

2. Technical Interpretation

The p-value is below 0.05, so we reject the null hypothesis that department and attrition are independent. There is a statistically significant relationship between the department an employee works in and whether they leave the company. Different departments have meaningfully different retention rates.

3. Non-Technical Interpretation

The HR and Accounting departments have the highest employee turnover, while Management has the lowest.

4. Visualization

prop_dept <- hr %>%
  group_by(Department) %>%
  summarise(
    Stayed   = sum(left == 0) / n(),
    Left     = sum(left == 1) / n(),
    attrition_rate = sum(left == 1) / n()
  ) %>%
  arrange(desc(attrition_rate))

plot_ly(prop_dept) %>%
  add_bars(x = ~reorder(Department, -attrition_rate), y = ~Stayed,
           name = "Stayed", marker = list(color = "#2196F3")) %>%
  add_bars(x = ~reorder(Department, -attrition_rate), y = ~Left,
           name = "Left", marker = list(color = "#F44336")) %>%
  layout(
    barmode = "stack",
    xaxis   = list(title = "Department", tickangle = -30),
    yaxis   = list(title = "Proportion", tickformat = ",.0%"),
    title   = "HR and Accounting have the highest turnover; Management retains employees best",
    legend  = list(orientation = "h", x = 0.3, y = -0.3)
  )