R Markdown

library(readr)
library(plotly)
library(dplyr)
library(ggplot2)

hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')

1. Department vs. Attrition

chisq.test(hr$Department , hr$left)
## 
##  Pearson's Chi-squared test
## 
## data:  hr$Department and hr$left
## X-squared = 86.825, df = 9, p-value = 7.042e-15

- Technical: the p-value of 7.042e-15 is incredibly small meaning that there is very small probability that the results are random. Based on the chi-squared test results, the likelihood of an employee leaving the company is significantly somewhat influenced by the department they work in.

- Non-technical: Employees decision ot leave was sometimes becasue of their department.

dept_data <- hr %>%
  group_by(Department) %>%
  summarise(
    stayed = sum(left == 0) / n(),
    left = sum(left == 1) / n()
  )
plot_ly(dept_data) %>%
  add_bars(x = ~Department, y = ~stayed, name = "Stayed", 
           marker = list(color = "#1f77b4")) %>%
  add_bars(x = ~Department, y = ~left, name = "Left", 
           marker = list(color = "#ff7f0e")) %>%
  layout(
    barmode = "stack",
    xaxis = list(
      title = "Department",
      tickvals = unique(dept_data$Department),  # Ensure each department has a tick value
      ticktext = unique(dept_data$Department)   # Use the department names as labels
    ),
    yaxis = list(title = "Proportion", tickformat = ",.0%"),
    title = "Department Has an Effect on Attrition"
  )

2. Salary vs. Attrition

chisq.test(hr$salary, hr$left)
## 
##  Pearson's Chi-squared test
## 
## data:  hr$salary and hr$left
## X-squared = 381.23, df = 2, p-value < 2.2e-16

- Technical: The p-value of 2.2e-16 is very small indicating that there is a very small probability that the results are random. The chi-squared value of 381.23 indicates that there is a significant difference between the observed and expected frequencies, suggesting that the variables salary and left are not independent.

- Non-technical: Employees with certain salary levels are more likely to leave, while others are more likely to stay.

salary_data <- hr %>%
  group_by(salary) %>%
  summarise(
    stayed = sum(left == 0) / n(),
    left = sum(left == 1) / n()
  )
plot_ly(salary_data) %>%
  add_bars(x = ~salary, y = ~stayed, name = "Stayed", 
           marker = list(color = "#1f77b4")) %>%
  add_bars(x = ~salary, y = ~left, name = "Left", 
           marker = list(color = "#ff7f0e")) %>%
  layout(
    barmode = "stack",
    xaxis = list(
      title = "Salary",
      tickvals = c(0, 1, 2),   
      ticktext = c("Low Salary", "Medium Salary", "High Salary")
    ),
    yaxis = list(title = "Proportion", tickformat = ",.0%"),
    title = "Low Salary Employees Leave More Than Others"
  )

3. Promotions vs. Attrition

chisq.test(hr$promotion_last_5years, hr$left)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  hr$promotion_last_5years and hr$left
## X-squared = 56.262, df = 1, p-value = 6.344e-14

- Technical: The p-value of 6.344e-14 is very small indicating that the odds that the results are random is unlikely. The Chi-square value of 56.262 indicates a large difference, suggesting a significant relationship between promotion and attrition

- Non-technical: Employees who were promoted in the last 5 years are likely to stay with the company and those who weren’t promoted may be more likely to leave

promotion_data <- hr %>%
  group_by(promotion_last_5years) %>%
  summarise(
    stayed = sum(left == 0) / n(),
    left = sum(left == 1) / n()
  )
plot_ly(promotion_data) %>%
  add_bars(x = ~promotion_last_5years, y = ~stayed, name = "Stayed", 
           marker = list(color = "#1f77b4")) %>%
  add_bars(x = ~promotion_last_5years, y = ~left, name = "Left", 
           marker = list(color = "#ff7f0e")) %>%
  layout(
    barmode = "stack",
    xaxis = list(
      title = "Recent Promotion",
      tickvals = c(0, 1),  
      ticktext = c("No Promotion", "Promotion")  
    ),
    yaxis = list(title = "Proportion", tickformat = ",.0%"),
    title = "Promotions Does Not Affect Attrition"
  )

4. Work Accident vs. Attrition

chisq.test(hr$Work_accident, hr$left)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  hr$Work_accident and hr$left
## X-squared = 357.56, df = 1, p-value < 2.2e-16

- Technical: the p value of < 2.2e-16 is very very small meaning that the results are very unlikely to have occurred by chance. The chi-squared test results of such a large number, 357.56, indicates that the likelihood of leaving the company is strongly influenced by whether an employee had a work accident.

- Non-technical: employees who have had a work accident are much more likely to leave the company.

accident_data <- hr %>%
  group_by(Work_accident) %>%
  summarise(
    stayed = sum(left == 0) / n(),
    left = sum(left == 1) / n()
  )
plot_ly(accident_data) %>%
  add_bars(x = ~Work_accident, y = ~stayed, name = "Stayed", 
           marker = list(color = "#1f77b4")) %>%
  add_bars(x = ~Work_accident, y = ~left, name = "Left", 
           marker = list(color = "#ff7f0e")) %>%
  layout(
    barmode = "stack", 
    xaxis = list(
      title = "Accidents at Work",
      tickvals = c(0, 1),  
      ticktext = c("Accident", "No Accident")  
    ),
    yaxis = list(title = "Proportion", tickformat = ",.0%"),
    title = "Accidents tend to affect attrition"
  )