##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
## Loading required package: ggpubr
##
## Attaching package: 'survminer'
## The following object is masked from 'package:survival':
##
## myeloma
people_analytics <- read.csv("WA_Fn-UseC_-HR-Employee-Attrition.csv")
str(people_analytics)
## 'data.frame': 1470 obs. of 35 variables:
## $ Age : int 41 49 37 33 27 32 59 30 38 36 ...
## $ Attrition : chr "Yes" "No" "Yes" "No" ...
## $ BusinessTravel : chr "Travel_Rarely" "Travel_Frequently" "Travel_Rarely" "Travel_Frequently" ...
## $ DailyRate : int 1102 279 1373 1392 591 1005 1324 1358 216 1299 ...
## $ Department : chr "Sales" "Research & Development" "Research & Development" "Research & Development" ...
## $ DistanceFromHome : int 1 8 2 3 2 2 3 24 23 27 ...
## $ Education : int 2 1 2 4 1 2 3 1 3 3 ...
## $ EducationField : chr "Life Sciences" "Life Sciences" "Other" "Life Sciences" ...
## $ EmployeeCount : int 1 1 1 1 1 1 1 1 1 1 ...
## $ EmployeeNumber : int 1 2 4 5 7 8 10 11 12 13 ...
## $ EnvironmentSatisfaction : int 2 3 4 4 1 4 3 4 4 3 ...
## $ Gender : chr "Female" "Male" "Male" "Female" ...
## $ HourlyRate : int 94 61 92 56 40 79 81 67 44 94 ...
## $ JobInvolvement : int 3 2 2 3 3 3 4 3 2 3 ...
## $ JobLevel : int 2 2 1 1 1 1 1 1 3 2 ...
## $ JobRole : chr "Sales Executive" "Research Scientist" "Laboratory Technician" "Research Scientist" ...
## $ JobSatisfaction : int 4 2 3 3 2 4 1 3 3 3 ...
## $ MaritalStatus : chr "Single" "Married" "Single" "Married" ...
## $ MonthlyIncome : int 5993 5130 2090 2909 3468 3068 2670 2693 9526 5237 ...
## $ MonthlyRate : int 19479 24907 2396 23159 16632 11864 9964 13335 8787 16577 ...
## $ NumCompaniesWorked : int 8 1 6 1 9 0 4 1 0 6 ...
## $ Over18 : chr "Y" "Y" "Y" "Y" ...
## $ OverTime : chr "Yes" "No" "Yes" "Yes" ...
## $ PercentSalaryHike : int 11 23 15 11 12 13 20 22 21 13 ...
## $ PerformanceRating : int 3 4 3 3 3 3 4 4 4 3 ...
## $ RelationshipSatisfaction: int 1 4 2 3 4 3 1 2 2 2 ...
## $ StandardHours : int 80 80 80 80 80 80 80 80 80 80 ...
## $ StockOptionLevel : int 0 1 0 0 1 0 3 1 0 2 ...
## $ TotalWorkingYears : int 8 10 7 8 6 8 12 1 10 17 ...
## $ TrainingTimesLastYear : int 0 3 3 3 3 2 3 2 2 3 ...
## $ WorkLifeBalance : int 1 3 3 3 3 2 2 3 3 2 ...
## $ YearsAtCompany : int 6 10 0 8 2 7 1 1 9 7 ...
## $ YearsInCurrentRole : int 4 7 0 7 2 7 0 0 7 7 ...
## $ YearsSinceLastPromotion : int 0 1 0 3 2 3 0 0 1 7 ...
## $ YearsWithCurrManager : int 5 7 0 0 2 6 0 0 8 7 ...
Courtesy of IBM for research purpose - This IBM HR Analytics Employee Attrition & Performance dataset provides a comprehensive picture of employee attributes and workplace dynamics within a specific organisation. This dataset, which includes information on age, gender, job roles, satisfaction levels, and performance indicators, provides significant insights into the factors that influence employee attrition, contentment, and performance. Leveraging this information enables the development of effective HR strategies focused at improving staff engagement, retention, and overall organisational performance. With 1470 observations across 35 variables, it provides a wealth of data for in-depth research and strategic decision-making.
The goal of analysing this dataset is to identify patterns and trends regarding employee attrition, performance, and satisfaction. By investigating the correlations between various variables such as age, gender, work role, and satisfaction levels, we hope to find potential factors impacting employee turnover and engagement. Finally, we hope to gain actionable insights that will help inform HR decision-making processes and assist the creation of focused interventions to increase employee well-being and organisational effectiveness.
summary(people_analytics)
## Age Attrition BusinessTravel DailyRate
## Min. :18.00 Length:1470 Length:1470 Min. : 102.0
## 1st Qu.:30.00 Class :character Class :character 1st Qu.: 465.0
## Median :36.00 Mode :character Mode :character Median : 802.0
## Mean :36.92 Mean : 802.5
## 3rd Qu.:43.00 3rd Qu.:1157.0
## Max. :60.00 Max. :1499.0
## Department DistanceFromHome Education EducationField
## Length:1470 Min. : 1.000 Min. :1.000 Length:1470
## Class :character 1st Qu.: 2.000 1st Qu.:2.000 Class :character
## Mode :character Median : 7.000 Median :3.000 Mode :character
## Mean : 9.193 Mean :2.913
## 3rd Qu.:14.000 3rd Qu.:4.000
## Max. :29.000 Max. :5.000
## EmployeeCount EmployeeNumber EnvironmentSatisfaction Gender
## Min. :1 Min. : 1.0 Min. :1.000 Length:1470
## 1st Qu.:1 1st Qu.: 491.2 1st Qu.:2.000 Class :character
## Median :1 Median :1020.5 Median :3.000 Mode :character
## Mean :1 Mean :1024.9 Mean :2.722
## 3rd Qu.:1 3rd Qu.:1555.8 3rd Qu.:4.000
## Max. :1 Max. :2068.0 Max. :4.000
## HourlyRate JobInvolvement JobLevel JobRole
## Min. : 30.00 Min. :1.00 Min. :1.000 Length:1470
## 1st Qu.: 48.00 1st Qu.:2.00 1st Qu.:1.000 Class :character
## Median : 66.00 Median :3.00 Median :2.000 Mode :character
## Mean : 65.89 Mean :2.73 Mean :2.064
## 3rd Qu.: 83.75 3rd Qu.:3.00 3rd Qu.:3.000
## Max. :100.00 Max. :4.00 Max. :5.000
## JobSatisfaction MaritalStatus MonthlyIncome MonthlyRate
## Min. :1.000 Length:1470 Min. : 1009 Min. : 2094
## 1st Qu.:2.000 Class :character 1st Qu.: 2911 1st Qu.: 8047
## Median :3.000 Mode :character Median : 4919 Median :14236
## Mean :2.729 Mean : 6503 Mean :14313
## 3rd Qu.:4.000 3rd Qu.: 8379 3rd Qu.:20462
## Max. :4.000 Max. :19999 Max. :26999
## NumCompaniesWorked Over18 OverTime PercentSalaryHike
## Min. :0.000 Length:1470 Length:1470 Min. :11.00
## 1st Qu.:1.000 Class :character Class :character 1st Qu.:12.00
## Median :2.000 Mode :character Mode :character Median :14.00
## Mean :2.693 Mean :15.21
## 3rd Qu.:4.000 3rd Qu.:18.00
## Max. :9.000 Max. :25.00
## PerformanceRating RelationshipSatisfaction StandardHours StockOptionLevel
## Min. :3.000 Min. :1.000 Min. :80 Min. :0.0000
## 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:80 1st Qu.:0.0000
## Median :3.000 Median :3.000 Median :80 Median :1.0000
## Mean :3.154 Mean :2.712 Mean :80 Mean :0.7939
## 3rd Qu.:3.000 3rd Qu.:4.000 3rd Qu.:80 3rd Qu.:1.0000
## Max. :4.000 Max. :4.000 Max. :80 Max. :3.0000
## TotalWorkingYears TrainingTimesLastYear WorkLifeBalance YearsAtCompany
## Min. : 0.00 Min. :0.000 Min. :1.000 Min. : 0.000
## 1st Qu.: 6.00 1st Qu.:2.000 1st Qu.:2.000 1st Qu.: 3.000
## Median :10.00 Median :3.000 Median :3.000 Median : 5.000
## Mean :11.28 Mean :2.799 Mean :2.761 Mean : 7.008
## 3rd Qu.:15.00 3rd Qu.:3.000 3rd Qu.:3.000 3rd Qu.: 9.000
## Max. :40.00 Max. :6.000 Max. :4.000 Max. :40.000
## YearsInCurrentRole YearsSinceLastPromotion YearsWithCurrManager
## Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 2.000 1st Qu.: 0.000 1st Qu.: 2.000
## Median : 3.000 Median : 1.000 Median : 3.000
## Mean : 4.229 Mean : 2.188 Mean : 4.123
## 3rd Qu.: 7.000 3rd Qu.: 3.000 3rd Qu.: 7.000
## Max. :18.000 Max. :15.000 Max. :17.000
sapply(people_analytics, function(x) sum(is.na(x)))
## Age Attrition BusinessTravel
## 0 0 0
## DailyRate Department DistanceFromHome
## 0 0 0
## Education EducationField EmployeeCount
## 0 0 0
## EmployeeNumber EnvironmentSatisfaction Gender
## 0 0 0
## HourlyRate JobInvolvement JobLevel
## 0 0 0
## JobRole JobSatisfaction MaritalStatus
## 0 0 0
## MonthlyIncome MonthlyRate NumCompaniesWorked
## 0 0 0
## Over18 OverTime PercentSalaryHike
## 0 0 0
## PerformanceRating RelationshipSatisfaction StandardHours
## 0 0 0
## StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 0 0 0
## WorkLifeBalance YearsAtCompany YearsInCurrentRole
## 0 0 0
## YearsSinceLastPromotion YearsWithCurrManager
## 0 0
people_analytics$Attrition <- factor(people_analytics$Attrition)
people_analytics$BusinessTravel <- factor(people_analytics$BusinessTravel)
people_analytics$Department <- factor(people_analytics$Department)
people_analytics$EducationField <- factor(people_analytics$EducationField)
people_analytics$Gender <- factor(people_analytics$Gender)
people_analytics$JobRole <- factor(people_analytics$JobRole)
people_analytics$MaritalStatus <- factor(people_analytics$MaritalStatus)
people_analytics$Over18 <- factor(people_analytics$Over18)
people_analytics$OverTime <- factor(people_analytics$OverTime)
This illustration reveals a clear association between work satisfaction and attrition rates. As job satisfaction grows from level 1 to level 4, the percentage of employees who leave falls.
Employees with Job Satisfaction Level 1 have an attrition rate of 22.8%.
For level 2, the attrition rate falls to 16.6%.
Similarly, attrition rate at level 3 is 16.5% (83.5% did not experience attrition).
Employees with Job Satisfaction level 4 have the lowest attrition rate, with only 11.3% experiencing it (88.7% did not).
This implies that higher levels of job satisfaction are linked to lower attrition rates, emphasising the need of addressing job satisfaction-related aspects in order to reduce employee turnover.
This information is useful for identifying departments with higher-than-average attrition rates, highlighting potential areas for additional research or action. Furthermore, it aids in identifying departments with reduced attrition rates, which may serve as models for best practices or areas where retention measures are particularly effective.
## Call:
## coxph(formula = SurvivalData ~ Age + Gender + Department + JobSatisfaction +
## Education + DistanceFromHome + EnvironmentSatisfaction, data = people_analytics)
##
## n= 1470, number of events= 237
##
## coef exp(coef) se(coef) z Pr(>|z|)
## Age -0.092092 0.912021 0.010354 -8.894 < 2e-16
## GenderMale 0.136691 1.146474 0.136033 1.005 0.314977
## DepartmentResearch & Development -0.331178 0.718077 0.303638 -1.091 0.275404
## DepartmentSales 0.037113 1.037810 0.308860 0.120 0.904355
## JobSatisfaction -0.237065 0.788940 0.057515 -4.122 3.76e-05
## Education -0.003332 0.996674 0.064957 -0.051 0.959094
## DistanceFromHome 0.020144 1.020348 0.007564 2.663 0.007741
## EnvironmentSatisfaction -0.224178 0.799173 0.060131 -3.728 0.000193
##
## Age ***
## GenderMale
## DepartmentResearch & Development
## DepartmentSales
## JobSatisfaction ***
## Education
## DistanceFromHome **
## EnvironmentSatisfaction ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## exp(coef) exp(-coef) lower .95 upper .95
## Age 0.9120 1.0965 0.8937 0.9307
## GenderMale 1.1465 0.8722 0.8782 1.4968
## DepartmentResearch & Development 0.7181 1.3926 0.3960 1.3021
## DepartmentSales 1.0378 0.9636 0.5665 1.9012
## JobSatisfaction 0.7889 1.2675 0.7048 0.8831
## Education 0.9967 1.0033 0.8775 1.1320
## DistanceFromHome 1.0203 0.9801 1.0053 1.0356
## EnvironmentSatisfaction 0.7992 1.2513 0.7103 0.8991
##
## Concordance= 0.737 (se = 0.018 )
## Likelihood ratio test= 147.8 on 8 df, p=<2e-16
## Wald test = 129.9 on 8 df, p=<2e-16
## Score (logrank) test = 134.9 on 8 df, p=<2e-16
It is useful because it can handle scenarios in which not everyone experiences the event under consideration (such as leaving a job), and it can examine multiple aspects at once. Furthermore, it produces understandable data, allowing us to identify which characteristics are most essential in predicting why people may leave their positions.
This analysis employing the Cox proportional hazards regression model provides insight into important predictors of attrition inside an organisation.
Here’s what it means for businesses.
Age: The coefficient for age is negative (-0.092092), showing that as one becomes older, the hazard (or risk) of attrition goes down. In other words, senior employees are less likely to leave the organisation than younger ones.
Gender: The coefficient for gender (particularly male) is positive (0.136691) but statistically insignificant (p = 0.315). This indicates that gender may not have a substantial impact on attrition rates in this study.
Department: Employees in the Research & Development department have a lower risk of attrition than other departments. However, the difference is not statistically significant (p = 0.275). Similarly, employees in the Sales department have a slightly increased risk of leaving, but this is not statistically significant (p = 0.904).
Job satisfaction: Job satisfaction has a considerable negative effect on attrition. A decline in job satisfaction increases the risk of attrition. The negative coefficient (-0.237065) and significance level (p < 0.001) support this conclusion.
Education level did not appear to have a substantial impact on attrition rates in this study.
Distance From Home: Employees who reside farther away from their employment have a somewhat increased risk of attrition. The positive correlation (0.020144) suggests that as the distance from home increases, so does the risk of attrition. This effect is statistically significant (p-value = 0.008).
Environment satisfaction: Environmental contentment, like job satisfaction, has a major negative impact on turnover. A decline in environment satisfaction increases the risk of attrition, as demonstrated by the negative coefficient (-0.224178) and its significant level (p < 0.001).
Overall, the model’s concordance statistic (c-index) of 0.737 indicates that it has strong predictive potential for discriminating between departing and remaining personnel. The likelihood ratio test, Wald test, and Score (logrank) test all show that the model’s coefficients differ considerably from zero, confirming its validity in predicting attrition.
ggforest(cox_model, main = "Cox Proportional Hazards Regression")
The key takeaway here is that different job roles have varying attrition rates inside the firm. For example:
Healthcare representatives experience a comparatively low attrition rate of 6.9%.
HR professionals had a higher attrition rate of 23.1%.
Laboratory technicians likewise have a high attrition rate (23.9%).
Managers experience a comparatively low attrition rate of 4.9%.
Sales agents have the highest departure rate of all job roles, at 39.8%.
Understanding attrition rates by job role can help businesses in a variety of ways:
Identifying Risk Areas: It helps determine which departments or job functions have greater turnover rates, allowing management to focus retention efforts where they are most required.
Retention plans: Armed with this knowledge, firms can create focused retention plans for certain job roles or departments, lowering turnover and retaining key people.
Succession Planning: High attrition rates in important employment roles, such as sales representatives, may suggest the need for effective succession planning to assure continuity and smooth transitions when people depart.
Resource Allocation: It enables firms to allocate resources more efficiently by diverting greater resources to regions with high attrition rates in order to address underlying issues that contribute to turnover.
Overall, this study sheds light on workforce dynamics and can help guide strategic decisions targeted at increasing staff retention and organisational stability.
This data reveals that different departments within the organisation experience varying turnover rates:
Human Resources: The department has a 19% attrition rate.
The Research & Development department has a remarkably low attrition rate of 13.8%.
Sales: The Sales department has a greater attrition rate than the other departments, at 20.6%.
Understanding the departmental attrition rates provides useful insights for business management.
Identifying Areas of Concern: Departments with higher attrition rates, such as Sales and Human Resources, may demand management’s attention to better understand the underlying causes of turnover and develop retention initiatives.
Focused Interventions: Armed with department-specific attrition rates, firms may create focused interventions and policies that address the particular challenges and dynamics of each department, reducing turnover and retaining valued talent.
Resource Allocation: The study aids in resource optimisation by allocating more resources, like as training programmes or employee engagement efforts, to departments with higher attrition rates in order to effectively manage retention difficulties.
Strategic Planning: Understanding departmental attrition rates can help with long-term strategic planning, such as succession planning and talent management programmes, assuring the organization’s continuity and stability.
Understanding and resolving departmental disparities in attrition rates allows firms to improve employee satisfaction, minimise turnover costs, and retain a stable and productive workforce.