A large company named ETC, employs, at any given point of time, around 4000 employees.
However, every year, around 15% of its employees leave the company.
Since the attrition level is too high, the management wants to use predictive modelling to bring it down.
Hence, the objectives of the analysis are to:
Help company XYZ identify current employees that are very likely to leave
Recommend ways for company XYZ to decrease its attrition level in the future
CRISP-DM,[1] is a data mining process model that describes commonly used approaches that data mining experts
use to tackle problems.(https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining)
The analysis is divided into three parts:
Data Understanding – Source of data, patterns in the data
Predictive modelling of attrition
Recommending ways for company XYZ to decrease its level of attrition
EDA (https://en.wikipedia.org/wiki/Exploratory_data_analysis) is the first step in every data analysis workflow.
It is crucial for understanding the features and for choosing suitable analysis techniques, methods and algorithms.
Features are the variables that describe data in a data set (its “properties”). The response variable, or label, of a data set describes the output of interest. Here, for every measurement point we have label that tells us whether the machine was broken (“1”) or not (“0”).
High-level exploration of the features & how they relate to the response variable can give us an intuitive understanding of major patterns, influencing factors and phenomena. Visualization can further help to get feel for the data and to communicate main characteristics. Here, I am Used a correlation and distribution or Bar/Box plots.
The data received for the analysis can be divided into 4 broad categories -
Data Looks Like
Description
Age
-Employees aged 36 years and above are more likely to stay
-Employees aged 32 years and below are more likely to leave
Experience
-Employees that have worked for a total of 10 years or more are more likely to stay
-Employees that have worked for a total of 7 years or less are more likely to leave
Among attritions, median age = 32 and median exp. = 7
Among non-attritions, median age = 36 and median exp. = 10
Coefficients of the variables Age and TotalWorkingYears are significant.
Description
Training
-Employees that got 3 or more training sessions last year are more likely to stay
-Employees that got 2 or fewer training sessions last year are more likely to leave
Years with Current Manager
-Employees that have spent 3 years or more under the same manager are more likely to stay
-Employees that have spent 2 years or less under the same manager are more likely to leave
-Coefficients of the variables TrainingTimesLastYear and YearsWithCurrManager are significant.
-Rest of the data is based on means/medians etc.
Description
Job Satisfaction
-Employees that have medium, high or very high levels of job satisfaction, are more likely to stay
-Employees that have low levels of job satisfaction, are more likely to leave
Environment Satisfaction
-Employees that have medium, high or very high levels of environment satisfaction, are more likely to stay
-Employees that have low levels of environment satisfaction, are more likely to leave
-Coefficients of the variables JobSatisfaction and EnvironmentSatisfaction are significant.
-Employees were asked to report their job satisfaction and work environment satisfaction levels in a survey.
Description
Average Work Hours
-Employees that, on average work for 7.3 hours or less, are more likely to stay
-Employees that, on average work for 8.2 hours or more, are more likely to leave
Work Life Balance
-Employees that rated their work life balance as good, better or best, are more likely to stay
-Employees that rated their work life balance as bad, are more likely to leave
-Coefficients of the variables AverageWorkTime and WorkLIfeBalance are significant.
-Average work hours data is based on means/medians etc.
-Employees were asked to report their level of work life balance in a survey.
Current employees:
Work life balance should be improved
Work environment should be improved
The manager of an employee should not be changed very often
Employees should be provided relevant training regularly, especially for its younger employees
Future employees (changes in hiring process):
The company should follow either one of the strategies given below –
Hire older people with decent work experience
Hire young people and train them appropriately
It could also opt for a combination of the two