IBM (International Business Machines Corporation) is an American multinational technology company headquartered in Armonk, New York, United States, with operations in over 170 countries. IBM has a large and diverse portfolio of products and services. As of 2016, these offerings fall into the categories of cloud computing, cognitive computing, commerce, data and analytics, Internet of Things,IT infrastructure, mobile, and security.
IBM Cloud includes infrastructure as a service (IaaS), software as a service (SaaS) and platform as a service (PaaS) offered through public, private and hybrid cloud delivery models. For instance, the IBM Bluemix PaaS enables developers to quickly create complex websites on a pay-as-you-go model. IBM SoftLayer is a dedicated server, managed hosting and cloud computing provider, which in 2011 reported hosting more than 81,000 servers for more than 26,000 customers.IBM also provides Cloud Data Encryption Services (ICDES), using cryptographic splitting to secure customer data.
Hardware designed by IBM for these categories include IBM’s POWER microprocessors, which are employed inside many console gaming systems, including Xbox 360,PlayStation 3, and Nintendo’s Wii U.IBM Secure Blue is encryption hardware that can be built into microprocessors, and in 2014, the company revealed it was investing $3 billion over the following five years to design a neural chip that mimics the human brain, with 10 billion neurons and 100 trillion synapses, but that uses just 1 kilowatt of power.In 2016, the company launched all-flash arrays designed for small and midsized companies, which includes software for data compression, provisioning, and snapshots across various systems.
IBM headquaters in Armonk,New York
IBM has one of the largest workforces in the world, and employees at Big Blue are referred to as “IBMers”. The company was among the first corporations to provide group life insurance, survivor benefits, training for women, paid vacations, and training for disabled people.IBM has several leadership development and recognition programs to recognize employee potential and achievements. For early-career high potential employees,IBM sponsors leadership development programs by discipline (e.g., general management),human resources, finance. Each year, the company also selects 500 IBMers for the IBM Corporate Service Corps,which has been described as the corporate equivalent of the Peace Corps and gives top employees a month to do humanitarian work abroad.For certain interns, IBM also has a program called Extreme Blue that partners top business and technical students to develop high-value technology and compete to present their business case to the company’s CEO at internship’s end.
Employees
The company also has various designations for exceptional individual contributors such as Senior Technical Staff Member, Research Staff Member, Distinguished Engineer, and Distinguished Designer.The company’s most prestigious designation is that of IBM Fellow.
This dataset gives the information about the factors that lead to employee attrition and helps us extract answers for the qestions like “how the distance from home can effect the job involvment of an employee?”, “how does the job environment plays role in determining job satisfaction?”, “how the hourly rate of doing work and income are realted?”, etc.
The survey was carried out and the information is layed in the form of dataset consisting of rows and columns.
Description of some relavant columns is as follows:
1.Age:Gives the age of employee in numbers.
2.Attrition: If there is decline in the performance of employee=‘Yes’ If there is no decline in performance of employee=‘NO’
3.Business Travel:gives informantion of the frequency of the business tours an employee has to go for.
4.DailyRate:Rate at which an employee works daily.
5.Department:The section of the company in which the employee works.
6.DistanceFromHome:how far an employee lives from his workplace.
7.Education:1 ‘Below College’ 2 ‘College’ 3 ‘Bachelor’ 4 ‘Master’ 5 ‘Doctor’
8.Education Field:Qualification of employee
9.EnvironmentSatisfaction 1 ‘Low’ 2 ‘Medium’ 3 ‘High’ 4 ‘Very High’
10.Gender:sex of employee ‘Male’ or ‘Female’
11.JobInvolvement:1 ‘Low’ 2 ‘Medium’ 3 ‘High’ 4 ‘Very High’
12.JobSatisfaction:1 ‘Low’ 2 ‘Medium’ 3 ‘High’ 4 ‘Very High’
13:PerformanceRating:1 ‘Low’ 2 ‘Good’ 3 ‘Excellent’ 4 ‘Outstanding’
14:RelationshipSatisfaction:1 ‘Low’ 2 ‘Medium’ 3 ‘High’ 4 ‘Very High’
15.WorkLifeBalance 1 ‘Bad’ 2 ‘Good’ 3 ‘Better’ 4 ‘Best’
16.MaritalStatus:‘Married’,‘single’,‘divorced’
formula–> MonthlyIncome=ï..Age+ DistanceFromHome+Relationship Satisfaction+ EnvironmentSatisfaction+ joblevel + JobInvolvement+NumCompaniesWorked+ WorkLifeBalance
# Read the data
pdata <- read.csv(file="IBM-HR-Employee-Attrition.csv")
MyData <- pdata[-c(2,3,5,8,9,10,12,16,18,22,23)]
attach(MyData)
# Model 1
M1 <- lm(MonthlyIncome~ï..Age
+DistanceFromHome
+RelationshipSatisfaction
+EnvironmentSatisfaction
+JobLevel
+JobInvolvement
+NumCompaniesWorked
+WorkLifeBalance,
data=MyData)
summary(M1)
##
## Call:
## lm(formula = MonthlyIncome ~ ï..Age + DistanceFromHome + RelationshipSatisfaction +
## EnvironmentSatisfaction + JobLevel + JobInvolvement + NumCompaniesWorked +
## WorkLifeBalance, data = MyData)
##
## Residuals:
## Min 1Q Median 3Q Max
## -5276.3 -946.0 98.8 809.2 4013.4
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -1773.325 293.459 -6.043 1.92e-09 ***
## ï..Age 7.659 5.050 1.517 0.12953
## DistanceFromHome -12.742 4.712 -2.704 0.00693 **
## RelationshipSatisfaction 20.083 35.399 0.567 0.57057
## EnvironmentSatisfaction -34.289 34.933 -0.982 0.32647
## JobLevel 4004.090 40.156 99.714 < 2e-16 ***
## JobInvolvement -27.004 53.716 -0.503 0.61524
## NumCompaniesWorked 19.109 16.033 1.192 0.23349
## WorkLifeBalance -33.512 54.170 -0.619 0.53624
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1463 on 1461 degrees of freedom
## Multiple R-squared: 0.904, Adjusted R-squared: 0.9035
## F-statistic: 1720 on 8 and 1461 DF, p-value: < 2.2e-16
Through the above regression model,we established the effect of Monthly income on the various other factors with the simplest model. We regressed Age, DistanceFromHome,Relationship satisfaction, EnvironmentSatisfaction, job level,JobInvolvement,NumCompaniesWorked,worklife balance.We estimated model, using linear least squares.
formula—>WorkLifeBalance=TotalWorkingYears+ MonthlyIncome+ MonthlyRate+ RelationshipSatisfaction+ JobLevel +PerformanceRating +PercentSalaryHike
# Model 2
M2 <- lm(WorkLifeBalance~TotalWorkingYears
+MonthlyIncome
+MonthlyRate
+RelationshipSatisfaction
+JobLevel
+PerformanceRating
+PercentSalaryHike
,
data=MyData)
summary(M2)
##
## Call:
## lm(formula = WorkLifeBalance ~ TotalWorkingYears + MonthlyIncome +
## MonthlyRate + RelationshipSatisfaction + JobLevel + PerformanceRating +
## PercentSalaryHike, data = MyData)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.8341 -0.7189 0.2198 0.2690 1.3540
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2.601e+00 1.942e-01 13.391 <2e-16 ***
## TotalWorkingYears -6.548e-03 3.853e-03 -1.699 0.0895 .
## MonthlyIncome -4.931e-06 1.273e-05 -0.387 0.6986
## MonthlyRate 6.186e-07 2.593e-06 0.239 0.8115
## RelationshipSatisfaction 1.274e-02 1.707e-02 0.746 0.4557
## JobLevel 7.957e-02 5.518e-02 1.442 0.1495
## PerformanceRating 3.024e-02 8.073e-02 0.375 0.7080
## PercentSalaryHike -2.403e-03 7.961e-03 -0.302 0.7628
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7067 on 1462 degrees of freedom
## Multiple R-squared: 0.004147, Adjusted R-squared: -0.0006214
## F-statistic: 0.8697 on 7 and 1462 DF, p-value: 0.5298
Through the above regression model,we established the effect of Work Life Balance on the various other factors like Total Working Years,Monthly Income,Monthly Rate,Relationship Satisfaction,Job Level,Performance Rating,PercentSalary Hike .We estimated model, using linear least squares.
p value is more positive in model 2 whereas R value is more positive in model 1
# Summarize the Data
library(psych)
describe(pdata)
## vars n mean sd median trimmed
## ï..Age 1 1470 36.92 9.14 36.0 36.47
## Attrition* 2 1470 1.16 0.37 1.0 1.08
## BusinessTravel* 3 1470 2.61 0.67 3.0 2.76
## DailyRate 4 1470 802.49 403.51 802.0 803.83
## Department* 5 1470 2.26 0.53 2.0 2.25
## DistanceFromHome 6 1470 9.19 8.11 7.0 8.08
## Education 7 1470 2.91 1.02 3.0 2.98
## EducationField* 8 1470 3.25 1.33 3.0 3.10
## EmployeeCount 9 1470 1.00 0.00 1.0 1.00
## EmployeeNumber 10 1470 1024.87 602.02 1020.5 1023.40
## EnvironmentSatisfaction 11 1470 2.72 1.09 3.0 2.78
## Gender* 12 1470 1.60 0.49 2.0 1.62
## HourlyRate 13 1470 65.89 20.33 66.0 66.02
## JobInvolvement 14 1470 2.73 0.71 3.0 2.74
## JobLevel 15 1470 2.06 1.11 2.0 1.90
## JobRole* 16 1470 5.46 2.46 6.0 5.61
## JobSatisfaction 17 1470 2.73 1.10 3.0 2.79
## MaritalStatus* 18 1470 2.10 0.73 2.0 2.12
## MonthlyIncome 19 1470 6502.93 4707.96 4919.0 5667.24
## MonthlyRate 20 1470 14313.10 7117.79 14235.5 14286.48
## NumCompaniesWorked 21 1470 2.69 2.50 2.0 2.36
## Over18* 22 1470 1.00 0.00 1.0 1.00
## OverTime* 23 1470 1.28 0.45 1.0 1.23
## PercentSalaryHike 24 1470 15.21 3.66 14.0 14.80
## PerformanceRating 25 1470 3.15 0.36 3.0 3.07
## RelationshipSatisfaction 26 1470 2.71 1.08 3.0 2.77
## StandardHours 27 1470 80.00 0.00 80.0 80.00
## StockOptionLevel 28 1470 0.79 0.85 1.0 0.67
## TotalWorkingYears 29 1470 11.28 7.78 10.0 10.37
## TrainingTimesLastYear 30 1470 2.80 1.29 3.0 2.72
## WorkLifeBalance 31 1470 2.76 0.71 3.0 2.77
## YearsAtCompany 32 1470 7.01 6.13 5.0 5.99
## YearsInCurrentRole 33 1470 4.23 3.62 3.0 3.85
## YearsSinceLastPromotion 34 1470 2.19 3.22 1.0 1.48
## YearsWithCurrManager 35 1470 4.12 3.57 3.0 3.77
## mad min max range skew kurtosis se
## ï..Age 8.90 18 60 42 0.41 -0.41 0.24
## Attrition* 0.00 1 2 1 1.84 1.39 0.01
## BusinessTravel* 0.00 1 3 2 -1.44 0.69 0.02
## DailyRate 510.01 102 1499 1397 0.00 -1.21 10.52
## Department* 0.00 1 3 2 0.17 -0.40 0.01
## DistanceFromHome 7.41 1 29 28 0.96 -0.23 0.21
## Education 1.48 1 5 4 -0.29 -0.56 0.03
## EducationField* 1.48 1 6 5 0.55 -0.69 0.03
## EmployeeCount 0.00 1 1 0 NaN NaN 0.00
## EmployeeNumber 790.97 1 2068 2067 0.02 -1.23 15.70
## EnvironmentSatisfaction 1.48 1 4 3 -0.32 -1.20 0.03
## Gender* 0.00 1 2 1 -0.41 -1.83 0.01
## HourlyRate 26.69 30 100 70 -0.03 -1.20 0.53
## JobInvolvement 0.00 1 4 3 -0.50 0.26 0.02
## JobLevel 1.48 1 5 4 1.02 0.39 0.03
## JobRole* 2.97 1 9 8 -0.36 -1.20 0.06
## JobSatisfaction 1.48 1 4 3 -0.33 -1.22 0.03
## MaritalStatus* 1.48 1 3 2 -0.15 -1.12 0.02
## MonthlyIncome 3260.24 1009 19999 18990 1.37 0.99 122.79
## MonthlyRate 9201.76 2094 26999 24905 0.02 -1.22 185.65
## NumCompaniesWorked 1.48 0 9 9 1.02 0.00 0.07
## Over18* 0.00 1 1 0 NaN NaN 0.00
## OverTime* 0.00 1 2 1 0.96 -1.07 0.01
## PercentSalaryHike 2.97 11 25 14 0.82 -0.31 0.10
## PerformanceRating 0.00 3 4 1 1.92 1.68 0.01
## RelationshipSatisfaction 1.48 1 4 3 -0.30 -1.19 0.03
## StandardHours 0.00 80 80 0 NaN NaN 0.00
## StockOptionLevel 1.48 0 3 3 0.97 0.35 0.02
## TotalWorkingYears 5.93 0 40 40 1.11 0.91 0.20
## TrainingTimesLastYear 1.48 0 6 6 0.55 0.48 0.03
## WorkLifeBalance 0.00 1 4 3 -0.55 0.41 0.02
## YearsAtCompany 4.45 0 40 40 1.76 3.91 0.16
## YearsInCurrentRole 4.45 0 18 18 0.92 0.47 0.09
## YearsSinceLastPromotion 1.48 0 15 15 1.98 3.59 0.08
## YearsWithCurrManager 4.45 0 17 17 0.83 0.16 0.09
table1 <- xtabs(~ Gender + Attrition, data = pdata)
table1
## Attrition
## Gender No Yes
## Female 501 87
## Male 732 150
18% of males show attrition in their performance whereas 15% of females show attrition.
table2 <- xtabs(~ WorkLifeBalance+ Attrition, data = pdata)
table2
## Attrition
## WorkLifeBalance No Yes
## 1 55 25
## 2 286 58
## 3 766 127
## 4 126 27
People having work life balance of 3-i.e better balance have low percentage of people having attrition.
table3 <- xtabs(~ JobSatisfaction+ Attrition, data = pdata)
table3
## Attrition
## JobSatisfaction No Yes
## 1 223 66
## 2 234 46
## 3 369 73
## 4 407 52
group of people having higher Job Satisfaction have lower percentage of people whoes performance is reduced.
table4 <- xtabs(~ RelationshipSatisfaction+ Attrition, data = pdata)
table4
## Attrition
## RelationshipSatisfaction No Yes
## 1 219 57
## 2 258 45
## 3 388 71
## 4 368 64
people having better Relationship satisfaction have low number of people who have show attrition in their performance.
boxplot(MonthlyIncome~WorkLifeBalance, data=pdata, horizontal=TRUE,
xlab="attrition", las=1,
col=c("red","blue","green","yellow"),
main="boxplot of worklife balance and attrition in performance of employees")
boxplot(PercentSalaryHike~JobSatisfaction, data=pdata, horizontal=TRUE,
xlab="Percent salary Hike", las=1,
col=c("red","blue","green","yellow"),
main="boxplot of Percent Salary hike and Job Satisfaction")
1)Work Life Balance
hist(pdata$WorkLifeBalance,
main="Histogram of Work Life balance",
col=c("blue"),
xlab="work life balance" )
Histogram shows that highest number of employees have better work life balance in IBM
2)Job Satisfaction
hist(pdata$JobSatisfaction,
main="Histogram of Job Satisfaction",
col=c("yellow"),
xlab="Job Satisfaction level" )
histogram shows that majority employees have higher level of job satisfaction
3)monthly income
hist(pdata$MonthlyIncome,
main="Histogram of Monthly Income",
col=c("green"),
xlab="Monthly income in rupees" )
Majority of people have their monthly salary between 0-10,000 Rupees.
considered parameters: Work life balance,percent salaryhike and job satisfaction.
WorkLifeBalance <- factor(WorkLifeBalance)
fit <- aov(PercentSalaryHike~WorkLifeBalance*JobSatisfaction)
summary(fit)
## Df Sum Sq Mean Sq F value Pr(>F)
## WorkLifeBalance 3 44 14.813 1.106 0.346
## JobSatisfaction 1 10 9.966 0.744 0.389
## WorkLifeBalance:JobSatisfaction 3 34 11.458 0.855 0.464
## Residuals 1462 19589 13.399
By using box-plots, contingency tables,histograms we can deduce the dependencies of variables like work life balance, job involvement, monthly income, job satisfaction etc. causing the attrition in the performance of the employee.
From this analysis we can conclude that the factors like Age, Distance From Home, Percent Salary Hike, Environment Satisfaction, Hourly Rate,Job Involvement,Number of Companies Worked cause impact on the work life balance of the people and hence effect the attrition. We can easily make out from the boxplots and histograms.