The HR Dashboard


Data description: The data set is an HR Employee Performance data set.It includes 35 different variables about employees’ demographics, evaluations about the company and attrition result. It has 1470 rows of records.

Variables: Age, Attrition, BusinessTravel, DailyRate, Department, DistanceFromHome, Education, EducationField, EmployeeCount, EmployeeNumber, EnvironmentSatisfaction, Gender, HourlyRate, obInvolvement, JobLevel, JobRole, JobSatisfaction, MaritalStatus, MonthlyIncome, MonthlyRate, NumCompaniesWorked, Over18, OverTime, PercentSalaryHike, PerformanceRating, RelationshipSatisfaction, StandardHours, StockOptionLevel, TotalWorkingYears, TrainingTimesLastYear, WorkLifeBalance, YearsAtCompany, YearsInCurrentRole, YearsSinceLastPromotion, YearsWithCurrManager The four major objectives are: 1.Provide summary statistics about employees 2.Get understanding about how the company’s employees think about their works 3.Explore relationships between employee attributes and monthly income 4.Investigate possible factors affect attrition

Summary Statistics

      Age        Attrition            BusinessTravel   DailyRate     
 Min.   :18.00   No :1233   Non-Travel       : 150   Min.   : 102.0  
 1st Qu.:30.00   Yes: 237   Travel_Frequently: 277   1st Qu.: 465.0  
 Median :36.00              Travel_Rarely    :1043   Median : 802.0  
 Mean   :36.92                                       Mean   : 802.5  
 3rd Qu.:43.00                                       3rd Qu.:1157.0  
 Max.   :60.00                                       Max.   :1499.0  
                                                                     
                  Department  DistanceFromHome  Education        
 Human Resources       : 63   Min.   : 1.000   Length:1470       
 Research & Development:961   1st Qu.: 2.000   Class :character  
 Sales                 :446   Median : 7.000   Mode  :character  
                              Mean   : 9.193                     
                              3rd Qu.:14.000                     
                              Max.   :29.000                     
                                                                 
          EducationField EmployeeCount EmployeeNumber  
 Human Resources : 27    Min.   :1     Min.   :   1.0  
 Life Sciences   :606    1st Qu.:1     1st Qu.: 491.2  
 Marketing       :159    Median :1     Median :1020.5  
 Medical         :464    Mean   :1     Mean   :1024.9  
 Other           : 82    3rd Qu.:1     3rd Qu.:1555.8  
 Technical Degree:132    Max.   :1     Max.   :2068.0  
                                                       
 EnvironmentSatisfaction    Gender      HourlyRate     JobInvolvement    
 Length:1470             Female:588   Min.   : 30.00   Length:1470       
 Class :character        Male  :882   1st Qu.: 48.00   Class :character  
 Mode  :character                     Median : 66.00   Mode  :character  
                                      Mean   : 65.89                     
                                      3rd Qu.: 83.75                     
                                      Max.   :100.00                     
                                                                         
    JobLevel                          JobRole    JobSatisfaction   
 Min.   :1.000   Sales Executive          :326   Length:1470       
 1st Qu.:1.000   Research Scientist       :292   Class :character  
 Median :2.000   Laboratory Technician    :259   Mode  :character  
 Mean   :2.064   Manufacturing Director   :145                     
 3rd Qu.:3.000   Healthcare Representative:131                     
 Max.   :5.000   Manager                  :102                     
                 (Other)                  :215                     
  MaritalStatus MonthlyIncome    MonthlyRate    NumCompaniesWorked
 Divorced:327   Min.   : 1009   Min.   : 2094   Min.   :0.000     
 Married :673   1st Qu.: 2911   1st Qu.: 8047   1st Qu.:1.000     
 Single  :470   Median : 4919   Median :14236   Median :2.000     
                Mean   : 6503   Mean   :14313   Mean   :2.693     
                3rd Qu.: 8379   3rd Qu.:20462   3rd Qu.:4.000     
                Max.   :19999   Max.   :26999   Max.   :9.000     
                                                                  
 Over18   OverTime   PercentSalaryHike PerformanceRating 
 Y:1470   No :1054   Min.   :11.00     Length:1470       
          Yes: 416   1st Qu.:12.00     Class :character  
                     Median :14.00     Mode  :character  
                     Mean   :15.21                       
                     3rd Qu.:18.00                       
                     Max.   :25.00                       
                                                         
 RelationshipSatisfaction StandardHours StockOptionLevel TotalWorkingYears
 Length:1470              Min.   :80    Min.   :0.0000   Min.   : 0.00    
 Class :character         1st Qu.:80    1st Qu.:0.0000   1st Qu.: 6.00    
 Mode  :character         Median :80    Median :1.0000   Median :10.00    
                          Mean   :80    Mean   :0.7939   Mean   :11.28    
                          3rd Qu.:80    3rd Qu.:1.0000   3rd Qu.:15.00    
                          Max.   :80    Max.   :3.0000   Max.   :40.00    
                                                                          
 TrainingTimesLastYear WorkLifeBalance    YearsAtCompany  
 Min.   :0.000         Length:1470        Min.   : 0.000  
 1st Qu.:2.000         Class :character   1st Qu.: 3.000  
 Median :3.000         Mode  :character   Median : 5.000  
 Mean   :2.799                            Mean   : 7.008  
 3rd Qu.:3.000                            3rd Qu.: 9.000  
 Max.   :6.000                            Max.   :40.000  
                                                          
 YearsInCurrentRole YearsSinceLastPromotion YearsWithCurrManager
 Min.   : 0.000     Min.   : 0.000          Min.   : 0.000      
 1st Qu.: 2.000     1st Qu.: 0.000          1st Qu.: 2.000      
 Median : 3.000     Median : 1.000          Median : 3.000      
 Mean   : 4.229     Mean   : 2.188          Mean   : 4.123      
 3rd Qu.: 7.000     3rd Qu.: 3.000          3rd Qu.: 7.000      
 Max.   :18.000     Max.   :15.000          Max.   :17.000      
                                                                
Status Counts
Yes 237
No 1233
Department Count
Human Resources 63
Research & Development 961
Sales 446

In the original data set provided, variables “Education”, “EnvironmentSatisfaction”, “JobInvolvement”, “JobSatisfaction”, “PerformanceRating”, “RelationshipSatisfaction”, and “WorkLifeBalance” are dummy coded using 1-5 or 1-4 scale. For better understanding and visualizing purpose, these variables are reversed back to actual categories based on the explanations given by the data provider.

To begin the analysis, it is important to get some summary statstics about the 1,470 employees.
In 1470 employee records, 237 people left the company, which accounts for 16.1% of the total population.

Employees by Department


Q1. Which departments are these employees come from?

961 employees are from the Research & Development department, accounts for the majority of the data set. Other two departments include in the data set are Sales and Human Resources.

Employee Income Distribution


Q2. How employees’ incomes distributed?

Since the data set doesn’t explain how “rate” and “income”, the analysis assume the “hourly rate” and “daily rate” are equal to “hourly income” and “daily income”. It is clear that most employees who left have a relatively low monthly income level. Also, the monthly income is positive skewed, which means the most of employees have lower than $10,000 per month income level.
The distributions of hourly rate and daily rate don’t provide much valuable insights of employees. For employees who choose to leave, their hourly rate and daily rate are not significantly different from people who stay in the company.

Work Life Balance


Q3. Working Years in the company?

In thhe company, majority employees have 0-12 years of work experience. And when taking look at the work experience they gained at the company, 0-10 years is the most common time length for all three departments. Most left employees choose to leave after they work for the company 0-5 years, or after they work 0-10 years in total. There are few employees working for the company over 20 years no matter they choose to leave or stay.

Job Satisfaction


Q4. What is the job satisfaction levels?

The job satisfaction increases when the age and experience of the employee increases. We see that there is a downfall in satisfaction after age of 50 and monthly income of $14,000. There is also a mixed level of satisfaction for employees in the sales department.

Possible factors affect attrition


Q5. What are the possible factors that affect attrition?

It is interesting to notice that the two major groups who left companies are employees who have “low” and “high” job satisfaction. For employees who are highly satisfied with thier job, it is more important for HR to know what cause them left.

---
title: "PallaviSaitu_Project"
author: "Pallavi Saitu"
date: "4/23/2019"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    source_code: embed
---

### The HR Dashboard {data-commentary-width=400}

```{r}
library(tidyverse)
library(knitr)
library(gridExtra)
library(ggpubr)
knitr::include_graphics("/Users/pallavisaitu/Downloads/HR.jpg")
```

*** 
Data description:
The data set is an HR Employee Performance data set.It includes 35 different variables about employees’ demographics, evaluations about the company and attrition result. It has 1470 rows of records. 

Variables:
Age, Attrition, BusinessTravel, DailyRate, Department, DistanceFromHome, Education, EducationField, EmployeeCount, EmployeeNumber, EnvironmentSatisfaction, Gender, HourlyRate, obInvolvement, JobLevel, JobRole, JobSatisfaction, MaritalStatus, MonthlyIncome, MonthlyRate, NumCompaniesWorked, Over18, OverTime, PercentSalaryHike, PerformanceRating, RelationshipSatisfaction, StandardHours, StockOptionLevel, TotalWorkingYears, TrainingTimesLastYear, WorkLifeBalance, YearsAtCompany, YearsInCurrentRole, YearsSinceLastPromotion, YearsWithCurrManager
The four major objectives are:
1.Provide summary statistics about employees
2.Get understanding about how the company’s employees think about their works
3.Explore relationships between employee attributes and monthly income
4.Investigate possible factors affect attrition

### Summary Statistics

```{r}
HR <- read.csv("/Users/pallavisaitu/Downloads/WA_Fn-UseC_-HR-Employee-Attrition.csv")
names(HR)[1] <- "Age"   #Rename the column name to "Age" for consistence purpose. 

# Employees' education 
HR$Education[HR$Education=="1"] <- "Below College"
HR$Education[HR$Education=="2"] <- "College"
HR$Education[HR$Education=="3"] <- "Bachelor"
HR$Education[HR$Education=="4"] <- "Master"
HR$Education[HR$Education=="5"] <- "Doctor"


# Employees' environment satisfaction
HR$EnvironmentSatisfaction[HR$EnvironmentSatisfaction=="1"] <- "Low"
HR$EnvironmentSatisfaction[HR$EnvironmentSatisfaction=="2"] <- "Medium"
HR$EnvironmentSatisfaction[HR$EnvironmentSatisfaction=="3"] <- "High"
HR$EnvironmentSatisfaction[HR$EnvironmentSatisfaction=="4"] <- "Very High"


# Employees' job involvement
HR$JobInvolvement[HR$JobInvolvement=="1"] <- "Low"
HR$JobInvolvement[HR$JobInvolvement=="2"] <- "Medium"
HR$JobInvolvement[HR$JobInvolvement=="3"] <- "High"
HR$JobInvolvement[HR$JobInvolvement=="4"] <- "Very High"


# Employees' job satisfaction
HR$JobSatisfaction[HR$JobSatisfaction=="1"] <- "Low"
HR$JobSatisfaction[HR$JobSatisfaction=="2"] <- "Medium"
HR$JobSatisfaction[HR$JobSatisfaction=="3"] <- "High"
HR$JobSatisfaction[HR$JobSatisfaction=="4"] <- "Very High"

# Employees' performance rating
HR$PerformanceRating[HR$PerformanceRating=="1"] <- "Low"
HR$PerformanceRating[HR$PerformanceRating=="2"] <- "Good"
HR$PerformanceRating[HR$PerformanceRating=="3"] <- "Excellent"
HR$PerformanceRating[HR$PerformanceRating=="4"] <- "Outstanding"


# Employees' relationship satisfaction
HR$RelationshipSatisfaction[HR$RelationshipSatisfaction=="1"] <- "Low"
HR$RelationshipSatisfaction[HR$RelationshipSatisfaction=="2"] <- "Medium"
HR$RelationshipSatisfaction[HR$RelationshipSatisfaction=="3"] <- "High"
HR$RelationshipSatisfaction[HR$RelationshipSatisfaction=="4"] <- "Very High"


# Employees' life balance
HR$WorkLifeBalance[HR$WorkLifeBalance=="1"] <- "Bad"
HR$WorkLifeBalance[HR$WorkLifeBalance=="2"] <- "Good"
HR$WorkLifeBalance[HR$WorkLifeBalance=="3"] <- "Better"
HR$WorkLifeBalance[HR$WorkLifeBalance=="4"] <- "Best"

summary(HR)

HR$Attrition <- factor(HR$Attrition,levels=c("Yes","No"))
attrition <- data.frame(table(HR$Attrition))
names(attrition)[1] <- "Status"
names(attrition)[2] <- "Counts"
kable(attrition)

department <-data.frame(table(HR$Department))
kable(department,col.names = c("Department","Count"))

```

*** 
In the original data set provided, variables "Education", "EnvironmentSatisfaction", "JobInvolvement", "JobSatisfaction", "PerformanceRating", "RelationshipSatisfaction", and "WorkLifeBalance" are dummy coded using 1-5 or 1-4 scale. For better understanding and visualizing purpose, these variables are reversed  back to actual categories based on the explanations given by the data provider.

To begin the analysis, it is important to get some summary statstics about the 1,470 employees.  
In 1470 employee records, 237 people left the company, which accounts for 16.1% of the total population.  

### Employees by Department

```{r}
inc_1 <- ggplot(HR, aes(x = MonthlyIncome, fill = Attrition)) + 
            geom_histogram(position = "dodge") + labs(x="Monthly Income", y="Number of employees")
inc_2 <- ggplot(HR, aes(x = HourlyRate, fill = Attrition)) + 
            geom_histogram(position = "dodge") + labs(x="Hourly Rate", y="Number of employees")
inc_3 <- ggplot(HR, aes(x = DailyRate, fill = Attrition)) + 
            geom_histogram(position = "dodge") + labs(x="Daily Rate", y="Number of employees")
grid.arrange(inc_1,inc_2,inc_3, ncol = 2, nrow = 2, top = "Income Distribution in company", bottom = "IBM HR Analytics")
```

*** 

Q1. Which departments are these employees come from?

961 employees are from the Research & Development department, accounts for the majority of the data set. Other two departments include in the data set are Sales and Human Resources.  
   

### Employee Income Distribution

```{r}
ggplot(HR) + 
  geom_histogram(mapping=(aes(TotalWorkingYears)),fill="skyblue",col="white",binwidth = 1) + 
  labs(x="Total Working Years", y="Number of employees",caption="IBM HR Analytics", title="Total Working Years") + theme(legend.position="none")

ggplot(HR, aes(x= Department, y=TotalWorkingYears, group = Department, fill = Department)) +
  geom_violin() + theme(legend.position="none") +
  coord_flip() + 
  labs(x="Department",y="Total Working Years",caption="IBM HR Analytics", title="Total Working Years by Attrition") +
  facet_wrap(~ Attrition)

ggplot(HR) + 
  geom_histogram(mapping=(aes(YearsAtCompany)),fill="skyblue",col="white",binwidth = 1) + 
  labs(x="Working Years at the company", y="Number of employees",caption="IBM HR Analytics", title="Working Years at Company") + theme(legend.position="none")

ggplot(HR, aes(x= Department, y=YearsAtCompany, group = Department, fill = Department)) +
  geom_violin() + theme(legend.position="none") +
  coord_flip() + 
  labs(x="Department",y="Working Years at the company",caption="IBM HR Analytics", title="Working Years at Company by Attrition") +
  facet_wrap(~ Attrition)
```

*** 
Q2. How employees' incomes distributed?

Since the data set doesn't explain how "rate" and "income", the analysis assume the "hourly rate" and "daily rate" are equal to "hourly income" and "daily income". It is clear that most employees who left have a relatively low monthly income level. Also, the monthly income is positive skewed, which means the most of employees have lower than $10,000 per month income level.  
The distributions of hourly rate and daily rate don't provide much valuable insights of employees. For employees who choose to leave, their hourly rate and daily rate are not significantly different from people who stay in the company. 

### Work Life Balance 


```{r}
# Filter by department
worklife_sales <-data.frame(table(filter(HR,Department=="Sales")$WorkLifeBalance))
names(worklife_sales)[1] <- "Status"
names(worklife_sales)[2] <- "Counts"
worklife_RD <-data.frame(table(filter(HR,Department=="Research & Development")$WorkLifeBalance))
names(worklife_RD)[1] <- "Status"
names(worklife_RD)[2] <- "Counts"
worklife_HR <-data.frame(table(filter(HR,Department=="Human Resources")$WorkLifeBalance))
names(worklife_HR)[1] <- "Status"
names(worklife_HR)[2] <- "Counts"

a <- ggpie(worklife_HR,"Counts",fill="Status",color="white", label="Counts",lab.pos = "out",lab.font = "white") +
  ggtitle("Human Resources") + theme(legend.position = "right") 
b <- ggpie(worklife_RD,"Counts",fill="Status",color="white", label="Counts",lab.pos = "out",lab.font = "white") +
  ggtitle("Research & Development") + theme(legend.position = "right") 
c <- ggpie(worklife_sales,"Counts",fill="Status",color="white", label="Counts",lab.pos = "out",lab.font = "white") +
  ggtitle("Sales") + theme(legend.position = "right")
grid.arrange(a,b,c,ncol=2,nrow=2,newpage = FALSE)

```

*** 
Q3. Working Years in the company?

In thhe company, majority employees have 0-12 years of work experience. And when taking look at the work experience they gained at the company, 0-10 years is the most common time length for all three departments. Most left employees choose to leave after they work for the company 0-5 years, or after they work 0-10 years in total. There are few employees working for the company over 20 years no matter they choose to leave or stay.  


### Job Satisfaction
```{r}
HR$JobSatisfaction <- factor(HR$JobSatisfaction, 
                                     levels = c("Low", "Medium","High","Very High"))
ggplot(HR,aes(Age,MonthlyIncome)) + geom_point(aes(color=Department)) + 
  geom_smooth(col="black", se=FALSE,method="loess")+ facet_grid(.~Department) + 
  labs(title="Age and Monthly Income", x="Age", y="Monthly Income")

```

*** 

Q4. What is the job satisfaction levels?

The job satisfaction increases when the age and experience of the employee increases. We see that there is a downfall in satisfaction after age of 50 and monthly income of $14,000. There is also a mixed level of satisfaction for employees in the sales department. 


### Possible factors affect attrition
```{r}
ggplot(data=HR)+
  geom_bar(position="dodge",mapping=aes(JobSatisfaction,fill=Attrition)) + labs(title="Job Satisfaction and Attrition", x="Job Satisfaction", y="Number of employees", caption="IBM HR Analytics") 
```


*** 

Q5. What are the possible factors that affect attrition?

It is interesting to notice that the two major groups who left companies are employees who have "low" and "high" job satisfaction. For employees who are highly satisfied with thier job, it is more important for HR to know what cause them left.