Data Analysis
### Category Analysis:
###1.Based on the Frequecny graph, sales workers have the highest prcentage (27.6%) to leave the company.It probably becasue of salary and commision or sales have more connection, they can have much better opportunity. Other job category are in the middle (Marketing, Account,it, Product,etc), almost 90% of the them are low and medium slary workers. ###2.Managements have the least percentage,higher level workers are more stable in their work. Moverover,higher salary people only have (8.25%) a small portion, which corresponing to management occupation.
Employee_Leave_reason1 <- rename (Employee_Leave_reason, Occupation = sales)
freq(Employee_Leave_reason1)

## Occupation frequency percentage cumulative_perc
## 1 sales 4140 27.60 27.60
## 2 technical 2720 18.13 45.73
## 3 support 2229 14.86 60.59
## 4 IT 1227 8.18 68.77
## 5 product_mng 902 6.01 74.78
## 6 marketing 858 5.72 80.50
## 7 RandD 787 5.25 85.75
## 8 accounting 767 5.11 90.86
## 9 hr 739 4.93 95.79
## 10 management 630 4.20 100.00

## salary frequency percentage cumulative_perc
## 1 low 7316 48.78 48.78
## 2 medium 6446 42.98 91.76
## 3 high 1237 8.25 100.00
## [1] "Variables processed: Occupation, salary"
## Continuous variables Analysis:
### 1. Satisfaction Level:it's negative skewness and platykurtic, data are light-tailed or lack of outliers, that implies most of workers's satisfication level are over than average 0.613.
### 2. Last_evaluation:it's fairly symmetrical and platykurtic, Kurtosis only 1.8, extreme values are less than that of the normal distribution.
### 3. Nmber_Project:It's positive skew and Platyurtic, skewness less than 0.5 and kurtosis less than 3.That implies that most workers finished the number of projects less than average 3.38.
### 4.Average_monthly_hours: it's fairly symmetrica and platykurtic, which indicates that most workers are spend around 201.05 hours.
### 5.time_spend_company It's highly skewed and leptokurtic. Most workers spend approximately 3.498 per day,but Kurtosis grate than 3 implies outliers some poeple spend spend 7.5 or even 10 hours perday.
###6. work_acident and promotion_last_5years: both of them are Leptokurtic and highly skewed
### left: highly skewed and Platykurtic
### Based on the data, Occupation and salary are category, others are numeric.
plot_num(Employee_Leave_reason1)

profiling_num(Employee_Leave_reason1)
## variable mean std_dev variation_coef p_01
## 1 satisfaction_level 0.61283352 0.2486307 0.4057067 0.09
## 2 last_evaluation 0.71610174 0.1711691 0.2390290 0.39
## 3 number_project 3.80305354 1.2325924 0.3241060 2.00
## 4 average_montly_hours 201.05033669 49.9430994 0.2484109 104.00
## 5 time_spend_company 3.49823322 1.4601362 0.4173925 2.00
## 6 Work_accident 0.14460964 0.3517186 2.4321930 0.00
## 7 left 0.23808254 0.4259241 1.7889766 0.00
## 8 promotion_last_5years 0.02126808 0.1442815 6.7839426 0.00
## p_05 p_25 p_50 p_75 p_95 p_99 skewness kurtosis iqr
## 1 0.11 0.44 0.64 0.82 0.96 0.99 -0.47631270 2.328965 0.38
## 2 0.46 0.56 0.72 0.87 0.98 1.00 -0.02661909 1.760973 0.31
## 3 2.00 3.00 4.00 5.00 6.00 7.00 0.33767184 2.504287 2.00
## 4 130.00 156.00 200.00 245.00 275.00 301.00 0.05283670 1.864997 89.00
## 5 2.00 3.00 3.00 4.00 6.00 10.00 1.85313370 7.771220 1.00
## 6 0.00 0.00 0.00 0.00 1.00 1.00 2.02094660 5.084225 0.00
## 7 0.00 0.00 0.00 0.00 1.00 1.00 1.22991957 2.512702 0.00
## 8 0.00 0.00 0.00 0.00 0.00 1.00 6.63630462 45.040539 0.00
## range_98 range_80
## 1 [0.09, 0.99] [0.21, 0.92]
## 2 [0.39, 1] [0.49, 0.95]
## 3 [2, 7] [2, 5]
## 4 [104, 301] [137, 267]
## 5 [2, 10] [2, 5]
## 6 [0, 1] [0, 1]
## 7 [0, 1] [0, 1]
## 8 [0, 1] [0, 0]