Measure the relationship between two continuous variables

Data Sources: HR_comma_sep

Employee_Leave_rate <- rename (HR_comma_sep, Occupation = sales)
Employee_Leave_rate1 <- Employee_Leave_rate [, -c(9:10)]
summary(Employee_Leave_rate1)
##  satisfaction_level last_evaluation  number_project  average_montly_hours
##  Min.   :0.0900     Min.   :0.3600   Min.   :2.000   Min.   : 96.0       
##  1st Qu.:0.4400     1st Qu.:0.5600   1st Qu.:3.000   1st Qu.:156.0       
##  Median :0.6400     Median :0.7200   Median :4.000   Median :200.0       
##  Mean   :0.6128     Mean   :0.7161   Mean   :3.803   Mean   :201.1       
##  3rd Qu.:0.8200     3rd Qu.:0.8700   3rd Qu.:5.000   3rd Qu.:245.0       
##  Max.   :1.0000     Max.   :1.0000   Max.   :7.000   Max.   :310.0       
##  time_spend_company Work_accident         left       
##  Min.   : 2.000     Min.   :0.0000   Min.   :0.0000  
##  1st Qu.: 3.000     1st Qu.:0.0000   1st Qu.:0.0000  
##  Median : 3.000     Median :0.0000   Median :0.0000  
##  Mean   : 3.498     Mean   :0.1446   Mean   :0.2381  
##  3rd Qu.: 4.000     3rd Qu.:0.0000   3rd Qu.:0.0000  
##  Max.   :10.000     Max.   :1.0000   Max.   :1.0000  
##  promotion_last_5years
##  Min.   :0.00000      
##  1st Qu.:0.00000      
##  Median :0.00000      
##  Mean   :0.02127      
##  3rd Qu.:0.00000      
##  Max.   :1.00000

Easy glimpse into all the possible correlations

1 Correlations between different variables

Satisfaction_level and last_evaluation.With alpha 0.05:

## 
##  Pearson's product-moment correlation
## 
## data:  Employee_Leave_rate1$satisfaction_level and Employee_Leave_rate1$last_evaluation
## t = 12.933, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.08916727 0.12082195
## sample estimates:
##       cor 
## 0.1050212

Explanation:

#1 P value less than 0.05, so the relationship between satisfaction_level and last_evaluation is significan.
#2 Cor is 0.11, The correlation between those two variablesa have a very weak positive liner relationship. 
#3 Stricktly speaking,two varibales almost have nothing to do with each other.

2 Satisfaction_level and time_spend_company.With alpha 0.05:

## 
##  Pearson's product-moment correlation
## 
## data:  Employee_Leave_rate1$satisfaction_level and Employee_Leave_rate1$time_spend_company
## t = -12.416, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.11668153 -0.08499948
## sample estimates:
##        cor 
## -0.1008661

Explanation:

#1 P value less than 0.05, so the relationship between satisfaction_level and time_spend_company is significan.
#2 Cor is -0.1, The correlation between those two variablesa have a very weak negative liner relationship. 
#3 Two varibales move in oppsite way and have nothing related.

3 Average_monthly_hours and number_project.With alpha = 0.05:

## 
##  Pearson's product-moment correlation
## 
## data:  Employee_Leave_rate1$average_montly_hours and Employee_Leave_rate1$number_project
## t = 56.219, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.4039037 0.4303411
## sample estimates:
##       cor 
## 0.4172106

Explanation:

#1  P value less than 0.05, so the relationship between Average_monthly_hours and number_project is significan.
#2  Cor is o.42, The correlation between those two variablesa have a moderate positive liner relationship. 
#3  Much more project were done, more average_montly_hours spend in company.

4 Last_evaluation and number_project.With alpha 0.05:

## 
##  Pearson's product-moment correlation
## 
## data:  Employee_Leave_rate1$last_evaluation and Employee_Leave_rate1$number_project
## t = 45.656, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.3352028 0.3633053
## sample estimates:
##       cor 
## 0.3493326

Explanation:

#1 P value less that 0.05, The relationship between left and satisfaction_level is significant
#2 Cor 0.35, The correlation between those two variablesa have a moderate positive liner relationship. 
#3 The more number_project done, the higher level last_evaluation.

5 Satisfaction_level and salary.With alpha 0.05:

Employee_Leave_rate2 = Employee_Leave_rate %>%
  mutate(salary_type = recode(Employee_Leave_rate$salary,"low" = 0, "medium" = 1, "high" = 2))
Employee_Leave_rate3<- Employee_Leave_rate2 [, -c(9:10)]

ggplot(Employee_Leave_rate3)+
  geom_point(aes(satisfaction_level, salary_type))+
  geom_smooth(aes(satisfaction_level, salary_type), method = 'lm')

cor.test(Employee_Leave_rate3$satisfaction_level, Employee_Leave_rate3$salary_type)
## 
##  Pearson's product-moment correlation
## 
## data:  Employee_Leave_rate3$satisfaction_level and Employee_Leave_rate3$salary_type
## t = 6.1335, df = 14997, p-value = 8.81e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.03404593 0.06597347
## sample estimates:
##        cor 
## 0.05002248

Explanation:

#1 The relationship between satisfaction_level and salary_type is significant.
#2 Cor is 0.2, The correlation between two variablesa are weak positive liner relationship. 
#3 Higher salary employees not all with higher satisfaction_level. Lower sarly employees not all with low satisfaction level.