library(readr)
library(ggplot2)
hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
head(hr)
## # A tibble: 6 × 10
## satisfaction_level last_evaluation number_project average_montly_hours
## <dbl> <dbl> <dbl> <dbl>
## 1 0.38 0.53 2 157
## 2 0.8 0.86 5 262
## 3 0.11 0.88 7 272
## 4 0.72 0.87 5 223
## 5 0.37 0.52 2 159
## 6 0.41 0.5 2 153
## # ℹ 6 more variables: time_spend_company <dbl>, Work_accident <dbl>,
## # left <dbl>, promotion_last_5years <dbl>, Department <chr>, salary <chr>
cor.test(hr$satisfaction_level, hr$last_evaluation)
##
## Pearson's product-moment correlation
##
## data: hr$satisfaction_level and hr$last_evaluation
## t = 12.933, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.08916727 0.12082195
## sample estimates:
## cor
## 0.1050212
Technical Interpretation: The p-value is extremely small (well below 0.05), which means the correlation between satisfaction level and last evaluation score is statistically significant. The correlation estimate is slightly negative and weak (around -0.10), indicating a small but real inverse relationship.
Non-Technical Interpretation: Employees who receive higher performance evaluation scores tend to report slightly lower satisfaction levels.
ggplot(hr, aes(x = last_evaluation, y = satisfaction_level)) +
geom_point(alpha = 0.3, color = "steelblue") +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "High Performers Tend to Be Slightly Less Satisfied",
x = "Last Evaluation Score",
y = "Satisfaction Level")
## `geom_smooth()` using formula = 'y ~ x'
cor.test(hr$number_project, hr$average_montly_hours)
##
## Pearson's product-moment correlation
##
## data: hr$number_project and hr$average_montly_hours
## t = 56.219, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.4039037 0.4303411
## sample estimates:
## cor
## 0.4172106
Technical Interpretation: The p-value is extremely small (far below 0.05), a statistically significant positive correlation between number of projects and average monthly hours. The correlation estimate is moderately strong and positive (around 0.42), meaning these two variables move together.
Non-Technical Interpretation: Employees who are assigned more projects tend to work more hours each month. This makes intuitive sense a heavier workload naturally demands more time on the job.
ggplot(hr, aes(x = number_project, y = average_montly_hours)) +
geom_point(alpha = 0.3, color = "darkorange") +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "More Projects Means More Hours Worked per Month",
x = "Number of Projects",
y = "Average Monthly Hours")
## `geom_smooth()` using formula = 'y ~ x'
cor.test(hr$time_spend_company, hr$last_evaluation)
##
## Pearson's product-moment correlation
##
## data: hr$time_spend_company and hr$last_evaluation
## t = 16.256, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1158309 0.1472844
## sample estimates:
## cor
## 0.1315907
Technical Interpretation: The p-value is extremely small (well below 0.05), meaning it’s statistical significance. The positive correlation estimate (around 0.13) is weak, but real suggesting that employees with more tenure receive marginally higher evaluation scores.
Non-Technical Interpretation: Employees who have been with the company longer tend to receive slightly higher performance ratings. Experience on the job appears to give workers a modest edge in how they are evaluated.
ggplot(hr, aes(x = time_spend_company, y = last_evaluation)) +
geom_point(alpha = 0.3, color = "mediumpurple") +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "More Experienced Employees Tend to Score Higher in Evaluations",
x = "Years at Company",
y = "Last Evaluation Score")
## `geom_smooth()` using formula = 'y ~ x'
cor.test(hr$satisfaction_level, hr$average_montly_hours)
##
## Pearson's product-moment correlation
##
## data: hr$satisfaction_level and hr$average_montly_hours
## t = -2.4556, df = 14997, p-value = 0.01408
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.036040356 -0.004045605
## sample estimates:
## cor
## -0.02004811
Technical Interpretation: The p-value is extremely small (well below 0.05), indicating statistical significance. The negative correlation estimate (around -0.02 to -0.10) suggests that employees who work more hours each month report lower levels of satisfaction, though the relationship is relatively weak.
Non-Technical Interpretation: Employees who work longer hours each month tend to be less satisfied with their jobs. This suggests that overwork may be contributing to employee dissatisfaction, which could ultimately drive attrition.
ggplot(hr, aes(x = average_montly_hours, y = satisfaction_level)) +
geom_point(alpha = 0.3, color = "tomato") +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Working Longer Hours Is Linked to Lower Job Satisfaction",
x = "Average Monthly Hours",
y = "Satisfaction Level")
## `geom_smooth()` using formula = 'y ~ x'