library(readr)
hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
## Rows: 14999 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Department, salary
## dbl (8): satisfaction_level, last_evaluation, number_project, average_montly...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cor.test(hr$satisfaction_level, hr$average_montly_hours)
##
## Pearson's product-moment correlation
##
## data: hr$satisfaction_level and hr$average_montly_hours
## t = -2.4556, df = 14997, p-value = 0.01408
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.036040356 -0.004045605
## sample estimates:
## cor
## -0.02004811
The correlation between satisfaction_level and average_montly_hours is -0.020 (p-value = 0.01408). The correlation is negative but very close to zero, indicating a weak relationship. The p-value of 0.01408 is less than 0.05, suggesting that the correlation is statistically significant.Non-Technical Interpretation: Although there is a slight negative relationship, the connection between satisfaction level and average monthly hours is very weak. This means that changes in monthly working hours are almost unrelated to employees’ satisfaction levels.
library(ggplot2)
ggplot(hr, aes(x = average_montly_hours, y = satisfaction_level)) +
geom_hex(bins = 30) + # Adjust the bins for more or fewer hexagons
scale_fill_viridis_c(option = "C", name = "Density") + # Color scale for density
labs(
title = "Satisfaction Level vs. Average Monthly Hours (Hexbin Plot)",
x = "Average Monthly Hours",
y = "Satisfaction Level"
) +
theme_minimal()
cor.test(hr$last_evaluation, hr$number_project)
##
## Pearson's product-moment correlation
##
## data: hr$last_evaluation and hr$number_project
## t = 45.656, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3352028 0.3633053
## sample estimates:
## cor
## 0.3493326
The correlation between last_evaluation and number_project is 0.349 (p-value < 2.2e-16). This positive correlation is moderate, and the extremely low p-value indicates that the correlation is statistically significant.Non-Technical Interpretation: Employees who have more projects tend to receive higher evaluations. This suggests a moderate association between the number of projects an employee handles and their performance evaluation.
ggplot(hr, aes(x = number_project, y = last_evaluation)) +
geom_hex(bins = 30) +
scale_fill_viridis_c(option = "C", name = "Density") +
labs(
title = "Last Evaluation vs. Number of Projects (Hexbin Plot)",
x = "Number of Projects",
y = "Last Evaluation"
) +
theme_minimal()
cor.test(hr$time_spend_company, hr$satisfaction_level)
##
## Pearson's product-moment correlation
##
## data: hr$time_spend_company and hr$satisfaction_level
## t = -12.416, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.11668153 -0.08499948
## sample estimates:
## cor
## -0.1008661
Technical Interpretation: The correlation between time_spend_company and satisfaction_level is -0.101 (p-value < 2.2e-16). This negative correlation is weak but statistically significant given the very low p-value. Non-Technical Interpretation: Employees who have spent more time in the company tend to have slightly lower satisfaction levels. Although the relationship is weak, it indicates that longer tenure may be associated with a small decrease in satisfaction.
ggplot(hr, aes(x = time_spend_company, y = satisfaction_level)) +
geom_hex(bins = 30) +
scale_fill_viridis_c(option = "C", name = "Density") +
labs(
title = "Time Spent in Company vs. Satisfaction Level (Hexbin Plot)",
x = "Time Spent in Company (Years)",
y = "Satisfaction Level"
) +
theme_minimal()
cor.test(hr$average_montly_hours, hr$last_evaluation)
##
## Pearson's product-moment correlation
##
## data: hr$average_montly_hours and hr$last_evaluation
## t = 44.237, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.3255078 0.3538218
## sample estimates:
## cor
## 0.3397418
Technical Interpretation: The correlation between average_montly_hours and last_evaluation is 0.340 (p-value < 2.2e-16). This moderate positive correlation is statistically significant given the very low p-value. Non-Technical Interpretation: Employees who work more hours on average tend to receive higher evaluations. This moderate relationship suggests that higher work hours are associated with better performance evaluations.
ggplot(hr, aes(x = average_montly_hours, y = last_evaluation)) +
geom_hex(bins = 30) +
scale_fill_viridis_c(option = "C", name = "Density") +
labs(
title = "Average Monthly Hours vs. Last Evaluation (Hexbin Plot)",
x = "Average Monthly Hours",
y = "Last Evaluation"
) +
theme_minimal()