library(readr)
hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
## Rows: 14999 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Department, salary
## dbl (8): satisfaction_level, last_evaluation, number_project, average_montly...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
install.packages("ggplot2", repos = "https://cran.rstudio.com/")
##
## The downloaded binary packages are in
## /var/folders/q1/gl10h4f94b3dsmnpk2jdysgr0000gn/T//RtmpB8mjKY/downloaded_packages
library(ggplot2)
cor.test(hr$average_montly_hours, hr$satisfaction_level)
##
## Pearson's product-moment correlation
##
## data: hr$average_montly_hours and hr$satisfaction_level
## t = -2.4556, df = 14997, p-value = 0.01408
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.036040356 -0.004045605
## sample estimates:
## cor
## -0.02004811
The P value is 0.01408, which is less than the baseline significance level of 0.05, which means that there is a statistically signficant correlation between workplace satisfaction level and monthly hours worked.
This correlation is signficant, which implies that workplace satisfaction may be influenced by how many hours an employee works. Employees who work work more strenuous hours may generally feel less satisfied with their job.
ggplot(hr, aes(x = average_montly_hours, y = satisfaction_level)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Scatter Plot: Satisfaction level vs Average Monthly Hours Worked",
x = "Average Monthly Hours",
y = "Satisfaction Level")
## `geom_smooth()` using formula = 'y ~ x'
cor.test(hr$number_project, hr$time_spend_company)
##
## Pearson's product-moment correlation
##
## data: hr$number_project and hr$time_spend_company
## t = 24.579, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1813532 0.2121217
## sample estimates:
## cor
## 0.1967859
The P value is 2.2e-16, which is a very small number and definitely under the baseline signficance level of 0.05, meaning that there is a highly statistically signficant correlation between number of projects and time spent at the company.
This correlation is highly significant, which implies that the employees who have spent a longer time with the company may generally have completed more projects.
ggplot(hr, aes(x = factor(time_spend_company), y = number_project)) +
geom_boxplot(fill = "skyblue") +
labs(title = "Box Plot: Number of Projects by Time at the Company",
x = "Time Spent at Company",
y = "Number of Projects") +
theme_minimal()
cor.test(hr$satisfaction_level, hr$time_spend_company)
##
## Pearson's product-moment correlation
##
## data: hr$satisfaction_level and hr$time_spend_company
## t = -12.416, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.11668153 -0.08499948
## sample estimates:
## cor
## -0.1008661
The P value is 2.2e-16, which is a very small number and definitely under the baseline significance level of 0.05, meaning that there is a highly statisfically significant correleation between satisifaction level and the amount of time spent at the company.
The correlation is highly siginficant,and has a negative correlation which implies that the employees who have spent a longer time at the company reported lower satisfaction levels. This could be for a number of reasons such as more responsibility as you stay at the company, leading to stress, or loss of initial enthusiasm.
ggplot(hr, aes(x = factor(time_spend_company), y = satisfaction_level)) +
geom_boxplot(fill = "skyblue") +
labs(title = "Box Plot: Satisfaction level by Time Spent at the Company",
x = "Time Spent at Company",
y = "Satisfaction level") +
theme_minimal()
cor.test(hr$promotion_last_5years, hr$satisfaction_level)
##
## Pearson's product-moment correlation
##
## data: hr$promotion_last_5years and hr$satisfaction_level
## t = 3.1367, df = 14997, p-value = 0.001712
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.009605315 0.041591949
## sample estimates:
## cor
## 0.02560519
The p-value is 0.001712, which is below the he baseline significance level of 0.05, meaning that there is a statistically signficant correlation between satisfaction level and receiving a recent promotion.
The correlation is signifiant, which means that receiving a promotion may lead to employees reporting higher satisfaction with their job.
ggplot(hr, aes(x = factor(promotion_last_5years), y = satisfaction_level)) +
geom_boxplot(fill = "skyblue") +
labs(title = "Box Plot: Satisfaction level by Recent Promotion",
x = "Promotion in last 5 Years",
y = "Satisfaction level") +
theme_minimal()