library(readr)
library(plotly)
## Loading required package: ggplot2
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
## Rows: 14999 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Department, salary
## dbl (8): satisfaction_level, last_evaluation, number_project, average_montly...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Task 1
cor.test(hr$time_spend_company, hr$satisfaction_level)
##
## Pearson's product-moment correlation
##
## data: hr$time_spend_company and hr$satisfaction_level
## t = -12.416, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.11668153 -0.08499948
## sample estimates:
## cor
## -0.1008661
Task 2
The results of the correlation test indicate that there is a
statistically significant weak negative correlation between
time_spend_company and satisfaction_level. The p-value of less than
2.2e-16 indicates that this result is highly significant; we can reject
the null hypothesis that there is no correlation between the two
variables.
Task 3
There is a slight tendency for employees who spend more time at the
company to report lower satisfaction levels. The relationship is weak,
but suggests that long-term employees might feel less satisfied over
time. The very low p-value indicates that this finding is statistically
significant, meaning it is unlikely to have occurred by chance.
Therefore, it’s worth considering that as employees stay longer, their
satisfaction may decline.
Task 4
library(ggplot2)
ggplot(hr, aes(x = time_spend_company, y = satisfaction_level)) +
geom_point() +
geom_smooth(method = "lm", color = "blue", se = FALSE) +
labs(title = "Longer Tenure with the Company Tends to Lower Satisfaction",
x = "Years at the Company",
y = "Job Satisfaction Level")
## `geom_smooth()` using formula = 'y ~ x'
