library(readr)
library(ggplot2)
hr1 <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
## Rows: 14999 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Department, salary
## dbl (8): satisfaction_level, last_evaluation, number_project, average_montly...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
cor.test(hr1$average_montly_hours, hr1$time_spend_company)
##
## Pearson's product-moment correlation
##
## data: hr1$average_montly_hours and hr1$time_spend_company
## t = 15.774, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1119801 0.1434654
## sample estimates:
## cor
## 0.1277549
p-value interpretation: The P value is 2.2e-16, this is extremely small which means there is statistical evidence that there is correlation between having a promotion in the last 5 years and the time spent at the company.
correlation estimate interpretation: It is a weak positive correlation
non-technical interpretation: The more time you spend at a company, your hours increase slightly.
ggplot(hr1, aes(x = average_montly_hours, y = time_spend_company)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Scatter Plot: Average Monthly Hours vs. Time Spent at company",
x = "Average Monthly Hours",
y = "Time Spend Company")
## `geom_smooth()` using formula = 'y ~ x'
cor.test(hr1$satisfaction_level, hr1$average_montly_hours)
##
## Pearson's product-moment correlation
##
## data: hr1$satisfaction_level and hr1$average_montly_hours
## t = -2.4556, df = 14997, p-value = 0.01408
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.036040356 -0.004045605
## sample estimates:
## cor
## -0.02004811
p-value interpretation: The P value is 0.01408, this is very large therefore there in no relationship between a persons satisfaction level and average monthly hours
correlation estimate interpretation: There is no correlation
non-technical interpretation: No relationship
ggplot(hr1, aes(x = satisfaction_level, y = average_montly_hours)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Scatter Plot: Satisfaction Level vs. Average Monthly Hours",
x = "Satisfaction Level",
y = "Average Monthly Hours")
## `geom_smooth()` using formula = 'y ~ x'
cor.test(hr1$number_project, hr1$time_spend_company)
##
## Pearson's product-moment correlation
##
## data: hr1$number_project and hr1$time_spend_company
## t = 24.579, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1813532 0.2121217
## sample estimates:
## cor
## 0.1967859
p-value interpretation: The P value is 2.2e-16, this is extremely small which means there is statistical evidence that there is a correlation between the number of projects and the time spent at the company.
correlation estimate interpretation: It is a weak positive correlation
non-technical interpretation: The more time you spend at a company the more projects you will have
ggplot(hr1, aes(x = number_project, y = time_spend_company)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Number of Projects vs. Time Spent at Company",
x = "Number of Projects",
y = "Time Spent at Company")
## `geom_smooth()` using formula = 'y ~ x'
cor.test(hr1$satisfaction_level, hr1$number_project)
##
## Pearson's product-moment correlation
##
## data: hr1$satisfaction_level and hr1$number_project
## t = -17.69, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.1586105 -0.1272570
## sample estimates:
## cor
## -0.1429696
p-value interpretation: The P value is 2.2e-16, this is extremely small which means there is statistical evidence that there is a correlation between a persons satisfaction level and the number of projects they have done.
correlation estimate interpretation: It is a weak negative correlation
non-technical interpretation: The more projects you do the more your satisfaction level changes.
ggplot(hr1, aes(x = satisfaction_level, y = number_project)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(title = "Scatter Plot: Satisfactions Level vs. Number or Projects",
x = "Satisfaction Level",
y = "Number of Projects")
## `geom_smooth()` using formula = 'y ~ x'