library(readr)
library(ggplot2)
hr1 <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
## Rows: 14999 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Department, salary
## dbl (8): satisfaction_level, last_evaluation, number_project, average_montly...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

1. Perform the correlation (.5 point) Choose any two appropriate variables from the data and perform the correlation, displaying the results.

2. Interpret the results in technical terms (.5 point) For each correlation, explain what the test’s p-value means (significance).

3. Interpret the results in non-technical terms (1 point) For each correlation, what do the results mean in non-technical terms.

4. Create a plot that helps visualize the correlation (.5 point) For each correlation, create a graph to help visualize the relationship between the two variables. The title must be the non-technical interpretation.

1.

cor.test(hr1$average_montly_hours, hr1$time_spend_company)
## 
##  Pearson's product-moment correlation
## 
## data:  hr1$average_montly_hours and hr1$time_spend_company
## t = 15.774, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1119801 0.1434654
## sample estimates:
##       cor 
## 0.1277549

p-value interpretation: The P value is 2.2e-16, this is extremely small which means there is statistical evidence that there is correlation between having a promotion in the last 5 years and the time spent at the company.

correlation estimate interpretation: It is a weak positive correlation

non-technical interpretation: The more time you spend at a company, your hours increase slightly.

ggplot(hr1, aes(x = average_montly_hours, y = time_spend_company)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(title = "Scatter Plot: Average Monthly Hours vs. Time Spent at company",
       x = "Average Monthly Hours",
       y = "Time Spend Company")
## `geom_smooth()` using formula = 'y ~ x'

2.

cor.test(hr1$satisfaction_level, hr1$average_montly_hours)
## 
##  Pearson's product-moment correlation
## 
## data:  hr1$satisfaction_level and hr1$average_montly_hours
## t = -2.4556, df = 14997, p-value = 0.01408
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.036040356 -0.004045605
## sample estimates:
##         cor 
## -0.02004811

p-value interpretation: The P value is 0.01408, this is very large therefore there in no relationship between a persons satisfaction level and average monthly hours

correlation estimate interpretation: There is no correlation

non-technical interpretation: No relationship

ggplot(hr1, aes(x = satisfaction_level, y = average_montly_hours)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(title = "Scatter Plot: Satisfaction Level vs. Average Monthly Hours",
       x = "Satisfaction Level",
       y = "Average Monthly Hours")
## `geom_smooth()` using formula = 'y ~ x'

3.

cor.test(hr1$number_project, hr1$time_spend_company)
## 
##  Pearson's product-moment correlation
## 
## data:  hr1$number_project and hr1$time_spend_company
## t = 24.579, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1813532 0.2121217
## sample estimates:
##       cor 
## 0.1967859

p-value interpretation: The P value is 2.2e-16, this is extremely small which means there is statistical evidence that there is a correlation between the number of projects and the time spent at the company.

correlation estimate interpretation: It is a weak positive correlation

non-technical interpretation: The more time you spend at a company the more projects you will have

ggplot(hr1, aes(x = number_project, y = time_spend_company)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(title = "Number of Projects vs. Time Spent at Company",
       x = "Number of Projects",
       y = "Time Spent at Company")
## `geom_smooth()` using formula = 'y ~ x'

4.

cor.test(hr1$satisfaction_level, hr1$number_project)
## 
##  Pearson's product-moment correlation
## 
## data:  hr1$satisfaction_level and hr1$number_project
## t = -17.69, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1586105 -0.1272570
## sample estimates:
##        cor 
## -0.1429696

p-value interpretation: The P value is 2.2e-16, this is extremely small which means there is statistical evidence that there is a correlation between a persons satisfaction level and the number of projects they have done.

correlation estimate interpretation: It is a weak negative correlation

non-technical interpretation: The more projects you do the more your satisfaction level changes.

ggplot(hr1, aes(x = satisfaction_level, y = number_project)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, color = "red") +
  labs(title = "Scatter Plot: Satisfactions Level vs. Number or Projects",
       x = "Satisfaction Level",
       y = "Number of Projects")
## `geom_smooth()` using formula = 'y ~ x'