Justin Kaplan

Assignment 7

library(readr)
hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
library(dplyr)
library(ggplot2)

1.Perform the correlation (.5 point) Choose any two appropriate variables from the data and perform the correlation, displaying the results.

cor_test_result <- cor.test(hr$satisfaction_level, hr$left)
cor_test_result
## 
##  Pearson's product-moment correlation
## 
## data:  hr$satisfaction_level and hr$left
## t = -51.613, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.4018809 -0.3747001
## sample estimates:
##       cor 
## -0.388375

2.Interpret the results in technical terms (.5 point) For each correlation, explain what the test’s p-value means (significance).

There is a very low chance that the data is random. A p-value of 2.2e -16 implies that there is a strong correlation. between employee satisfaction and whether they still work with the company or not.

3.Interpret the results in non-technical terms (1 point) For each correlation, what do the results mean in non-techical terms.

If employees are more satisfied they are much more likely to stay with the company. Companies can use this information to project employees and try to make a better effot to retain employees.

4.Create a plot that helps visualize the correlation (.5 point) For each correlation, create a graph to help visualize the realtionship between the two variables. The title must be the non-technical interpretation.

ggplot(hr, aes(x = left, y = satisfaction_level, fill = left)) +
  geom_boxplot() +
  stat_summary(fun = mean, geom = "point", shape = 18, size = 3, color = "red") +
  labs(title = "EMployee Satisfaction by Attrition",
       x = "Status with Company",
       y = "Satisfaction Level") +
  theme_minimal()
## Warning: Continuous x aesthetic
## ℹ did you forget `aes(group = ...)`?
## Warning: The following aesthetics were dropped during statistical transformation: fill.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
##   the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
##   variable into a factor?