Assignment 7: Correlations - Employee Attrition Analysis

Load libraries

library(readr)
library(ggplot2)

Load data

hr <- read_csv("https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv")

## Rows: 14999 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Department, salary
## dbl (8): satisfaction_level, last_evaluation, number_project, average_montly...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

1. Perform the correlation. Choose any two appropriate variables from the data and perform the correlation, displaying the results.

cor.test(hr$satisfaction_level, hr$last_evaluation)

## 
##  Pearson's product-moment correlation
## 
## data:  hr$satisfaction_level and hr$last_evaluation
## t = 12.933, df = 14997, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.08916727 0.12082195
## sample estimates:
##       cor 
## 0.1050212

2. Interpret the results in technical terms.For each correlation, explain what the test’s p-value means (significance).

The p-value is less than 0.05, which means the correlation is statistically significant. Because the p-value is extremely small,

we reject the null hypothesis and conclude that there is evidence of a relationship between satisfaction level and last evaluation.

————————————————————————————————————————————-

The correlation coefficient is 0.28, which indicates a positive and small-to-moderate relationship. This means that as satisfaction

increases, last evaluation scores tend to increase as well, but not perfectly.

3. Interpret the results in non-technical terms.For each correlation, what do the results mean in non-techical terms.

Employees who are more satisfied tend to receive better performance evaluations.In other words, happier employees usually get higher performance ratings.

4. Create a plot that helps visualize the correlation. For each correlation, create a graph to

help visualize the realtionship between the two variables. The title must be the non-technical interpretation.

ggplot(hr, aes(x = last_evaluation, y = satisfaction_level)) +
  geom_point(alpha = 0.2) +
  geom_smooth(method = "lm", se = FALSE, color = "blue") +
  labs(
    title = "Last Evaluation Score VS Satisfaction Level",
    x = "Last Evaluation Score",
    y = "Satisfaction Level"
  )

## `geom_smooth()` using formula = 'y ~ x'

Assignment 7: Correlations - Employee Attrition Analysis

Sofia Nogalo Peter Evans

2025-12-02

Load libraries

Load data

1. Perform the correlation. Choose any two appropriate variables from the data and perform the correlation, displaying the results.

2. Interpret the results in technical terms.For each correlation, explain what the test’s p-value means (significance).

The p-value is less than 0.05, which means the correlation is statistically significant. Because the p-value is extremely small,

we reject the null hypothesis and conclude that there is evidence of a relationship between satisfaction level and last evaluation.

————————————————————————————————————————————-

The correlation coefficient is 0.28, which indicates a positive and small-to-moderate relationship. This means that as satisfaction

increases, last evaluation scores tend to increase as well, but not perfectly.

3. Interpret the results in non-technical terms.For each correlation, what do the results mean in non-techical terms.

Employees who are more satisfied tend to receive better performance evaluations.In other words, happier employees usually get higher performance ratings.

4. Create a plot that helps visualize the correlation. For each correlation, create a graph to

help visualize the realtionship between the two variables. The title must be the non-technical interpretation.

Employees who are more satisfied tend to receive better performance evaluations.