Objective

In this assignment, you will analyze employee attrition data using various visualization techniques in R. You will create a histogram, box plots, and bar plots to gain insights into the factors affecting employee turnover.

Data

The dataset contains information about employees, including their satisfaction levels, last evaluation scores, number of projects, average monthly hours, time spent at the company, work accidents, promotion history, department, and salary.

Tasks

  1. Histogram: Distribution of Employee Satisfaction Create a histogram of the satisfaction_level variable. The title should reflect a key takeaway from the distribution.

  2. Box Plot: Last Evaluation Scores Create a box plot of the last_evaluation variable. The title should highlight an important insight about the evaluation scores.

  3. Comparative Box Plot: Monthly Hours by Department Create a comparative box plot of average_montly_hours grouped by department. The title should emphasize a significant difference or pattern among departments.

  4. Bar Plot of Frequencies: Attrition by Salary Level Create a bar plot showing the frequency of employee attrition (left) for each salary category. The title should point out the relationship between salary and attrition.

  5. Bar Plot of Averages: Average Satisfaction by Department Create a bar plot displaying the average satisfaction_level for each department. The title should highlight a key observation about departmental satisfaction.

Requirements

  1. Use R to create all visualizations.
  2. Apply appropriate colors, labels, and themes to enhance readability.
  3. Each plot should have a title whch is the insight.
  4. Include brief comments explaining your observations/analysis from each plot.
  5. Publish your R Markdown document on RPubs and submit the link for grading.

Submission

Submit your assignment by providing a link to your published RPubs document containing all the required visualizations and explanations.

Grading Criteria

Total: 12 points

Starter code

Use this code to read the data. Note that you will need additional libraries

library(readr)

hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')

Example of a succesful graph

library(plotly)

plot_ly(mtcars, x = ~mpg, type = "histogram") %>%
  layout(title = "Most cars had an MPG between 20 and 25 ",
         xaxis = list(title = "Miles Per Gallon"),
         yaxis = list(title = "Count"))

Analysis

Good luck!