library(readr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
## Rows: 14999 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Department, salary
## dbl (8): satisfaction_level, last_evaluation, number_project, average_montly...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Histogram: Distribution of Employee Satisfaction Create a histogram of the satisfaction_level variable. The title should reflect a key takeaway from the distribution.

plot_ly(hr, x = ~satisfaction_level, type = "histogram") %>%
  layout(title = "Significant Portion of Employees Show Extremely Low Satisfaction",
         xaxis = list(title = "Satisfaction Level"),
         yaxis = list(title = "Count"))

The distribution appears to be roughly bimodal, with two distinct peaks: one at very low satisfaction (near 0) and another in the mid-to-high satisfaction range (around 0.5 to 0.9).

There is a large concentration of highly dissatisfied employees, with a spike in the count near 0 satisfaction.

Box Plot: Last Evaluation Scores Create a box plot of the last_evaluation variable. The title should highlight an important insight about the evaluation scores.

plot_ly(hr, y = ~last_evaluation, type = "box") %>%
  layout(title = "Most Employees Receive Mid-to-High Evaluation Scores",
         yaxis = list(title = "Last Evaluation Scores"))

The median evaluation score is around 0.72, indicating that half of the employees have scores higher than this, and half are lower.

The IQR spans from 0.56 to 0.87, showing that most employees receive evaluations in this mid-to-high range.

A small number of employees have lower evaluation scores.

Comparative Box Plot: Monthly Hours by Department Create a comparative box plot of average_montly_hours grouped by department. The title should emphasize a significant difference or pattern among departments.

plot_ly(hr, x = ~as.factor(Department), y = ~average_montly_hours, type = "box") %>%
  layout(title = "Monthly Hours Worked are Consistent Across Departments",
         xaxis = list(title = "Department"),
         yaxis = list(title = "Average Monthly Hours"))

Median monthly hours for all departments are fairly similar, with most departments showing a median around 200 hours per month.

The IQR for each department are quite broad, spanning from about 150 to 250 hours.

Pie Chart of Frequencies: Attrition by Salary Level Create a pie chart showing the frequency of employee attrition (left) for each salary category. The title should point out the relationship between salary and attrition.

left_by_salary <- hr %>%
  filter(left == 1) %>%
  count(salary)

plot_ly(left_by_salary, labels = ~salary, values = ~n, type = 'pie') %>%
  layout(title = 'Higher Attrition Rates in Low Salary Levels')

60.8% of employee attrition comes from those in the low salary category, this means a significant portion of employees leaving are earning lower wages.

36.9% of attrition occurs in the medium salary category, this means that mid-range salary earners also contribute substantially to the overall attrition rate.

Bar Plot of Averages: Average Satisfaction by Department Create a bar plot displaying the average satisfaction_level for each department. The title should highlight a key observation about departmental satisfaction.

department_satisfaction <- hr %>%
  group_by(Department) %>%
  summarise(avg_satisfaction = mean(satisfaction_level))

plot_ly(department_satisfaction, x = ~factor(Department), y = ~avg_satisfaction, type = 'bar') %>%
  layout(title = 'Consistent Satisfaction Across Departments',
         xaxis = list(title = 'Department'),
         yaxis = list(title = 'Average Satisfaction'))

Average satisfaction levels across all departments are consistently close to 0.6, showing little variation between different departments.

The accounting department has the lowest satisfaction, though still close to 0.6, suggesting no department is significantly below others in employee satisfaction.