library(readr)
library(plotly)
## Loading required package: ggplot2
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
hr <- read_csv('https://raw.githubusercontent.com/aiplanethub/Datasets/refs/heads/master/HR_comma_sep.csv')
## Rows: 14999 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): Department, salary
## dbl (8): satisfaction_level, last_evaluation, number_project, average_montly...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Histogram: Distribution of Employee Satisfaction Create a histogram of the satisfaction_level variable. The title should reflect a key takeaway from the distribution.

plot_ly(hr, x = ~satisfaction_level, type = "histogram") %>%
  layout(title = "Most employees are satisifed",
         xaxis = list(title = "Satisfaction Level"),
         yaxis = list(title = "Frequency"))

about 6 percent of employees are extremely dissatisifed (level <= .1)

most employees are satisifed (satisifaction level > .5)

Box Plot: Last Evaluation Scores Create a box plot of the last_evaluation variable. The title should highlight an important insight about the evaluation scores.

plot_ly(hr, y = ~satisfaction_level, type = "box") %>%
  layout(title = "Box Plot of Last Evaluation Scores: Most Employees Cluster Around High Scores",
    yaxis = list(title = "Last Evaluation Score"))

large portion of employees received evaluation scores between approximately 0.55 and 0.85. This indicates that most employees were evaluated with moderate to high scores.

Any points outside the whiskers are considered outliers. These represent employees who received either exceptionally low or high scores

Comparative Box Plot: Monthly Hours by Department Create a comparative box plot of average_montly_hours grouped by department. The title should emphasize a significant difference or pattern among departments.

plot_ly(hr, x = ~Department, y = ~average_montly_hours, type = "box") %>%
  layout(title = "Comparative Box Plot of Monthly Hours: Noticeable Variation Across Departments",
         xaxis = list(title = "Department"),
         yaxis = list(title = "Average_montly_hours"))

The plot reveals whether certain departments, such as IT or Sales, have consistently higher or lower monthly hours, indicating potential workload imbalances.

Some departments may have a higher median (central line in the box), indicating that employees in these departments tend to work more hours on average compared to others.

Pie Chart of Frequencies: Attrition by Salary Level Create a pie chart showing the frequency of employee attrition (left) for each salary category. The title should point out the relationship between salary and attrition.

attrition_data <- hr[hr$left == 1, ]
attrition_by_salary <- as.data.frame(table(attrition_data$salary))
colnames(attrition_by_salary) <- c("salary", "count")
plot_ly(attrition_by_salary, labels = ~salary, values = ~count, type = 'pie') %>%
  layout(title = 'Employee Attrition by Salary: Higher Attrition in Lower Salary Categories')

There is a clear relationship between salary level and attrition—employees in lower salary categories experience higher rates of turnover, while those in higher salary categories tend to stay with the company longer.

If the pie chart shows that a significant portion of the chart is taken up by the “low” salary category, this indicates that employees with lower salaries are more likely to leave the company

Bar Plot of Averages: Average Satisfaction by Department Create a bar plot displaying the average satisfaction_level for each department. The title should highlight a key observation about departmental satisfaction.

avg_satisfaction <- hr %>% group_by(Department) %>% summarize(avg_satisfaction = mean(satisfaction_level))
plot_ly(avg_satisfaction, x = ~Department, y = ~avg_satisfaction, type = 'bar') %>%
  layout(title = 'Average Satisfaction by Department: Key Differences in Satisfaction Levels Across Teams',
         xaxis = list(title = 'sales'),
         yaxis = list(title = 'average satisfaction level'))

Management and It have the highest satisfaction level meaning they like their job the most

hr and accounting have a lower satifaction level