library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.2 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## Warning: package 'ggplot2' was built under R version 4.1.2
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(ggplot2)
setwd("C:/Users/P_Ath/Desktop/DATA 110")
salaries <- read_csv("Employee_Salaries_-_2020.csv")
##
## -- Column specification --------------------------------------------------------
## cols(
## Department = col_character(),
## `Department Name` = col_character(),
## Division = col_character(),
## Gender = col_character(),
## `Base Salary` = col_double(),
## `2020 Overtime Pay` = col_double(),
## `2020 Longevity Pay` = col_double(),
## Grade = col_character()
## )
head(salaries)
## # A tibble: 6 x 8
## Department `Department Name` Division Gender `Base Salary` `2020 Overtime ~
## <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 ABS Alcohol Beverage ~ Wholesale~ F 78902 199.
## 2 ABS Alcohol Beverage ~ Administr~ F 35926 0
## 3 ABS Alcohol Beverage ~ Administr~ M 167345 0
## 4 ABS Alcohol Beverage ~ Wholesale~ F 90848 0
## 5 ABS Alcohol Beverage ~ Administr~ F 78902 205.
## 6 ABS Alcohol Beverage ~ Marketing F 109761 0
## # ... with 2 more variables: 2020 Longevity Pay <dbl>, Grade <chr>
p1 <- salaries %>%
ggplot(aes(x="2020 Overtime Pay", fill=Grade)) +
geom_histogram(position="identity", alpha = 0.5,stat = 'count', binwidth = 50, color = "white") +
scale_fill_discrete(name= "Division", labels = c("Wholesale Administration", "Wholesale Operation", "Administration", "Administration Services", "Marketing"))
## Warning: Ignoring unknown parameters: binwidth, bins, pad
p1
This dataset is from https://data.montgomerycountymd.gov/Human-Resources/Employee-Salaries-2020/he7s-ebwb/data.
In this dataset, it gives the annual salary information which includes
its gross pay and its overtime pay, especially for all the active and
its permanent employees of Montgomery County. It is paid in the calendar
of the year, 2020.
The topic of this dataset is all about employee salaries by division.
The variables that are given in this dataset are the Quantitative
variables. The variables that are given in this dataset are department,
department name, base salary, division, gender, overtime pay, longevity
pay, and grade. In this dataset I used, the dataset did not need
cleaning. What this visualization represents is how the color delineates
the grades of pay. And the count is the number of workers that are paid
by the division. But overall, by analyzing this dataset, the people work
in the Marketing division than the Wholesale Administration. I could not
make a plot of the base pay by division because there were so many
people that work in each division.