Employee_Salaries

Load packages

library(tidyverse)

## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --

## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.2     v dplyr   1.0.7
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1

## Warning: package 'ggplot2' was built under R version 4.1.2

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(ggplot2)

Load the data

salaries <- read_csv("Employee_Salaries_-_2020.csv")

## 
## -- Column specification --------------------------------------------------------
## cols(
##   Department = col_character(),
##   `Department Name` = col_character(),
##   Division = col_character(),
##   Gender = col_character(),
##   `Base Salary` = col_double(),
##   `2020 Overtime Pay` = col_double(),
##   `2020 Longevity Pay` = col_double(),
##   Grade = col_character()
## )

head(salaries)

## # A tibble: 6 x 8
##   Department `Department Name`  Division   Gender `Base Salary` `2020 Overtime ~
##   <chr>      <chr>              <chr>      <chr>          <dbl>            <dbl>
## 1 ABS        Alcohol Beverage ~ Wholesale~ F              78902             199.
## 2 ABS        Alcohol Beverage ~ Administr~ F              35926               0 
## 3 ABS        Alcohol Beverage ~ Administr~ M             167345               0 
## 4 ABS        Alcohol Beverage ~ Wholesale~ F              90848               0 
## 5 ABS        Alcohol Beverage ~ Administr~ F              78902             205.
## 6 ABS        Alcohol Beverage ~ Marketing  F             109761               0 
## # ... with 2 more variables: 2020 Longevity Pay <dbl>, Grade <chr>

Make a barplot of gender salaries by division

p1 <- salaries %>%
  ggplot(aes(x="2020 Overtime Pay", fill=Grade)) +
  geom_histogram(position="identity", alpha = 0.5,stat = 'count', binwidth = 50, color = "white") + 
                   scale_fill_discrete(name= "Division", labels = c("Wholesale Administration", "Wholesale Operation", "Administration", "Administration Services", "Marketing"))

## Warning: Ignoring unknown parameters: binwidth, bins, pad

p1

This dataset is from https://data.montgomerycountymd.gov/Human-Resources/Employee-Salaries-2020/he7s-ebwb/data. In this dataset, it gives the annual salary information which includes its gross pay and its overtime pay, especially for all the active and its permanent employees of Montgomery County. It is paid in the calendar of the year, 2020.
The topic of this dataset is all about employee salaries by division. The variables that are given in this dataset are the Quantitative variables. The variables that are given in this dataset are department, department name, base salary, division, gender, overtime pay, longevity pay, and grade. In this dataset I used, the dataset did not need cleaning. What this visualization represents is how the color delineates the grades of pay. And the count is the number of workers that are paid by the division. But overall, by analyzing this dataset, the people work in the Marketing division than the Wholesale Administration. I could not make a plot of the base pay by division because there were so many people that work in each division.

Employee_Salaries

Preethi

2022-03-19

Load packages

Set the Working directory

Load the data

Make a barplot of gender salaries by division