Introduction to the HR Dataset - Version 14

The HR Dataset was designed by Drs. Rich Huebner and Carla Patalano to accompany a case study designed for graduate HR students studying HR metrics, measurement, and analytics. The students use Tableau data visualization software to uncover insights about the case. This is a synthetic data set created specifically to go along with the case study (proprietary for the college that we teach at).

Every year or so, we update the data set to include additional columns, and to make slight changes to the underlying data. In this version, we add several new features to the data set:

We did not remove any fields for this version.

Data Dictionary

ï..Feature Description DataType
Employee Name Employee’s full name Text
EmpID Employee ID is unique to each employee Text
MarriedID Is the person married (1 or 0 for yes or no) Binary
MaritalStatusID Marital status code that matches the text field MaritalDesc Integer
EmpStatusID Employment status code that matches text field EmploymentStatus Integer
DeptID Department ID code that matches the department the employee works in Integer
PerfScoreID Performance Score code that matches the employee’s most recent performance score Integer
FromDiversityJobFairID Was the employee sourced from the Diversity job fair? 1 or 0 for yes or no Binary
PayRate The person’s hourly pay rate. All salaries are converted to hourly pay rate Float
Termd Has this employee been terminated - 1 or 0 Binary
PositionID An integer indicating the person’s position Integer
Position The text name/title of the position the person has Text
State The state that the person lives in Text
Zip The zip code for the employee Text
DOB Date of Birth for the employee Date
Sex Sex - M or F Text
MaritalDesc The marital status of the person (divorced, single, widowed, separated, etc) Text
CitizenDesc Label for whether the person is a Citizen or Eligible NonCitizen Text
HispanicLatino Yes or No field for whether the employee is Hispanic/Latino Text
RaceDesc Description/text of the race the person identifies with Text
DateofHire Date the person was hired Date
DateofTermination Date the person was terminated, only populated if, in fact, Termd = 1 Date
TermReason A text reason / description for why the person was terminated Text
EmploymentStatus A description/category of the person’s employment status. Anyone currently working full time = Active Text
Department Name of the department that the person works in Text
ManagerName The name of the person’s immediate manager Text
ManagerID A unique identifier for each manager. Integer
RecruitmentSource The name of the recruitment source where the employee was recruited from Text
PerformanceScore Performance Score text/category (Fully Meets, Partially Meets, PIP, Exceeds) Text
EngagementSurvey Results from the last engagement survey, managed by our external partner Float
EmpSatisfaction A basic satisfaction score between 1 and 5, as reported on a recent employee satisfaction survey Integer
SpecialProjectsCount The number of special projects that the employee worked on during the last 6 months Integer
LastPerformanceReviewDate The most recent date of the person’s last performance review. Date
DaysLateLast30 The number of times that the employee was late to work during the last 30 days Integer
Absences The number of times the employee was absent from work. Integer

Structure of HR Data Set

## 'data.frame':    311 obs. of  36 variables:
##  $ ï..Employee_Name          : Factor w/ 311 levels "Adinolfi, Wilson  K",..: 1 2 3 4 5 6 7 8 9 10 ...
##  $ EmpID                     : int  10026 10084 10196 10088 10069 10002 10194 10062 10114 10250 ...
##  $ MarriedID                 : int  0 1 1 1 0 0 0 0 0 0 ...
##  $ MaritalStatusID           : int  0 1 1 1 2 0 0 4 0 2 ...
##  $ GenderID                  : int  1 1 0 0 0 0 0 1 0 1 ...
##  $ EmpStatusID               : int  1 5 5 1 5 1 1 1 3 1 ...
##  $ DeptID                    : int  5 3 5 5 5 5 4 5 5 3 ...
##  $ PerfScoreID               : int  4 3 3 3 3 4 3 3 3 3 ...
##  $ FromDiversityJobFairID    : int  0 0 0 0 0 0 0 0 1 0 ...
##  $ Salary                    : int  62506 104437 64955 64991 50825 57568 95660 59365 47837 50178 ...
##  $ Termd                     : int  0 1 1 0 1 0 0 0 0 0 ...
##  $ PositionID                : int  19 27 20 19 19 19 24 19 19 14 ...
##  $ Position                  : Factor w/ 32 levels "Accountant I",..: 23 31 24 23 23 23 28 23 23 18 ...
##  $ State                     : Factor w/ 28 levels "AL","AZ","CA",..: 11 11 11 11 11 11 11 11 11 11 ...
##  $ Zip                       : int  1960 2148 1810 1886 2169 1844 2110 2199 1902 1886 ...
##  $ DOB                       : Factor w/ 307 levels "01/02/51","01/04/64",..: 159 95 225 234 216 120 123 31 27 7 ...
##  $ Sex                       : Factor w/ 2 levels "F","M ": 2 2 1 1 1 1 1 2 1 2 ...
##  $ MaritalDesc               : Factor w/ 5 levels "Divorced","Married",..: 4 2 2 2 1 4 4 5 4 1 ...
##  $ CitizenDesc               : Factor w/ 3 levels "Eligible NonCitizen",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ HispanicLatino            : Factor w/ 4 levels "no","No","yes",..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ RaceDesc                  : Factor w/ 6 levels "American Indian or Alaska Native",..: 6 6 6 6 6 6 6 6 3 6 ...
##  $ DateofHire                : Factor w/ 101 levels "1/10/2011","1/20/2013",..: 76 38 76 10 71 15 22 97 78 7 ...
##  $ DateofTermination         : Factor w/ 97 levels "","1/11/2014",..: 1 57 86 1 96 1 1 1 1 1 ...
##  $ TermReason                : Factor w/ 18 levels "Another position",..: 12 3 6 12 17 12 12 12 12 12 ...
##  $ EmploymentStatus          : Factor w/ 3 levels "Active","Terminated for Cause",..: 1 3 3 1 3 1 1 1 1 1 ...
##  $ Department                : Factor w/ 6 levels "Admin Offices",..: 4 3 4 4 4 4 6 4 4 3 ...
##  $ ManagerName               : Factor w/ 21 levels "Alex Sweetwater",..: 18 20 16 9 21 2 1 15 5 19 ...
##  $ ManagerID                 : int  22 4 20 16 39 11 10 19 12 7 ...
##  $ RecruitmentSource         : Factor w/ 9 levels "CareerBuilder",..: 6 5 6 5 4 6 6 3 2 5 ...
##  $ PerformanceScore          : Factor w/ 4 levels "Exceeds","Fully Meets",..: 1 2 2 2 2 1 2 2 2 2 ...
##  $ EngagementSurvey          : num  4.6 4.96 3.02 4.84 5 5 3.04 5 4.46 5 ...
##  $ EmpSatisfaction           : int  5 3 3 5 4 5 3 4 3 5 ...
##  $ SpecialProjectsCount      : int  0 6 0 0 0 0 4 0 0 6 ...
##  $ LastPerformanceReview_Date: Factor w/ 137 levels "1/10/2013","1/10/2015",..: 14 66 118 29 44 39 19 67 25 58 ...
##  $ DaysLateLast30            : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ Absences                  : int  1 17 3 15 2 15 19 19 4 16 ...
## NULL

Sample Visualizations using ggplot2 and ggthemes