Employee Job satisfaction is of utmost importance because it influences productivity. Employers globally are recognizing human resource as a key element in determining overall organizational performance.Since performance is a key factor in realizing organizational goals and productivity, human resource managers are increasingly concerned about factors that enhance job satisfaction, and hence productivity. The purpose of this study is to identify factors that influence job satisfaction and their relationships. # The data The data set contains thirteen input variables consisting of employee demographic data such as gender, age, and employee number.Presumptive factors affecting Job satisfaction are also included, namely; Hourly rate, Education level, Hours worked in a week, job level , Performance rating, the number of years in current role etc. This study aims at accessing relationship between these factors and employee Job satisfaction.
library(readxl)
Job_data <- read_excel("C:/Users/Christine/Desktop/Rpubs/Job data.xlsx")
attach(Job_data)
head(Job_data, 3)
# A tibble: 3 x 12
`Employee ID` Age Gender `Education Leve~ `Hourly\r\nRate` `Weekly\r\nHour~
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 41 0 2 94 40
2 2 49 1 1 61 47
3 3 37 1 2 92 40
# ... with 6 more variables: `Job\r\nLevel` <dbl>,
# `Last\r\nPerformance\r\nRating` <dbl>, `Years At Company` <dbl>, `Years
# In\r\nCurrent\r\nRole` <dbl>, `Years Since\r\nLast\r\nPromotion` <dbl>,
# `Job\r\nSatisfaction\r\nRating` <dbl>
library(tidyr)
library(dplyr)
library(ggplot2)
library(ggpubr)
names(Job_data)<-gsub("\\s","_",names(Job_data))
A<-Job_data%>%
ggplot(aes(x=Age, y=Job__Satisfaction__Rating))+geom_point()+geom_smooth(method=lm)
B<-Job_data%>%
ggplot(aes(x=Education_Level, y= Job__Satisfaction__Rating))+geom_point()+geom_smooth(method=lm)
C<-Job_data%>%
ggplot(aes(x=Hourly__Rate,y= Job__Satisfaction__Rating))+geom_point()+geom_smooth(method=lm)
D<-Job_data%>%
ggplot(aes(x=Weekly__Hours__Worked, y=Job__Satisfaction__Rating))+ geom_point()+geom_smooth(method = lm)
E<-Job_data%>%
ggplot(aes(x=Job__Level, y=Job__Satisfaction__Rating))+ geom_point()+geom_smooth(method = lm)
F<-Job_data%>%
ggplot(aes(x= Years_Since__Last__Promotion, y=Job__Satisfaction__Rating))+ geom_point()+geom_smooth(method = lm)
ggarrange(A, B, C, D, E,F, ncol= 3, nrow = 3)
There does not seem to have any significant relationship between Age, Education level and Job satisfaction. On the other hand, Hourly rate, number of hours worked in a week, influence job satisfaction positively. The more the years since last promotion, the lower the job satisfaction. The factors seem to have linear relationship with employee job satisfaction. In order to fit a linear model, we test out linear regression assumptions.
library(tidyverse)
library(caret)
Job_data<-na.omit(Job_data)
sample_n(Job_data,3)
# A tibble: 3 x 12
Employee_ID Age Gender Education_Level Hourly__Rate Weekly__Hours__~
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 35 24 1 3 61 53
2 439 35 1 3 72 45
3 552 39 0 3 179 40
# ... with 6 more variables: Job__Level <dbl>, Last__Performance__Rating <dbl>,
# Years_At_Company <dbl>, Years_In__Current__Role <dbl>,
# Years_Since__Last__Promotion <dbl>, Job__Satisfaction__Rating <dbl>
set.seed(123)
training.samples<-Job_data$Job__Satisfaction__Rating %>%
createDataPartition(p=0.8,list= FALSE)
train.set<-Job_data[training.samples,]
test.set<-Job_data[-training.samples, ]
The data is split into a training set and a test set for fitting the model and prediction
model_1 <-lm(Job__Satisfaction__Rating ~ Hourly__Rate+ Weekly__Hours__Worked+ Job__Level+Years_Since__Last__Promotion, train.set)
summary(model_1)
Call:
lm(formula = Job__Satisfaction__Rating ~ Hourly__Rate + Weekly__Hours__Worked +
Job__Level + Years_Since__Last__Promotion, data = train.set)
Residuals:
Min 1Q Median 3Q Max
-3.2660 -0.8131 0.0469 0.8305 2.7144
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.177453 0.321358 12.999 < 2e-16 ***
Hourly__Rate 0.028037 0.001306 21.475 < 2e-16 ***
Weekly__Hours__Worked -0.003364 0.006985 -0.482 0.63
Job__Level 0.115363 0.021105 5.466 7.29e-08 ***
Years_Since__Last__Promotion -0.072722 0.015653 -4.646 4.35e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.182 on 497 degrees of freedom
Multiple R-squared: 0.5594, Adjusted R-squared: 0.5559
F-statistic: 157.8 on 4 and 497 DF, p-value: < 2.2e-16
The data is fitted into a linear regression model with input variables as hourly rate, years since last promotion, job level and weekly hours worked. The model confirms our hypothesis that those are the factors that affect job satisfaction as shown by p-values of below 0.05 and Adjusted R-squared of 55 %
predicted.classes<-predict(model_1, test.set)
head(round(predicted.classes))
1 2 3 4 5 6
9 7 5 6 6 7
summary(aov(model_1, test.set))
Df Sum Sq Mean Sq F value Pr(>F)
Hourly__Rate 1 201.55 201.55 181.204 < 2e-16 ***
Weekly__Hours__Worked 1 3.88 3.88 3.489 0.06426 .
Job__Level 1 12.01 12.01 10.795 0.00134 **
Years_Since__Last__Promotion 1 4.67 4.67 4.198 0.04268 *
Residuals 118 131.25 1.11
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Model_fit<-lm (Job__Satisfaction__Rating~Hourly__Rate+ Years_Since__Last__Promotion, Job_data)
summary(Model_fit)
Call:
lm(formula = Job__Satisfaction__Rating ~ Hourly__Rate + Years_Since__Last__Promotion,
data = Job_data)
Residuals:
Min 1Q Median 3Q Max
-3.2534 -0.7495 0.0492 0.7989 2.9829
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.34070 0.12651 34.310 < 2e-16 ***
Hourly__Rate 0.02994 0.00116 25.806 < 2e-16 ***
Years_Since__Last__Promotion -0.07517 0.01454 -5.168 3.19e-07 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 1.192 on 622 degrees of freedom
Multiple R-squared: 0.5419, Adjusted R-squared: 0.5405
F-statistic: 368 on 2 and 622 DF, p-value: < 2.2e-16
summary(aov(Model_fit))
Df Sum Sq Mean Sq F value Pr(>F)
Hourly__Rate 1 1008.3 1008.3 709.21 < 2e-16 ***
Years_Since__Last__Promotion 1 38.0 38.0 26.71 3.19e-07 ***
Residuals 622 884.3 1.4
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the analysis,Hourly rate influences job satisfaction positively( p-value< 0.05) and years since last promotion affect job satisfaction inversely.From this survey, the two are the main factors that influence job satisfaction
The plots below show that the models satisfy assumptions of linear regression
plot(model_1)
plot(Model_fit)