The objective of this assignment is to predict job satisfaction using various predictors in the dataset. I will display this through visualizing the data, using KNN to classify satisfaction, and Naive Bayes to predict satisfaction.
The dataset includes information on employees from different departments and factors that might affect job satisfaction. It has details like years of experience, communication, recognition, training, working conditions, tools, and work-life balance, along with job satisfaction scores. The data covers departments like Administrative, Maintenance, Management, Production, QC, and SR, making it possible to compare satisfaction across roles. Analyzing this can help find patterns and understand what influences job satisfaction.
Department | Years | Ideas | Communication | Recognition | Training | Conditions | Tools | Balance | Satisfaction |
---|---|---|---|---|---|---|---|---|---|
Administrative | 16 | 2 | 3 | 2 | 2 | 4 | 5 | 2 | 3 |
Administrative | 2 | 4 | 4 | 3 | 4 | 4 | 5 | 3 | 9 |
Administrative | 14 | 4 | 3 | 2 | 2 | 5 | 5 | 5 | 6 |
Maintenance | 17 | 5 | 4 | 3 | 5 | 5 | 5 | 3 | 8 |
Maintenance | 15 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | 9 |
This is a preview of the first five rows of the dataset. The dataset consists of a mix of categorical and numerical data types. The Department column contains text values such as “Administrative,” “Maintenance,” and “Production,” which can be treated as a categorical variable. The other columns, such as Years, Ideas, Communication, Recognition, Training, Conditions, Tools, Balance, and Satisfaction, are numerical and contain ratings or numerical scores. These scores represent categories like satisfaction or skill level and could be considered ordinal data, even though they are stored as numeric. While these columns are stored as numbers, they represent rankings or ratings rather than continuous measurements.
Department | Years | Ideas | Communication | Recognition | Training | Conditions | Tools | Balance | Satisfaction | |
---|---|---|---|---|---|---|---|---|---|---|
Length:32 | Min. : 1.000 | Min. :2.000 | Min. :2.000 | Min. :1.000 | Min. :2.000 | Min. :2.000 | Min. :3.000 | Min. :2.000 | Min. : 3.000 | |
Class :character | 1st Qu.: 2.750 | 1st Qu.:3.000 | 1st Qu.:3.000 | 1st Qu.:2.000 | 1st Qu.:3.000 | 1st Qu.:3.000 | 1st Qu.:4.000 | 1st Qu.:3.000 | 1st Qu.: 5.000 | |
Mode :character | Median : 7.000 | Median :4.000 | Median :4.000 | Median :3.000 | Median :4.000 | Median :4.000 | Median :5.000 | Median :3.500 | Median : 7.000 | |
NA | Mean : 9.219 | Mean :3.656 | Mean :3.688 | Mean :3.156 | Mean :3.625 | Mean :3.844 | Mean :4.656 | Mean :3.625 | Mean : 6.844 | |
NA | 3rd Qu.:15.000 | 3rd Qu.:5.000 | 3rd Qu.:4.000 | 3rd Qu.:4.000 | 3rd Qu.:4.250 | 3rd Qu.:5.000 | 3rd Qu.:5.000 | 3rd Qu.:5.000 | 3rd Qu.: 8.250 | |
NA | Max. :32.000 | Max. :5.000 | Max. :5.000 | Max. :5.000 | Max. :5.000 | Max. :5.000 | Max. :5.000 | Max. :5.000 | Max. :10.000 |
The summary of the dataset reveals employee ratings across various work-related factors, such as communication, recognition, and training. The Department column consists of categorical data with 32 entries from different departments. Numerical columns like Years range from 1 to 32, with an average of about 9.22 years of experience. Ratings for work factors, on a scale of 1 to 5, show that most employees report moderately positive experiences. Satisfaction, with a mean of 6.84 and a maximum of 10, indicates generally positive feedback. Other factors, like Tools and Conditions, have higher ratings, with employees feeling well-equipped and satisfied with their working conditions.
The data is not missing any values. It includes a good amount of predictors though there are a few columns that should have been included that would affect satisfaction that are not mentioned. These include health and safety, company values, relationship with managers, pay scale, career development, and challenges.
This histogram displays the distribution of job satisfaction scores, ranging from 1 (least satisfied) to 10 (most satisfied). The distribution is left-skewed, indicating that most employees report relatively high satisfaction levels. The majority of scores cluster between 7 and 8, suggesting that a significant portion of employees are generally satisfied with their jobs. The left skewness further highlights that fewer employees report very low satisfaction scores, making dissatisfaction less common in this dataset.
The box plot of job satisfaction by department provides insight into how satisfaction levels vary across different roles. It highlights that QC and Maintenance have the highest overall job satisfaction scores, indicating that employees in these departments tend to be more satisfied with their work environment. Additionally, Management has the highest median satisfaction level, suggesting that, on average, management employees report greater satisfaction compared to other departments. This visualization helps identify trends in job satisfaction across various roles, which can be useful for targeted improvements.
As an employee receives more recognition, their satusfaction levels go
up, which is expected.
This correlation heatmap illustrates the strength and direction of relationships among different variables. A key takeaway is that there is a positive relationship between satisfaction and recognition, meaning that as recognition increases, so does job satisfaction. On the other hand, the relationship between satisfaction and tools shows little correlation, suggesting that the availability of tools does not have a strong impact on job satisfaction. This visualization helps identify which factors are most closely related to satisfaction..
This scatter plot shows a negative linear relationship between years of experience and job satisfaction. This finding was surprising, as I initially assumed that gaining experience would lead to higher comfort and satisfaction in the workplace. However, the plot suggests the opposite: as employees gain more experience, their satisfaction tends to decrease. This could indicate that with more experience, employees develop higher expectations and may become dissatisfied with what previously satisfied them. This trend challenges the assumption that experience always leads to increased job satisfaction.
The higher the communication level the more satisfied an employee is, which is expected.
The higher the contribution level, the more satisfied the employee is which is expected.
As the tools an employee has increases, so does their satisfaction.
The more training an employee has, the more satisfied they are.
This analysis evaluates a KNN model’s performance in predicting job satisfaction levels (Low, Medium, High) based on various workplace factors. The model’s accuracy, precision, and balanced accuracy are assessed to determine its effectiveness, with a focus on identifying potential weaknesses in classifying Low satisfaction cases.
Department | Years | Ideas | Communication | Recognition | Training | Conditions | Tools | Balance | Satisfaction |
---|---|---|---|---|---|---|---|---|---|
Administrative | 16 | 2 | 3 | 2 | 2 | 4 | 5 | 2 | Low |
Administrative | 2 | 4 | 4 | 3 | 4 | 4 | 5 | 3 | High |
Administrative | 14 | 4 | 3 | 2 | 2 | 5 | 5 | 5 | Medium |
Maintenance | 17 | 5 | 4 | 3 | 5 | 5 | 5 | 3 | High |
Maintenance | 15 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | High |
Management | 1 | 5 | 4 | 4 | 3 | 5 | 3 | 5 | High |
Management | 3 | 3 | 4 | 3 | 3 | 4 | 5 | 5 | High |
Management | 3 | 2 | 2 | 2 | 2 | 3 | 5 | 3 | Low |
Production | 16 | 2 | 3 | 2 | 4 | 4 | 4 | 2 | Medium |
Production | 15 | 2 | 3 | 1 | 4 | 4 | 4 | 2 | Medium |
Production | 13 | 3 | 3 | 3 | 4 | 4 | 4 | 3 | High |
Production | 3 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | High |
Production | 6 | 2 | 2 | 1 | 3 | 3 | 4 | 2 | Medium |
Production | 1 | 5 | 4 | 4 | 3 | 4 | 5 | 5 | High |
Production | 3 | 3 | 4 | 3 | 4 | 5 | 5 | 4 | Medium |
Production | 2 | 4 | 4 | 4 | 4 | 5 | 5 | 5 | High |
Production | 3 | 3 | 4 | 3 | 3 | 2 | 4 | 4 | Medium |
Production | 2 | 4 | 3 | 4 | 3 | 3 | 4 | 4 | Medium |
Production | 2 | 4 | 5 | 4 | 4 | 4 | 4 | 4 | High |
Production | 15 | 5 | 4 | 3 | 4 | 3 | 5 | 3 | Medium |
Production | 5 | 4 | 5 | 3 | 2 | 3 | 5 | 4 | Medium |
Production | 8 | 5 | 5 | 3 | 5 | 3 | 5 | 3 | High |
Production | 17 | 4 | 3 | 4 | 3 | 3 | 5 | 2 | Medium |
Production | 15 | 5 | 3 | 4 | 5 | 5 | 5 | 5 | High |
Production | 5 | 2 | 4 | 2 | 2 | 2 | 5 | 3 | Low |
QC | 1 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | High |
QC | 11 | 3 | 4 | 4 | 4 | 5 | 5 | 2 | Medium |
SR | 21 | 3 | 2 | 2 | 3 | 2 | 4 | 3 | Medium |
SR | 8 | 3 | 2 | 2 | 2 | 2 | 4 | 2 | Medium |
SR | 32 | 2 | 3 | 2 | 4 | 2 | 5 | 3 | Medium |
SR | 2 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | High |
SR | 18 | 4 | 4 | 4 | 5 | 5 | 5 | 5 | High |
The Job dataset for KNN consists of data collected from employees across various departments, with the goal of predicting job satisfaction. The dataset includes columns for different features that may impact satisfaction, such as Years of experience, Ideas, Communication, Recognition, Training, Conditions, Tools, and Work-Life Balance. The Satisfaction column is the target variable, with three levels: Low, Medium, and High.
The dataset covers several departments, including Administrative, Maintenance, Management, Production, QC, and SR, providing a diverse set of roles. It contains 32 rows of employee data, with satisfaction levels distributed across the dataset, reflecting various combinations of job-related factors.
This dataset is used to train a KNN model, with the goal of understanding how these factors influence job satisfaction across different departments.
## k-Nearest Neighbors
##
## 32 samples
## 9 predictor
## 3 classes: 'Low', 'Medium', 'High'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold, repeated 3 times)
## Summary of sample sizes: 29, 29, 28, 29, 28, 30, ...
## Resampling results across tuning parameters:
##
## k Accuracy Kappa
## 5 0.7100000 0.4744781
## 7 0.6544444 0.3883502
## 9 0.5516667 0.1687205
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was k = 5.
The final value used for the model was k = 5.
The plot shows the elbow occurs at 7.
Department | Years | Ideas | Communication | Recognition | Training | Conditions | Tools | Balance | Satisfaction | prediction |
---|---|---|---|---|---|---|---|---|---|---|
Administrative | 16 | 2 | 3 | 2 | 2 | 4 | 5 | 2 | Low | Medium |
Administrative | 2 | 4 | 4 | 3 | 4 | 4 | 5 | 3 | High | High |
Administrative | 14 | 4 | 3 | 2 | 2 | 5 | 5 | 5 | Medium | Medium |
Maintenance | 17 | 5 | 4 | 3 | 5 | 5 | 5 | 3 | High | High |
Maintenance | 15 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | High | High |
Management | 1 | 5 | 4 | 4 | 3 | 5 | 3 | 5 | High | High |
Management | 3 | 3 | 4 | 3 | 3 | 4 | 5 | 5 | High | High |
Management | 3 | 2 | 2 | 2 | 2 | 3 | 5 | 3 | Low | Medium |
Production | 16 | 2 | 3 | 2 | 4 | 4 | 4 | 2 | Medium | Medium |
Production | 15 | 2 | 3 | 1 | 4 | 4 | 4 | 2 | Medium | Medium |
Production | 13 | 3 | 3 | 3 | 4 | 4 | 4 | 3 | High | Medium |
Production | 3 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | High | High |
Production | 6 | 2 | 2 | 1 | 3 | 3 | 4 | 2 | Medium | Medium |
Production | 1 | 5 | 4 | 4 | 3 | 4 | 5 | 5 | High | High |
Production | 3 | 3 | 4 | 3 | 4 | 5 | 5 | 4 | Medium | High |
Production | 2 | 4 | 4 | 4 | 4 | 5 | 5 | 5 | High | High |
Production | 3 | 3 | 4 | 3 | 3 | 2 | 4 | 4 | Medium | Medium |
Production | 2 | 4 | 3 | 4 | 3 | 3 | 4 | 4 | Medium | High |
Production | 2 | 4 | 5 | 4 | 4 | 4 | 4 | 4 | High | High |
Production | 15 | 5 | 4 | 3 | 4 | 3 | 5 | 3 | Medium | High |
Production | 5 | 4 | 5 | 3 | 2 | 3 | 5 | 4 | Medium | Medium |
Production | 8 | 5 | 5 | 3 | 5 | 3 | 5 | 3 | High | Medium |
Production | 17 | 4 | 3 | 4 | 3 | 3 | 5 | 2 | Medium | Medium |
Production | 15 | 5 | 3 | 4 | 5 | 5 | 5 | 5 | High | High |
Production | 5 | 2 | 4 | 2 | 2 | 2 | 5 | 3 | Low | Medium |
QC | 1 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | High | High |
QC | 11 | 3 | 4 | 4 | 4 | 5 | 5 | 2 | Medium | Medium |
SR | 21 | 3 | 2 | 2 | 3 | 2 | 4 | 3 | Medium | Medium |
SR | 8 | 3 | 2 | 2 | 2 | 2 | 4 | 2 | Medium | Medium |
SR | 32 | 2 | 3 | 2 | 4 | 2 | 5 | 3 | Medium | Medium |
SR | 2 | 5 | 5 | 5 | 5 | 5 | 5 | 5 | High | High |
SR | 18 | 4 | 4 | 4 | 5 | 5 | 5 | 5 | High | High |
The prediction seems to be unable to predict low values while mire accurately predicting medium and high values.
## Confusion Matrix and Statistics
##
## Reference
## Prediction Low Medium High
## Low 0 0 0
## Medium 3 11 2
## High 0 3 13
##
## Overall Statistics
##
## Accuracy : 0.75
## 95% CI : (0.566, 0.8854)
## No Information Rate : 0.4688
## P-Value [Acc > NIR] : 0.001154
##
## Kappa : 0.5429
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: Low Class: Medium Class: High
## Sensitivity 0.00000 0.7857 0.8667
## Specificity 1.00000 0.7222 0.8235
## Pos Pred Value NaN 0.6875 0.8125
## Neg Pred Value 0.90625 0.8125 0.8750
## Prevalence 0.09375 0.4375 0.4688
## Detection Rate 0.00000 0.3438 0.4062
## Detection Prevalence 0.00000 0.5000 0.5000
## Balanced Accuracy 0.50000 0.7540 0.8451
This confusion matrix reveals that this model is 75% accurate and was able to predict most satisfaction levels. The Kappa level is 0.5429 shows a decent level of agreement between predicted and actual values. The sensitivity for class low is 0 which means the model was not able to correctly identify any of the satisfaction levels in thus category. 78.57% of the Medium satisfaction instances were correctly identified revealed by a sensitivity level of 0.7857. 86.67% of the High satisfaction instances were correctly identified revealed by a sensitivity level of 0.8667. As for specificity, The model correctly identified all non-Low instances as not Low which is because it did not classify any as low. The model correctly identified 72.22% of non-Medium instances as not Medium. 82.35% of non-High instances were correctly identified as not High.Precision: The model correctly predicts Medium satisfaction 68.75% of the time and High satisfaction 81.25% of the time. It never predicts Low, so its precision for Low is NaN. Negative Predictive Value (NPV): The model correctly identifies non-Low cases 90.63% of the time, non-Medium cases 81.25% of the time, and non-High cases 87.5% of the time. Balanced Accuracy: The model performs best for High (84.51%), followed by Medium (75.4%). Low has a 50% balanced accuracy, meaning the model struggles to detect Low satisfaction. The model performs well for predicting High and Medium satisfaction but completely fails to predict Low satisfaction, indicating a need for better class balance or feature adjustments.
Naive Bayes is a probabilistic classifier that predicts job satisfaction by calculating the probability of each satisfaction level (Low, Medium, High) based on features like communication, recognition, and training. It assumes that the features are independent given the satisfaction level, making it a simple yet effective method for classification.
This output shows the data with all categorical variables.
Department | Years | Ideas | Communication | Recognition | Training | Conditions | Tools | Balance | Satisfaction |
---|---|---|---|---|---|---|---|---|---|
Administrative | 16-20 | Low | Medium | Low | Low | High | High | Low | Low |
Administrative | 1-5 | Medium | High | Medium | High | High | High | Medium | High |
Administrative | 11-15 | Medium | Medium | Low | Low | High | High | High | Medium |
Maintenance | 16-20 | High | High | Medium | High | High | High | Medium | High |
Maintenance | 11-15 | High | High | High | High | High | High | High | High |
Management | 1-5 | High | High | High | Medium | High | Medium | High | High |
Management | 1-5 | Medium | High | Medium | Medium | High | High | High | High |
Management | 1-5 | Low | Low | Low | Low | Medium | High | Medium | Low |
Production | 16-20 | Low | Medium | Low | High | High | High | Low | Medium |
Production | 11-15 | Low | Medium | Low | High | High | High | Low | Medium |
Production | 11-15 | Medium | Medium | Medium | High | High | High | Medium | High |
Production | 1-5 | High | High | High | High | High | High | High | High |
Production | 6-10 | Low | Low | Low | Medium | Medium | High | Low | Medium |
Production | 1-5 | High | High | High | Medium | High | High | High | High |
Production | 1-5 | Medium | High | Medium | High | High | High | High | Medium |
Production | 1-5 | Medium | High | High | High | High | High | High | High |
Production | 1-5 | Medium | High | Medium | Medium | Low | High | High | Medium |
Production | 1-5 | Medium | Medium | High | Medium | Medium | High | High | Medium |
Production | 1-5 | Medium | High | High | High | High | High | High | High |
Production | 11-15 | High | High | Medium | High | Medium | High | Medium | Medium |
Production | 1-5 | Medium | High | Medium | Low | Medium | High | High | Medium |
Production | 6-10 | High | High | Medium | High | Medium | High | Medium | High |
Production | 16-20 | Medium | Medium | High | Medium | Medium | High | Low | Medium |
Production | 11-15 | High | Medium | High | High | High | High | High | High |
Production | 1-5 | Low | High | Low | Low | Low | High | Medium | Low |
QC | 1-5 | High | High | High | High | High | High | High | High |
QC | 11-15 | Medium | High | High | High | High | High | Low | Medium |
SR | 21+ | Medium | Low | Low | Medium | Low | High | Medium | Medium |
SR | 6-10 | Medium | Low | Low | Low | Low | High | Low | Medium |
SR | 21+ | Low | Medium | Low | High | Low | High | Medium | Medium |
SR | 1-5 | High | High | High | High | High | High | High | High |
SR | 16-20 | Medium | High | High | High | High | High | High | High |
##
## Naive Bayes Classifier for Discrete Predictors
##
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
##
## A-priori probabilities:
## Y
## Low Medium High
## 0.09375 0.43750 0.46875
##
## Conditional probabilities:
## Department
## Y Administrative Maintenance Management Production QC SR
## Low 0.33333333 0.00000000 0.33333333 0.33333333 0.00000000 0.00000000
## Medium 0.07142857 0.00000000 0.00000000 0.64285714 0.07142857 0.21428571
## High 0.06666667 0.13333333 0.13333333 0.46666667 0.06666667 0.13333333
##
## Years
## Y 1-5 6-10 11-15 16-20 21+
## Low 0.66666667 0.00000000 0.00000000 0.33333333 0.00000000
## Medium 0.28571429 0.14285714 0.28571429 0.14285714 0.14285714
## High 0.60000000 0.06666667 0.20000000 0.13333333 0.00000000
##
## Ideas
## Y Low Medium High
## Low 1.00000000 0.00000000 0.00000000
## Medium 0.28571429 0.64285714 0.07142857
## High 0.00000000 0.40000000 0.60000000
##
## Communication
## Y Low Medium High
## Low 0.3333333 0.3333333 0.3333333
## Medium 0.2142857 0.4285714 0.3571429
## High 0.0000000 0.1333333 0.8666667
##
## Recognition
## Y Low Medium High
## Low 1.0000000 0.0000000 0.0000000
## Medium 0.5000000 0.2857143 0.2142857
## High 0.0000000 0.3333333 0.6666667
##
## Training
## Y Low Medium High
## Low 1.0000000 0.0000000 0.0000000
## Medium 0.2142857 0.3571429 0.4285714
## High 0.0000000 0.2000000 0.8000000
##
## Conditions
## Y Low Medium High
## Low 0.33333333 0.33333333 0.33333333
## Medium 0.28571429 0.35714286 0.35714286
## High 0.00000000 0.06666667 0.93333333
##
## Tools
## Y Low Medium High
## Low 0.00000000 0.00000000 1.00000000
## Medium 0.00000000 0.00000000 1.00000000
## High 0.00000000 0.06666667 0.93333333
##
## Balance
## Y Low Medium High
## Low 0.3333333 0.6666667 0.0000000
## Medium 0.4285714 0.2142857 0.3571429
## High 0.0000000 0.2666667 0.7333333
Now we want to use the NB Classifier to classify companies based on their predictors. We will use the whole dataset.
## Low Medium High
## [1,] 9.772912e-01 2.270882e-02 1.225914e-12
## [2,] 3.658135e-09 4.553691e-02 9.544631e-01
## [3,] 1.032852e-07 9.999936e-01 6.334088e-06
## [4,] 8.623011e-12 5.565793e-05 9.999443e-01
## [5,] 4.703643e-18 1.686669e-05 9.999831e-01
## [6,] 1.951173e-14 2.623751e-07 9.999997e-01
## [7,] 4.178268e-12 6.742242e-04 9.993258e-01
## [8,] 9.999593e-01 4.066221e-05 4.031836e-13
## [9,] 2.385173e-03 9.976148e-01 1.675498e-08
## [10,] 3.586300e-06 9.999964e-01 1.259623e-08
## [11,] 6.576018e-12 5.893853e-01 4.106147e-01
## [12,] 9.944699e-14 1.031607e-03 9.989684e-01
## [13,] 1.721401e-05 9.999828e-01 2.699156e-12
## [14,] 3.968327e-13 3.430432e-03 9.965696e-01
## [15,] 2.879444e-13 3.584366e-02 9.641563e-01
## [16,] 1.472714e-13 1.374939e-02 9.862506e-01
## [17,] 1.192117e-11 9.893079e-01 1.069205e-02
## [18,] 8.947844e-12 8.353788e-01 1.646212e-01
## [19,] 1.472714e-13 1.374939e-02 9.862506e-01
## [20,] 1.931076e-11 1.602551e-01 8.397449e-01
## [21,] 1.598988e-08 9.952196e-01 4.780419e-03
## [22,] 5.363465e-11 2.225502e-01 7.774498e-01
## [23,] 2.975015e-09 9.999005e-01 9.951625e-05
## [24,] 2.843126e-15 2.359438e-02 9.764056e-01
## [25,] 9.663192e-01 3.368080e-02 1.772771e-11
## [26,] 2.088866e-15 8.025448e-04 9.991975e-01
## [27,] 1.164327e-13 9.662455e-01 3.375451e-02
## [28,] 3.442861e-10 1.000000e+00 6.169606e-11
## [29,] 1.434525e-07 9.999999e-01 6.426672e-14
## [30,] 3.227681e-07 9.999997e-01 7.712005e-11
## [31,] 1.044014e-15 1.203334e-03 9.987967e-01
## [32,] 3.403249e-15 3.530338e-02 9.646966e-01
## [1] Low High Medium High High High High Low Medium Medium
## [11] Medium High Medium High High High Medium Medium High High
## [21] Medium High Medium High Low High Medium Medium Medium Medium
## [31] High High
## Levels: Low Medium High
Department | Years | Ideas | Communication | Recognition | Training | Conditions | Tools | Balance | Satisfaction | Low | Medium | High | pred.class |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Administrative | 16-20 | Low | Medium | Low | Low | High | High | Low | Low | 0.9772912 | 0.0227088 | 0.0000000 | Low |
Administrative | 1-5 | Medium | High | Medium | High | High | High | Medium | High | 0.0000000 | 0.0455369 | 0.9544631 | High |
Administrative | 11-15 | Medium | Medium | Low | Low | High | High | High | Medium | 0.0000001 | 0.9999936 | 0.0000063 | Medium |
Maintenance | 16-20 | High | High | Medium | High | High | High | Medium | High | 0.0000000 | 0.0000557 | 0.9999443 | High |
Maintenance | 11-15 | High | High | High | High | High | High | High | High | 0.0000000 | 0.0000169 | 0.9999831 | High |
Management | 1-5 | High | High | High | Medium | High | Medium | High | High | 0.0000000 | 0.0000003 | 0.9999997 | High |
Management | 1-5 | Medium | High | Medium | Medium | High | High | High | High | 0.0000000 | 0.0006742 | 0.9993258 | High |
Management | 1-5 | Low | Low | Low | Low | Medium | High | Medium | Low | 0.9999593 | 0.0000407 | 0.0000000 | Low |
Production | 16-20 | Low | Medium | Low | High | High | High | Low | Medium | 0.0023852 | 0.9976148 | 0.0000000 | Medium |
Production | 11-15 | Low | Medium | Low | High | High | High | Low | Medium | 0.0000036 | 0.9999964 | 0.0000000 | Medium |
Production | 11-15 | Medium | Medium | Medium | High | High | High | Medium | High | 0.0000000 | 0.5893853 | 0.4106147 | Medium |
Production | 1-5 | High | High | High | High | High | High | High | High | 0.0000000 | 0.0010316 | 0.9989684 | High |
Production | 6-10 | Low | Low | Low | Medium | Medium | High | Low | Medium | 0.0000172 | 0.9999828 | 0.0000000 | Medium |
Production | 1-5 | High | High | High | Medium | High | High | High | High | 0.0000000 | 0.0034304 | 0.9965696 | High |
Production | 1-5 | Medium | High | Medium | High | High | High | High | Medium | 0.0000000 | 0.0358437 | 0.9641563 | High |
Production | 1-5 | Medium | High | High | High | High | High | High | High | 0.0000000 | 0.0137494 | 0.9862506 | High |
Production | 1-5 | Medium | High | Medium | Medium | Low | High | High | Medium | 0.0000000 | 0.9893079 | 0.0106921 | Medium |
Production | 1-5 | Medium | Medium | High | Medium | Medium | High | High | Medium | 0.0000000 | 0.8353788 | 0.1646212 | Medium |
Production | 1-5 | Medium | High | High | High | High | High | High | High | 0.0000000 | 0.0137494 | 0.9862506 | High |
Production | 11-15 | High | High | Medium | High | Medium | High | Medium | Medium | 0.0000000 | 0.1602551 | 0.8397449 | High |
Production | 1-5 | Medium | High | Medium | Low | Medium | High | High | Medium | 0.0000000 | 0.9952196 | 0.0047804 | Medium |
Production | 6-10 | High | High | Medium | High | Medium | High | Medium | High | 0.0000000 | 0.2225502 | 0.7774498 | High |
Production | 16-20 | Medium | Medium | High | Medium | Medium | High | Low | Medium | 0.0000000 | 0.9999005 | 0.0000995 | Medium |
Production | 11-15 | High | Medium | High | High | High | High | High | High | 0.0000000 | 0.0235944 | 0.9764056 | High |
Production | 1-5 | Low | High | Low | Low | Low | High | Medium | Low | 0.9663192 | 0.0336808 | 0.0000000 | Low |
QC | 1-5 | High | High | High | High | High | High | High | High | 0.0000000 | 0.0008025 | 0.9991975 | High |
QC | 11-15 | Medium | High | High | High | High | High | Low | Medium | 0.0000000 | 0.9662455 | 0.0337545 | Medium |
SR | 21+ | Medium | Low | Low | Medium | Low | High | Medium | Medium | 0.0000000 | 1.0000000 | 0.0000000 | Medium |
SR | 6-10 | Medium | Low | Low | Low | Low | High | Low | Medium | 0.0000001 | 0.9999999 | 0.0000000 | Medium |
SR | 21+ | Low | Medium | Low | High | Low | High | Medium | Medium | 0.0000003 | 0.9999997 | 0.0000000 | Medium |
SR | 1-5 | High | High | High | High | High | High | High | High | 0.0000000 | 0.0012033 | 0.9987967 | High |
SR | 16-20 | Medium | High | High | High | High | High | High | High | 0.0000000 | 0.0353034 | 0.9646966 | High |
## Confusion Matrix and Statistics
##
## Reference
## Prediction Low Medium High
## Low 3 0 0
## Medium 0 12 1
## High 0 2 14
##
## Overall Statistics
##
## Accuracy : 0.9062
## 95% CI : (0.7498, 0.9802)
## No Information Rate : 0.4688
## P-Value [Acc > NIR] : 2.331e-07
##
## Kappa : 0.8381
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: Low Class: Medium Class: High
## Sensitivity 1.00000 0.8571 0.9333
## Specificity 1.00000 0.9444 0.8824
## Pos Pred Value 1.00000 0.9231 0.8750
## Neg Pred Value 1.00000 0.8947 0.9375
## Prevalence 0.09375 0.4375 0.4688
## Detection Rate 0.09375 0.3750 0.4375
## Detection Prevalence 0.09375 0.4062 0.5000
## Balanced Accuracy 1.00000 0.9008 0.9078
The Naive Bayes model achieved 90.62% accuracy, with a strong Kappa score of 0.8381, indicating high agreement between predictions and actual values. Low satisfaction was perfectly classified (100% sensitivity and precision), while Medium (85.71% sensitivity, 92.31% precision) and High (93.33% sensitivity, 87.50% precision) had minor misclassifications. The balanced accuracy remains high across all classes, making this model effective for predicting job satisfaction trends.