INTRODUCTION

This analysis investigates how different job attributes- Department, Years of Experience, Ideas, Communication, Recognition, Training, Conditions, Tools, Work-Life Balance, and Satisfaction Levels are related to each other by means of data visualization to better understand patterns influencing employee satisfaction. The K-Nearest Neighbors (KNN) and Naive Bayes classification methods are then implemented to predict job satisfaction levels, depending on the strengths of either model in highlighting key satisfaction drivers. Evaluating performance through accuracy, precision, recall, and further insight into the predictive power of these methods, and we will discuss the findings and their implications toward increasing employee satisfaction within organizations.

SUMMARY

Job Satisfaction Data Summary
Department Years Ideas Communication Recognition Training Satisfaction Satisfaction Level
Administrative 3 5 2 4 2 7 Medium
Maintenance 2 5 3 9 8 5 Low
Management 3 4 4 12 9 4 Medium
Production 17 3 5 7 9 4 High
QC 2 2 2 NA NA 3 Low
SR 5 2 2 8 4 4 High
Other 11 11 5 9 4 6 Medium

BOXPLOTS


HISTOGRAMS


SCATTERPLOTS

HEATMAP

This heatmap shows correlations between job attributes, with Satisfaction strongly linked to Ideas (0.85), Recognition (0.84), and Balance (0.71). Training has the weakest correlations, especially with Balance (0.29). Overall, Satisfaction is most influenced by Ideas, Recognition, and Balance, while Training appears less connected to other factors.

KNN

Job Satisfaction Data
Department Years Ideas Communication Recognition Training Conditions Tools Balance Satisfaction
Administrative 16 2 3 2 2 4 5 2 3
Administrative 2 4 4 3 4 4 5 3 9
Administrative 14 4 3 2 2 5 5 5 6
Maintenance 17 5 4 3 5 5 5 3 8
Maintenance 15 5 5 5 5 5 5 5 9
Management 1 5 4 4 3 5 3 5 9
Management 3 3 4 3 3 4 5 5 8
Management 3 2 2 2 2 3 5 3 3
Production 16 2 3 2 4 4 4 2 5
Production 15 2 3 1 4 4 4 2 4
Production 13 3 3 3 4 4 4 3 8
Production 3 5 5 5 5 5 5 5 10
Production 6 2 2 1 3 3 4 2 4
Production 1 5 4 4 3 4 5 5 9
Production 3 3 4 3 4 5 5 4 7
Production 2 4 4 4 4 5 5 5 8
Production 3 3 4 3 3 2 4 4 6
Production 2 4 3 4 3 3 4 4 6
Production 2 4 5 4 4 4 4 4 8
Production 15 5 4 3 4 3 5 3 7
Production 5 4 5 3 2 3 5 4 7
Production 8 5 5 3 5 3 5 3 8
Production 17 4 3 4 3 3 5 2 6
Production 15 5 3 4 5 5 5 5 9
Production 5 2 4 2 2 2 5 3 3
QC 1 5 5 5 5 5 5 5 10
QC 11 3 4 4 4 5 5 2 7
SR 21 3 2 2 3 2 4 3 5
SR 8 3 2 2 2 2 4 2 4
SR 32 2 3 2 4 2 5 3 5
SR 2 5 5 5 5 5 5 5 10
SR 18 4 4 4 5 5 5 5 8

CLASSIFICATION

##           Reference
## Prediction 3 4 5 6 7 8 9 10
##         3  1 1 0 0 0 0 0  0
##         4  0 1 1 0 0 0 0  0
##         5  0 0 2 0 0 0 0  0
##         6  2 0 0 2 0 0 0  0
##         7  0 0 0 1 2 2 0  0
##         8  0 1 0 1 2 2 3  0
##         9  0 0 0 0 0 3 2  0
##         10 0 0 0 0 0 0 0  3
## 
## **Accuracy of the model:** 46.88 %

The confusion matrix shows the model’s predictions for job satisfaction levels, with an accuracy of 46.9%. While the model correctly predicted some satisfaction levels, it struggled with misclassifications, particularly among adjacent categories (e.g., levels 7 and 8). The results indicate that the model could benefit from further tuning or alternative approaches to improve its performance.

NAIVE BAYES

Job Dataset for Naive Bayes
Years Ideas Communication Recognition Training Conditions Tools Balance Satisfaction
16 2 3 2 2 4 5 2 3
2 4 4 3 4 4 5 3 9
14 4 3 2 2 5 5 5 6
17 5 4 3 5 5 5 3 8
15 5 5 5 5 5 5 5 9
1 5 4 4 3 5 3 5 9
3 3 4 3 3 4 5 5 8
3 2 2 2 2 3 5 3 3
16 2 3 2 4 4 4 2 5
15 2 3 1 4 4 4 2 4
13 3 3 3 4 4 4 3 8
3 5 5 5 5 5 5 5 10
6 2 2 1 3 3 4 2 4
1 5 4 4 3 4 5 5 9
3 3 4 3 4 5 5 4 7
2 4 4 4 4 5 5 5 8
3 3 4 3 3 2 4 4 6
2 4 3 4 3 3 4 4 6
2 4 5 4 4 4 4 4 8
15 5 4 3 4 3 5 3 7
5 4 5 3 2 3 5 4 7
8 5 5 3 5 3 5 3 8
17 4 3 4 3 3 5 2 6
15 5 3 4 5 5 5 5 9
5 2 4 2 2 2 5 3 3
1 5 5 5 5 5 5 5 10
11 3 4 4 4 5 5 2 7
21 3 2 2 3 2 4 3 5
8 3 2 2 2 2 4 2 4
32 2 3 2 4 2 5 3 5
2 5 5 5 5 5 5 5 10
18 4 4 4 5 5 5 5 8
Confusion Matrix: Predicted vs Actual Job Satisfaction
3 4 5 6 7 8 9 10
3 3 0 1 1 0 0 0 0
4 0 3 1 0 0 0 0 0
5 0 0 1 0 0 0 0 0
6 0 0 0 2 0 0 0 0
7 0 0 0 1 4 5 2 0
8 0 0 0 0 0 2 0 0
9 0 0 0 0 0 0 3 0
10 0 0 0 0 0 0 0 3

The confusion matrix displays the predicted vs. actual job satisfaction levels, with a model accuracy of 66%. The model correctly predicted satisfaction levels for categories like 7, 8, 9, and 10, showing stronger performance in these areas. However, there are still some misclassifications, particularly in levels 3 and 7. While the model performs moderately well, there is room for further improvement through adjustments in the model or features.