INTRODUCTION

This analysis investigates how different job attributes- Department, Years of Experience, Ideas, Communication, Recognition, Training, Conditions, Tools, Work-Life Balance, and Satisfaction Levels are related to each other by means of data visualization to better understand patterns influencing employee satisfaction. The K-Nearest Neighbors (KNN) and Naive Bayes classification methods are then implemented to predict job satisfaction levels, depending on the strengths of either model in highlighting key satisfaction drivers. Evaluating performance through accuracy, precision, recall, and further insight into the predictive power of these methods, and we will discuss the findings and their implications toward increasing employee satisfaction within organizations.

SUMMARY

Job Satisfaction Data Summary
Department	Years	Ideas	Communication	Recognition	Training	Satisfaction	Satisfaction Level
Administrative	3	5	2	4	2	7	Medium
Maintenance	2	5	3	9	8	5	Low
Management	3	4	4	12	9	4	Medium
Production	17	3	5	7	9	4	High
QC	2	2	2	NA	NA	3	Low
SR	5	2	2	8	4	4	High
Other	11	11	5	9	4	6	Medium

BOXPLOTS

HISTOGRAMS

SCATTERPLOTS

HEATMAP

This heatmap shows correlations between job attributes, with Satisfaction strongly linked to Ideas (0.85), Recognition (0.84), and Balance (0.71). Training has the weakest correlations, especially with Balance (0.29). Overall, Satisfaction is most influenced by Ideas, Recognition, and Balance, while Training appears less connected to other factors.

KNN

Job Satisfaction Data
Department	Years	Ideas	Communication	Recognition	Training	Conditions	Tools	Balance	Satisfaction
Administrative	16	2	3	2	2	4	5	2	3
Administrative	2	4	4	3	4	4	5	3	9
Administrative	14	4	3	2	2	5	5	5	6
Maintenance	17	5	4	3	5	5	5	3	8
Maintenance	15	5	5	5	5	5	5	5	9
Management	1	5	4	4	3	5	3	5	9
Management	3	3	4	3	3	4	5	5	8
Management	3	2	2	2	2	3	5	3	3
Production	16	2	3	2	4	4	4	2	5
Production	15	2	3	1	4	4	4	2	4
Production	13	3	3	3	4	4	4	3	8
Production	3	5	5	5	5	5	5	5	10
Production	6	2	2	1	3	3	4	2	4
Production	1	5	4	4	3	4	5	5	9
Production	3	3	4	3	4	5	5	4	7
Production	2	4	4	4	4	5	5	5	8
Production	3	3	4	3	3	2	4	4	6
Production	2	4	3	4	3	3	4	4	6
Production	2	4	5	4	4	4	4	4	8
Production	15	5	4	3	4	3	5	3	7
Production	5	4	5	3	2	3	5	4	7
Production	8	5	5	3	5	3	5	3	8
Production	17	4	3	4	3	3	5	2	6
Production	15	5	3	4	5	5	5	5	9
Production	5	2	4	2	2	2	5	3	3
QC	1	5	5	5	5	5	5	5	10
QC	11	3	4	4	4	5	5	2	7
SR	21	3	2	2	3	2	4	3	5
SR	8	3	2	2	2	2	4	2	4
SR	32	2	3	2	4	2	5	3	5
SR	2	5	5	5	5	5	5	5	10
SR	18	4	4	4	5	5	5	5	8

CLASSIFICATION

##           Reference
## Prediction 3 4 5 6 7 8 9 10
##         3  1 1 0 0 0 0 0  0
##         4  0 1 1 0 0 0 0  0
##         5  0 0 2 0 0 0 0  0
##         6  2 0 0 2 0 0 0  0
##         7  0 0 0 1 2 2 0  0
##         8  0 1 0 1 2 2 3  0
##         9  0 0 0 0 0 3 2  0
##         10 0 0 0 0 0 0 0  3

## 
## **Accuracy of the model:** 46.88 %

The confusion matrix shows the model’s predictions for job satisfaction levels, with an accuracy of 46.9%. While the model correctly predicted some satisfaction levels, it struggled with misclassifications, particularly among adjacent categories (e.g., levels 7 and 8). The results indicate that the model could benefit from further tuning or alternative approaches to improve its performance.

NAIVE BAYES

Job Dataset for Naive Bayes
Years	Ideas	Communication	Recognition	Training	Conditions	Tools	Balance	Satisfaction
16	2	3	2	2	4	5	2	3
2	4	4	3	4	4	5	3	9
14	4	3	2	2	5	5	5	6
17	5	4	3	5	5	5	3	8
15	5	5	5	5	5	5	5	9
1	5	4	4	3	5	3	5	9
3	3	4	3	3	4	5	5	8
3	2	2	2	2	3	5	3	3
16	2	3	2	4	4	4	2	5
15	2	3	1	4	4	4	2	4
13	3	3	3	4	4	4	3	8
3	5	5	5	5	5	5	5	10
6	2	2	1	3	3	4	2	4
1	5	4	4	3	4	5	5	9
3	3	4	3	4	5	5	4	7
2	4	4	4	4	5	5	5	8
3	3	4	3	3	2	4	4	6
2	4	3	4	3	3	4	4	6
2	4	5	4	4	4	4	4	8
15	5	4	3	4	3	5	3	7
5	4	5	3	2	3	5	4	7
8	5	5	3	5	3	5	3	8
17	4	3	4	3	3	5	2	6
15	5	3	4	5	5	5	5	9
5	2	4	2	2	2	5	3	3
1	5	5	5	5	5	5	5	10
11	3	4	4	4	5	5	2	7
21	3	2	2	3	2	4	3	5
8	3	2	2	2	2	4	2	4
32	2	3	2	4	2	5	3	5
2	5	5	5	5	5	5	5	10
18	4	4	4	5	5	5	5	8

Confusion Matrix: Predicted vs Actual Job Satisfaction
	3	4	5	6	7	8	9	10
3	3	0	1	1	0	0	0	0
4	0	3	1	0	0	0	0	0
5	0	0	1	0	0	0	0	0
6	0	0	0	2	0	0	0	0
7	0	0	0	1	4	5	2	0
8	0	0	0	0	0	2	0	0
9	0	0	0	0	0	0	3	0
10	0	0	0	0	0	0	0	3

The confusion matrix displays the predicted vs. actual job satisfaction levels, with a model accuracy of 66%. The model correctly predicted satisfaction levels for categories like 7, 8, 9, and 10, showing stronger performance in these areas. However, there are still some misclassifications, particularly in levels 3 and 7. While the model performs moderately well, there is room for further improvement through adjustments in the model or features.

Jobs

Gianni Naccarato

2025-03-24