R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

# Load library yang dibutuhkan
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.4.3
## Warning: package 'ggplot2' was built under R version 4.4.3
## Warning: package 'tibble' was built under R version 4.4.3
## Warning: package 'tidyr' was built under R version 4.4.3
## Warning: package 'readr' was built under R version 4.4.3
## Warning: package 'purrr' was built under R version 4.4.3
## Warning: package 'dplyr' was built under R version 4.4.3
## Warning: package 'forcats' was built under R version 4.4.3
## Warning: package 'lubridate' was built under R version 4.4.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(patchwork)
## Warning: package 'patchwork' was built under R version 4.4.3
library(caTools)
## Warning: package 'caTools' was built under R version 4.4.3
library(caret)
## Warning: package 'caret' was built under R version 4.4.3
## Loading required package: lattice
## 
## Attaching package: 'caret'
## 
## The following object is masked from 'package:purrr':
## 
##     lift
library(dplyr)
library(ggplot2)
library(patchwork)
library(nnet)
## Warning: package 'nnet' was built under R version 4.4.3
# membaca data set
data <- read.csv("HR_Analytics.csv")

Tahapan Preprocessing

  1. Melihat struktur data (str(data))
    Digunakan untuk mengetahui tipe data setiap kolom, jumlah observasi (1480 baris), dan jumlah variabel (38 kolom) dan membantu memahami mana yang bertipe numerik (int), karakter (chr), atau faktor.

  2. Melihat ringkasan data (summary(data))
    Memberikan ringkasan statistik deskriptif: minimum, maksimum, median, mean, kuartil, dan jumlah data untuk masing-masing kolom serta penting untuk mendeteksi outlier, distribusi, dan memahami nilai-nilai umum di dataset.

  3. Melihat beberapa baris awal data (head(data))
    Digunakan untuk melihat 6 baris pertama sebagai contoh isi dataset, supaya lebih familiar dengan datanya.

  4. Mengecek missing value (colSums(is.na(data)))
    Menghitung jumlah nilai yang hilang (NA) di setiap kolom.
    Hasilnya menunjukkan bahwa hanya kolom YearsWithCurrManager yang memiliki 57 NA.

  5. Mengisi missing value
    Mengisi NA di YearsWithCurrManager dengan nilai median. Median dipilih karena lebih tahan terhadap outlier dibanding mean.

  6. Mengecek ulang missing value (colSums(is.na(data)))
    Memastikan semua missing value sudah berhasil diisi (semua nol).

  7. Mengecek duplikasi data (sum(duplicated(data$EmpID)))
    Mengecek apakah ada ID pegawai (EmpID) yang muncul lebih dari satu kali.
    Hasil: ditemukan 10 data duplikat.

  8. Menghapus baris duplikat
    Menghapus baris dengan EmpID yang sama supaya tidak memengaruhi analisis.

# melihat struktur data
str(data)
## 'data.frame':    1480 obs. of  38 variables:
##  $ EmpID                   : chr  "RM297" "RM302" "RM458" "RM728" ...
##  $ Age                     : int  18 18 18 18 18 18 18 18 19 19 ...
##  $ AgeGroup                : chr  "18-25" "18-25" "18-25" "18-25" ...
##  $ Attrition               : chr  "Yes" "No" "Yes" "No" ...
##  $ BusinessTravel          : chr  "Travel_Rarely" "Travel_Rarely" "Travel_Frequently" "Non-Travel" ...
##  $ DailyRate               : int  230 812 1306 287 247 1124 544 1431 528 1181 ...
##  $ Department              : chr  "Research & Development" "Sales" "Sales" "Research & Development" ...
##  $ DistanceFromHome        : int  3 10 5 5 8 1 3 14 22 3 ...
##  $ Education               : int  3 3 3 2 1 3 2 3 1 1 ...
##  $ EducationField          : chr  "Life Sciences" "Medical" "Marketing" "Life Sciences" ...
##  $ EmployeeCount           : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ EmployeeNumber          : int  405 411 614 1012 1156 1368 1624 1839 167 201 ...
##  $ EnvironmentSatisfaction : int  3 4 2 2 3 4 2 2 4 2 ...
##  $ Gender                  : chr  "Male" "Female" "Male" "Male" ...
##  $ HourlyRate              : int  54 69 69 73 80 97 70 33 50 79 ...
##  $ JobInvolvement          : int  3 2 3 3 3 3 3 3 3 3 ...
##  $ JobLevel                : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ JobRole                 : chr  "Laboratory Technician" "Sales Representative" "Sales Representative" "Research Scientist" ...
##  $ JobSatisfaction         : int  3 3 2 4 3 4 4 3 3 2 ...
##  $ MaritalStatus           : chr  "Single" "Single" "Single" "Single" ...
##  $ MonthlyIncome           : int  1420 1200 1878 1051 1904 1611 1569 1514 1675 1483 ...
##  $ SalarySlab              : chr  "Upto 5k" "Upto 5k" "Upto 5k" "Upto 5k" ...
##  $ MonthlyRate             : int  25233 9724 8059 13493 13556 19305 18420 8018 26820 16102 ...
##  $ NumCompaniesWorked      : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ Over18                  : chr  "Y" "Y" "Y" "Y" ...
##  $ OverTime                : chr  "No" "No" "Yes" "No" ...
##  $ PercentSalaryHike       : int  13 12 14 15 12 15 12 16 19 14 ...
##  $ PerformanceRating       : int  3 3 3 3 3 3 3 3 3 3 ...
##  $ RelationshipSatisfaction: int  3 1 4 4 4 3 3 3 4 4 ...
##  $ StandardHours           : int  80 80 80 80 80 80 80 80 80 80 ...
##  $ StockOptionLevel        : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ TotalWorkingYears       : int  0 0 0 0 0 0 0 0 0 1 ...
##  $ TrainingTimesLastYear   : int  2 2 3 2 0 5 2 4 2 3 ...
##  $ WorkLifeBalance         : int  3 3 3 3 3 4 4 1 2 3 ...
##  $ YearsAtCompany          : int  0 0 0 0 0 0 0 0 0 1 ...
##  $ YearsInCurrentRole      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ YearsSinceLastPromotion : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ YearsWithCurrManager    : int  0 0 0 0 0 0 0 0 0 0 ...
# melihat ringkasan data
summary(data)
##     EmpID                Age          AgeGroup          Attrition        
##  Length:1480        Min.   :18.00   Length:1480        Length:1480       
##  Class :character   1st Qu.:30.00   Class :character   Class :character  
##  Mode  :character   Median :36.00   Mode  :character   Mode  :character  
##                     Mean   :36.92                                        
##                     3rd Qu.:43.00                                        
##                     Max.   :60.00                                        
##                                                                          
##  BusinessTravel       DailyRate       Department        DistanceFromHome
##  Length:1480        Min.   : 102.0   Length:1480        Min.   : 1.00   
##  Class :character   1st Qu.: 465.0   Class :character   1st Qu.: 2.00   
##  Mode  :character   Median : 800.0   Mode  :character   Median : 7.00   
##                     Mean   : 801.4                      Mean   : 9.22   
##                     3rd Qu.:1157.0                      3rd Qu.:14.00   
##                     Max.   :1499.0                      Max.   :29.00   
##                                                                         
##    Education     EducationField     EmployeeCount EmployeeNumber  
##  Min.   :1.000   Length:1480        Min.   :1     Min.   :   1.0  
##  1st Qu.:2.000   Class :character   1st Qu.:1     1st Qu.: 493.8  
##  Median :3.000   Mode  :character   Median :1     Median :1027.5  
##  Mean   :2.911                      Mean   :1     Mean   :1031.9  
##  3rd Qu.:4.000                      3rd Qu.:1     3rd Qu.:1568.2  
##  Max.   :5.000                      Max.   :1     Max.   :2068.0  
##                                                                   
##  EnvironmentSatisfaction    Gender            HourlyRate     JobInvolvement
##  Min.   :1.000           Length:1480        Min.   : 30.00   Min.   :1.00  
##  1st Qu.:2.000           Class :character   1st Qu.: 48.00   1st Qu.:2.00  
##  Median :3.000           Mode  :character   Median : 66.00   Median :3.00  
##  Mean   :2.724                              Mean   : 65.85   Mean   :2.73  
##  3rd Qu.:4.000                              3rd Qu.: 83.00   3rd Qu.:3.00  
##  Max.   :4.000                              Max.   :100.00   Max.   :4.00  
##                                                                            
##     JobLevel       JobRole          JobSatisfaction MaritalStatus     
##  Min.   :1.000   Length:1480        Min.   :1.000   Length:1480       
##  1st Qu.:1.000   Class :character   1st Qu.:2.000   Class :character  
##  Median :2.000   Mode  :character   Median :3.000   Mode  :character  
##  Mean   :2.065                      Mean   :2.725                     
##  3rd Qu.:3.000                      3rd Qu.:4.000                     
##  Max.   :5.000                      Max.   :4.000                     
##                                                                       
##  MonthlyIncome    SalarySlab         MonthlyRate    NumCompaniesWorked
##  Min.   : 1009   Length:1480        Min.   : 2094   Min.   :0.000     
##  1st Qu.: 2922   Class :character   1st Qu.: 8051   1st Qu.:1.000     
##  Median : 4933   Mode  :character   Median :14220   Median :2.000     
##  Mean   : 6505                      Mean   :14298   Mean   :2.687     
##  3rd Qu.: 8384                      3rd Qu.:20461   3rd Qu.:4.000     
##  Max.   :19999                      Max.   :26999   Max.   :9.000     
##                                                                       
##     Over18            OverTime         PercentSalaryHike PerformanceRating
##  Length:1480        Length:1480        Min.   :11.00     Min.   :3.000    
##  Class :character   Class :character   1st Qu.:12.00     1st Qu.:3.000    
##  Mode  :character   Mode  :character   Median :14.00     Median :3.000    
##                                        Mean   :15.21     Mean   :3.153    
##                                        3rd Qu.:18.00     3rd Qu.:3.000    
##                                        Max.   :25.00     Max.   :4.000    
##                                                                           
##  RelationshipSatisfaction StandardHours StockOptionLevel TotalWorkingYears
##  Min.   :1.000            Min.   :80    Min.   :0.0000   Min.   : 0.00    
##  1st Qu.:2.000            1st Qu.:80    1st Qu.:0.0000   1st Qu.: 6.00    
##  Median :3.000            Median :80    Median :1.0000   Median :10.00    
##  Mean   :2.709            Mean   :80    Mean   :0.7919   Mean   :11.28    
##  3rd Qu.:4.000            3rd Qu.:80    3rd Qu.:1.0000   3rd Qu.:15.00    
##  Max.   :4.000            Max.   :80    Max.   :3.0000   Max.   :40.00    
##                                                                           
##  TrainingTimesLastYear WorkLifeBalance YearsAtCompany   YearsInCurrentRole
##  Min.   :0.000         Min.   :1.000   Min.   : 0.000   Min.   : 0.000    
##  1st Qu.:2.000         1st Qu.:2.000   1st Qu.: 3.000   1st Qu.: 2.000    
##  Median :3.000         Median :3.000   Median : 5.000   Median : 3.000    
##  Mean   :2.798         Mean   :2.761   Mean   : 7.009   Mean   : 4.228    
##  3rd Qu.:3.000         3rd Qu.:3.000   3rd Qu.: 9.000   3rd Qu.: 7.000    
##  Max.   :6.000         Max.   :4.000   Max.   :40.000   Max.   :18.000    
##                                                                           
##  YearsSinceLastPromotion YearsWithCurrManager
##  Min.   : 0.000          Min.   : 0.000      
##  1st Qu.: 0.000          1st Qu.: 2.000      
##  Median : 1.000          Median : 3.000      
##  Mean   : 2.182          Mean   : 4.118      
##  3rd Qu.: 3.000          3rd Qu.: 7.000      
##  Max.   :15.000          Max.   :17.000      
##                          NA's   :57
# melihat beberapa baris awal
head(data)
##   EmpID Age AgeGroup Attrition    BusinessTravel DailyRate
## 1 RM297  18    18-25       Yes     Travel_Rarely       230
## 2 RM302  18    18-25        No     Travel_Rarely       812
## 3 RM458  18    18-25       Yes Travel_Frequently      1306
## 4 RM728  18    18-25        No        Non-Travel       287
## 5 RM829  18    18-25       Yes        Non-Travel       247
## 6 RM973  18    18-25        No        Non-Travel      1124
##               Department DistanceFromHome Education EducationField
## 1 Research & Development                3         3  Life Sciences
## 2                  Sales               10         3        Medical
## 3                  Sales                5         3      Marketing
## 4 Research & Development                5         2  Life Sciences
## 5 Research & Development                8         1        Medical
## 6 Research & Development                1         3  Life Sciences
##   EmployeeCount EmployeeNumber EnvironmentSatisfaction Gender HourlyRate
## 1             1            405                       3   Male         54
## 2             1            411                       4 Female         69
## 3             1            614                       2   Male         69
## 4             1           1012                       2   Male         73
## 5             1           1156                       3   Male         80
## 6             1           1368                       4 Female         97
##   JobInvolvement JobLevel               JobRole JobSatisfaction MaritalStatus
## 1              3        1 Laboratory Technician               3        Single
## 2              2        1  Sales Representative               3        Single
## 3              3        1  Sales Representative               2        Single
## 4              3        1    Research Scientist               4        Single
## 5              3        1 Laboratory Technician               3        Single
## 6              3        1 Laboratory Technician               4        Single
##   MonthlyIncome SalarySlab MonthlyRate NumCompaniesWorked Over18 OverTime
## 1          1420    Upto 5k       25233                  1      Y       No
## 2          1200    Upto 5k        9724                  1      Y       No
## 3          1878    Upto 5k        8059                  1      Y      Yes
## 4          1051    Upto 5k       13493                  1      Y       No
## 5          1904    Upto 5k       13556                  1      Y       No
## 6          1611    Upto 5k       19305                  1      Y       No
##   PercentSalaryHike PerformanceRating RelationshipSatisfaction StandardHours
## 1                13                 3                        3            80
## 2                12                 3                        1            80
## 3                14                 3                        4            80
## 4                15                 3                        4            80
## 5                12                 3                        4            80
## 6                15                 3                        3            80
##   StockOptionLevel TotalWorkingYears TrainingTimesLastYear WorkLifeBalance
## 1                0                 0                     2               3
## 2                0                 0                     2               3
## 3                0                 0                     3               3
## 4                0                 0                     2               3
## 5                0                 0                     0               3
## 6                0                 0                     5               4
##   YearsAtCompany YearsInCurrentRole YearsSinceLastPromotion
## 1              0                  0                       0
## 2              0                  0                       0
## 3              0                  0                       0
## 4              0                  0                       0
## 5              0                  0                       0
## 6              0                  0                       0
##   YearsWithCurrManager
## 1                    0
## 2                    0
## 3                    0
## 4                    0
## 5                    0
## 6                    0
# mengecek missing value
colSums(is.na(data))
##                    EmpID                      Age                 AgeGroup 
##                        0                        0                        0 
##                Attrition           BusinessTravel                DailyRate 
##                        0                        0                        0 
##               Department         DistanceFromHome                Education 
##                        0                        0                        0 
##           EducationField            EmployeeCount           EmployeeNumber 
##                        0                        0                        0 
##  EnvironmentSatisfaction                   Gender               HourlyRate 
##                        0                        0                        0 
##           JobInvolvement                 JobLevel                  JobRole 
##                        0                        0                        0 
##          JobSatisfaction            MaritalStatus            MonthlyIncome 
##                        0                        0                        0 
##               SalarySlab              MonthlyRate       NumCompaniesWorked 
##                        0                        0                        0 
##                   Over18                 OverTime        PercentSalaryHike 
##                        0                        0                        0 
##        PerformanceRating RelationshipSatisfaction            StandardHours 
##                        0                        0                        0 
##         StockOptionLevel        TotalWorkingYears    TrainingTimesLastYear 
##                        0                        0                        0 
##          WorkLifeBalance           YearsAtCompany       YearsInCurrentRole 
##                        0                        0                        0 
##  YearsSinceLastPromotion     YearsWithCurrManager 
##                        0                       57
# mengisi missing value dengan median
median_val <- median(data$YearsWithCurrManager, na.rm = TRUE)
data$YearsWithCurrManager[is.na(data$YearsWithCurrManager)] <- median_val

# mengecek kembali untuk memastikan tidak ada missing value
colSums(is.na(data))
##                    EmpID                      Age                 AgeGroup 
##                        0                        0                        0 
##                Attrition           BusinessTravel                DailyRate 
##                        0                        0                        0 
##               Department         DistanceFromHome                Education 
##                        0                        0                        0 
##           EducationField            EmployeeCount           EmployeeNumber 
##                        0                        0                        0 
##  EnvironmentSatisfaction                   Gender               HourlyRate 
##                        0                        0                        0 
##           JobInvolvement                 JobLevel                  JobRole 
##                        0                        0                        0 
##          JobSatisfaction            MaritalStatus            MonthlyIncome 
##                        0                        0                        0 
##               SalarySlab              MonthlyRate       NumCompaniesWorked 
##                        0                        0                        0 
##                   Over18                 OverTime        PercentSalaryHike 
##                        0                        0                        0 
##        PerformanceRating RelationshipSatisfaction            StandardHours 
##                        0                        0                        0 
##         StockOptionLevel        TotalWorkingYears    TrainingTimesLastYear 
##                        0                        0                        0 
##          WorkLifeBalance           YearsAtCompany       YearsInCurrentRole 
##                        0                        0                        0 
##  YearsSinceLastPromotion     YearsWithCurrManager 
##                        0                        0
# mengecek apakah ada data duplikat
sum(duplicated(data$EmpID))
## [1] 10
# menghapus baris dengan EmpID duplikat
data <- data[!duplicated(data$EmpID), ]

# mengecek kembali jumlah duplikat
sum(duplicated(data$EmpID))
## [1] 0
#mengecek jumlah nilai unik per kolom
sapply(data, function(x) length(unique(x)))
##                    EmpID                      Age                 AgeGroup 
##                     1470                       43                        5 
##                Attrition           BusinessTravel                DailyRate 
##                        2                        4                      886 
##               Department         DistanceFromHome                Education 
##                        3                       29                        5 
##           EducationField            EmployeeCount           EmployeeNumber 
##                        6                        1                     1470 
##  EnvironmentSatisfaction                   Gender               HourlyRate 
##                        4                        2                       71 
##           JobInvolvement                 JobLevel                  JobRole 
##                        4                        5                        9 
##          JobSatisfaction            MaritalStatus            MonthlyIncome 
##                        4                        3                     1349 
##               SalarySlab              MonthlyRate       NumCompaniesWorked 
##                        4                     1427                       10 
##                   Over18                 OverTime        PercentSalaryHike 
##                        1                        2                       15 
##        PerformanceRating RelationshipSatisfaction            StandardHours 
##                        2                        4                        1 
##         StockOptionLevel        TotalWorkingYears    TrainingTimesLastYear 
##                        4                       40                        7 
##          WorkLifeBalance           YearsAtCompany       YearsInCurrentRole 
##                        4                       37                       19 
##  YearsSinceLastPromotion     YearsWithCurrManager 
##                       16                       18
# menghapus variabel tidak informatif
data <- data %>% select(-c(EmpID, EmployeeCount, Over18, StandardHours, EmployeeNumber))

# mengonversi kolom-kolom kategorik ke factor
categorical_cols <- c("JobRole", "Gender", "MaritalStatus", "BusinessTravel", 
                      "Department", "EducationField", "Attrition", 
                      "AgeGroup", "SalarySlab", "OverTime")

data[categorical_cols] <- lapply(data[categorical_cols], as.factor)

EDA

  1. Distribusi JobRole

Dari plot distribusi: - Jumlah karyawan paling banyak ada di Sales Executive, Research Scientist, dan Laboratory Technician. - Jabatan dengan jumlah karyawan paling sedikit adalah Healthcare Representative dan Manufacturing Director.

Ini menunjukkan distribusi pekerjaan yang tidak merata, di mana fungsi sales, research, dan laboratorium mendominasi struktur organisasi.

  1. Proporsi Variabel Kategorikal per JobRole

Menggunakan plot proporsi, berikut insight per variabel:

  1. Gender
  1. MaritalStatus
  1. BusinessTravel
  1. Department
  1. EducationField
  1. Attrition
  1. AgeGroup
  1. SalarySlab
  1. OverTime
  1. Distribusi Variabel Numerik (Boxplot per JobRole)

Berdasarkan boxplot:

  1. Age
  1. DistanceFromHome
  1. Satisfaction dan Involvement
  1. Income
  1. PercentSalaryHike, StockOptionLevel, PerformanceRating
  1. YearsAtCompany, YearsInCurrentRole, YearsWithCurrManager, YearsSinceLastPromotion
  1. WorkLifeBalance
# melihat distribusi target JobRole
data %>%
  count(JobRole) %>%
  ggplot(aes(x = reorder(JobRole, n), y = n, fill = JobRole)) +
  geom_bar(stat = "identity") +
  coord_flip() +  # Putar sumbu untuk label lebih rapi
  theme_minimal() +
  labs(title = "Distribusi Job Role", x = "Job Role", y = "Jumlah") +
  theme(legend.position = "none")

# Hubungan variabel kategorikal dengan JobRole (1 plot per kategori)
categorical_predictors <- c("Gender", "MaritalStatus", "BusinessTravel",
                            "Department", "EducationField", "Attrition",
                            "AgeGroup", "SalarySlab", "OverTime")

# Atur ukuran per plot
options(repr.plot.width = 8, repr.plot.height = 5)

# Tampilkan satu per satu
for (var in categorical_predictors) {
  p <- ggplot(data, aes_string(x = "JobRole", fill = var)) +
    geom_bar(position = "fill") +
    coord_flip() +
    labs(title = paste("Proporsi", var, "per JobRole"), y = "Proporsi", x = NULL) +
    theme_minimal(base_size = 10) +
    theme(legend.position = "bottom")
  
  print(p)  # Tampilkan setiap plot secara terpisah
}
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

# Hubungan variabel numerik dengan JobRole (1 plot per variabel)

# Pilih kolom numerik secara otomatis
numeric_vars <- data %>%
  select(where(is.numeric)) %>%
  colnames()

# Atur ukuran per plot
options(repr.plot.width = 8, repr.plot.height = 5)

# Tampilkan satu per satu
for (var in numeric_vars) {
  p <- ggplot(data, aes_string(x = "JobRole", y = var)) +
    geom_boxplot(fill = "lightblue") +
    coord_flip() +
    theme_minimal(base_size = 10) +
    labs(title = paste("Boxplot", var, "per JobRole"),
         x = NULL, y = var)
  
  print(p)  # Tampilkan tiap plot
}

Analisis Modeling

Model regresi logistik multinomial dibangun untuk memprediksi jenis pekerjaan (JobRole) berdasarkan berbagai fitur yang tersedia. Model ini menggunakan fungsi multinom() dengan semua variabel prediktor yang ada dalam dataset (~ .).

Analisis Koefisien

  1. Gender
  1. Status Perkawinan
  1. Departemen

Variabel Kontinu

  1. Monthly Income
  1. Age
  1. Total Working Years

Evaluasi Kinerja Model

  1. Confusion Matrix

  2. Confusion Matrix

  1. Statistik Klasifikasi

Interpretasi Hasil

  1. Variabel paling berpengaruh Departemen, pendapatan bulanan, dan pengalaman kerja total → prediktor kuat berdasarkan besaran koefisien.

  2. Pola prediksi

    • Posisi managerial → terkait kuat dengan pendapatan tinggi
    • Posisi sales → terkait erat dengan departemen penjualan
    • Posisi teknis → terkait dengan variabel pendidikan
  3. Keterbatasan model

    • Kelas seperti Healthcare Representative dan Manufacturing Director → performa prediksi rendah
    • Standard error besar pada beberapa koefisien (terutama intercept) → menunjukkan ketidakpastian estimasi
  4. Kesesuaian model

    • Nilai AIC: 2572,975 → baseline untuk perbandingan dengan model alternatif
    • Devians residual: 2220,975 → menggambarkan seberapa baik model menyesuaikan dengan data
# load dan split data
set.seed(123)
split <- sample.split(data$JobRole, SplitRatio = 0.7)
train_data <- subset(data, split == TRUE)
test_data  <- subset(data, split == FALSE)

# memisahkan prediktor (fitur) dan target (JobRole)
x_train <- train_data %>% select(-JobRole)
y_train <- train_data$JobRole

# melakukan oversampling untuk menyeimbangkan kelas
set.seed(123)
balanced_train <- upSample(
  x = x_train,        # Fitur
  y = y_train,        # Target
  yname = "JobRole"   # Nama variabel target
)

# mengecek distribusi kelas setelah oversampling
table(balanced_train$JobRole)
## 
## Healthcare Representative           Human Resources     Laboratory Technician 
##                       228                       228                       228 
##                   Manager    Manufacturing Director         Research Director 
##                       228                       228                       228 
##        Research Scientist           Sales Executive      Sales Representative 
##                       228                       228                       228
# memilih fitur-fitur penting
selected_features <- c("Gender", "MaritalStatus", "Department", "BusinessTravel", "EducationField",
                       "OverTime", "Age", "MonthlyIncome", "TotalWorkingYears", 
                       "YearsAtCompany", "DistanceFromHome", "TrainingTimesLastYear", 
                       "NumCompaniesWorked")

# mengambil fitur dan target akhir untuk pelatihan
x_balanced <- balanced_train %>% select(all_of(selected_features))
y_balanced <- balanced_train$JobRole

# mengubah variabel karakter menjadi faktor
x_balanced[] <- lapply(x_balanced, function(x) if(is.character(x)) as.factor(x) else x)

# encode variabel kategorikal menjadi dummy variables
dummies_model <- dummyVars(~ ., data = x_balanced)

# menerapkan transformasi ke x_balanced
x_balanced_encoded <- predict(dummies_model, newdata = x_balanced)

# mengkonversi ke data frame
x_balanced_encoded <- as.data.frame(x_balanced_encoded)

# memeriksa hasil
head(x_balanced_encoded)
##   Gender.Female Gender.Male MaritalStatus.Divorced MaritalStatus.Married
## 1             0           1                      1                     0
## 2             0           1                      0                     1
## 3             1           0                      0                     0
## 4             0           1                      0                     0
## 5             0           1                      0                     0
## 6             1           0                      0                     1
##   MaritalStatus.Single Department.Human Resources
## 1                    0                          0
## 2                    0                          0
## 3                    1                          0
## 4                    1                          0
## 5                    1                          0
## 6                    0                          0
##   Department.Research & Development Department.Sales BusinessTravel.Non-Travel
## 1                                 1                0                         1
## 2                                 1                0                         0
## 3                                 1                0                         0
## 4                                 1                0                         0
## 5                                 1                0                         1
## 6                                 1                0                         0
##   BusinessTravel.Travel_Frequently BusinessTravel.Travel_Rarely
## 1                                0                            0
## 2                                1                            0
## 3                                0                            1
## 4                                0                            1
## 5                                0                            0
## 6                                0                            1
##   BusinessTravel.TravelRarely EducationField.Human Resources
## 1                           0                              0
## 2                           0                              0
## 3                           0                              0
## 4                           0                              0
## 5                           0                              0
## 6                           0                              0
##   EducationField.Life Sciences EducationField.Marketing EducationField.Medical
## 1                            1                        0                      0
## 2                            0                        0                      1
## 3                            0                        0                      1
## 4                            0                        0                      1
## 5                            0                        0                      0
## 6                            0                        0                      1
##   EducationField.Other EducationField.Technical Degree OverTime.No OverTime.Yes
## 1                    0                               0           1            0
## 2                    0                               0           0            1
## 3                    0                               0           1            0
## 4                    0                               0           1            0
## 5                    0                               1           1            0
## 6                    0                               0           1            0
##   Age MonthlyIncome TotalWorkingYears YearsAtCompany DistanceFromHome
## 1  25          4000                 6              6                5
## 2  26          4741                 5              5               11
## 3  27          6811                 9              7                7
## 4  28          5661                 9              8               16
## 5  28          8722                10             10               24
## 6  29          4335                11              8               27
##   TrainingTimesLastYear NumCompaniesWorked
## 1                     2                  1
## 2                     3                  1
## 3                     2                  8
## 4                     2                  0
## 5                     2                  1
## 6                     3                  4
# normalisasi fitur numerik
x_balanced_scaled <- as.data.frame(scale(x_balanced_encoded))

# memeriksa hasil normalisasi
summary(x_balanced_scaled)
##  Gender.Female      Gender.Male      MaritalStatus.Divorced
##  Min.   :-0.8569   Min.   :-1.1665   Min.   :-0.5509       
##  1st Qu.:-0.8569   1st Qu.:-1.1665   1st Qu.:-0.5509       
##  Median :-0.8569   Median : 0.8569   Median :-0.5509       
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000       
##  3rd Qu.: 1.1665   3rd Qu.: 0.8569   3rd Qu.:-0.5509       
##  Max.   : 1.1665   Max.   : 0.8569   Max.   : 1.8142       
##  MaritalStatus.Married MaritalStatus.Single Department.Human Resources
##  Min.   :-0.9643       Min.   :-0.6313      Min.   :-0.3715           
##  1st Qu.:-0.9643       1st Qu.:-0.6313      1st Qu.:-0.3715           
##  Median :-0.9643       Median :-0.6313      Median :-0.3715           
##  Mean   : 0.0000       Mean   : 0.0000      Mean   : 0.0000           
##  3rd Qu.: 1.0365       3rd Qu.: 1.5832      3rd Qu.:-0.3715           
##  Max.   : 1.0365       Max.   : 1.5832      Max.   : 2.6902           
##  Department.Research & Development Department.Sales  BusinessTravel.Non-Travel
##  Min.   :-1.2701                   Min.   :-0.5945   Min.   :-0.3128          
##  1st Qu.:-1.2701                   1st Qu.:-0.5945   1st Qu.:-0.3128          
##  Median : 0.7869                   Median :-0.5945   Median :-0.3128          
##  Mean   : 0.0000                   Mean   : 0.0000   Mean   : 0.0000          
##  3rd Qu.: 0.7869                   3rd Qu.: 1.6814   3rd Qu.:-0.3128          
##  Max.   : 0.7869                   Max.   : 1.6814   Max.   : 3.1950          
##  BusinessTravel.Travel_Frequently BusinessTravel.Travel_Rarely
##  Min.   :-0.4604                  Min.   :-1.644              
##  1st Qu.:-0.4604                  1st Qu.:-1.644              
##  Median :-0.4604                  Median : 0.608              
##  Mean   : 0.0000                  Mean   : 0.000              
##  3rd Qu.:-0.4604                  3rd Qu.: 0.608              
##  Max.   : 2.1711                  Max.   : 0.608              
##  BusinessTravel.TravelRarely EducationField.Human Resources
##  Min.   :-0.07668            Min.   :-0.2298               
##  1st Qu.:-0.07668            1st Qu.:-0.2298               
##  Median :-0.07668            Median :-0.2298               
##  Mean   : 0.00000            Mean   : 0.0000               
##  3rd Qu.:-0.07668            3rd Qu.:-0.2298               
##  Max.   :13.03523            Max.   : 4.3489               
##  EducationField.Life Sciences EducationField.Marketing EducationField.Medical
##  Min.   :-0.8611              Min.   :-0.2947          Min.   :-0.651        
##  1st Qu.:-0.8611              1st Qu.:-0.2947          1st Qu.:-0.651        
##  Median :-0.8611              Median :-0.2947          Median :-0.651        
##  Mean   : 0.0000              Mean   : 0.0000          Mean   : 0.000        
##  3rd Qu.: 1.1607              3rd Qu.:-0.2947          3rd Qu.: 1.535        
##  Max.   : 1.1607              Max.   : 3.3921          Max.   : 1.535        
##  EducationField.Other EducationField.Technical Degree  OverTime.No     
##  Min.   :-0.2535      Min.   :-0.3062                 Min.   :-1.5553  
##  1st Qu.:-0.2535      1st Qu.:-0.3062                 1st Qu.:-1.5553  
##  Median :-0.2535      Median :-0.3062                 Median : 0.6427  
##  Mean   : 0.0000      Mean   : 0.0000                 Mean   : 0.0000  
##  3rd Qu.:-0.2535      3rd Qu.:-0.3062                 3rd Qu.: 0.6427  
##  Max.   : 3.9422      Max.   : 3.2640                 Max.   : 0.6427  
##   OverTime.Yes          Age           MonthlyIncome     TotalWorkingYears
##  Min.   :-0.6427   Min.   :-2.11515   Min.   :-1.1735   Min.   :-1.4228  
##  1st Qu.:-0.6427   1st Qu.:-0.73178   1st Qu.:-0.8307   1st Qu.:-0.7407  
##  Median :-0.6427   Median :-0.09329   Median :-0.3950   Median :-0.2859  
##  Mean   : 0.0000   Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.0000  
##  3rd Qu.: 1.5553   3rd Qu.: 0.65160   3rd Qu.: 0.5946   3rd Qu.: 0.7373  
##  Max.   : 1.5553   Max.   : 2.35422   Max.   : 2.2142   Max.   : 3.1247  
##  YearsAtCompany    DistanceFromHome  TrainingTimesLastYear NumCompaniesWorked
##  Min.   :-1.0899   Min.   :-1.0382   Min.   :-2.2068       Min.   :-1.0771   
##  1st Qu.:-0.6637   1st Qu.:-0.9150   1st Qu.:-0.6431       1st Qu.:-0.6890   
##  Median :-0.3796   Median :-0.2993   Median : 0.1387       Median :-0.3009   
##  Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.0000       Mean   : 0.0000   
##  3rd Qu.: 0.3307   3rd Qu.: 0.6859   3rd Qu.: 0.1387       3rd Qu.: 0.4753   
##  Max.   : 4.5927   Max.   : 2.4100   Max.   : 2.4842       Max.   : 2.4157
# menghitung matriks korelasi
cor_matrix <- cor(x_balanced_scaled)

# mencari pasangan fitur dengan korelasi tinggi
high_cor_pairs <- which(abs(cor_matrix) > 0.9 & abs(cor_matrix) < 1, arr.ind = TRUE)

# menghapus duplikat (karena matriks simetris)
high_cor_pairs <- high_cor_pairs[high_cor_pairs[,1] < high_cor_pairs[,2], ]

# melihat hasil pasangan fitur dengan korelasi tinggi
if (nrow(high_cor_pairs) > 0) {
  data.frame(
    Fitur_1 = colnames(x_balanced_scaled)[high_cor_pairs[,1]],
    Fitur_2 = colnames(x_balanced_scaled)[high_cor_pairs[,2]],
    Korelasi = cor_matrix[high_cor_pairs]
  )
} else {
  print("Tidak ada pasangan fitur dengan korelasi > 0.9")
}
## [1] "Tidak ada pasangan fitur dengan korelasi > 0.9"
# mengabungkan fitur dan target ke dalam satu data frame
rf_data <- x_balanced_scaled
rf_data$JobRole <- y_balanced

# fit model multinomial
multinom_model <- multinom(JobRole ~ ., data = rf_data)
## # weights:  261 (224 variable)
## initial  value 4508.704833 
## iter  10 value 2162.898516
## iter  20 value 1560.075623
## iter  30 value 1255.111597
## iter  40 value 1192.001751
## iter  50 value 1166.385568
## iter  60 value 1149.197785
## iter  70 value 1135.265369
## iter  80 value 1124.385279
## iter  90 value 1113.938609
## iter 100 value 1110.487528
## final  value 1110.487528 
## stopped after 100 iterations
# melihat ringkasan model
summary(multinom_model)
## Call:
## multinom(formula = JobRole ~ ., data = rf_data)
## 
## Coefficients:
##                        (Intercept) Gender.Female Gender.Male
## Human Resources          -1.061685    0.10136605 -0.10136605
## Laboratory Technician    -2.831749   -0.13803225  0.13803225
## Manager                  -5.927352    0.27334716 -0.27334716
## Manufacturing Director    2.167369    0.01939807 -0.01939807
## Research Director        -6.189242    0.12929801 -0.12929801
## Research Scientist       -2.619415   -0.06873669  0.06873669
## Sales Executive          -1.898217    0.31232469 -0.31232469
## Sales Representative    -17.373032    0.21351011 -0.21351011
##                        MaritalStatus.Divorced MaritalStatus.Married
## Human Resources                   -0.24445127            0.00192417
## Laboratory Technician              0.03417554           -0.05065538
## Manager                            0.05759020           -0.60750332
## Manufacturing Director             0.11046715           -0.05616205
## Research Director                  0.25390993           -0.75734562
## Research Scientist                 0.00744938           -0.12009636
## Sales Executive                   -0.20267892           -0.41371175
## Sales Representative              -0.47980325           -0.15964199
##                        MaritalStatus.Single `Department.Human Resources`
## Human Resources                  0.22675441                    4.2830480
## Laboratory Technician            0.02406655                    0.1288996
## Manager                          0.61846603                    2.2224973
## Manufacturing Director          -0.04127184                    1.0610813
## Research Director                0.60049481                    1.1863110
## Research Scientist               0.12594847                    0.7687039
## Sales Executive                  0.64767084                    1.0484758
## Sales Representative             0.62594119                    0.5585218
##                        `Department.Research & Development` Department.Sales
## Human Resources                                 -3.6079156        0.8080047
## Laboratory Technician                           -0.8289166        0.8212575
## Manager                                         -4.1369793        2.9249447
## Manufacturing Director                          -1.2293710        0.5714059
## Research Director                               -1.9676342        1.2950973
## Research Scientist                              -1.0135140        0.5499182
## Sales Executive                                 -5.4127090        5.2089967
## Sales Representative                            -5.5457630        5.7203845
##                        `BusinessTravel.Non-Travel`
## Human Resources                        0.036332207
## Laboratory Technician                 -0.190755600
## Manager                                0.375318091
## Manufacturing Director                -0.197406714
## Research Director                      0.288140954
## Research Scientist                    -0.172676849
## Sales Executive                        0.467183241
## Sales Representative                  -0.006820797
##                        BusinessTravel.Travel_Frequently
## Human Resources                             -0.06064750
## Laboratory Technician                        0.03531266
## Manager                                      0.33475455
## Manufacturing Director                       0.05077896
## Research Director                            0.07980227
## Research Scientist                           0.01364786
## Sales Executive                              0.70204478
## Sales Representative                         0.42440102
##                        BusinessTravel.Travel_Rarely BusinessTravel.TravelRarely
## Human Resources                        -0.124094230                   0.8889181
## Laboratory Technician                  -0.100849509                   1.1242559
## Manager                                -0.490239555                  -0.2165021
## Manufacturing Director                  0.148279419                  -0.3784864
## Research Director                      -0.339436496                   0.5016792
## Research Scientist                     -0.002881688                   0.5942185
## Sales Executive                        -0.996428162                   0.5572517
## Sales Representative                   -0.529319146                   0.9927361
##                        `EducationField.Human Resources`
## Human Resources                             0.376945138
## Laboratory Technician                       0.541305285
## Manager                                     1.803316219
## Manufacturing Director                      0.646161525
## Research Director                           2.156712060
## Research Scientist                          0.003569817
## Sales Executive                             0.897902232
## Sales Representative                        0.476872446
##                        `EducationField.Life Sciences` EducationField.Marketing
## Human Resources                            -0.2500912                1.4164515
## Laboratory Technician                      -0.3169493                1.0932912
## Manager                                    -0.5058736                1.0867148
## Manufacturing Director                     -0.2779794                0.9456363
## Research Director                          -0.3521574                1.2510551
## Research Scientist                         -0.2633623                1.1301857
## Sales Executive                            -0.5248653                1.2707762
## Sales Representative                       -0.5060992                0.6556319
##                        EducationField.Medical EducationField.Other
## Human Resources                   -0.33242769         -0.312937691
## Laboratory Technician             -0.39229867         -0.001306596
## Manager                            0.03175927          0.082924610
## Manufacturing Director            -0.36678143         -0.117453920
## Research Director                 -0.23242821         -0.430308336
## Research Scientist                -0.30909279         -0.073925784
## Sales Executive                    0.03135411          0.168848582
## Sales Representative              -0.41392748          0.136990381
##                        `EducationField.Technical Degree` OverTime.No
## Human Resources                              -0.41483906  0.18783906
## Laboratory Technician                        -0.27939787  0.09019187
## Manager                                      -1.68761191  0.36545109
## Manufacturing Director                       -0.22982288  0.11717744
## Research Director                            -1.52562133  0.21882853
## Research Scientist                           -0.06453701 -0.06849352
## Sales Executive                              -1.19878100  0.07152328
## Sales Representative                          0.44631871  0.33651536
##                        OverTime.Yes        Age MonthlyIncome TotalWorkingYears
## Human Resources         -0.18783906 -0.4076193    -0.8478695         0.7426855
## Laboratory Technician   -0.09019187 -0.1400765    -9.5540467         0.6310785
## Manager                 -0.36545109 -0.2728248    17.8125549        -4.2987722
## Manufacturing Director  -0.11717744 -0.1313403    -0.2994945         0.3059534
## Research Director       -0.21882853 -0.1948751    16.5360052        -3.9711549
## Research Scientist       0.06849352 -0.2104517    -9.3061510         0.4066149
## Sales Executive         -0.07152328 -0.8750792     3.3263893        -1.0046587
## Sales Representative    -0.33651536 -1.0790986   -21.1890476         4.2944762
##                        YearsAtCompany DistanceFromHome TrainingTimesLastYear
## Human Resources            -1.7352144      -0.27476949            0.13751423
## Laboratory Technician       0.1180779      -0.04031161            0.07704332
## Manager                    -0.4699736      -0.96191303            0.43099031
## Manufacturing Director     -0.3649694       0.12924888            0.27550769
## Research Director          -0.8010961      -0.68218089            0.08982287
## Research Scientist          0.4727734      -0.01352127           -0.16263993
## Sales Executive             0.8393971      -0.25272838            0.67532489
## Sales Representative       -3.6331234      -0.92952806            0.81893442
##                        NumCompaniesWorked
## Human Resources                -0.4040001
## Laboratory Technician           0.2499453
## Manager                        -0.8959202
## Manufacturing Director         -0.1374286
## Research Director              -0.4226508
## Research Scientist              0.1912795
## Sales Executive                -0.1535980
## Sales Representative           -1.7219733
## 
## Std. Errors:
##                        (Intercept) Gender.Female Gender.Male
## Human Resources           22.72266    0.51830249  0.51830249
## Laboratory Technician     22.71487    0.06922736  0.06922736
## Manager                   22.72324    0.16125453  0.16125453
## Manufacturing Director    22.38426    0.05003849  0.05003849
## Research Director         22.73131    0.15364205  0.15364205
## Research Scientist        22.70713    0.06857740  0.06857740
## Sales Executive           22.73091    0.27840855  0.27840855
## Sales Representative      23.03226    0.32069309  0.32069309
##                        MaritalStatus.Divorced MaritalStatus.Married
## Human Resources                    0.79647880            0.61916303
## Laboratory Technician              0.09918692            0.08286235
## Manager                            0.22680019            0.22466866
## Manufacturing Director             0.06899462            0.06023252
## Research Director                  0.21105757            0.21345691
## Research Scientist                 0.10010336            0.08307996
## Sales Executive                    0.40170698            0.37721323
## Sales Representative               0.49362193            0.43011392
##                        MaritalStatus.Single `Department.Human Resources`
## Human Resources                  0.70866109                     13.81664
## Laboratory Technician            0.09481463                     14.06096
## Manager                          0.25997603                     13.81220
## Manufacturing Director           0.07017209                     13.83810
## Research Director                0.25012008                     13.91120
## Research Scientist               0.09396103                     13.85911
## Sales Executive                  0.45205797                     13.90057
## Sales Representative             0.49830219                     13.91128
##                        `Department.Research & Development` Department.Sales
## Human Resources                                   5.588966         5.872572
## Laboratory Technician                             5.645735         5.911951
## Manager                                           5.550055         5.804298
## Manufacturing Director                            5.549370         5.804041
## Research Director                                 5.601843         5.878475
## Research Scientist                                5.579726         5.866031
## Sales Executive                                   5.596493         5.833327
## Sales Representative                              5.622554         5.858178
##                        `BusinessTravel.Non-Travel`
## Human Resources                          1.5146794
## Laboratory Technician                    1.2897727
## Manager                                  1.3324990
## Manufacturing Director                   0.1926102
## Research Director                        1.2984347
## Research Scientist                       1.2923830
## Sales Executive                          1.3614794
## Sales Representative                     1.3787395
##                        BusinessTravel.Travel_Frequently
## Human Resources                               1.8406023
## Laboratory Technician                         1.7158881
## Manager                                       1.7572128
## Manufacturing Director                        0.2349405
## Research Director                             1.7137198
## Research Scientist                            1.7191772
## Sales Executive                               1.7628996
## Sales Representative                          1.7772248
##                        BusinessTravel.Travel_Rarely BusinessTravel.TravelRarely
## Human Resources                           2.0719116                   24.931826
## Laboratory Technician                     2.0035403                   25.001326
## Manager                                   2.0484754                   25.503782
## Manufacturing Director                    0.2687395                    3.286893
## Research Director                         1.9982559                   24.883985
## Research Scientist                        2.0073552                   25.048710
## Sales Executive                           2.0359968                   25.016376
## Sales Representative                      2.0454745                   25.079538
##                        `EducationField.Human Resources`
## Human Resources                                2.749063
## Laboratory Technician                          3.373668
## Manager                                        3.301233
## Manufacturing Director                         2.913538
## Research Director                              3.570334
## Research Scientist                             2.986541
## Sales Executive                                3.150120
## Sales Representative                           3.455380
##                        `EducationField.Life Sciences` EducationField.Marketing
## Human Resources                              13.96527                 63.08817
## Laboratory Technician                        13.95519                 63.09054
## Manager                                      13.95334                 63.07935
## Manufacturing Director                       13.95213                 63.08965
## Research Director                            13.95673                 63.08799
## Research Scientist                           13.95299                 63.09115
## Sales Executive                              13.95720                 63.07892
## Sales Representative                         13.96181                 63.07955
##                        EducationField.Medical EducationField.Other
## Human Resources                      12.92765             6.781406
## Laboratory Technician                12.90517             6.725455
## Manager                              12.90379             6.730952
## Manufacturing Director               12.90229             6.723685
## Research Director                    12.90688             6.732548
## Research Scientist                   12.90314             6.724446
## Sales Executive                      12.90961             6.768577
## Sales Representative                 12.91536             6.858914
##                        `EducationField.Technical Degree` OverTime.No
## Human Resources                                 7.965800  0.51246828
## Laboratory Technician                           7.903698  0.07095825
## Manager                                         7.907328  0.14187310
## Manufacturing Director                          7.901342  0.05099014
## Research Director                               7.908392  0.13183064
## Research Scientist                              7.902306  0.06805939
## Sales Executive                                 7.912314  0.28381040
## Sales Representative                            7.919252  0.34028587
##                        OverTime.Yes       Age MonthlyIncome TotalWorkingYears
## Human Resources          0.51246828 1.3303039     3.5255546         2.0498909
## Laboratory Technician    0.07095825 0.1850113     0.7574677         0.3798090
## Manager                  0.14187310 0.5458637     2.2462769         0.8547606
## Manufacturing Director   0.05099014 0.1464605     0.2777899         0.2345761
## Research Director        0.13183064 0.4963485     2.2343777         0.8090067
## Research Scientist       0.06805939 0.1850914     0.7488797         0.3906337
## Sales Executive          0.28381040 0.7781735     2.8575307         1.3814736
## Sales Representative     0.34028587 0.8798438     4.6458426         1.8058351
##                        YearsAtCompany DistanceFromHome TrainingTimesLastYear
## Human Resources             1.6630746       1.17540095            1.03689344
## Laboratory Technician       0.3153891       0.13140265            0.13566470
## Manager                     0.3411488       0.32419139            0.29084910
## Manufacturing Director      0.1614502       0.09505565            0.09469771
## Research Director           0.3295810       0.30650082            0.27306036
## Research Scientist          0.3161834       0.13094267            0.13859892
## Sales Executive             0.7499578       0.57736950            0.62692451
## Sales Representative        1.3626403       0.67082037            0.70038892
##                        NumCompaniesWorked
## Human Resources                 1.2029859
## Laboratory Technician           0.1553793
## Manager                         0.3427836
## Manufacturing Director          0.1165053
## Research Director               0.3207304
## Research Scientist              0.1554070
## Sales Executive                 0.5931364
## Sales Representative            0.8708416
## 
## Residual Deviance: 2220.975 
## AIC: 2572.975
# menghitung akurasi Model Multinomial
# memprediksi JobRole menggunakan model
predicted_class <- predict(multinom_model, newdata = rf_data)

#  membandingkan dengan nilai asli
actual_class <- rf_data$JobRole

# menghitung akurasi
accuracy <- mean(predicted_class == actual_class)

#  mencetak hasil akurasi
print(paste("Akurasi model:", round(accuracy * 100, 2), "%"))
## [1] "Akurasi model: 74.9 %"
#confusion matrix untuk melihat distribusi prediksi
table(Predicted = predicted_class, Actual = actual_class)
##                            Actual
## Predicted                   Healthcare Representative Human Resources
##   Healthcare Representative                       119               0
##   Human Resources                                   0             228
##   Laboratory Technician                            10               0
##   Manager                                           0               0
##   Manufacturing Director                           85               0
##   Research Director                                 2               0
##   Research Scientist                               12               0
##   Sales Executive                                   0               0
##   Sales Representative                              0               0
##                            Actual
## Predicted                   Laboratory Technician Manager
##   Healthcare Representative                    14       0
##   Human Resources                               0       0
##   Laboratory Technician                       135       0
##   Manager                                       0     175
##   Manufacturing Director                       10       0
##   Research Director                             0      53
##   Research Scientist                           69       0
##   Sales Executive                               0       0
##   Sales Representative                          0       0
##                            Actual
## Predicted                   Manufacturing Director Research Director
##   Healthcare Representative                     75                 5
##   Human Resources                                0                 0
##   Laboratory Technician                          7                 0
##   Manager                                        0                22
##   Manufacturing Director                       112                 2
##   Research Director                              9               199
##   Research Scientist                            25                 0
##   Sales Executive                                0                 0
##   Sales Representative                           0                 0
##                            Actual
## Predicted                   Research Scientist Sales Executive
##   Healthcare Representative                 10               0
##   Human Resources                            0               0
##   Laboratory Technician                     78               0
##   Manager                                    0               0
##   Manufacturing Director                    21               0
##   Research Director                          0               0
##   Research Scientist                       119               0
##   Sales Executive                            0             224
##   Sales Representative                       0               4
##                            Actual
## Predicted                   Sales Representative
##   Healthcare Representative                    0
##   Human Resources                              0
##   Laboratory Technician                        0
##   Manager                                      0
##   Manufacturing Director                       0
##   Research Director                            0
##   Research Scientist                           0
##   Sales Executive                              2
##   Sales Representative                       226
#menghitung precision, recall, dan F1-score per kelas
confusion <- confusionMatrix(factor(predicted_class), factor(actual_class))
print(confusion)
## Confusion Matrix and Statistics
## 
##                            Reference
## Prediction                  Healthcare Representative Human Resources
##   Healthcare Representative                       119               0
##   Human Resources                                   0             228
##   Laboratory Technician                            10               0
##   Manager                                           0               0
##   Manufacturing Director                           85               0
##   Research Director                                 2               0
##   Research Scientist                               12               0
##   Sales Executive                                   0               0
##   Sales Representative                              0               0
##                            Reference
## Prediction                  Laboratory Technician Manager
##   Healthcare Representative                    14       0
##   Human Resources                               0       0
##   Laboratory Technician                       135       0
##   Manager                                       0     175
##   Manufacturing Director                       10       0
##   Research Director                             0      53
##   Research Scientist                           69       0
##   Sales Executive                               0       0
##   Sales Representative                          0       0
##                            Reference
## Prediction                  Manufacturing Director Research Director
##   Healthcare Representative                     75                 5
##   Human Resources                                0                 0
##   Laboratory Technician                          7                 0
##   Manager                                        0                22
##   Manufacturing Director                       112                 2
##   Research Director                              9               199
##   Research Scientist                            25                 0
##   Sales Executive                                0                 0
##   Sales Representative                           0                 0
##                            Reference
## Prediction                  Research Scientist Sales Executive
##   Healthcare Representative                 10               0
##   Human Resources                            0               0
##   Laboratory Technician                     78               0
##   Manager                                    0               0
##   Manufacturing Director                    21               0
##   Research Director                          0               0
##   Research Scientist                       119               0
##   Sales Executive                            0             224
##   Sales Representative                       0               4
##                            Reference
## Prediction                  Sales Representative
##   Healthcare Representative                    0
##   Human Resources                              0
##   Laboratory Technician                        0
##   Manager                                      0
##   Manufacturing Director                       0
##   Research Director                            0
##   Research Scientist                           0
##   Sales Executive                              2
##   Sales Representative                       226
## 
## Overall Statistics
##                                           
##                Accuracy : 0.749           
##                  95% CI : (0.7297, 0.7677)
##     No Information Rate : 0.1111          
##     P-Value [Acc > NIR] : < 2.2e-16       
##                                           
##                   Kappa : 0.7177          
##                                           
##  Mcnemar's Test P-Value : NA              
## 
## Statistics by Class:
## 
##                      Class: Healthcare Representative Class: Human Resources
## Sensitivity                                   0.52193                 1.0000
## Specificity                                   0.94298                 1.0000
## Pos Pred Value                                0.53363                 1.0000
## Neg Pred Value                                0.94040                 1.0000
## Prevalence                                    0.11111                 0.1111
## Detection Rate                                0.05799                 0.1111
## Detection Prevalence                          0.10867                 0.1111
## Balanced Accuracy                             0.73246                 1.0000
##                      Class: Laboratory Technician Class: Manager
## Sensitivity                               0.59211        0.76754
## Specificity                               0.94792        0.98794
## Pos Pred Value                            0.58696        0.88832
## Neg Pred Value                            0.94896        0.97143
## Prevalence                                0.11111        0.11111
## Detection Rate                            0.06579        0.08528
## Detection Prevalence                      0.11209        0.09600
## Balanced Accuracy                         0.77001        0.87774
##                      Class: Manufacturing Director Class: Research Director
## Sensitivity                                0.49123                  0.87281
## Specificity                                0.93531                  0.96491
## Pos Pred Value                             0.48696                  0.75665
## Neg Pred Value                             0.93633                  0.98379
## Prevalence                                 0.11111                  0.11111
## Detection Rate                             0.05458                  0.09698
## Detection Prevalence                       0.11209                  0.12817
## Balanced Accuracy                          0.71327                  0.91886
##                      Class: Research Scientist Class: Sales Executive
## Sensitivity                            0.52193                 0.9825
## Specificity                            0.94189                 0.9989
## Pos Pred Value                         0.52889                 0.9912
## Neg Pred Value                         0.94034                 0.9978
## Prevalence                             0.11111                 0.1111
## Detection Rate                         0.05799                 0.1092
## Detection Prevalence                   0.10965                 0.1101
## Balanced Accuracy                      0.73191                 0.9907
##                      Class: Sales Representative
## Sensitivity                               0.9912
## Specificity                               0.9978
## Pos Pred Value                            0.9826
## Neg Pred Value                            0.9989
## Prevalence                                0.1111
## Detection Rate                            0.1101
## Detection Prevalence                      0.1121
## Balanced Accuracy                         0.9945
summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

```

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.