This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
# Load library yang dibutuhkan
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.4.3
## Warning: package 'ggplot2' was built under R version 4.4.3
## Warning: package 'tibble' was built under R version 4.4.3
## Warning: package 'tidyr' was built under R version 4.4.3
## Warning: package 'readr' was built under R version 4.4.3
## Warning: package 'purrr' was built under R version 4.4.3
## Warning: package 'dplyr' was built under R version 4.4.3
## Warning: package 'forcats' was built under R version 4.4.3
## Warning: package 'lubridate' was built under R version 4.4.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(patchwork)
## Warning: package 'patchwork' was built under R version 4.4.3
library(caTools)
## Warning: package 'caTools' was built under R version 4.4.3
library(caret)
## Warning: package 'caret' was built under R version 4.4.3
## Loading required package: lattice
##
## Attaching package: 'caret'
##
## The following object is masked from 'package:purrr':
##
## lift
library(dplyr)
library(ggplot2)
library(patchwork)
library(nnet)
## Warning: package 'nnet' was built under R version 4.4.3
# membaca data set
data <- read.csv("HR_Analytics.csv")
Melihat struktur data (str(data))
Digunakan untuk mengetahui tipe data setiap kolom, jumlah observasi
(1480 baris), dan jumlah variabel (38 kolom) dan membantu memahami mana
yang bertipe numerik (int), karakter (chr), atau faktor.
Melihat ringkasan data (summary(data))
Memberikan ringkasan statistik deskriptif: minimum, maksimum, median,
mean, kuartil, dan jumlah data untuk masing-masing kolom serta penting
untuk mendeteksi outlier, distribusi, dan memahami nilai-nilai umum di
dataset.
Melihat beberapa baris awal data
(head(data))
Digunakan untuk melihat 6 baris pertama sebagai contoh isi dataset,
supaya lebih familiar dengan datanya.
Mengecek missing value
(colSums(is.na(data)))
Menghitung jumlah nilai yang hilang (NA) di setiap kolom.
Hasilnya menunjukkan bahwa hanya kolom YearsWithCurrManager yang
memiliki 57 NA.
Mengisi missing value
Mengisi NA di YearsWithCurrManager dengan nilai median. Median dipilih
karena lebih tahan terhadap outlier dibanding mean.
Mengecek ulang missing value
(colSums(is.na(data)))
Memastikan semua missing value sudah berhasil diisi (semua
nol).
Mengecek duplikasi data
(sum(duplicated(data$EmpID)))
Mengecek apakah ada ID pegawai (EmpID) yang muncul lebih dari satu
kali.
Hasil: ditemukan 10 data duplikat.
Menghapus baris duplikat
Menghapus baris dengan EmpID yang sama supaya tidak memengaruhi
analisis.
# melihat struktur data
str(data)
## 'data.frame': 1480 obs. of 38 variables:
## $ EmpID : chr "RM297" "RM302" "RM458" "RM728" ...
## $ Age : int 18 18 18 18 18 18 18 18 19 19 ...
## $ AgeGroup : chr "18-25" "18-25" "18-25" "18-25" ...
## $ Attrition : chr "Yes" "No" "Yes" "No" ...
## $ BusinessTravel : chr "Travel_Rarely" "Travel_Rarely" "Travel_Frequently" "Non-Travel" ...
## $ DailyRate : int 230 812 1306 287 247 1124 544 1431 528 1181 ...
## $ Department : chr "Research & Development" "Sales" "Sales" "Research & Development" ...
## $ DistanceFromHome : int 3 10 5 5 8 1 3 14 22 3 ...
## $ Education : int 3 3 3 2 1 3 2 3 1 1 ...
## $ EducationField : chr "Life Sciences" "Medical" "Marketing" "Life Sciences" ...
## $ EmployeeCount : int 1 1 1 1 1 1 1 1 1 1 ...
## $ EmployeeNumber : int 405 411 614 1012 1156 1368 1624 1839 167 201 ...
## $ EnvironmentSatisfaction : int 3 4 2 2 3 4 2 2 4 2 ...
## $ Gender : chr "Male" "Female" "Male" "Male" ...
## $ HourlyRate : int 54 69 69 73 80 97 70 33 50 79 ...
## $ JobInvolvement : int 3 2 3 3 3 3 3 3 3 3 ...
## $ JobLevel : int 1 1 1 1 1 1 1 1 1 1 ...
## $ JobRole : chr "Laboratory Technician" "Sales Representative" "Sales Representative" "Research Scientist" ...
## $ JobSatisfaction : int 3 3 2 4 3 4 4 3 3 2 ...
## $ MaritalStatus : chr "Single" "Single" "Single" "Single" ...
## $ MonthlyIncome : int 1420 1200 1878 1051 1904 1611 1569 1514 1675 1483 ...
## $ SalarySlab : chr "Upto 5k" "Upto 5k" "Upto 5k" "Upto 5k" ...
## $ MonthlyRate : int 25233 9724 8059 13493 13556 19305 18420 8018 26820 16102 ...
## $ NumCompaniesWorked : int 1 1 1 1 1 1 1 1 1 1 ...
## $ Over18 : chr "Y" "Y" "Y" "Y" ...
## $ OverTime : chr "No" "No" "Yes" "No" ...
## $ PercentSalaryHike : int 13 12 14 15 12 15 12 16 19 14 ...
## $ PerformanceRating : int 3 3 3 3 3 3 3 3 3 3 ...
## $ RelationshipSatisfaction: int 3 1 4 4 4 3 3 3 4 4 ...
## $ StandardHours : int 80 80 80 80 80 80 80 80 80 80 ...
## $ StockOptionLevel : int 0 0 0 0 0 0 0 0 0 0 ...
## $ TotalWorkingYears : int 0 0 0 0 0 0 0 0 0 1 ...
## $ TrainingTimesLastYear : int 2 2 3 2 0 5 2 4 2 3 ...
## $ WorkLifeBalance : int 3 3 3 3 3 4 4 1 2 3 ...
## $ YearsAtCompany : int 0 0 0 0 0 0 0 0 0 1 ...
## $ YearsInCurrentRole : int 0 0 0 0 0 0 0 0 0 0 ...
## $ YearsSinceLastPromotion : int 0 0 0 0 0 0 0 0 0 0 ...
## $ YearsWithCurrManager : int 0 0 0 0 0 0 0 0 0 0 ...
# melihat ringkasan data
summary(data)
## EmpID Age AgeGroup Attrition
## Length:1480 Min. :18.00 Length:1480 Length:1480
## Class :character 1st Qu.:30.00 Class :character Class :character
## Mode :character Median :36.00 Mode :character Mode :character
## Mean :36.92
## 3rd Qu.:43.00
## Max. :60.00
##
## BusinessTravel DailyRate Department DistanceFromHome
## Length:1480 Min. : 102.0 Length:1480 Min. : 1.00
## Class :character 1st Qu.: 465.0 Class :character 1st Qu.: 2.00
## Mode :character Median : 800.0 Mode :character Median : 7.00
## Mean : 801.4 Mean : 9.22
## 3rd Qu.:1157.0 3rd Qu.:14.00
## Max. :1499.0 Max. :29.00
##
## Education EducationField EmployeeCount EmployeeNumber
## Min. :1.000 Length:1480 Min. :1 Min. : 1.0
## 1st Qu.:2.000 Class :character 1st Qu.:1 1st Qu.: 493.8
## Median :3.000 Mode :character Median :1 Median :1027.5
## Mean :2.911 Mean :1 Mean :1031.9
## 3rd Qu.:4.000 3rd Qu.:1 3rd Qu.:1568.2
## Max. :5.000 Max. :1 Max. :2068.0
##
## EnvironmentSatisfaction Gender HourlyRate JobInvolvement
## Min. :1.000 Length:1480 Min. : 30.00 Min. :1.00
## 1st Qu.:2.000 Class :character 1st Qu.: 48.00 1st Qu.:2.00
## Median :3.000 Mode :character Median : 66.00 Median :3.00
## Mean :2.724 Mean : 65.85 Mean :2.73
## 3rd Qu.:4.000 3rd Qu.: 83.00 3rd Qu.:3.00
## Max. :4.000 Max. :100.00 Max. :4.00
##
## JobLevel JobRole JobSatisfaction MaritalStatus
## Min. :1.000 Length:1480 Min. :1.000 Length:1480
## 1st Qu.:1.000 Class :character 1st Qu.:2.000 Class :character
## Median :2.000 Mode :character Median :3.000 Mode :character
## Mean :2.065 Mean :2.725
## 3rd Qu.:3.000 3rd Qu.:4.000
## Max. :5.000 Max. :4.000
##
## MonthlyIncome SalarySlab MonthlyRate NumCompaniesWorked
## Min. : 1009 Length:1480 Min. : 2094 Min. :0.000
## 1st Qu.: 2922 Class :character 1st Qu.: 8051 1st Qu.:1.000
## Median : 4933 Mode :character Median :14220 Median :2.000
## Mean : 6505 Mean :14298 Mean :2.687
## 3rd Qu.: 8384 3rd Qu.:20461 3rd Qu.:4.000
## Max. :19999 Max. :26999 Max. :9.000
##
## Over18 OverTime PercentSalaryHike PerformanceRating
## Length:1480 Length:1480 Min. :11.00 Min. :3.000
## Class :character Class :character 1st Qu.:12.00 1st Qu.:3.000
## Mode :character Mode :character Median :14.00 Median :3.000
## Mean :15.21 Mean :3.153
## 3rd Qu.:18.00 3rd Qu.:3.000
## Max. :25.00 Max. :4.000
##
## RelationshipSatisfaction StandardHours StockOptionLevel TotalWorkingYears
## Min. :1.000 Min. :80 Min. :0.0000 Min. : 0.00
## 1st Qu.:2.000 1st Qu.:80 1st Qu.:0.0000 1st Qu.: 6.00
## Median :3.000 Median :80 Median :1.0000 Median :10.00
## Mean :2.709 Mean :80 Mean :0.7919 Mean :11.28
## 3rd Qu.:4.000 3rd Qu.:80 3rd Qu.:1.0000 3rd Qu.:15.00
## Max. :4.000 Max. :80 Max. :3.0000 Max. :40.00
##
## TrainingTimesLastYear WorkLifeBalance YearsAtCompany YearsInCurrentRole
## Min. :0.000 Min. :1.000 Min. : 0.000 Min. : 0.000
## 1st Qu.:2.000 1st Qu.:2.000 1st Qu.: 3.000 1st Qu.: 2.000
## Median :3.000 Median :3.000 Median : 5.000 Median : 3.000
## Mean :2.798 Mean :2.761 Mean : 7.009 Mean : 4.228
## 3rd Qu.:3.000 3rd Qu.:3.000 3rd Qu.: 9.000 3rd Qu.: 7.000
## Max. :6.000 Max. :4.000 Max. :40.000 Max. :18.000
##
## YearsSinceLastPromotion YearsWithCurrManager
## Min. : 0.000 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 2.000
## Median : 1.000 Median : 3.000
## Mean : 2.182 Mean : 4.118
## 3rd Qu.: 3.000 3rd Qu.: 7.000
## Max. :15.000 Max. :17.000
## NA's :57
# melihat beberapa baris awal
head(data)
## EmpID Age AgeGroup Attrition BusinessTravel DailyRate
## 1 RM297 18 18-25 Yes Travel_Rarely 230
## 2 RM302 18 18-25 No Travel_Rarely 812
## 3 RM458 18 18-25 Yes Travel_Frequently 1306
## 4 RM728 18 18-25 No Non-Travel 287
## 5 RM829 18 18-25 Yes Non-Travel 247
## 6 RM973 18 18-25 No Non-Travel 1124
## Department DistanceFromHome Education EducationField
## 1 Research & Development 3 3 Life Sciences
## 2 Sales 10 3 Medical
## 3 Sales 5 3 Marketing
## 4 Research & Development 5 2 Life Sciences
## 5 Research & Development 8 1 Medical
## 6 Research & Development 1 3 Life Sciences
## EmployeeCount EmployeeNumber EnvironmentSatisfaction Gender HourlyRate
## 1 1 405 3 Male 54
## 2 1 411 4 Female 69
## 3 1 614 2 Male 69
## 4 1 1012 2 Male 73
## 5 1 1156 3 Male 80
## 6 1 1368 4 Female 97
## JobInvolvement JobLevel JobRole JobSatisfaction MaritalStatus
## 1 3 1 Laboratory Technician 3 Single
## 2 2 1 Sales Representative 3 Single
## 3 3 1 Sales Representative 2 Single
## 4 3 1 Research Scientist 4 Single
## 5 3 1 Laboratory Technician 3 Single
## 6 3 1 Laboratory Technician 4 Single
## MonthlyIncome SalarySlab MonthlyRate NumCompaniesWorked Over18 OverTime
## 1 1420 Upto 5k 25233 1 Y No
## 2 1200 Upto 5k 9724 1 Y No
## 3 1878 Upto 5k 8059 1 Y Yes
## 4 1051 Upto 5k 13493 1 Y No
## 5 1904 Upto 5k 13556 1 Y No
## 6 1611 Upto 5k 19305 1 Y No
## PercentSalaryHike PerformanceRating RelationshipSatisfaction StandardHours
## 1 13 3 3 80
## 2 12 3 1 80
## 3 14 3 4 80
## 4 15 3 4 80
## 5 12 3 4 80
## 6 15 3 3 80
## StockOptionLevel TotalWorkingYears TrainingTimesLastYear WorkLifeBalance
## 1 0 0 2 3
## 2 0 0 2 3
## 3 0 0 3 3
## 4 0 0 2 3
## 5 0 0 0 3
## 6 0 0 5 4
## YearsAtCompany YearsInCurrentRole YearsSinceLastPromotion
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## YearsWithCurrManager
## 1 0
## 2 0
## 3 0
## 4 0
## 5 0
## 6 0
# mengecek missing value
colSums(is.na(data))
## EmpID Age AgeGroup
## 0 0 0
## Attrition BusinessTravel DailyRate
## 0 0 0
## Department DistanceFromHome Education
## 0 0 0
## EducationField EmployeeCount EmployeeNumber
## 0 0 0
## EnvironmentSatisfaction Gender HourlyRate
## 0 0 0
## JobInvolvement JobLevel JobRole
## 0 0 0
## JobSatisfaction MaritalStatus MonthlyIncome
## 0 0 0
## SalarySlab MonthlyRate NumCompaniesWorked
## 0 0 0
## Over18 OverTime PercentSalaryHike
## 0 0 0
## PerformanceRating RelationshipSatisfaction StandardHours
## 0 0 0
## StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 0 0 0
## WorkLifeBalance YearsAtCompany YearsInCurrentRole
## 0 0 0
## YearsSinceLastPromotion YearsWithCurrManager
## 0 57
# mengisi missing value dengan median
median_val <- median(data$YearsWithCurrManager, na.rm = TRUE)
data$YearsWithCurrManager[is.na(data$YearsWithCurrManager)] <- median_val
# mengecek kembali untuk memastikan tidak ada missing value
colSums(is.na(data))
## EmpID Age AgeGroup
## 0 0 0
## Attrition BusinessTravel DailyRate
## 0 0 0
## Department DistanceFromHome Education
## 0 0 0
## EducationField EmployeeCount EmployeeNumber
## 0 0 0
## EnvironmentSatisfaction Gender HourlyRate
## 0 0 0
## JobInvolvement JobLevel JobRole
## 0 0 0
## JobSatisfaction MaritalStatus MonthlyIncome
## 0 0 0
## SalarySlab MonthlyRate NumCompaniesWorked
## 0 0 0
## Over18 OverTime PercentSalaryHike
## 0 0 0
## PerformanceRating RelationshipSatisfaction StandardHours
## 0 0 0
## StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 0 0 0
## WorkLifeBalance YearsAtCompany YearsInCurrentRole
## 0 0 0
## YearsSinceLastPromotion YearsWithCurrManager
## 0 0
# mengecek apakah ada data duplikat
sum(duplicated(data$EmpID))
## [1] 10
# menghapus baris dengan EmpID duplikat
data <- data[!duplicated(data$EmpID), ]
# mengecek kembali jumlah duplikat
sum(duplicated(data$EmpID))
## [1] 0
#mengecek jumlah nilai unik per kolom
sapply(data, function(x) length(unique(x)))
## EmpID Age AgeGroup
## 1470 43 5
## Attrition BusinessTravel DailyRate
## 2 4 886
## Department DistanceFromHome Education
## 3 29 5
## EducationField EmployeeCount EmployeeNumber
## 6 1 1470
## EnvironmentSatisfaction Gender HourlyRate
## 4 2 71
## JobInvolvement JobLevel JobRole
## 4 5 9
## JobSatisfaction MaritalStatus MonthlyIncome
## 4 3 1349
## SalarySlab MonthlyRate NumCompaniesWorked
## 4 1427 10
## Over18 OverTime PercentSalaryHike
## 1 2 15
## PerformanceRating RelationshipSatisfaction StandardHours
## 2 4 1
## StockOptionLevel TotalWorkingYears TrainingTimesLastYear
## 4 40 7
## WorkLifeBalance YearsAtCompany YearsInCurrentRole
## 4 37 19
## YearsSinceLastPromotion YearsWithCurrManager
## 16 18
# menghapus variabel tidak informatif
data <- data %>% select(-c(EmpID, EmployeeCount, Over18, StandardHours, EmployeeNumber))
# mengonversi kolom-kolom kategorik ke factor
categorical_cols <- c("JobRole", "Gender", "MaritalStatus", "BusinessTravel",
"Department", "EducationField", "Attrition",
"AgeGroup", "SalarySlab", "OverTime")
data[categorical_cols] <- lapply(data[categorical_cols], as.factor)
Dari plot distribusi: - Jumlah karyawan paling banyak ada di Sales Executive, Research Scientist, dan Laboratory Technician. - Jabatan dengan jumlah karyawan paling sedikit adalah Healthcare Representative dan Manufacturing Director.
Ini menunjukkan distribusi pekerjaan yang tidak merata, di mana fungsi sales, research, dan laboratorium mendominasi struktur organisasi.
Menggunakan plot proporsi, berikut insight per variabel:
Berdasarkan boxplot:
# melihat distribusi target JobRole
data %>%
count(JobRole) %>%
ggplot(aes(x = reorder(JobRole, n), y = n, fill = JobRole)) +
geom_bar(stat = "identity") +
coord_flip() + # Putar sumbu untuk label lebih rapi
theme_minimal() +
labs(title = "Distribusi Job Role", x = "Job Role", y = "Jumlah") +
theme(legend.position = "none")
# Hubungan variabel kategorikal dengan JobRole (1 plot per kategori)
categorical_predictors <- c("Gender", "MaritalStatus", "BusinessTravel",
"Department", "EducationField", "Attrition",
"AgeGroup", "SalarySlab", "OverTime")
# Atur ukuran per plot
options(repr.plot.width = 8, repr.plot.height = 5)
# Tampilkan satu per satu
for (var in categorical_predictors) {
p <- ggplot(data, aes_string(x = "JobRole", fill = var)) +
geom_bar(position = "fill") +
coord_flip() +
labs(title = paste("Proporsi", var, "per JobRole"), y = "Proporsi", x = NULL) +
theme_minimal(base_size = 10) +
theme(legend.position = "bottom")
print(p) # Tampilkan setiap plot secara terpisah
}
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# Hubungan variabel numerik dengan JobRole (1 plot per variabel)
# Pilih kolom numerik secara otomatis
numeric_vars <- data %>%
select(where(is.numeric)) %>%
colnames()
# Atur ukuran per plot
options(repr.plot.width = 8, repr.plot.height = 5)
# Tampilkan satu per satu
for (var in numeric_vars) {
p <- ggplot(data, aes_string(x = "JobRole", y = var)) +
geom_boxplot(fill = "lightblue") +
coord_flip() +
theme_minimal(base_size = 10) +
labs(title = paste("Boxplot", var, "per JobRole"),
x = NULL, y = var)
print(p) # Tampilkan tiap plot
}
Model regresi logistik multinomial dibangun untuk memprediksi jenis
pekerjaan (JobRole) berdasarkan berbagai fitur yang
tersedia. Model ini menggunakan fungsi multinom() dengan
semua variabel prediktor yang ada dalam dataset (~ .).
Analisis Koefisien
Variabel Kontinu
Evaluasi Kinerja Model
Confusion Matrix
Confusion Matrix
Interpretasi Hasil
Variabel paling berpengaruh Departemen, pendapatan bulanan, dan pengalaman kerja total → prediktor kuat berdasarkan besaran koefisien.
Pola prediksi
Keterbatasan model
Kesesuaian model
# load dan split data
set.seed(123)
split <- sample.split(data$JobRole, SplitRatio = 0.7)
train_data <- subset(data, split == TRUE)
test_data <- subset(data, split == FALSE)
# memisahkan prediktor (fitur) dan target (JobRole)
x_train <- train_data %>% select(-JobRole)
y_train <- train_data$JobRole
# melakukan oversampling untuk menyeimbangkan kelas
set.seed(123)
balanced_train <- upSample(
x = x_train, # Fitur
y = y_train, # Target
yname = "JobRole" # Nama variabel target
)
# mengecek distribusi kelas setelah oversampling
table(balanced_train$JobRole)
##
## Healthcare Representative Human Resources Laboratory Technician
## 228 228 228
## Manager Manufacturing Director Research Director
## 228 228 228
## Research Scientist Sales Executive Sales Representative
## 228 228 228
# memilih fitur-fitur penting
selected_features <- c("Gender", "MaritalStatus", "Department", "BusinessTravel", "EducationField",
"OverTime", "Age", "MonthlyIncome", "TotalWorkingYears",
"YearsAtCompany", "DistanceFromHome", "TrainingTimesLastYear",
"NumCompaniesWorked")
# mengambil fitur dan target akhir untuk pelatihan
x_balanced <- balanced_train %>% select(all_of(selected_features))
y_balanced <- balanced_train$JobRole
# mengubah variabel karakter menjadi faktor
x_balanced[] <- lapply(x_balanced, function(x) if(is.character(x)) as.factor(x) else x)
# encode variabel kategorikal menjadi dummy variables
dummies_model <- dummyVars(~ ., data = x_balanced)
# menerapkan transformasi ke x_balanced
x_balanced_encoded <- predict(dummies_model, newdata = x_balanced)
# mengkonversi ke data frame
x_balanced_encoded <- as.data.frame(x_balanced_encoded)
# memeriksa hasil
head(x_balanced_encoded)
## Gender.Female Gender.Male MaritalStatus.Divorced MaritalStatus.Married
## 1 0 1 1 0
## 2 0 1 0 1
## 3 1 0 0 0
## 4 0 1 0 0
## 5 0 1 0 0
## 6 1 0 0 1
## MaritalStatus.Single Department.Human Resources
## 1 0 0
## 2 0 0
## 3 1 0
## 4 1 0
## 5 1 0
## 6 0 0
## Department.Research & Development Department.Sales BusinessTravel.Non-Travel
## 1 1 0 1
## 2 1 0 0
## 3 1 0 0
## 4 1 0 0
## 5 1 0 1
## 6 1 0 0
## BusinessTravel.Travel_Frequently BusinessTravel.Travel_Rarely
## 1 0 0
## 2 1 0
## 3 0 1
## 4 0 1
## 5 0 0
## 6 0 1
## BusinessTravel.TravelRarely EducationField.Human Resources
## 1 0 0
## 2 0 0
## 3 0 0
## 4 0 0
## 5 0 0
## 6 0 0
## EducationField.Life Sciences EducationField.Marketing EducationField.Medical
## 1 1 0 0
## 2 0 0 1
## 3 0 0 1
## 4 0 0 1
## 5 0 0 0
## 6 0 0 1
## EducationField.Other EducationField.Technical Degree OverTime.No OverTime.Yes
## 1 0 0 1 0
## 2 0 0 0 1
## 3 0 0 1 0
## 4 0 0 1 0
## 5 0 1 1 0
## 6 0 0 1 0
## Age MonthlyIncome TotalWorkingYears YearsAtCompany DistanceFromHome
## 1 25 4000 6 6 5
## 2 26 4741 5 5 11
## 3 27 6811 9 7 7
## 4 28 5661 9 8 16
## 5 28 8722 10 10 24
## 6 29 4335 11 8 27
## TrainingTimesLastYear NumCompaniesWorked
## 1 2 1
## 2 3 1
## 3 2 8
## 4 2 0
## 5 2 1
## 6 3 4
# normalisasi fitur numerik
x_balanced_scaled <- as.data.frame(scale(x_balanced_encoded))
# memeriksa hasil normalisasi
summary(x_balanced_scaled)
## Gender.Female Gender.Male MaritalStatus.Divorced
## Min. :-0.8569 Min. :-1.1665 Min. :-0.5509
## 1st Qu.:-0.8569 1st Qu.:-1.1665 1st Qu.:-0.5509
## Median :-0.8569 Median : 0.8569 Median :-0.5509
## Mean : 0.0000 Mean : 0.0000 Mean : 0.0000
## 3rd Qu.: 1.1665 3rd Qu.: 0.8569 3rd Qu.:-0.5509
## Max. : 1.1665 Max. : 0.8569 Max. : 1.8142
## MaritalStatus.Married MaritalStatus.Single Department.Human Resources
## Min. :-0.9643 Min. :-0.6313 Min. :-0.3715
## 1st Qu.:-0.9643 1st Qu.:-0.6313 1st Qu.:-0.3715
## Median :-0.9643 Median :-0.6313 Median :-0.3715
## Mean : 0.0000 Mean : 0.0000 Mean : 0.0000
## 3rd Qu.: 1.0365 3rd Qu.: 1.5832 3rd Qu.:-0.3715
## Max. : 1.0365 Max. : 1.5832 Max. : 2.6902
## Department.Research & Development Department.Sales BusinessTravel.Non-Travel
## Min. :-1.2701 Min. :-0.5945 Min. :-0.3128
## 1st Qu.:-1.2701 1st Qu.:-0.5945 1st Qu.:-0.3128
## Median : 0.7869 Median :-0.5945 Median :-0.3128
## Mean : 0.0000 Mean : 0.0000 Mean : 0.0000
## 3rd Qu.: 0.7869 3rd Qu.: 1.6814 3rd Qu.:-0.3128
## Max. : 0.7869 Max. : 1.6814 Max. : 3.1950
## BusinessTravel.Travel_Frequently BusinessTravel.Travel_Rarely
## Min. :-0.4604 Min. :-1.644
## 1st Qu.:-0.4604 1st Qu.:-1.644
## Median :-0.4604 Median : 0.608
## Mean : 0.0000 Mean : 0.000
## 3rd Qu.:-0.4604 3rd Qu.: 0.608
## Max. : 2.1711 Max. : 0.608
## BusinessTravel.TravelRarely EducationField.Human Resources
## Min. :-0.07668 Min. :-0.2298
## 1st Qu.:-0.07668 1st Qu.:-0.2298
## Median :-0.07668 Median :-0.2298
## Mean : 0.00000 Mean : 0.0000
## 3rd Qu.:-0.07668 3rd Qu.:-0.2298
## Max. :13.03523 Max. : 4.3489
## EducationField.Life Sciences EducationField.Marketing EducationField.Medical
## Min. :-0.8611 Min. :-0.2947 Min. :-0.651
## 1st Qu.:-0.8611 1st Qu.:-0.2947 1st Qu.:-0.651
## Median :-0.8611 Median :-0.2947 Median :-0.651
## Mean : 0.0000 Mean : 0.0000 Mean : 0.000
## 3rd Qu.: 1.1607 3rd Qu.:-0.2947 3rd Qu.: 1.535
## Max. : 1.1607 Max. : 3.3921 Max. : 1.535
## EducationField.Other EducationField.Technical Degree OverTime.No
## Min. :-0.2535 Min. :-0.3062 Min. :-1.5553
## 1st Qu.:-0.2535 1st Qu.:-0.3062 1st Qu.:-1.5553
## Median :-0.2535 Median :-0.3062 Median : 0.6427
## Mean : 0.0000 Mean : 0.0000 Mean : 0.0000
## 3rd Qu.:-0.2535 3rd Qu.:-0.3062 3rd Qu.: 0.6427
## Max. : 3.9422 Max. : 3.2640 Max. : 0.6427
## OverTime.Yes Age MonthlyIncome TotalWorkingYears
## Min. :-0.6427 Min. :-2.11515 Min. :-1.1735 Min. :-1.4228
## 1st Qu.:-0.6427 1st Qu.:-0.73178 1st Qu.:-0.8307 1st Qu.:-0.7407
## Median :-0.6427 Median :-0.09329 Median :-0.3950 Median :-0.2859
## Mean : 0.0000 Mean : 0.00000 Mean : 0.0000 Mean : 0.0000
## 3rd Qu.: 1.5553 3rd Qu.: 0.65160 3rd Qu.: 0.5946 3rd Qu.: 0.7373
## Max. : 1.5553 Max. : 2.35422 Max. : 2.2142 Max. : 3.1247
## YearsAtCompany DistanceFromHome TrainingTimesLastYear NumCompaniesWorked
## Min. :-1.0899 Min. :-1.0382 Min. :-2.2068 Min. :-1.0771
## 1st Qu.:-0.6637 1st Qu.:-0.9150 1st Qu.:-0.6431 1st Qu.:-0.6890
## Median :-0.3796 Median :-0.2993 Median : 0.1387 Median :-0.3009
## Mean : 0.0000 Mean : 0.0000 Mean : 0.0000 Mean : 0.0000
## 3rd Qu.: 0.3307 3rd Qu.: 0.6859 3rd Qu.: 0.1387 3rd Qu.: 0.4753
## Max. : 4.5927 Max. : 2.4100 Max. : 2.4842 Max. : 2.4157
# menghitung matriks korelasi
cor_matrix <- cor(x_balanced_scaled)
# mencari pasangan fitur dengan korelasi tinggi
high_cor_pairs <- which(abs(cor_matrix) > 0.9 & abs(cor_matrix) < 1, arr.ind = TRUE)
# menghapus duplikat (karena matriks simetris)
high_cor_pairs <- high_cor_pairs[high_cor_pairs[,1] < high_cor_pairs[,2], ]
# melihat hasil pasangan fitur dengan korelasi tinggi
if (nrow(high_cor_pairs) > 0) {
data.frame(
Fitur_1 = colnames(x_balanced_scaled)[high_cor_pairs[,1]],
Fitur_2 = colnames(x_balanced_scaled)[high_cor_pairs[,2]],
Korelasi = cor_matrix[high_cor_pairs]
)
} else {
print("Tidak ada pasangan fitur dengan korelasi > 0.9")
}
## [1] "Tidak ada pasangan fitur dengan korelasi > 0.9"
# mengabungkan fitur dan target ke dalam satu data frame
rf_data <- x_balanced_scaled
rf_data$JobRole <- y_balanced
# fit model multinomial
multinom_model <- multinom(JobRole ~ ., data = rf_data)
## # weights: 261 (224 variable)
## initial value 4508.704833
## iter 10 value 2162.898516
## iter 20 value 1560.075623
## iter 30 value 1255.111597
## iter 40 value 1192.001751
## iter 50 value 1166.385568
## iter 60 value 1149.197785
## iter 70 value 1135.265369
## iter 80 value 1124.385279
## iter 90 value 1113.938609
## iter 100 value 1110.487528
## final value 1110.487528
## stopped after 100 iterations
# melihat ringkasan model
summary(multinom_model)
## Call:
## multinom(formula = JobRole ~ ., data = rf_data)
##
## Coefficients:
## (Intercept) Gender.Female Gender.Male
## Human Resources -1.061685 0.10136605 -0.10136605
## Laboratory Technician -2.831749 -0.13803225 0.13803225
## Manager -5.927352 0.27334716 -0.27334716
## Manufacturing Director 2.167369 0.01939807 -0.01939807
## Research Director -6.189242 0.12929801 -0.12929801
## Research Scientist -2.619415 -0.06873669 0.06873669
## Sales Executive -1.898217 0.31232469 -0.31232469
## Sales Representative -17.373032 0.21351011 -0.21351011
## MaritalStatus.Divorced MaritalStatus.Married
## Human Resources -0.24445127 0.00192417
## Laboratory Technician 0.03417554 -0.05065538
## Manager 0.05759020 -0.60750332
## Manufacturing Director 0.11046715 -0.05616205
## Research Director 0.25390993 -0.75734562
## Research Scientist 0.00744938 -0.12009636
## Sales Executive -0.20267892 -0.41371175
## Sales Representative -0.47980325 -0.15964199
## MaritalStatus.Single `Department.Human Resources`
## Human Resources 0.22675441 4.2830480
## Laboratory Technician 0.02406655 0.1288996
## Manager 0.61846603 2.2224973
## Manufacturing Director -0.04127184 1.0610813
## Research Director 0.60049481 1.1863110
## Research Scientist 0.12594847 0.7687039
## Sales Executive 0.64767084 1.0484758
## Sales Representative 0.62594119 0.5585218
## `Department.Research & Development` Department.Sales
## Human Resources -3.6079156 0.8080047
## Laboratory Technician -0.8289166 0.8212575
## Manager -4.1369793 2.9249447
## Manufacturing Director -1.2293710 0.5714059
## Research Director -1.9676342 1.2950973
## Research Scientist -1.0135140 0.5499182
## Sales Executive -5.4127090 5.2089967
## Sales Representative -5.5457630 5.7203845
## `BusinessTravel.Non-Travel`
## Human Resources 0.036332207
## Laboratory Technician -0.190755600
## Manager 0.375318091
## Manufacturing Director -0.197406714
## Research Director 0.288140954
## Research Scientist -0.172676849
## Sales Executive 0.467183241
## Sales Representative -0.006820797
## BusinessTravel.Travel_Frequently
## Human Resources -0.06064750
## Laboratory Technician 0.03531266
## Manager 0.33475455
## Manufacturing Director 0.05077896
## Research Director 0.07980227
## Research Scientist 0.01364786
## Sales Executive 0.70204478
## Sales Representative 0.42440102
## BusinessTravel.Travel_Rarely BusinessTravel.TravelRarely
## Human Resources -0.124094230 0.8889181
## Laboratory Technician -0.100849509 1.1242559
## Manager -0.490239555 -0.2165021
## Manufacturing Director 0.148279419 -0.3784864
## Research Director -0.339436496 0.5016792
## Research Scientist -0.002881688 0.5942185
## Sales Executive -0.996428162 0.5572517
## Sales Representative -0.529319146 0.9927361
## `EducationField.Human Resources`
## Human Resources 0.376945138
## Laboratory Technician 0.541305285
## Manager 1.803316219
## Manufacturing Director 0.646161525
## Research Director 2.156712060
## Research Scientist 0.003569817
## Sales Executive 0.897902232
## Sales Representative 0.476872446
## `EducationField.Life Sciences` EducationField.Marketing
## Human Resources -0.2500912 1.4164515
## Laboratory Technician -0.3169493 1.0932912
## Manager -0.5058736 1.0867148
## Manufacturing Director -0.2779794 0.9456363
## Research Director -0.3521574 1.2510551
## Research Scientist -0.2633623 1.1301857
## Sales Executive -0.5248653 1.2707762
## Sales Representative -0.5060992 0.6556319
## EducationField.Medical EducationField.Other
## Human Resources -0.33242769 -0.312937691
## Laboratory Technician -0.39229867 -0.001306596
## Manager 0.03175927 0.082924610
## Manufacturing Director -0.36678143 -0.117453920
## Research Director -0.23242821 -0.430308336
## Research Scientist -0.30909279 -0.073925784
## Sales Executive 0.03135411 0.168848582
## Sales Representative -0.41392748 0.136990381
## `EducationField.Technical Degree` OverTime.No
## Human Resources -0.41483906 0.18783906
## Laboratory Technician -0.27939787 0.09019187
## Manager -1.68761191 0.36545109
## Manufacturing Director -0.22982288 0.11717744
## Research Director -1.52562133 0.21882853
## Research Scientist -0.06453701 -0.06849352
## Sales Executive -1.19878100 0.07152328
## Sales Representative 0.44631871 0.33651536
## OverTime.Yes Age MonthlyIncome TotalWorkingYears
## Human Resources -0.18783906 -0.4076193 -0.8478695 0.7426855
## Laboratory Technician -0.09019187 -0.1400765 -9.5540467 0.6310785
## Manager -0.36545109 -0.2728248 17.8125549 -4.2987722
## Manufacturing Director -0.11717744 -0.1313403 -0.2994945 0.3059534
## Research Director -0.21882853 -0.1948751 16.5360052 -3.9711549
## Research Scientist 0.06849352 -0.2104517 -9.3061510 0.4066149
## Sales Executive -0.07152328 -0.8750792 3.3263893 -1.0046587
## Sales Representative -0.33651536 -1.0790986 -21.1890476 4.2944762
## YearsAtCompany DistanceFromHome TrainingTimesLastYear
## Human Resources -1.7352144 -0.27476949 0.13751423
## Laboratory Technician 0.1180779 -0.04031161 0.07704332
## Manager -0.4699736 -0.96191303 0.43099031
## Manufacturing Director -0.3649694 0.12924888 0.27550769
## Research Director -0.8010961 -0.68218089 0.08982287
## Research Scientist 0.4727734 -0.01352127 -0.16263993
## Sales Executive 0.8393971 -0.25272838 0.67532489
## Sales Representative -3.6331234 -0.92952806 0.81893442
## NumCompaniesWorked
## Human Resources -0.4040001
## Laboratory Technician 0.2499453
## Manager -0.8959202
## Manufacturing Director -0.1374286
## Research Director -0.4226508
## Research Scientist 0.1912795
## Sales Executive -0.1535980
## Sales Representative -1.7219733
##
## Std. Errors:
## (Intercept) Gender.Female Gender.Male
## Human Resources 22.72266 0.51830249 0.51830249
## Laboratory Technician 22.71487 0.06922736 0.06922736
## Manager 22.72324 0.16125453 0.16125453
## Manufacturing Director 22.38426 0.05003849 0.05003849
## Research Director 22.73131 0.15364205 0.15364205
## Research Scientist 22.70713 0.06857740 0.06857740
## Sales Executive 22.73091 0.27840855 0.27840855
## Sales Representative 23.03226 0.32069309 0.32069309
## MaritalStatus.Divorced MaritalStatus.Married
## Human Resources 0.79647880 0.61916303
## Laboratory Technician 0.09918692 0.08286235
## Manager 0.22680019 0.22466866
## Manufacturing Director 0.06899462 0.06023252
## Research Director 0.21105757 0.21345691
## Research Scientist 0.10010336 0.08307996
## Sales Executive 0.40170698 0.37721323
## Sales Representative 0.49362193 0.43011392
## MaritalStatus.Single `Department.Human Resources`
## Human Resources 0.70866109 13.81664
## Laboratory Technician 0.09481463 14.06096
## Manager 0.25997603 13.81220
## Manufacturing Director 0.07017209 13.83810
## Research Director 0.25012008 13.91120
## Research Scientist 0.09396103 13.85911
## Sales Executive 0.45205797 13.90057
## Sales Representative 0.49830219 13.91128
## `Department.Research & Development` Department.Sales
## Human Resources 5.588966 5.872572
## Laboratory Technician 5.645735 5.911951
## Manager 5.550055 5.804298
## Manufacturing Director 5.549370 5.804041
## Research Director 5.601843 5.878475
## Research Scientist 5.579726 5.866031
## Sales Executive 5.596493 5.833327
## Sales Representative 5.622554 5.858178
## `BusinessTravel.Non-Travel`
## Human Resources 1.5146794
## Laboratory Technician 1.2897727
## Manager 1.3324990
## Manufacturing Director 0.1926102
## Research Director 1.2984347
## Research Scientist 1.2923830
## Sales Executive 1.3614794
## Sales Representative 1.3787395
## BusinessTravel.Travel_Frequently
## Human Resources 1.8406023
## Laboratory Technician 1.7158881
## Manager 1.7572128
## Manufacturing Director 0.2349405
## Research Director 1.7137198
## Research Scientist 1.7191772
## Sales Executive 1.7628996
## Sales Representative 1.7772248
## BusinessTravel.Travel_Rarely BusinessTravel.TravelRarely
## Human Resources 2.0719116 24.931826
## Laboratory Technician 2.0035403 25.001326
## Manager 2.0484754 25.503782
## Manufacturing Director 0.2687395 3.286893
## Research Director 1.9982559 24.883985
## Research Scientist 2.0073552 25.048710
## Sales Executive 2.0359968 25.016376
## Sales Representative 2.0454745 25.079538
## `EducationField.Human Resources`
## Human Resources 2.749063
## Laboratory Technician 3.373668
## Manager 3.301233
## Manufacturing Director 2.913538
## Research Director 3.570334
## Research Scientist 2.986541
## Sales Executive 3.150120
## Sales Representative 3.455380
## `EducationField.Life Sciences` EducationField.Marketing
## Human Resources 13.96527 63.08817
## Laboratory Technician 13.95519 63.09054
## Manager 13.95334 63.07935
## Manufacturing Director 13.95213 63.08965
## Research Director 13.95673 63.08799
## Research Scientist 13.95299 63.09115
## Sales Executive 13.95720 63.07892
## Sales Representative 13.96181 63.07955
## EducationField.Medical EducationField.Other
## Human Resources 12.92765 6.781406
## Laboratory Technician 12.90517 6.725455
## Manager 12.90379 6.730952
## Manufacturing Director 12.90229 6.723685
## Research Director 12.90688 6.732548
## Research Scientist 12.90314 6.724446
## Sales Executive 12.90961 6.768577
## Sales Representative 12.91536 6.858914
## `EducationField.Technical Degree` OverTime.No
## Human Resources 7.965800 0.51246828
## Laboratory Technician 7.903698 0.07095825
## Manager 7.907328 0.14187310
## Manufacturing Director 7.901342 0.05099014
## Research Director 7.908392 0.13183064
## Research Scientist 7.902306 0.06805939
## Sales Executive 7.912314 0.28381040
## Sales Representative 7.919252 0.34028587
## OverTime.Yes Age MonthlyIncome TotalWorkingYears
## Human Resources 0.51246828 1.3303039 3.5255546 2.0498909
## Laboratory Technician 0.07095825 0.1850113 0.7574677 0.3798090
## Manager 0.14187310 0.5458637 2.2462769 0.8547606
## Manufacturing Director 0.05099014 0.1464605 0.2777899 0.2345761
## Research Director 0.13183064 0.4963485 2.2343777 0.8090067
## Research Scientist 0.06805939 0.1850914 0.7488797 0.3906337
## Sales Executive 0.28381040 0.7781735 2.8575307 1.3814736
## Sales Representative 0.34028587 0.8798438 4.6458426 1.8058351
## YearsAtCompany DistanceFromHome TrainingTimesLastYear
## Human Resources 1.6630746 1.17540095 1.03689344
## Laboratory Technician 0.3153891 0.13140265 0.13566470
## Manager 0.3411488 0.32419139 0.29084910
## Manufacturing Director 0.1614502 0.09505565 0.09469771
## Research Director 0.3295810 0.30650082 0.27306036
## Research Scientist 0.3161834 0.13094267 0.13859892
## Sales Executive 0.7499578 0.57736950 0.62692451
## Sales Representative 1.3626403 0.67082037 0.70038892
## NumCompaniesWorked
## Human Resources 1.2029859
## Laboratory Technician 0.1553793
## Manager 0.3427836
## Manufacturing Director 0.1165053
## Research Director 0.3207304
## Research Scientist 0.1554070
## Sales Executive 0.5931364
## Sales Representative 0.8708416
##
## Residual Deviance: 2220.975
## AIC: 2572.975
# menghitung akurasi Model Multinomial
# memprediksi JobRole menggunakan model
predicted_class <- predict(multinom_model, newdata = rf_data)
# membandingkan dengan nilai asli
actual_class <- rf_data$JobRole
# menghitung akurasi
accuracy <- mean(predicted_class == actual_class)
# mencetak hasil akurasi
print(paste("Akurasi model:", round(accuracy * 100, 2), "%"))
## [1] "Akurasi model: 74.9 %"
#confusion matrix untuk melihat distribusi prediksi
table(Predicted = predicted_class, Actual = actual_class)
## Actual
## Predicted Healthcare Representative Human Resources
## Healthcare Representative 119 0
## Human Resources 0 228
## Laboratory Technician 10 0
## Manager 0 0
## Manufacturing Director 85 0
## Research Director 2 0
## Research Scientist 12 0
## Sales Executive 0 0
## Sales Representative 0 0
## Actual
## Predicted Laboratory Technician Manager
## Healthcare Representative 14 0
## Human Resources 0 0
## Laboratory Technician 135 0
## Manager 0 175
## Manufacturing Director 10 0
## Research Director 0 53
## Research Scientist 69 0
## Sales Executive 0 0
## Sales Representative 0 0
## Actual
## Predicted Manufacturing Director Research Director
## Healthcare Representative 75 5
## Human Resources 0 0
## Laboratory Technician 7 0
## Manager 0 22
## Manufacturing Director 112 2
## Research Director 9 199
## Research Scientist 25 0
## Sales Executive 0 0
## Sales Representative 0 0
## Actual
## Predicted Research Scientist Sales Executive
## Healthcare Representative 10 0
## Human Resources 0 0
## Laboratory Technician 78 0
## Manager 0 0
## Manufacturing Director 21 0
## Research Director 0 0
## Research Scientist 119 0
## Sales Executive 0 224
## Sales Representative 0 4
## Actual
## Predicted Sales Representative
## Healthcare Representative 0
## Human Resources 0
## Laboratory Technician 0
## Manager 0
## Manufacturing Director 0
## Research Director 0
## Research Scientist 0
## Sales Executive 2
## Sales Representative 226
#menghitung precision, recall, dan F1-score per kelas
confusion <- confusionMatrix(factor(predicted_class), factor(actual_class))
print(confusion)
## Confusion Matrix and Statistics
##
## Reference
## Prediction Healthcare Representative Human Resources
## Healthcare Representative 119 0
## Human Resources 0 228
## Laboratory Technician 10 0
## Manager 0 0
## Manufacturing Director 85 0
## Research Director 2 0
## Research Scientist 12 0
## Sales Executive 0 0
## Sales Representative 0 0
## Reference
## Prediction Laboratory Technician Manager
## Healthcare Representative 14 0
## Human Resources 0 0
## Laboratory Technician 135 0
## Manager 0 175
## Manufacturing Director 10 0
## Research Director 0 53
## Research Scientist 69 0
## Sales Executive 0 0
## Sales Representative 0 0
## Reference
## Prediction Manufacturing Director Research Director
## Healthcare Representative 75 5
## Human Resources 0 0
## Laboratory Technician 7 0
## Manager 0 22
## Manufacturing Director 112 2
## Research Director 9 199
## Research Scientist 25 0
## Sales Executive 0 0
## Sales Representative 0 0
## Reference
## Prediction Research Scientist Sales Executive
## Healthcare Representative 10 0
## Human Resources 0 0
## Laboratory Technician 78 0
## Manager 0 0
## Manufacturing Director 21 0
## Research Director 0 0
## Research Scientist 119 0
## Sales Executive 0 224
## Sales Representative 0 4
## Reference
## Prediction Sales Representative
## Healthcare Representative 0
## Human Resources 0
## Laboratory Technician 0
## Manager 0
## Manufacturing Director 0
## Research Director 0
## Research Scientist 0
## Sales Executive 2
## Sales Representative 226
##
## Overall Statistics
##
## Accuracy : 0.749
## 95% CI : (0.7297, 0.7677)
## No Information Rate : 0.1111
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.7177
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: Healthcare Representative Class: Human Resources
## Sensitivity 0.52193 1.0000
## Specificity 0.94298 1.0000
## Pos Pred Value 0.53363 1.0000
## Neg Pred Value 0.94040 1.0000
## Prevalence 0.11111 0.1111
## Detection Rate 0.05799 0.1111
## Detection Prevalence 0.10867 0.1111
## Balanced Accuracy 0.73246 1.0000
## Class: Laboratory Technician Class: Manager
## Sensitivity 0.59211 0.76754
## Specificity 0.94792 0.98794
## Pos Pred Value 0.58696 0.88832
## Neg Pred Value 0.94896 0.97143
## Prevalence 0.11111 0.11111
## Detection Rate 0.06579 0.08528
## Detection Prevalence 0.11209 0.09600
## Balanced Accuracy 0.77001 0.87774
## Class: Manufacturing Director Class: Research Director
## Sensitivity 0.49123 0.87281
## Specificity 0.93531 0.96491
## Pos Pred Value 0.48696 0.75665
## Neg Pred Value 0.93633 0.98379
## Prevalence 0.11111 0.11111
## Detection Rate 0.05458 0.09698
## Detection Prevalence 0.11209 0.12817
## Balanced Accuracy 0.71327 0.91886
## Class: Research Scientist Class: Sales Executive
## Sensitivity 0.52193 0.9825
## Specificity 0.94189 0.9989
## Pos Pred Value 0.52889 0.9912
## Neg Pred Value 0.94034 0.9978
## Prevalence 0.11111 0.1111
## Detection Rate 0.05799 0.1092
## Detection Prevalence 0.10965 0.1101
## Balanced Accuracy 0.73191 0.9907
## Class: Sales Representative
## Sensitivity 0.9912
## Specificity 0.9978
## Pos Pred Value 0.9826
## Neg Pred Value 0.9989
## Prevalence 0.1111
## Detection Rate 0.1101
## Detection Prevalence 0.1121
## Balanced Accuracy 0.9945
summary(cars)
## speed dist
## Min. : 4.0 Min. : 2.00
## 1st Qu.:12.0 1st Qu.: 26.00
## Median :15.0 Median : 36.00
## Mean :15.4 Mean : 42.98
## 3rd Qu.:19.0 3rd Qu.: 56.00
## Max. :25.0 Max. :120.00
```
You can also embed plots, for example:
Note that the echo = FALSE parameter was added to the
code chunk to prevent printing of the R code that generated the
plot.