1 Introduction

A neural network is a type of machine learning algorithm that is designed to recognize patterns in data. It’s inspired by the structure and function of the human brain, where interconnected neurons work together to process and transmit information. Neural networks consist of layers of interconnected nodes (neurons) that process input data and produce output. These layers are typically organized into three main types:

Input Layer: This layer receives the raw input data and passes it on to the next layer for processing.

Hidden Layers: These are one or more layers between the input and output layers. Each neuron in a hidden layer processes the information it receives from the previous layer and passes the output to the next layer. The hidden layers are responsible for learning complex patterns in the data.

Output Layer: This layer produces the final prediction or output based on the information processed by the hidden layers. The structure of the output layer depends on the type of problem being solved, such as classification or regression.

Neural networks “learn” from data through a process called training. During training, the network adjusts the connections between neurons (weights) based on the input data and the desired output. This adjustment is done iteratively using optimization algorithms that minimize the difference between the predicted output and the actual target.

Neural networks have gained significant popularity due to their ability to learn from complex and high-dimensional data, making them suitable for tasks such as image and speech recognition, language translation, playing games, autonomous driving, and more. They can capture intricate relationships in data that may be difficult for traditional algorithms to discover.

While neural networks can achieve remarkable accuracy, they also come with challenges. They require large amounts of data for training, and determining the right architecture (number of layers and neurons) and optimization techniques can be complex. Deep learning, a subset of neural networks, involves using multiple layers to create complex models and has been a driving force behind many recent breakthroughs in AI.

In summary, neural networks are a powerful tool in the field of machine learning, capable of learning and representing complex patterns in data. They have enabled significant advancements in various domains and continue to be a focus of research and development.

1.1 Feature Conversion for `neuralnet`

neuralnet() requires all features to be in the numeric form (dummy variable for categorical features, normalization of numerical features). The model formula in neuralnet() requires dummy variables to be explicitly defined. It is also highly recommended to scale all numerical features before being included in the network model. The objective is to find all feature names (numeric and all dummy variables) and write them in the model formula like the one in glm: response ~ var_1 + var_2 + ... +var_k

To explain the modeling process in detail, we will outline major steps in the following subsections.

1.2 Numeric Feature Scaling

There are different types of scaling and standardization. The one we use in the following has

\[ scaled.var = \frac{orig.var - \min(orig.var)}{\max(orig.var)-\min(orig.var)} \] The scaled numeric feature is unitless (similar to the well-known z-score transformation).

attrition = read.csv("https://raw.githubusercontent.com/Tenam01/DATASETS/main/cleanedattrition2.csv")
attrition$Age = (attrition$Age-min(attrition$Age))/(max(attrition$Age)-min(attrition$Age))
attrition$DistanceFromHome = (attrition$DistanceFromHome-min(attrition$DistanceFromHome))/(max(attrition$DistanceFromHome)-min(attrition$DistanceFromHome))
attrition$Education = (attrition$Education-min(attrition$Education))/(max(attrition$Education)-min(attrition$Education))
attrition$EnvironmentSatisfaction = (attrition$EnvironmentSatisfaction-min(attrition$EnvironmentSatisfaction))/(max(attrition$EnvironmentSatisfaction)-min(attrition$EnvironmentSatisfaction))
attrition$JobInvolvement = (attrition$JobInvolvement-min(attrition$JobInvolvement))/(max(attrition$JobInvolvement)-min(attrition$JobInvolvement))
attrition$JobLevel = (attrition$JobLevel-min(attrition$JobLevel))/(max(attrition$JobLevel)-min(attrition$JobLevel))
attrition$JobSatisfaction = (attrition$JobSatisfaction-min(attrition$JobSatisfaction))/(max(attrition$JobSatisfaction)-min(attrition$JobSatisfaction))
attrition$NumCompaniesWorked = (attrition$NumCompaniesWorked-min(attrition$NumCompaniesWorked))/(max(attrition$NumCompaniesWorked)-min(attrition$NumCompaniesWorked))
attrition$MonthlyIncome = (attrition$MonthlyIncome-min(attrition$MonthlyIncome))/(max(attrition$MonthlyIncome)-min(attrition$MonthlyIncome))
attrition$PercentSalaryHike = (attrition$PercentSalaryHike-min(attrition$PercentSalaryHike))/(max(attrition$PercentSalaryHike)-min(attrition$PercentSalaryHike))
attrition$PerformanceRating = (attrition$PerformanceRating-min(attrition$PerformanceRating))/(max(attrition$PerformanceRating)-min(attrition$PerformanceRating))
attrition$RelationshipSatisfaction = (attrition$RelationshipSatisfaction-min(attrition$RelationshipSatisfaction))/(max(attrition$RelationshipSatisfaction)-min(attrition$RelationshipSatisfaction))
attrition$StockOptionLevel = (attrition$StockOptionLevel-min(attrition$StockOptionLevel))/(max(attrition$StockOptionLevel)-min(attrition$StockOptionLevel))
attrition$TrainingTimesLastYear = (attrition$TrainingTimesLastYear-min(attrition$TrainingTimesLastYear))/(max(attrition$TrainingTimesLastYear)-min(attrition$TrainingTimesLastYear))
attrition$WorkLifeBalance = (attrition$WorkLifeBalance-min(attrition$WorkLifeBalance))/(max(attrition$WorkLifeBalance)-min(attrition$WorkLifeBalance))
attrition$YearsInCurrentRole = (attrition$YearsInCurrentRole-min(attrition$YearsInCurrentRole))/(max(attrition$YearsInCurrentRole)-min(attrition$YearsInCurrentRole))
attrition$YearsSinceLastPromotion = (attrition$YearsSinceLastPromotion-min(attrition$YearsSinceLastPromotion))/(max(attrition$YearsSinceLastPromotion)-min(attrition$YearsSinceLastPromotion))
attrition$YearsSinceLastPromotion = (attrition$YearsWithCurrManager-min(attrition$YearsWithCurrManager))/(max(attrition$YearsWithCurrManager)-min(attrition$YearsWithCurrManager))

1.3 Extract All Feature Names

In practical applications, there may be many categorical features in the model and each category could have many categories. It is practically infeasible to write all resulting dummy features explicitly. We can use the R function to extract variables from a model formula that will be used in a model. Make sure, all categorical feature variables must be defined in a non-numerical form (i.e., should not be numerically encoded). We can also use the R function relevel() to change the baseline of an unordered categorical feature variable.

# Convert "Attrition" to a factor variable
attrition$Attrition <- factor(attrition$Attrition, levels = c("No", "Yes"))
# Change baseline for Attrition variable (assuming "No" is the baseline)
attrition$Attrition <- relevel(attrition$Attrition, ref = "No")

# Convert "BusinessTravel" to a factor variable
attrition$BusinessTravel <- factor(attrition$BusinessTravel, levels = c("Travel_Rarely", "Travel_Frequently", "Travel_Non"))
# Change baseline for BusinessTravel variable (assuming "Travel_Rarely" is the baseline)
attrition$BusinessTravel <- relevel(attrition$BusinessTravel, ref = "Travel_Rarely")

# Convert "Department" to a factor variable
attrition$Department <- factor(attrition$Department, levels = c("Human Resources", "Research & Development", "Sales"))
# Change baseline for Department variable (assuming "Sales" is the baseline)
attrition$Department <- relevel(attrition$Department, ref = "Sales")

# Convert "EducationField" to a factor variable
attrition$EducationField <- factor(attrition$EducationField, levels = c("Human Resources", "Life Sciences", "Marketing", "Medical", "Other", "Technical Degree"))
# Change baseline for EducationField variable (assuming "Technical Degree" is the baseline)
attrition$EducationField <- relevel(attrition$EducationField, ref = "Technical Degree")

# Convert "Gender" to a factor variable
attrition$Gender <- factor(attrition$Gender, levels = c("Female", "Male"))
# Change baseline for Gender variable (assuming "Male" is the baseline)
attrition$Gender <- relevel(attrition$Gender, ref = "Male")

# Convert "JobRole" to a factor variable
attrition$JobRole <- factor(attrition$JobRole, levels = c("Healthcare Representative", "Human Resources", "Laboratory Technician", "Manager", "Manufacturing Director", "Research Director", "Research Scientist", "Sales Executive", "Sales Representative"))
# Change baseline for JobRole variable (assuming "Research Director" is the baseline)
attrition$JobRole <- relevel(attrition$JobRole, ref = "Research Director")

# Convert "MaritalStatus" to a factor variable
attrition$MaritalStatus <- factor(attrition$MaritalStatus, levels = c("Divorced", "Married", "Single"))
# Change baseline for MaritalStatus variable (assuming "Single" is the baseline)
attrition$MaritalStatus <- relevel(attrition$MaritalStatus, ref = "Single")

# Convert "OverTime" to a factor variable
attrition$OverTime <- factor(attrition$OverTime, levels = c("No", "Yes"))
# Change baseline for OverTime variable (assuming "No" is the baseline)
attrition$OverTime <- relevel(attrition$OverTime, ref = "No")

Next, we use the R function model.matrix() to extract the names of all feature variables (including implicitly defined dummy feature variables from model.matrix()).

# Create a model matrix for modeling 
attritionMtx <- model.matrix(~ ., data = attrition)

# Get the column names of the model matrix
colnames(attritionMtx)

##  [1] "(Intercept)"                      "Age"                             
##  [3] "AttritionYes"                     "BusinessTravelTravel_Frequently" 
##  [5] "BusinessTravelTravel_Non"         "DepartmentHuman Resources"       
##  [7] "DepartmentResearch & Development" "DistanceFromHome"                
##  [9] "Education"                        "EducationFieldHuman Resources"   
## [11] "EducationFieldLife Sciences"      "EducationFieldMarketing"         
## [13] "EducationFieldMedical"            "EducationFieldOther"             
## [15] "EnvironmentSatisfaction"          "GenderFemale"                    
## [17] "JobInvolvement"                   "JobLevel"                        
## [19] "JobRoleHealthcare Representative" "JobRoleHuman Resources"          
## [21] "JobRoleLaboratory Technician"     "JobRoleManager"                  
## [23] "JobRoleManufacturing Director"    "JobRoleResearch Scientist"       
## [25] "JobRoleSales Executive"           "JobRoleSales Representative"     
## [27] "JobSatisfaction"                  "MaritalStatusDivorced"           
## [29] "MaritalStatusMarried"             "MonthlyIncome"                   
## [31] "NumCompaniesWorked"               "OverTimeYes"                     
## [33] "PercentSalaryHike"                "PerformanceRating"               
## [35] "RelationshipSatisfaction"         "StockOptionLevel"                
## [37] "TrainingTimesLastYear"            "WorkLifeBalance"                 
## [39] "YearsInCurrentRole"               "YearsSinceLastPromotion"         
## [41] "YearsWithCurrManager"

There are some naming issues in the above dummy feature variables for network modeling (although they are good for regular linear and generalized linear regression models). We need to rename them by excluding special characters in order to build neural network models. These issues can be avoided at the stage of feature engineering (if we initially planned to build neural network models). Next, we clean up the variables before defining the network model formula.

colnames(attritionMtx)[4] <- "BusinessTravelTravelFreq"
colnames(attritionMtx)[5] <- "BusinessTravelTravelNon"
colnames(attritionMtx)[6] <- "DepartmentHumanResources"
colnames(attritionMtx)[7] <- "DepartmentResearchDevelopment"
colnames(attritionMtx)[10] <- "EducationFieldHumanResources"
colnames(attritionMtx)[11] <- "EducationFieldLifeSciences"
colnames(attritionMtx)[19] <- "JobRoleHealthcareRepresentative"
colnames(attritionMtx)[20] <- "JobRoleHumanResources"
colnames(attritionMtx)[21] <- "JobRoleLaboratorytechnician"
colnames(attritionMtx)[23] <- "JobRoleManufacturingDirector"
colnames(attritionMtx)[24] <- "JobRoleResearchScientist"
colnames(attritionMtx)[25] <- "JobRoleSalesExecutive"
colnames(attritionMtx)[26] <- "JobRoleSalesRepresentative"

1.4 Define Model Formula

For convenience, we encourage you to use CamelCase notation (CamelCase is a way to separate the words in a phrase by making the first letter of each word capitalized and not using spaces) in naming feature variables.

# Get the column names of the model matrix (excluding intercept)
columnNames <- colnames(attritionMtx)[-1]

# Replace invalid characters and spaces in column names
columnNames <- gsub("[[:space:]]", "_", columnNames) # Replace spaces with underscores
columnNames <- gsub("[[:punct:]]", "_", columnNames) # Replace punctuations with underscores

# Create a formula string without AttritionYes on the right-hand side
modelFormula <- as.formula(paste("AttritionYes ~", 
                                 paste(setdiff(columnNames, "AttritionYes"), collapse = " + ")))

# Display the model formula
modelFormula

## AttritionYes ~ Age + BusinessTravelTravelFreq + BusinessTravelTravelNon + 
##     DepartmentHumanResources + DepartmentResearchDevelopment + 
##     DistanceFromHome + Education + EducationFieldHumanResources + 
##     EducationFieldLifeSciences + EducationFieldMarketing + EducationFieldMedical + 
##     EducationFieldOther + EnvironmentSatisfaction + GenderFemale + 
##     JobInvolvement + JobLevel + JobRoleHealthcareRepresentative + 
##     JobRoleHumanResources + JobRoleLaboratorytechnician + JobRoleManager + 
##     JobRoleManufacturingDirector + JobRoleResearchScientist + 
##     JobRoleSalesExecutive + JobRoleSalesRepresentative + JobSatisfaction + 
##     MaritalStatusDivorced + MaritalStatusMarried + MonthlyIncome + 
##     NumCompaniesWorked + OverTimeYes + PercentSalaryHike + PerformanceRating + 
##     RelationshipSatisfaction + StockOptionLevel + TrainingTimesLastYear + 
##     WorkLifeBalance + YearsInCurrentRole + YearsSinceLastPromotion + 
##     YearsWithCurrManager

1.5 Training and Testing Neural Network Model

We follow the routine steps for building a neural network model to predict Attrition.

1.5.1 Data Splitting

We split the data into 70% for training the neural network and 30% for testing.

n = dim(attritionMtx)[1]
testID = sample(1:n, round(n*0.7), replace = FALSE)
testDat = attritionMtx[testID,]
trainDat = attritionMtx[-testID,]

1.5.2 Build Neural Network Model

NetworkModel = neuralnet(modelFormula,
                         data = trainDat,
                         hidden = 1,               # single layer NN
                         rep = 1,                  # number of replicates in training NN
                         threshold = 0.01,         # threshold for the partial derivatives as stopping criteria.
                         learningrate = 0.1,       # user selected rate
                         algorithm = "rprop+"
                         )
kable(NetworkModel$result.matrix)

error	10.2249312
reached.threshold	0.0098644
steps	5295.0000000
Intercept.to.1layhid1	34.2345484
Age.to.1layhid1	-36.2326672
BusinessTravelTravelFreq.to.1layhid1	12.7631513
BusinessTravelTravelNon.to.1layhid1	0.2239382
DepartmentHumanResources.to.1layhid1	3.7016882
DepartmentResearchDevelopment.to.1layhid1	-0.5630887
DistanceFromHome.to.1layhid1	11.5560392
Education.to.1layhid1	2.6507646
EducationFieldHumanResources.to.1layhid1	-11.6188136
EducationFieldLifeSciences.to.1layhid1	-7.0547274
EducationFieldMarketing.to.1layhid1	-4.9758298
EducationFieldMedical.to.1layhid1	-7.8774701
EducationFieldOther.to.1layhid1	-6.3574417
EnvironmentSatisfaction.to.1layhid1	-13.0527514
GenderFemale.to.1layhid1	1.6887635
JobInvolvement.to.1layhid1	-26.6435888
JobLevel.to.1layhid1	38.2467912
JobRoleHealthcareRepresentative.to.1layhid1	4.9651668
JobRoleHumanResources.to.1layhid1	14.0401330
JobRoleLaboratorytechnician.to.1layhid1	-2.1312997
JobRoleManager.to.1layhid1	-489.2852726
JobRoleManufacturingDirector.to.1layhid1	-453.1908570
JobRoleResearchScientist.to.1layhid1	7.1827604
JobRoleSalesExecutive.to.1layhid1	12.8123777
JobRoleSalesRepresentative.to.1layhid1	15.1326463
JobSatisfaction.to.1layhid1	-19.4989237
MaritalStatusDivorced.to.1layhid1	-9.6420330
MaritalStatusMarried.to.1layhid1	-2.3013135
MonthlyIncome.to.1layhid1	-50.2861920
NumCompaniesWorked.to.1layhid1	17.7880731
OverTimeYes.to.1layhid1	13.6985724
PercentSalaryHike.to.1layhid1	-6.4494042
PerformanceRating.to.1layhid1	7.0671475
RelationshipSatisfaction.to.1layhid1	-0.0520577
StockOptionLevel.to.1layhid1	-13.6518023
TrainingTimesLastYear.to.1layhid1	-27.2224193
WorkLifeBalance.to.1layhid1	1.2252818
YearsInCurrentRole.to.1layhid1	-0.5336293
YearsSinceLastPromotion.to.1layhid1	-1.6566077
YearsWithCurrManager.to.1layhid1	-0.7181540
Intercept.to.AttritionYes	0.0519850
1layhid1.to.AttritionYes	0.9355885

plot(NetworkModel, rep="best")

Figure 12. Single-layer backpropagation Neural network model for Employee Attrition

logiModel = glm(factor(Attrition) ~., family = binomial, data = attrition)
pander(summary(logiModel)$coefficients)

Table continues below
	Estimate	Std. Error	z value
(Intercept)	0.3545	1.608	0.2204
Age	-1.502	0.5208	-2.884
BusinessTravelTravel_Frequently	0.8802	0.2047	4.299
DepartmentHuman Resources	-12.63	446	-0.02832
DepartmentResearch & Development	0.2472	1.113	0.222
DistanceFromHome	1.196	0.3055	3.913
Education	0.09759	0.3507	0.2783
EducationFieldHuman Resources	-0.02841	0.8175	-0.03475
EducationFieldLife Sciences	-0.8385	0.304	-2.758
EducationFieldMarketing	-0.4895	0.3912	-1.251
EducationFieldMedical	-0.903	0.3136	-2.879
EducationFieldOther	-1.133	0.4889	-2.318
EnvironmentSatisfaction	-1.07	0.2452	-4.363
GenderFemale	-0.3189	0.1842	-1.731
JobInvolvement	-1.677	0.3706	-4.525
JobLevel	-0.6883	1.183	-0.5818
JobRoleHealthcare Representative	1.016	0.9542	1.065
JobRoleHuman Resources	15.17	446	0.03402
JobRoleLaboratory Technician	2.614	0.9983	2.619
JobRoleManager	1.406	1.07	1.315
JobRoleManufacturing Director	1.203	0.9399	1.28
JobRoleResearch Scientist	1.699	1.001	1.698
JobRoleSales Executive	2.377	1.422	1.672
JobRoleSales Representative	3.43	1.51	2.272
JobSatisfaction	-1.041	0.2406	-4.329
MaritalStatusDivorced	-1.014	0.3419	-2.967
MaritalStatusMarried	-0.7231	0.248	-2.916
MonthlyIncome	0.7883	1.503	0.5246
NumCompaniesWorked	1.306	0.3418	3.822
OverTimeYes	1.82	0.1904	9.558
PercentSalaryHike	-0.3553	0.5425	-0.6549
PerformanceRating	0.02365	0.396	0.05972
RelationshipSatisfaction	-0.6659	0.2483	-2.681
StockOptionLevel	-0.6092	0.4656	-1.309
TrainingTimesLastYear	-0.9523	0.4452	-2.139
WorkLifeBalance	-0.9719	0.3725	-2.609
YearsInCurrentRole	-1.111	0.7599	-1.462
YearsSinceLastPromotion	-0.4548	0.7042	-0.6458

	Pr(>\|z\|)
(Intercept)	0.8256
Age	0.003921
BusinessTravelTravel_Frequently	1.713e-05
DepartmentHuman Resources	0.9774
DepartmentResearch & Development	0.8243
DistanceFromHome	9.099e-05
Education	0.7808
EducationFieldHuman Resources	0.9723
EducationFieldLife Sciences	0.005809
EducationFieldMarketing	0.2109
EducationFieldMedical	0.003989
EducationFieldOther	0.02043
EnvironmentSatisfaction	1.282e-05
GenderFemale	0.08339
JobInvolvement	6.042e-06
JobLevel	0.5607
JobRoleHealthcare Representative	0.287
JobRoleHuman Resources	0.9729
JobRoleLaboratory Technician	0.00883
JobRoleManager	0.1886
JobRoleManufacturing Director	0.2006
JobRoleResearch Scientist	0.08956
JobRoleSales Executive	0.0946
JobRoleSales Representative	0.02311
JobSatisfaction	1.501e-05
MaritalStatusDivorced	0.003009
MaritalStatusMarried	0.003546
MonthlyIncome	0.5999
NumCompaniesWorked	0.0001326
OverTimeYes	1.197e-21
PercentSalaryHike	0.5126
PerformanceRating	0.9524
RelationshipSatisfaction	0.00733
StockOptionLevel	0.1907
TrainingTimesLastYear	0.03244
WorkLifeBalance	0.009082
YearsInCurrentRole	0.1438
YearsSinceLastPromotion	0.5184

1.5.3 About Cross-validation in Neural Network

The algorithm of Cross-validation is primarily used for tuning hyper-parameters. For example, in the sigmoid perceptron, the optimal cut-off scores for the binary decision can be obtained through cross-validation. One of the important hyperparameters in the neural network model is the learning rate \(\alpha\) (in the backpropagation algorithm) that impacts the learning speed in training neural network models.

n0 <- dim(trainDat)[1] / 5
cut.off.score <- seq(0, 1, length = 22)[-c(1, 22)]
pred.accuracy <- matrix(0, ncol = 20, nrow = 5, byrow = TRUE)

for (i in 1:5) {
  valid.id <- ((i - 1) * n0 + 1):(i * n0)
  valid.data <- trainDat[valid.id,]
  train.data <- trainDat[-valid.id,]

  train.model <- neuralnet(modelFormula,
                           data = train.data,
                           hidden = 1,
                           rep = 1,
                           threshold = 0.01,
                           learningrate = 0.1,
                           algorithm = "rprop+"
  )
  
  pred.nn.score <- predict(train.model, valid.data)  # Use train.model for prediction
  for (j in 1:20) {
    pred.status <- as.numeric(pred.nn.score > cut.off.score[j])
    a11 <- sum(pred.status == valid.data[, 3])
    pred.accuracy[i, j] <- a11 / length(pred.nn.score)
  }
}

avg.accuracy <- apply(pred.accuracy, 2, mean)
max.id <- which(avg.accuracy == max(avg.accuracy))

tick.label <- as.character(round(cut.off.score, 2))
plot(1:20, avg.accuracy, type = "b",
     xlim = c(1, 20), 
     ylim = c(0.5, 1), 
     axes = FALSE,
     xlab = "Cut-off Score",
     ylab = "Accuracy",
     main = "5-fold CV performance"
)
axis(1, at = 1:20, label = tick.label, las = 2)
axis(2)
segments(max.id, 0.5, max.id, avg.accuracy[max.id], col = "red")
text(max.id, avg.accuracy[max.id] + 0.03, as.character(round(avg.accuracy[max.id], 4)), col = "red", cex = 0.8)

1.5.4 Testing Model Performance

#Test the resulting output
nn.results <- predict(NetworkModel, testDat)
results <- data.frame(actual = testDat[,3], prediction = nn.results > .57)
confMatrix = table(results$prediction, results$actual)               # confusion matrix
accuracy=sum(results$actual == results$prediction)/length(results$prediction)
list(confusion.matrix = confMatrix, accuracy = accuracy)

## $confusion.matrix
##        
##           0   1
##   FALSE 734 113
##   TRUE   32  45
## 
## $accuracy
## [1] 0.8430736

1.5.5 ROC Analysis

Recall that the ROC curve is the plot of sensitivity against (1 - specificity) calculated from the confusion matrix based on a sequence of selected cut-off scores. Definitions of sensitivity and specificity are given in the following confusion matrix

Next, we construct a ROC for the above NN model based on the training data set.

nn.results = predict(NetworkModel, trainDat)  # Keep in mind that trainDat is a matrix!
cut0 = seq(0,1, length = 20)
SenSpe = matrix(0, ncol = length(cut0), nrow = 2, byrow = FALSE)
for (i in 1:length(cut0)){
    a = sum(trainDat[,"AttritionYes"] == 1 & (nn.results > cut0[i]))
    d = sum(trainDat[,"AttritionYes"] == 0 & (nn.results < cut0[i]))
    b = sum(trainDat[,"AttritionYes"] == 0 & (nn.results > cut0[i]))    
    c = sum(trainDat[,"AttritionYes"] == 1 & (nn.results < cut0[i]))   
    sen = a/(a + c)
    spe = d/(b + d)
    SenSpe[,i] = c(sen, spe)
}
# plotting ROC
plot(1-SenSpe[2,], SenSpe[1,], type ="l", xlim=c(0,1), ylim=c(0,1),
     xlab = "1 - specificity", ylab = "Sensitivity", lty = 1,
     main = "ROC Curve", col = "blue")
abline(0,1, lty = 2, col = "red")
## Calculate AUC
xx = 1-SenSpe[2,]
yy = SenSpe[1,]
width = xx[-length(xx)] - xx[-1]
height = yy[-1]

## A better approx of ROC, need library {pROC}
  prediction = as.vector(nn.results)
  category = trainDat[,"AttritionYes"] == 1
  ROCobj <- roc(category, prediction)
  AUC = auc(ROCobj)[1]
  ##
###
text(0.8, 0.3, paste("AUC = ", round(AUC,4)), col = "purple", cex = 0.9)
legend("bottomright", c("ROC of the model", "Random guessing"), lty=c(1,2),
       col = c("blue", "red"), bty = "n", cex = 0.8)

Figure 14: ROC Curve of the neural network model.

The above ROC curve indicates that the underlying neural network is better than the random guess since the area under the curve is significantly greater than 0.5. In general, if the area under the ROC curve is greater than 0.65, we say the predictive power of the underlying model is acceptable.

1.5.6 Here’s a comparison of the two AUC values you’ve mentioned:

Logistic Model AUC: 0.8151
Neural Network Model AUC: 0.8337

*Comparing AUC values: AUC ranges between 0 and 1, where a higher value indicates better discrimination between the classes. Generally, an AUC value above 0.5 suggests that the model is performing better than random chance. In your case, both models have AUC values above 0.5, which is a positive sign. The neural network model has a slightly higher AUC (0.8337) compared to the logistic model (0.8151), suggesting that the Neural Network model is better at distinguishing between the positive and negative classes based on the chosen evaluation metric.

Predicting Employee Attrition using a Single-Layer Neural Network: Unveiling Insights with Deep Learning

Tenam Lama

1 Introduction

1.1 Feature Conversion for `neuralnet`

1.2 Numeric Feature Scaling

1.3 Extract All Feature Names

1.4 Define Model Formula

1.5 Training and Testing Neural Network Model

1.5.1 Data Splitting

1.5.2 Build Neural Network Model

1.5.3 About Cross-validation in Neural Network

1.5.4 Testing Model Performance

1.5.5 ROC Analysis

1.5.6 Here’s a comparison of the two AUC values you’ve mentioned:

Predicting Employee Attrition using a Single-Layer Neural Network: Unveiling Insights with Deep Learning

Tenam Lama

1 Introduction

1.1 Feature Conversion for neuralnet

1.2 Numeric Feature Scaling

1.3 Extract All Feature Names

1.4 Define Model Formula

1.5 Training and Testing Neural Network Model

1.5.1 Data Splitting

1.5.2 Build Neural Network Model

1.5.3 About Cross-validation in Neural Network

1.5.4 Testing Model Performance

1.5.5 ROC Analysis

1.5.6 Here’s a comparison of the two AUC values you’ve mentioned:

1.1 Feature Conversion for `neuralnet`