A neural network is a type of machine learning algorithm that is designed to recognize patterns in data. It’s inspired by the structure and function of the human brain, where interconnected neurons work together to process and transmit information. Neural networks consist of layers of interconnected nodes (neurons) that process input data and produce output. These layers are typically organized into three main types:
Input Layer: This layer receives the raw input data and passes it on to the next layer for processing.
Hidden Layers: These are one or more layers between the input and output layers. Each neuron in a hidden layer processes the information it receives from the previous layer and passes the output to the next layer. The hidden layers are responsible for learning complex patterns in the data.
Output Layer: This layer produces the final prediction or output based on the information processed by the hidden layers. The structure of the output layer depends on the type of problem being solved, such as classification or regression.
Neural networks “learn” from data through a process called training. During training, the network adjusts the connections between neurons (weights) based on the input data and the desired output. This adjustment is done iteratively using optimization algorithms that minimize the difference between the predicted output and the actual target.
Neural networks have gained significant popularity due to their ability to learn from complex and high-dimensional data, making them suitable for tasks such as image and speech recognition, language translation, playing games, autonomous driving, and more. They can capture intricate relationships in data that may be difficult for traditional algorithms to discover.
While neural networks can achieve remarkable accuracy, they also come with challenges. They require large amounts of data for training, and determining the right architecture (number of layers and neurons) and optimization techniques can be complex. Deep learning, a subset of neural networks, involves using multiple layers to create complex models and has been a driving force behind many recent breakthroughs in AI.
In summary, neural networks are a powerful tool in the field of machine learning, capable of learning and representing complex patterns in data. They have enabled significant advancements in various domains and continue to be a focus of research and development.
neuralnetneuralnet() requires all features to be in the numeric
form (dummy variable for categorical features, normalization of
numerical features). The model formula in neuralnet()
requires dummy variables to be explicitly defined. It is also highly
recommended to scale all numerical features before being included in the
network model. The objective is to find all feature names (numeric and
all dummy variables) and write them in the model formula like the one in
glm: response ~ var_1 + var_2 + ... +var_k
To explain the modeling process in detail, we will outline major steps in the following subsections.
There are different types of scaling and standardization. The one we use in the following has
\[ scaled.var = \frac{orig.var - \min(orig.var)}{\max(orig.var)-\min(orig.var)} \] The scaled numeric feature is unitless (similar to the well-known z-score transformation).
attrition = read.csv("https://raw.githubusercontent.com/Tenam01/DATASETS/main/cleanedattrition2.csv")
attrition$Age = (attrition$Age-min(attrition$Age))/(max(attrition$Age)-min(attrition$Age))
attrition$DistanceFromHome = (attrition$DistanceFromHome-min(attrition$DistanceFromHome))/(max(attrition$DistanceFromHome)-min(attrition$DistanceFromHome))
attrition$Education = (attrition$Education-min(attrition$Education))/(max(attrition$Education)-min(attrition$Education))
attrition$EnvironmentSatisfaction = (attrition$EnvironmentSatisfaction-min(attrition$EnvironmentSatisfaction))/(max(attrition$EnvironmentSatisfaction)-min(attrition$EnvironmentSatisfaction))
attrition$JobInvolvement = (attrition$JobInvolvement-min(attrition$JobInvolvement))/(max(attrition$JobInvolvement)-min(attrition$JobInvolvement))
attrition$JobLevel = (attrition$JobLevel-min(attrition$JobLevel))/(max(attrition$JobLevel)-min(attrition$JobLevel))
attrition$JobSatisfaction = (attrition$JobSatisfaction-min(attrition$JobSatisfaction))/(max(attrition$JobSatisfaction)-min(attrition$JobSatisfaction))
attrition$NumCompaniesWorked = (attrition$NumCompaniesWorked-min(attrition$NumCompaniesWorked))/(max(attrition$NumCompaniesWorked)-min(attrition$NumCompaniesWorked))
attrition$MonthlyIncome = (attrition$MonthlyIncome-min(attrition$MonthlyIncome))/(max(attrition$MonthlyIncome)-min(attrition$MonthlyIncome))
attrition$PercentSalaryHike = (attrition$PercentSalaryHike-min(attrition$PercentSalaryHike))/(max(attrition$PercentSalaryHike)-min(attrition$PercentSalaryHike))
attrition$PerformanceRating = (attrition$PerformanceRating-min(attrition$PerformanceRating))/(max(attrition$PerformanceRating)-min(attrition$PerformanceRating))
attrition$RelationshipSatisfaction = (attrition$RelationshipSatisfaction-min(attrition$RelationshipSatisfaction))/(max(attrition$RelationshipSatisfaction)-min(attrition$RelationshipSatisfaction))
attrition$StockOptionLevel = (attrition$StockOptionLevel-min(attrition$StockOptionLevel))/(max(attrition$StockOptionLevel)-min(attrition$StockOptionLevel))
attrition$TrainingTimesLastYear = (attrition$TrainingTimesLastYear-min(attrition$TrainingTimesLastYear))/(max(attrition$TrainingTimesLastYear)-min(attrition$TrainingTimesLastYear))
attrition$WorkLifeBalance = (attrition$WorkLifeBalance-min(attrition$WorkLifeBalance))/(max(attrition$WorkLifeBalance)-min(attrition$WorkLifeBalance))
attrition$YearsInCurrentRole = (attrition$YearsInCurrentRole-min(attrition$YearsInCurrentRole))/(max(attrition$YearsInCurrentRole)-min(attrition$YearsInCurrentRole))
attrition$YearsSinceLastPromotion = (attrition$YearsSinceLastPromotion-min(attrition$YearsSinceLastPromotion))/(max(attrition$YearsSinceLastPromotion)-min(attrition$YearsSinceLastPromotion))
attrition$YearsSinceLastPromotion = (attrition$YearsWithCurrManager-min(attrition$YearsWithCurrManager))/(max(attrition$YearsWithCurrManager)-min(attrition$YearsWithCurrManager))
In practical applications, there may be many categorical features in
the model and each category could have many categories. It is
practically infeasible to write all resulting dummy features explicitly.
We can use the R function to extract variables from a model formula that
will be used in a model. Make sure, all categorical feature variables
must be defined in a non-numerical form (i.e., should not be numerically
encoded). We can also use the R function relevel() to
change the baseline of an unordered categorical feature variable.
# Convert "Attrition" to a factor variable
attrition$Attrition <- factor(attrition$Attrition, levels = c("No", "Yes"))
# Change baseline for Attrition variable (assuming "No" is the baseline)
attrition$Attrition <- relevel(attrition$Attrition, ref = "No")
# Convert "BusinessTravel" to a factor variable
attrition$BusinessTravel <- factor(attrition$BusinessTravel, levels = c("Travel_Rarely", "Travel_Frequently", "Travel_Non"))
# Change baseline for BusinessTravel variable (assuming "Travel_Rarely" is the baseline)
attrition$BusinessTravel <- relevel(attrition$BusinessTravel, ref = "Travel_Rarely")
# Convert "Department" to a factor variable
attrition$Department <- factor(attrition$Department, levels = c("Human Resources", "Research & Development", "Sales"))
# Change baseline for Department variable (assuming "Sales" is the baseline)
attrition$Department <- relevel(attrition$Department, ref = "Sales")
# Convert "EducationField" to a factor variable
attrition$EducationField <- factor(attrition$EducationField, levels = c("Human Resources", "Life Sciences", "Marketing", "Medical", "Other", "Technical Degree"))
# Change baseline for EducationField variable (assuming "Technical Degree" is the baseline)
attrition$EducationField <- relevel(attrition$EducationField, ref = "Technical Degree")
# Convert "Gender" to a factor variable
attrition$Gender <- factor(attrition$Gender, levels = c("Female", "Male"))
# Change baseline for Gender variable (assuming "Male" is the baseline)
attrition$Gender <- relevel(attrition$Gender, ref = "Male")
# Convert "JobRole" to a factor variable
attrition$JobRole <- factor(attrition$JobRole, levels = c("Healthcare Representative", "Human Resources", "Laboratory Technician", "Manager", "Manufacturing Director", "Research Director", "Research Scientist", "Sales Executive", "Sales Representative"))
# Change baseline for JobRole variable (assuming "Research Director" is the baseline)
attrition$JobRole <- relevel(attrition$JobRole, ref = "Research Director")
# Convert "MaritalStatus" to a factor variable
attrition$MaritalStatus <- factor(attrition$MaritalStatus, levels = c("Divorced", "Married", "Single"))
# Change baseline for MaritalStatus variable (assuming "Single" is the baseline)
attrition$MaritalStatus <- relevel(attrition$MaritalStatus, ref = "Single")
# Convert "OverTime" to a factor variable
attrition$OverTime <- factor(attrition$OverTime, levels = c("No", "Yes"))
# Change baseline for OverTime variable (assuming "No" is the baseline)
attrition$OverTime <- relevel(attrition$OverTime, ref = "No")
Next, we use the R function model.matrix() to extract
the names of all feature variables (including implicitly defined dummy
feature variables from model.matrix()).
# Create a model matrix for modeling
attritionMtx <- model.matrix(~ ., data = attrition)
# Get the column names of the model matrix
colnames(attritionMtx)
## [1] "(Intercept)" "Age"
## [3] "AttritionYes" "BusinessTravelTravel_Frequently"
## [5] "BusinessTravelTravel_Non" "DepartmentHuman Resources"
## [7] "DepartmentResearch & Development" "DistanceFromHome"
## [9] "Education" "EducationFieldHuman Resources"
## [11] "EducationFieldLife Sciences" "EducationFieldMarketing"
## [13] "EducationFieldMedical" "EducationFieldOther"
## [15] "EnvironmentSatisfaction" "GenderFemale"
## [17] "JobInvolvement" "JobLevel"
## [19] "JobRoleHealthcare Representative" "JobRoleHuman Resources"
## [21] "JobRoleLaboratory Technician" "JobRoleManager"
## [23] "JobRoleManufacturing Director" "JobRoleResearch Scientist"
## [25] "JobRoleSales Executive" "JobRoleSales Representative"
## [27] "JobSatisfaction" "MaritalStatusDivorced"
## [29] "MaritalStatusMarried" "MonthlyIncome"
## [31] "NumCompaniesWorked" "OverTimeYes"
## [33] "PercentSalaryHike" "PerformanceRating"
## [35] "RelationshipSatisfaction" "StockOptionLevel"
## [37] "TrainingTimesLastYear" "WorkLifeBalance"
## [39] "YearsInCurrentRole" "YearsSinceLastPromotion"
## [41] "YearsWithCurrManager"
There are some naming issues in the above dummy feature variables for network modeling (although they are good for regular linear and generalized linear regression models). We need to rename them by excluding special characters in order to build neural network models. These issues can be avoided at the stage of feature engineering (if we initially planned to build neural network models). Next, we clean up the variables before defining the network model formula.
colnames(attritionMtx)[4] <- "BusinessTravelTravelFreq"
colnames(attritionMtx)[5] <- "BusinessTravelTravelNon"
colnames(attritionMtx)[6] <- "DepartmentHumanResources"
colnames(attritionMtx)[7] <- "DepartmentResearchDevelopment"
colnames(attritionMtx)[10] <- "EducationFieldHumanResources"
colnames(attritionMtx)[11] <- "EducationFieldLifeSciences"
colnames(attritionMtx)[19] <- "JobRoleHealthcareRepresentative"
colnames(attritionMtx)[20] <- "JobRoleHumanResources"
colnames(attritionMtx)[21] <- "JobRoleLaboratorytechnician"
colnames(attritionMtx)[23] <- "JobRoleManufacturingDirector"
colnames(attritionMtx)[24] <- "JobRoleResearchScientist"
colnames(attritionMtx)[25] <- "JobRoleSalesExecutive"
colnames(attritionMtx)[26] <- "JobRoleSalesRepresentative"
For convenience, we encourage you to use CamelCase
notation (CamelCase is a way to separate the words in a
phrase by making the first letter of each word capitalized and not using
spaces) in naming feature variables.
# Get the column names of the model matrix (excluding intercept)
columnNames <- colnames(attritionMtx)[-1]
# Replace invalid characters and spaces in column names
columnNames <- gsub("[[:space:]]", "_", columnNames) # Replace spaces with underscores
columnNames <- gsub("[[:punct:]]", "_", columnNames) # Replace punctuations with underscores
# Create a formula string without AttritionYes on the right-hand side
modelFormula <- as.formula(paste("AttritionYes ~",
paste(setdiff(columnNames, "AttritionYes"), collapse = " + ")))
# Display the model formula
modelFormula
## AttritionYes ~ Age + BusinessTravelTravelFreq + BusinessTravelTravelNon +
## DepartmentHumanResources + DepartmentResearchDevelopment +
## DistanceFromHome + Education + EducationFieldHumanResources +
## EducationFieldLifeSciences + EducationFieldMarketing + EducationFieldMedical +
## EducationFieldOther + EnvironmentSatisfaction + GenderFemale +
## JobInvolvement + JobLevel + JobRoleHealthcareRepresentative +
## JobRoleHumanResources + JobRoleLaboratorytechnician + JobRoleManager +
## JobRoleManufacturingDirector + JobRoleResearchScientist +
## JobRoleSalesExecutive + JobRoleSalesRepresentative + JobSatisfaction +
## MaritalStatusDivorced + MaritalStatusMarried + MonthlyIncome +
## NumCompaniesWorked + OverTimeYes + PercentSalaryHike + PerformanceRating +
## RelationshipSatisfaction + StockOptionLevel + TrainingTimesLastYear +
## WorkLifeBalance + YearsInCurrentRole + YearsSinceLastPromotion +
## YearsWithCurrManager
We follow the routine steps for building a neural network model to predict Attrition.
We split the data into 70% for training the neural network and 30% for testing.
n = dim(attritionMtx)[1]
testID = sample(1:n, round(n*0.7), replace = FALSE)
testDat = attritionMtx[testID,]
trainDat = attritionMtx[-testID,]
NetworkModel = neuralnet(modelFormula,
data = trainDat,
hidden = 1, # single layer NN
rep = 1, # number of replicates in training NN
threshold = 0.01, # threshold for the partial derivatives as stopping criteria.
learningrate = 0.1, # user selected rate
algorithm = "rprop+"
)
kable(NetworkModel$result.matrix)
| error | 10.2249312 |
| reached.threshold | 0.0098644 |
| steps | 5295.0000000 |
| Intercept.to.1layhid1 | 34.2345484 |
| Age.to.1layhid1 | -36.2326672 |
| BusinessTravelTravelFreq.to.1layhid1 | 12.7631513 |
| BusinessTravelTravelNon.to.1layhid1 | 0.2239382 |
| DepartmentHumanResources.to.1layhid1 | 3.7016882 |
| DepartmentResearchDevelopment.to.1layhid1 | -0.5630887 |
| DistanceFromHome.to.1layhid1 | 11.5560392 |
| Education.to.1layhid1 | 2.6507646 |
| EducationFieldHumanResources.to.1layhid1 | -11.6188136 |
| EducationFieldLifeSciences.to.1layhid1 | -7.0547274 |
| EducationFieldMarketing.to.1layhid1 | -4.9758298 |
| EducationFieldMedical.to.1layhid1 | -7.8774701 |
| EducationFieldOther.to.1layhid1 | -6.3574417 |
| EnvironmentSatisfaction.to.1layhid1 | -13.0527514 |
| GenderFemale.to.1layhid1 | 1.6887635 |
| JobInvolvement.to.1layhid1 | -26.6435888 |
| JobLevel.to.1layhid1 | 38.2467912 |
| JobRoleHealthcareRepresentative.to.1layhid1 | 4.9651668 |
| JobRoleHumanResources.to.1layhid1 | 14.0401330 |
| JobRoleLaboratorytechnician.to.1layhid1 | -2.1312997 |
| JobRoleManager.to.1layhid1 | -489.2852726 |
| JobRoleManufacturingDirector.to.1layhid1 | -453.1908570 |
| JobRoleResearchScientist.to.1layhid1 | 7.1827604 |
| JobRoleSalesExecutive.to.1layhid1 | 12.8123777 |
| JobRoleSalesRepresentative.to.1layhid1 | 15.1326463 |
| JobSatisfaction.to.1layhid1 | -19.4989237 |
| MaritalStatusDivorced.to.1layhid1 | -9.6420330 |
| MaritalStatusMarried.to.1layhid1 | -2.3013135 |
| MonthlyIncome.to.1layhid1 | -50.2861920 |
| NumCompaniesWorked.to.1layhid1 | 17.7880731 |
| OverTimeYes.to.1layhid1 | 13.6985724 |
| PercentSalaryHike.to.1layhid1 | -6.4494042 |
| PerformanceRating.to.1layhid1 | 7.0671475 |
| RelationshipSatisfaction.to.1layhid1 | -0.0520577 |
| StockOptionLevel.to.1layhid1 | -13.6518023 |
| TrainingTimesLastYear.to.1layhid1 | -27.2224193 |
| WorkLifeBalance.to.1layhid1 | 1.2252818 |
| YearsInCurrentRole.to.1layhid1 | -0.5336293 |
| YearsSinceLastPromotion.to.1layhid1 | -1.6566077 |
| YearsWithCurrManager.to.1layhid1 | -0.7181540 |
| Intercept.to.AttritionYes | 0.0519850 |
| 1layhid1.to.AttritionYes | 0.9355885 |
plot(NetworkModel, rep="best")
Figure 12. Single-layer backpropagation Neural network model for Employee Attrition
logiModel = glm(factor(Attrition) ~., family = binomial, data = attrition)
pander(summary(logiModel)$coefficients)
| Estimate | Std. Error | z value | |
|---|---|---|---|
| (Intercept) | 0.3545 | 1.608 | 0.2204 |
| Age | -1.502 | 0.5208 | -2.884 |
| BusinessTravelTravel_Frequently | 0.8802 | 0.2047 | 4.299 |
| DepartmentHuman Resources | -12.63 | 446 | -0.02832 |
| DepartmentResearch & Development | 0.2472 | 1.113 | 0.222 |
| DistanceFromHome | 1.196 | 0.3055 | 3.913 |
| Education | 0.09759 | 0.3507 | 0.2783 |
| EducationFieldHuman Resources | -0.02841 | 0.8175 | -0.03475 |
| EducationFieldLife Sciences | -0.8385 | 0.304 | -2.758 |
| EducationFieldMarketing | -0.4895 | 0.3912 | -1.251 |
| EducationFieldMedical | -0.903 | 0.3136 | -2.879 |
| EducationFieldOther | -1.133 | 0.4889 | -2.318 |
| EnvironmentSatisfaction | -1.07 | 0.2452 | -4.363 |
| GenderFemale | -0.3189 | 0.1842 | -1.731 |
| JobInvolvement | -1.677 | 0.3706 | -4.525 |
| JobLevel | -0.6883 | 1.183 | -0.5818 |
| JobRoleHealthcare Representative | 1.016 | 0.9542 | 1.065 |
| JobRoleHuman Resources | 15.17 | 446 | 0.03402 |
| JobRoleLaboratory Technician | 2.614 | 0.9983 | 2.619 |
| JobRoleManager | 1.406 | 1.07 | 1.315 |
| JobRoleManufacturing Director | 1.203 | 0.9399 | 1.28 |
| JobRoleResearch Scientist | 1.699 | 1.001 | 1.698 |
| JobRoleSales Executive | 2.377 | 1.422 | 1.672 |
| JobRoleSales Representative | 3.43 | 1.51 | 2.272 |
| JobSatisfaction | -1.041 | 0.2406 | -4.329 |
| MaritalStatusDivorced | -1.014 | 0.3419 | -2.967 |
| MaritalStatusMarried | -0.7231 | 0.248 | -2.916 |
| MonthlyIncome | 0.7883 | 1.503 | 0.5246 |
| NumCompaniesWorked | 1.306 | 0.3418 | 3.822 |
| OverTimeYes | 1.82 | 0.1904 | 9.558 |
| PercentSalaryHike | -0.3553 | 0.5425 | -0.6549 |
| PerformanceRating | 0.02365 | 0.396 | 0.05972 |
| RelationshipSatisfaction | -0.6659 | 0.2483 | -2.681 |
| StockOptionLevel | -0.6092 | 0.4656 | -1.309 |
| TrainingTimesLastYear | -0.9523 | 0.4452 | -2.139 |
| WorkLifeBalance | -0.9719 | 0.3725 | -2.609 |
| YearsInCurrentRole | -1.111 | 0.7599 | -1.462 |
| YearsSinceLastPromotion | -0.4548 | 0.7042 | -0.6458 |
| Pr(>|z|) | |
|---|---|
| (Intercept) | 0.8256 |
| Age | 0.003921 |
| BusinessTravelTravel_Frequently | 1.713e-05 |
| DepartmentHuman Resources | 0.9774 |
| DepartmentResearch & Development | 0.8243 |
| DistanceFromHome | 9.099e-05 |
| Education | 0.7808 |
| EducationFieldHuman Resources | 0.9723 |
| EducationFieldLife Sciences | 0.005809 |
| EducationFieldMarketing | 0.2109 |
| EducationFieldMedical | 0.003989 |
| EducationFieldOther | 0.02043 |
| EnvironmentSatisfaction | 1.282e-05 |
| GenderFemale | 0.08339 |
| JobInvolvement | 6.042e-06 |
| JobLevel | 0.5607 |
| JobRoleHealthcare Representative | 0.287 |
| JobRoleHuman Resources | 0.9729 |
| JobRoleLaboratory Technician | 0.00883 |
| JobRoleManager | 0.1886 |
| JobRoleManufacturing Director | 0.2006 |
| JobRoleResearch Scientist | 0.08956 |
| JobRoleSales Executive | 0.0946 |
| JobRoleSales Representative | 0.02311 |
| JobSatisfaction | 1.501e-05 |
| MaritalStatusDivorced | 0.003009 |
| MaritalStatusMarried | 0.003546 |
| MonthlyIncome | 0.5999 |
| NumCompaniesWorked | 0.0001326 |
| OverTimeYes | 1.197e-21 |
| PercentSalaryHike | 0.5126 |
| PerformanceRating | 0.9524 |
| RelationshipSatisfaction | 0.00733 |
| StockOptionLevel | 0.1907 |
| TrainingTimesLastYear | 0.03244 |
| WorkLifeBalance | 0.009082 |
| YearsInCurrentRole | 0.1438 |
| YearsSinceLastPromotion | 0.5184 |
The algorithm of Cross-validation is primarily used for tuning hyper-parameters. For example, in the sigmoid perceptron, the optimal cut-off scores for the binary decision can be obtained through cross-validation. One of the important hyperparameters in the neural network model is the learning rate \(\alpha\) (in the backpropagation algorithm) that impacts the learning speed in training neural network models.
n0 <- dim(trainDat)[1] / 5
cut.off.score <- seq(0, 1, length = 22)[-c(1, 22)]
pred.accuracy <- matrix(0, ncol = 20, nrow = 5, byrow = TRUE)
for (i in 1:5) {
valid.id <- ((i - 1) * n0 + 1):(i * n0)
valid.data <- trainDat[valid.id,]
train.data <- trainDat[-valid.id,]
train.model <- neuralnet(modelFormula,
data = train.data,
hidden = 1,
rep = 1,
threshold = 0.01,
learningrate = 0.1,
algorithm = "rprop+"
)
pred.nn.score <- predict(train.model, valid.data) # Use train.model for prediction
for (j in 1:20) {
pred.status <- as.numeric(pred.nn.score > cut.off.score[j])
a11 <- sum(pred.status == valid.data[, 3])
pred.accuracy[i, j] <- a11 / length(pred.nn.score)
}
}
avg.accuracy <- apply(pred.accuracy, 2, mean)
max.id <- which(avg.accuracy == max(avg.accuracy))
tick.label <- as.character(round(cut.off.score, 2))
plot(1:20, avg.accuracy, type = "b",
xlim = c(1, 20),
ylim = c(0.5, 1),
axes = FALSE,
xlab = "Cut-off Score",
ylab = "Accuracy",
main = "5-fold CV performance"
)
axis(1, at = 1:20, label = tick.label, las = 2)
axis(2)
segments(max.id, 0.5, max.id, avg.accuracy[max.id], col = "red")
text(max.id, avg.accuracy[max.id] + 0.03, as.character(round(avg.accuracy[max.id], 4)), col = "red", cex = 0.8)
#Test the resulting output
nn.results <- predict(NetworkModel, testDat)
results <- data.frame(actual = testDat[,3], prediction = nn.results > .57)
confMatrix = table(results$prediction, results$actual) # confusion matrix
accuracy=sum(results$actual == results$prediction)/length(results$prediction)
list(confusion.matrix = confMatrix, accuracy = accuracy)
## $confusion.matrix
##
## 0 1
## FALSE 734 113
## TRUE 32 45
##
## $accuracy
## [1] 0.8430736
Recall that the ROC curve is the plot of sensitivity against (1 - specificity) calculated from the confusion matrix based on a sequence of selected cut-off scores. Definitions of sensitivity and specificity are given in the following confusion matrix
Next, we construct a ROC for the above NN model based on the training data set.
nn.results = predict(NetworkModel, trainDat) # Keep in mind that trainDat is a matrix!
cut0 = seq(0,1, length = 20)
SenSpe = matrix(0, ncol = length(cut0), nrow = 2, byrow = FALSE)
for (i in 1:length(cut0)){
a = sum(trainDat[,"AttritionYes"] == 1 & (nn.results > cut0[i]))
d = sum(trainDat[,"AttritionYes"] == 0 & (nn.results < cut0[i]))
b = sum(trainDat[,"AttritionYes"] == 0 & (nn.results > cut0[i]))
c = sum(trainDat[,"AttritionYes"] == 1 & (nn.results < cut0[i]))
sen = a/(a + c)
spe = d/(b + d)
SenSpe[,i] = c(sen, spe)
}
# plotting ROC
plot(1-SenSpe[2,], SenSpe[1,], type ="l", xlim=c(0,1), ylim=c(0,1),
xlab = "1 - specificity", ylab = "Sensitivity", lty = 1,
main = "ROC Curve", col = "blue")
abline(0,1, lty = 2, col = "red")
## Calculate AUC
xx = 1-SenSpe[2,]
yy = SenSpe[1,]
width = xx[-length(xx)] - xx[-1]
height = yy[-1]
## A better approx of ROC, need library {pROC}
prediction = as.vector(nn.results)
category = trainDat[,"AttritionYes"] == 1
ROCobj <- roc(category, prediction)
AUC = auc(ROCobj)[1]
##
###
text(0.8, 0.3, paste("AUC = ", round(AUC,4)), col = "purple", cex = 0.9)
legend("bottomright", c("ROC of the model", "Random guessing"), lty=c(1,2),
col = c("blue", "red"), bty = "n", cex = 0.8)
Figure 14: ROC Curve of the neural network model.
The above ROC curve indicates that the underlying neural network is better than the random guess since the area under the curve is significantly greater than 0.5. In general, if the area under the ROC curve is greater than 0.65, we say the predictive power of the underlying model is acceptable.
*Comparing AUC values: AUC ranges between 0 and 1, where a higher value indicates better discrimination between the classes. Generally, an AUC value above 0.5 suggests that the model is performing better than random chance. In your case, both models have AUC values above 0.5, which is a positive sign. The neural network model has a slightly higher AUC (0.8337) compared to the logistic model (0.8151), suggesting that the Neural Network model is better at distinguishing between the positive and negative classes based on the chosen evaluation metric.