In this homework assignment, we will work through various classification metrics. We will create functions in R to carry out the various calculations. We will also investigate some functions in packages that will let you obtain the equivalent results. Finally, we will create graphical output that also can be used to evaluate the output of classification models, such as binary logistic regression.
The classification output data set classification-output-data.csv
was loaded into the variable data
. The data set contains 181 observations of 11 variables. This report will focus on three variables: “class” (the true class for each observation), “scored.class” (the predicted class for each observation), and “scored.probability” (the predicted likelihood of success for each observation).
data <- read.csv("https://raw.githubusercontent.com/cliftonleesps/data621_group1/main/hw2/classification-output-data.csv")
The data set has three key columns we will use:
class
: the actual class for the observation
scored.class
: the predicted class for the observation (based on a threshold of 0.5)
scored.probability
: the predicted probability of success for the observation
Using the the table() function, we can check the raw confusion matrix for this scored dataset.
The confusion matrix below has rows and columns representing the actual and predicted classes respectively for two types of categories (0 and 1).
119
is the number of observations that are correctly classified as class 0 (TN
, true negative observations)
30
is the number of observations belonging to class 0 that are incorrectly classified as class 1 (FN
, false negative observations)
5
is the number of observations belonging to class 1 but classified incorrectly as class 0 (FP
, false positive observations)
27
is the number of observations that are correctly classified as positive class 1 (TP
, true positive observations)
0 | 1 | |
---|---|---|
0 | 119 | 30 |
1 | 5 | 27 |
The function classify_predictions() calculates the amount of true positive, true negative, false negative, and false positive observations. Then, the function test_accuracy() takes the data set as a dataframe, with actual and predicted classifications identified, and returns the accuracy of the predictions. Based on the results of the functions, the accuracy is 0.81
.
\[Accuracy = \frac{TP+TN}{TP+FP+TN+FN}\]
classify_predictions <- function(df) {
dfSelected <- df %>% select(class, scored.class)
true_negative <- dfSelected %>% filter(scored.class == 0 & class == 0)
true_positive <- dfSelected %>% filter(scored.class == 1 & class == 1)
false_negative <- dfSelected %>% filter(scored.class == 0 & class == 1)
false_positive <- dfSelected %>% filter(scored.class == 1 & class == 0)
matrix <- list("tn" = nrow(true_negative),
"tp" = nrow(true_positive),
"fn" = nrow(false_negative),
"fp" = nrow(false_positive))
return (matrix)
}
test_accuracy <- function(df) {
m <- classify_predictions(df)
accuracy <- (m$tp + m$tn) / (m$tp + m$fp + m$tn + m$fn)
return (accuracy)
}
## [1] "The accuracy of the predictions: 0.807"
The function test_classification_error_rate() takes the data set as a dataframe, with actual and predicted classifications identified, and returns the classification error rate of the predictions. The function is based on the formula below. Based on the results of the functions, the classification error rate of the predictions is 0.19
.
\[Classification\: Error\: Rate = \frac{FP+FN}{TP+FP+TN+FN}\]
test_classification_error_rate <- function(df) {
m <- classify_predictions(df)
cer <- (m$fp + m$fn) / (m$tp + m$fp + m$tn + m$fn)
return (cer)
}
## [1] "The classification error rate of the predictions: 0.193"
As we see below, the accuracy and the error rate sums to one which means our custom functions work correctly.
## [1] "The sum of the accuracy and classification error: 1.0"
The function test_precision() takes the data set as a dataframe, with actual and predicted classifications identified, and returns the precision of the predictions. The function is based on the formula below. Based on the results of the functions, the precision of the predictions is 0.84
.
\[Precision = \frac{TP}{TP+FP}\]
test_precision <- function(df) {
m <- classify_predictions(df)
p <- m$tp / (m$tp + m$fp)
return (p)
}
## [1] "The precision of the predictions: 0.844"
The test_sensitivity() function takes the data set as a dataframe, with actual and predicted classifications identified, and returns the sensitivity of the predictions. The function is based on the formula below. Based on the results of the functions, the precision of the predictions is 0.47
.
\[Sensitivity = \frac{TP}{TP+FN}\]
test_sensitivity <- function(df) {
m <- classify_predictions(df)
s <- (m$tp) / (m$tp + m$fn)
return (s)
}
## [1] "The sensitivity of the predictions: 0.474"
The test_specificity() function takes the data set as a dataframe, with actual and predicted classifications identified, and returns the specificity of the predictions. The function is based on the formula below. Based on the results of the functions, the precision of the predictions is 0.96
.
\[Specificity = \frac{TN}{TN+FP}\]
test_specificity <- function(df) {
m <- classify_predictions(df)
s <- (m$tn) / (m$tn + m$fp)
return (s)
}
## [1] "The specificity of the predictions: 0.960"
The test_f1_score() function takes the data set as a dataframe, with actual and predicted classifications identified, and returns the F1 score of the predictions. The function is based on the formula below. Based on the results of the functions, the precision of the predictions is 0.61
.
\[F1 \: Score = \frac{2 \cdot Precision \cdot Sensitivity}{Precision + Sensitivity}\]
test_f1_score <- function(df) {
precision <- test_precision(df)
sensitivity <- test_sensitivity(df)
score <- (2 * precision * sensitivity) / (precision + sensitivity)
return(score)
}
## [1] "The f1_score of the predictions: 0.607"
The F1 score is calculated from precision and sensitivity. They both have a range from 0 to 1. Based on the rule that if 0<a<1 and 0<b<1, then ab<a: \[ If \; 0<Precision<1 \; and \; 0<Sensitivity<1, \\ then \; Precision \cdot Sensitivity<Precision \; \\ and \; Precision \cdot Sensitivity<Sensitivity, \\ then \; 2 \cdot Precision \cdot Sensitivity < Precision + Sensitivity\]
F1 score is the fraction with numerator 2•Precision•Sensitivity and denominator Precision + Sensitivity, the numerator is less than denominator. Also, Precision and Sensitivity are both positive. The result of the fraction can not be greater than 1 and negative in this case. The F1 score can not be equal to 0, and if Precision is 0 or Sensitivity is 0, the F1 score is not defined. Thus, the only range for F1 score is \[0<F1 \; score<1\]
The test_roc() function takes the data set as a dataframe, with a true classification column and a probability column and returns a list that includes the plot of the ROC curve and the AUC. First, we create a new data frame with columns class
, prob
(in the descending order for prob
) and columns TPR
, FPR
. The simple_auc() function calculated auc based on TPR and FPR. The ROC curve is plotted using the plot() function. A diagonal line (abline(a = 0, b = 1)) is represents a random classifier. Based on the results of the functions, the AUC is 0.85
.
test_roc <- function(class, prob) {
# create new dataframe with class and prob vars
df <- data.frame(class,prob)
df <- df[order(-df$prob),]
# add TP and FP rates to dataframe
df <- df %>% mutate(
TPR=cumsum(df$class)/sum(df$class),
FPR=cumsum(!df$class)/sum(!df$class)
)
# calc auc
calc_auc <- with(df, simple_auc(TPR, FPR))
# recode ROC plot
plot(df$FPR, df$TPR, main="Manual ROC Curve", xlab='FPR' , ylab='TPR')
abline(a = 0, b = 1)
legend(.5, .5, round(calc_auc, 4), title = "AUC")
axis(side=1, at=seq(0, 1, by=0.1))
axis(side=2, at=seq(0, 1, by=0.1))
p <- recordPlot()
return(list(p, calc_auc))
}
# Helper function to support the calculation of AUC
simple_auc <- function(TPR, FPR){
dFPR <- c(diff(FPR), 0)
dTPR <- c(diff(TPR), 0)
sum(TPR * dFPR) + sum(dTPR * dFPR)/2
}
ret_lst <- test_roc(data$class, data$scored.probability)
# output plot
ret_lst[1]
## [[1]]
## [1] "The calculated area under the curve:0.8503"
Below are the classification metrics that were computed using the functions defined above.
name <- c("Accuracy", "Classification Error Rate", "Precision", "Sensitivity", "Specificity", "F1 Score")
results <- c(test_accuracy(data), test_classification_error_rate(data), test_precision(data), test_sensitivity(data), test_specificity(data), test_f1_score(data))
output_data <- cbind('Metrics' = name, "Calculated Results" = round(results, 3))
kable(output_data) %>%
kable_styling(latex_options = "striped", full_width = FALSE)
Metrics | Calculated Results |
---|---|
Accuracy | 0.807 |
Classification Error Rate | 0.193 |
Precision | 0.844 |
Sensitivity | 0.474 |
Specificity | 0.96 |
F1 Score | 0.607 |
caret
packageThe confusionMatrix()
function from the caret
package provides access to various classification metrics (f1 score, accuracy, etc). Also, there are functions like sensitivity()/specificity()
to calculate these metrics separately. The confusionMatrix() function should have a setting “positive” equal to 1 in our case. Otherwise, the caret
package has a default positive equals to 0.
The table below provides the results for the classification metrics computed using the confusionMatrix() function together with values that were calculated using the functions created manually. As a result, R’s caret
functions return the same values as the manually written functions above.
conf_matrix <- confusionMatrix(factor(data$scored.class), factor(data$class), positive = "1")
results2 <- c(conf_matrix$overall[1], 1 - conf_matrix$overall[1], conf_matrix$byClass[3],
conf_matrix$byClass[1], conf_matrix$byClass[2], conf_matrix$byClass[7])
df_conf_matrix <- cbind("Metrics" = name, "Caret results" = round(results2, 3),
"Manual results" = round(results, 3))
kable(df_conf_matrix, row.names = FALSE) %>%
kable_styling(latex_options = "striped", full_width = FALSE)
Metrics | Caret results | Manual results |
---|---|---|
Accuracy | 0.807 | 0.807 |
Classification Error Rate | 0.193 | 0.193 |
Precision | 0.844 | 0.844 |
Sensitivity | 0.474 | 0.474 |
Specificity | 0.96 | 0.96 |
F1 Score | 0.607 | 0.607 |
Using caret’s sensitivity
function:
sprintf("The sensitivity using caret package: %.3f", sensitivity(factor(data$scored.class), factor(data$class), positive = "1"))
## [1] "The sensitivity using caret package: 0.474"
Using caret’s specificity
function:
sprintf("The specificity using caret package: %.3f", specificity(as.factor(data$scored.class), as.factor(data$class), negative = "0"))
## [1] "The specificity using caret package: 0.960"
pROC
packageThe auc()
function in pROC
package generates the same value as the manual process in question 10. Furthermore, the ROC curve from the question 10 and a curve generated by the roc()
function have a similar profile except for test_roc() function we created produces a plot with x-axis labeled as FPR
and y-axis labeled as TPR
, ranging from 0 to 1. On the other hand, the pROC
package function produces a plot with x-axis labeled as Specificity
and y-axis labeled as Sensitivity
, ranging from 1 to 0. These two plots ahve similar profile, except for the labeling and axes marks being slightly different.
roc_obj <- roc(data$class, data$scored.probability)
sprintf("The AUC using pRoc package: %.4f", auc(roc_obj))
## [1] "The AUC using pRoc package: 0.8503"
sprintf("The AUC using manual function: %.4f", round(as.numeric(ret_lst[2]),4))
## [1] "The AUC using manual function: 0.8503"
plot(roc_obj, print.auc=TRUE, main="pROC package ROC Curve")
axis(side=1, at=seq(0, 1, by=0.1))
axis(side=2, at=seq(0, 1, by=0.1))
ret_lst[1]
## [[1]]