library(readr)
library(dplyr)
library(tidyr)
library(ggplot2)
url <- "https://raw.githubusercontent.com/acatlin/data/master/classification_model_performance.csv"
performance <- read_csv(url, show_col_types = FALSE)
head(performance)
## # A tibble: 6 × 3
## class scored.class scored.probability
## <dbl> <dbl> <dbl>
## 1 0 0 0.328
## 2 0 0 0.273
## 3 1 0 0.110
## 4 0 0 0.0560
## 5 0 0 0.100
## 6 0 0 0.0552
# assign each case to TP, TN, FP, or FN
performance <- performance %>%
mutate(result = ifelse(class == 0 & scored.class == 0, "TN",
ifelse(class == 1& scored.class == 1, "TP",
ifelse(class == 1 & scored.class == 0, "FN",
ifelse(class == 0 & scored.class == 1, "FP", "other")))))
# calculate null error rate
null.error.rate = (sum(performance$result == "FP") + sum(performance$result == "TN")) / nrow(performance)
# Print the null error rate
cat("The null error rate is:", null.error.rate, "\n")
## The null error rate is: 0.6850829
performance_percents <- performance %>%
mutate(
positive_actual = sum(class == 1),
negative_actual = sum(class == 0),
positive_percent = round(positive_actual / nrow(performance) * 100, 2),
negative_percent = round(negative_actual / nrow(performance) * 100, 2))
performance %>%
ggplot(aes(x = factor(class))) +
geom_bar(fill = c("0" = "lightblue", "1" = "lightpink")) +
geom_text(
aes(label = paste0(ifelse(factor(class) == 0, performance_percents$negative_percent, performance_percents$positive_percent), "%")),
y = 0,
position = position_identity(),
vjust = -0.5
) +
xlab("Class") +
ylab("Count") +
ggtitle("Target",
"Distribution of Actual Explanatory Variable") +
theme(
plot.title = element_text(hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5)
)
confusion_matrix <- performance %>%
mutate(threshold_02 = ifelse(scored.probability >= 0.2, 1, 0),
threshold_05 = ifelse(scored.probability >= 0.5, 1, 0),
threshold_08 = ifelse(scored.probability >= 0.8, 1, 0)) %>%
gather(threshold, scored.class, -class, -scored.probability, -result, -scored.class) %>%
group_by(threshold) %>%
summarise(
TP = sum(class == 1 & scored.class == 1),
FP = sum(class == 0 & scored.class == 1),
TN = sum(class == 0 & scored.class == 0),
FN = sum(class == 1 & scored.class == 0)
) %>%
pivot_longer(cols = c(TP, FP, TN, FN), names_to = "Values", values_to = "threshold_value") %>%
pivot_wider(names_from = threshold, values_from = threshold_value) %>%
as.data.frame()
colnames(confusion_matrix) <- c("Values", "0.2 Threshold", "0.5 Threshold", "0.8 Threshold")
confusion_matrix
## Values 0.2 Threshold 0.5 Threshold 0.8 Threshold
## 1 TP 49 27 9
## 2 FP 51 5 1
## 3 TN 73 119 123
## 4 FN 8 30 48
# transpose matrix
rotated_confusion_matrix <- as.data.frame(t(confusion_matrix[,-1]))
colnames(rotated_confusion_matrix) <- confusion_matrix$Values
rotated_confusion_matrix %>%
mutate(
accuracy = (TP + TN) / (TP+FP+TN+FN),
precision = TP / (TP + FP),
recall = TP / (TP + FN),
f.score = (2 * recall * precision) / (recall + precision)
) %>%
select(-TP, -FP, -TN, -FN)
## accuracy precision recall f.score
## 0.2 Threshold 0.6740331 0.49000 0.8596491 0.6242038
## 0.5 Threshold 0.8066298 0.84375 0.4736842 0.6067416
## 0.8 Threshold 0.7292818 0.90000 0.1578947 0.2686567
Scenario: Early screening for diseases.
In a model where there is screening for a rare disease, it may be preferable to use a 0.2 scored probability threshold since this will increase the sensitivity of the model and capture more potential cases, minimizing the chance of missing true positives. The consequences of a false positive are less severe than those of a missing a true positive.
Scenario: Fraud Detection in Financial Transactions
In a model that determines fraud from financial transactions (i.e. credit card transactions), it may be preferable to use a 0.8 scored probability threshold since the consequences of false positives are high. If a transaction is falsely flagged as fraud, it could inconvenience a user by blocking their credit card. Using a higher probability threshold means that the model would only flag transactions where there is high confidence that the transaction is fraudulent.