Multi-Class Classifier Performance Plots with Driverless AI

Overview

This notebook intends on experimenting and designing various visualizations assisting is assessing performance of multi-class (multivariate) classifiers created with Driverless AI. It will evolve in both:

integration with Driverless AI
various types of plots

Driverless AI Experiment Summary

Upon completion of the experiment please download its summary using command Download Experiment Summary. This will result in downloading zip archive with name similar to h2oai_experiments_summary_product.zip. Unzipping this archive will create a folder with the same name containing multiple files including ensemble_confusion_matrix_stats_with_validation.json. Save full name of the file in the variable:

experiment_stats_file = "~/Projects/Playground/data/h2oai_experiments_summary_wigusopu/ensemble_confusion_matrix_stats_with_validation.json"

Reading and Parsing Stats from File

Driverless AI stats use following JSON format to represent classifier’s statistics (display first 50 lines):

writeLines(readLines(experiment_stats_file, n=35))

## {
##     "Auto debt": {
##         "Threshold (max F1 score)": "argmax",
##         "Population": 105043,
##         "P: Condition positive": 869,
##         "N: Condition negative": 104174,
##         "Test outcome positive": 414,
##         "Test outcome negative": 104629,
##         "TP: True Positive": 205,
##         "TN: True Negative": 103965,
##         "FP: False Positive": 209,
##         "FN: False Negative": 664,
##         "TPR: (Sensitivity, hit rate, recall)": 0.2359033372,
##         "TNR=SPC: (Specificity)": 0.9979937412,
##         "PPV: Pos Pred Value (Precision)": 0.4951690821,
##         "NPV: Neg Pred Value": 0.9936537671,
##         "FPR: False-out": 0.0020062588,
##         "FDR: False Discovery Rate": 0.5048309179,
##         "FNR: Miss Rate": 0.7640966628,
##         "ACC: Accuracy": 0.9916891178,
##         "F1 score": 0.319563523,
##         "MCC: Matthews correlation coefficient": 0.3381334593,
##         "Informedness": 0.2338970784,
##         "Markedness": 0.4888228492,
##         "Prevalence": 0.0082728026,
##         "LR+: Positive likelihood ratio": 117.5837045276,
##         "LR-: Negative likelihood ratio": 0.7656327202,
##         "DOR: Diagnostic odds ratio": 153.5771675218,
##         "FOR: False omission rate": 0.0063462329
##     },
##     "CD (Certificate of Deposit)": {
##         "Threshold (max F1 score)": "argmax",
##         "Population": 105043,
##         "P: Condition positive": 103,
##         "N: Condition negative": 104940,

where high level entries contain outcomes and next level contains their stats. To parse JSON and store classifier stats in R use function treadConfMatrixJsonIntoDataframe() that reads file and translates its content into R dataframe in long format suitable for visualization functions. It takes following parameters:

full file name: name and location of the stats file
classifier metric names (as found in ensemble_confusion_matrix_stats_with_validation.json): classifier metrics to keep (if NULL then use all metrics)

readConfMatrixJsonIntoDataframe <- function(file, metrics=NULL) {
  cf = jsonlite::fromJSON(txt=file, flatten = TRUE)
  dflist = lapply(names(cf), function(x) {
    df = unlist(cf[[x]])
    data.frame(item=x, metric=names(df), value=df, stringsAsFactors = FALSE)
  })
  data = do.call("rbind", dflist)
  data$value = as.numeric(data$value)
  if(!is.null(metrics) && length(metrics) > 0) {
    data = data[data$metric %in% metrics,]
  }
  
  return(data)
}

Read experiment data:

file_product = "~/Projects/Playground/data/h2oai_experiments_summary_wigusopu/ensemble_confusion_matrix_stats_with_validation.json"
data = readConfMatrixJsonIntoDataframe(file_product, c("TPR: (Sensitivity, hit rate, recall)",
                                                          "TNR=SPC: (Specificity)",
                                                          "PPV: Pos Pred Value (Precision)",
                                                          "ACC: Accuracy",
                                                          "FDR: False Discovery Rate"
                                                          ))

## Warning in readConfMatrixJsonIntoDataframe(file_product, c("TPR:
## (Sensitivity, hit rate, recall)", : NAs introduced by coercion

Displaying Classifier Metrics with Bar Chart

This function creates bar chart plot of classifier metrics by placing classifier classes on x-axis, performance metric values on y-axis, and creating a bar chart for each type of metric (with facets):

plotClassifierMetricsAsBarchart <- function(data, x, y, title, subtitle=NULL, 
                                            legendPosition = "bottom",
                                            guide=guide_legend(title=NULL, ncol=2, byrow=TRUE)) {
  p = ggplot(data) +
    geom_bar(aes(item, value, fill=item), position = "dodge", stat = "identity") +
    facet_wrap(~metric, ncol=1) +
    scale_fill_manual(values = colorRampPalette(tableau_color_pal()(8))(length(unique(data$item))),
                      guide = guide) +
    labs(x=x, y=y, title=title, subtitle = subtitle) +
    theme_minimal(base_family = 'Palatino', base_size = 12) +
    theme(legend.position = legendPosition,
          axis.text.x = element_blank())
  
  return(p)
}

plotClassifierMetricsAsBarchart(data, x="Products", y="Percent", 
                                title="Precision, Specificity, and Recall Metrics by Products",
                                subtitle = "Product Level Classifier",
                                legendPosition = "none")

Radar Plots

Using radar plot allows displaying multi-dimensional data compactly on a single chart and for multiple records. Adapting it to multi-class classifier places performance metrics to x-axis, metric values to y-axis and classes as either groups or facets.

Defining Radar Coordinate

Both parallel plots and radar plots operate on the same data. Please see here for details of implementing both with ggplot2 library. In short, parallel plot uses geom_path() geometry and then we transition to radar plot using coord_polar(). This transition creates couple of problems that are addressed with custom coordinate system coord_radar() and using geom_polygon() geometry:

# radar plot coordinate
# see http://www.cmap.polytechnique.fr/~lepennec/R/Radar/RadarAndParallelPlots.html
coord_radar <- function (theta = "x", start = 0, direction = 1) {
  theta = match.arg(theta, c("x", "y"))
  r = if (theta == "x") 
    "y"
  else "x"
  ggproto("CordRadar", CoordPolar, theta = theta, r = r, start = start, 
          direction = sign(direction),
          is_linear = function(coord) TRUE)
}


plotClassifierMetricsAsRadarPlot <- function(data, x, y, title, subtitle=NULL, 
                                            legendPosition = "bottom",
                                            guide=guide_legend(title=NULL, ncol=2, byrow=TRUE),
                                            alpha=0.2) {
  
  p = ggplot(data, aes(x = metric, y = value)) +
    geom_polygon(aes(color = item, fill=NULL, group=item), size = 1, show.legend = FALSE, alpha = alpha) +
    geom_line(aes(color = item, group=item), size = 1, alpha = alpha) +
    scale_x_discrete(drop=FALSE) +
    coord_radar() +
    scale_color_manual(values = colorRampPalette(tableau_color_pal()(8))(length(unique(data$item))),
                    guide = guide) +
    labs(x=x, y=y, title=title, subtitle = subtitle) +
    theme_minimal(base_size = 12, base_family = 'Palatino') +
    theme(legend.position=legendPosition,
        strip.text.x = element_text(size = rel(0.8)),
        axis.text.x = element_text(size = rel(0.8)),
        axis.ticks.y = element_blank(),
        axis.text.y = element_blank())
  
  return(p)
}


plotClassifierMetricsAsRadarPlotFaceted <- function(data, x="", y="", title, subtitle=NULL, 
                                            legendPosition = "bottom", facetNCol = 10,
                                            alpha=0.2) {
  
  p = ggplot(data, aes(x = metric, y = value)) +
    geom_polygon(aes(color = item, fill=NULL, group=1), size = 1, show.legend = FALSE, alpha = alpha) +
    geom_line(aes(color = item, group=1), size = 1, alpha = alpha) +
    facet_wrap(~item, ncol=facetNCol) + 
    scale_x_discrete(drop=FALSE) +
    coord_radar() +
    scale_color_manual(values = colorRampPalette(tableau_color_pal()(8))(length(unique(data$item)))) +
    labs(x=x, y=y, title=title, subtitle = subtitle) +
    theme_minimal(base_size = 12, base_family = 'Palatino') +
    theme(legend.position="none",
        axis.ticks.x = element_blank(),
        axis.text.x = element_blank(),
        axis.ticks.y = element_blank(),
        axis.text.y = element_blank())
  
  return(p)
}

Display radar plots:

plotClassifierMetricsAsRadarPlot(data, x=NULL, y=NULL, 
                                 title="Multi-Class Classifier Metrics by Classes",
                                 subtitle = "Classes = Products",
                                 legendPosition = "none")

plotClassifierMetricsAsRadarPlotFaceted(data, title="Multi-Class Classifier Metrics by Classes",
       subtitle = "Classes = Products")