Description

Displays summary statistics and plots on the GRAPHS-2015 dataset. All statistics and plotting functions used are from the ASlib package.

library(aslib)
dataset = parseASScenario("GRAPHS-2015")
summary(dataset)
##                   Length Class          Mode
## desc              14     ASScenarioDesc list
## feature.runstatus  7     data.frame     list
## feature.costs      7     data.frame     list
## feature.values    37     data.frame     list
## algo.runs          5     data.frame     list
## algo.runstatus     9     data.frame     list
## cv.splits          3     data.frame     list

Summary of features

getFeatureNames(dataset)
##  [1] "cheap.pattern.time"                    
##  [2] "cheap.pattern.vertices"                
##  [3] "cheap.pattern.edges"                   
##  [4] "cheap.pattern.loops"                   
##  [5] "cheap.pattern.meandeg"                 
##  [6] "cheap.pattern.maxdeg"                  
##  [7] "cheap.pattern.degisfixed"              
##  [8] "cheap.pattern.density"                 
##  [9] "cheap.target.time"                     
## [10] "cheap.target.vertices"                 
## [11] "cheap.target.edges"                    
## [12] "cheap.target.loops"                    
## [13] "cheap.target.meandeg"                  
## [14] "cheap.target.maxdeg"                   
## [15] "cheap.target.degisfixed"               
## [16] "cheap.target.density"                  
## [17] "distance.pattern.time"                 
## [18] "distance.pattern.isconnected"          
## [19] "distance.pattern.meandistance"         
## [20] "distance.pattern.maxdistance"          
## [21] "distance.pattern.proportiondistancege2"
## [22] "distance.pattern.proportiondistancege3"
## [23] "distance.pattern.proportiondistancege4"
## [24] "distance.target.time"                  
## [25] "distance.target.isconnected"           
## [26] "distance.target.meandistance"          
## [27] "distance.target.maxdistance"           
## [28] "distance.target.proportiondistancege2" 
## [29] "distance.target.proportiondistancege3" 
## [30] "distance.target.proportiondistancege4" 
## [31] "lad.values.removed"                    
## [32] "lad.values.removed.percent"            
## [33] "lad.values.removed.min"                
## [34] "lad.values.removed.max"                
## [35] "lad.time"
summarizeFeatureValues(dataset)

Summary of algorithm performance

getAlgorithmNames(dataset)
## [1] "lad"             "supplementallad" "vf2"             "glasgow1"       
## [5] "glasgow2"        "glasgow3"        "glasgow4"
summarizeAlgoPerf(dataset) 
summarizeAlgoRunstatus(dataset)

Algorithm performance plots

Important note w.r.t. some of the following plots: If appropriate, we imputed performance values for failed runs. We used \(max + 0.3 * (max - min)\), in case of minimization problems, or \(min - 0.3 * (max - min)\), in case of maximization problems.

plotAlgoPerfBoxplots(dataset, impute.zero.vals = TRUE, log = TRUE)

plotAlgoPerfDensities(dataset, impute.zero.vals = TRUE, log = TRUE)

plotAlgoPerfCDFs(dataset, impute.zero.vals = TRUE, log = TRUE)

plotAlgoPerfScatterMatrix(dataset, impute.zero.vals = TRUE, log = TRUE)

Correlation matrix

The figure showing the correlations of the ranks of the performance values shows the Spearman correlation coefficient. Missing values were imputed prior to computing the correlation coefficients. The algorithms are ordered in a way that similar (highly correlated) algorithms are close to each other. Per default the clustering is based on hierarchical clustering, using Ward’s method.

plotAlgoCorMatrix(dataset)

References

Source Rmd file

GRAPHS-2015 EDA original source

Kotthoff, Lars, Ciaran McCreesh, and Christine Solnon. 2016. “Portfolios of Subgraph Isomorphism Algorithms.” In LION 10.