Introduction

This report aims to explain the application of the Friedman Aligned Ranks Test and the Bergmann-Hommel Post Hoc procedure according to Garcia and Herrera (2008) version.

Friedman Aligned Ranks Test

The Friedman test is a non-parametric statistical test similar to the parametric repeated measures ANOVA and it is used to detect differences in treatments across multiple test attempts. The idea is to verifiy if there is significant difference between blocks with a fast multiple comparison method. Basically, the procedure involves ranking each row (or block) together, then order the rows values in decreasing order and calculate the average rank for each column (or factor). Once the average rank value is found for each factor, the following formula is applied to compare two factors:

\[ z = \frac {\left(R_i - R_j \right)} {\sqrt{\frac {k(k+1)} {6N}}} \]

where \(R_i\) is the average rank computed through the Friedman test for the i-th column (factor), k is the number of columns (factors) to be compared and N is the number of blocks sets used in the comparison. In this report the idea is to compare the accuracy of different classifiers methods using differents datasets. So, the columns (factors) will be the classifiers and the rows (blocks) will be the datasets. For each dataset, all the classification methods were applied and the objective is to compare if there is a better method and, if there is, which one is it. The dataset used in this report contains 3 methods of classification and 7 datasets. For each combined case the Friedman Rank test were applied and the ranks for each case is presented in the following table.

As we can see, the costBasedInitizalization classifier received the first rank for all datasets. It shows that this classifier is the best one. We can use the formula previously presented to calculate the p-value for the averages values and compare one by one the methods. The results are shown below:

Using a significance level equals to 5% we notice that there is no evidences of significant difference between kMeansPlusPlusInit and randomInitMethod but this difference exists between costBasedInitialization and the other two methods.

Bergmann-Hommel Pos-hoc Procedure

This procedure uses the Friedman Aligned Ranks Test p-values and applies a correction based on a list of possible hypothesis for multiple testing. The difference between these two tests is basically the number of possible hypothesis. The Bergmann-Hommel only uses exhaustives sets of hypothesis, i.e hypothesis tha can be true at the same time. It amplifies the test power, making the p-values more trustable to analyze. For the dataset used in this report the Bergmann-Hommel p-values is:

According to the p-values presented on the table above we can conclude that there is significant difference between the costBasedInitialization and the other two. For the methods applied in this report were used only one metric (‘jaccard’), one metric distance(‘euclidean’) and one moment (‘ini’). However, the method can be applied to any metric, metric distance and moment possible combination. It is available to check in the shiny app that can be acessed through the link: http://127.0.0.1:6196.