PURPOSE

The purpose of this file is to outline the structure of the subsequent documents. Documents 8.1-12.2 are all structured in the same way:

  1. First the actual labels and the considered feature set are imported (F1, F2, F3, F4, F5). In case of a combined feature set the imported feature set is combined with the machine-learning feature set F1.

  2. In order to enable one-vs-all classification, the labels are recoded. To illustrate if the score “2” is considered, all scores where Score = 2 is TRUE are coded as 1 and all other scores are coded as 0.

  3. As a next step, the features are converted to numeric- a prerequisite for SVM classification.

  4. Following, the labels and the features are split into Training (80%) and Validation (20%) sets. A random split was used.

  5. Using the training features and training labels SVM models were built. The specification probability =TRUE was used in order to calculate the output as a probability rather than a discrete prediction.

  6. Using the probabilities as an input, the class with the highest probability was chosen.

  7. Finally, performance measures such as Accuracy, Precision and Recall were calculated.