The purpose of this file is to outline the structure of the subsequent documents. Documents 8.1-12.2 are all structured in the same way:
First the actual labels and the considered feature set are imported (F1, F2, F3, F4, F5). In case of a combined feature set the imported feature set is combined with the machine-learning feature set F1.
In order to enable one-vs-all classification, the labels are recoded. To illustrate if the score “2” is considered, all scores where Score = 2 is TRUE are coded as 1 and all other scores are coded as 0.
As a next step, the features are converted to numeric- a prerequisite for SVM classification.
Following, the labels and the features are split into Training (80%) and Validation (20%) sets. A random split was used.
Using the training features and training labels SVM models were built. The specification probability =TRUE was used in order to calculate the output as a probability rather than a discrete prediction.
Using the probabilities as an input, the class with the highest probability was chosen.
Finally, performance measures such as Accuracy, Precision and Recall were calculated.