Descision Boundary tool for SVM classifier

Vinod M
6/21/2015

Introduction

  • Classifiers seperate classes by mapping classes to a probabilty space
  • In understanding how well a classifier is working it helps to visualize the descision boundary created
  • Changing the parameters of the classifier changes the fit. Understanding the parameters is key to using the classifier with sucess.
  • The Support Vector Machine (SVM) is known as a large margin classifier and uses two parameters C and gamma to control the fit

Dataset with two classifiers

Below is a plot of the example dataset used by this tool Visually we can see that the two classes are not seperable linearly and it also has some outliers. plot of chunk unnamed-chunk-1

Decision boundary tool in action

  • When using the descision boundary tool and adjusting the C parameter we can see that the SVM classifier attempts to classify all points (including outliers) correctly to reduce the misclassification error when using high C values. Where as a low C value produces a simple decision boundary with higher misclassification error. The C acts as a regularization parameter and can be used to prevent overfitting the data
  • The parameter gamma determines the radius of influence of a training point and large values lead to overfit and small values result in simple decision boundary.

Conclusion

  • Learning methods are effective tools in supervised learning but the user needs to know how the parameters affect the outcome
  • Educative lools like this one provide an intuitive learning as to how the parameters influence the model