In this blog I could like to focus on one primary advantage and disadvantage of different classifier
KNN runs on the principle of how the neigbours are classified, depending on it, the observation is classified into one group or another. It calculates the euclidean distance among the observations and identifies K nearest neighbors and check what are they classified as. Depending on the majority of these classifications, the observation is classified. The value of K is very critical in these scenarios. Usually we take k=3 or 5, but it depends on the dataset. Its advantages include being able to work efficiently. It’s critical disadvantage is that its absolutely not suggested in cases of less data or skewed data.
Naive Bayes is a generative algorithm, which means it generates newer data based on the existing data and uses it to identify the probability of an observation being classified as a type. Naive Bayes is hence much faster and it’s usually used in Natural Langugae processing in real world applications, as it has to calculate the probability of words and render a result. It’s primary advantages is it being generative, it’s faster. It’s disadvantage is it’s assumption of independence among features which is not really applicable in many cases.
Logistic regression is a probabilistic algorithm, where is identifies the probability of a value and then assigns the classification. If the value of P is greater than 0.5 then it classifies to one value, else to another. If we have only 2 types of classes in the dependent variable, then it’s called Binary Logistic regression, else multinomial regression. In R, we implement the logistic using glm package, and specify the type as binomial for binomial logistic regression. It’s primary advantages is its very simple and explainable and it’s disadvantage is it’s not ideal for multi class or complex scenarios.
Decision tree runs on the tree model, where it generates a tree structure from the data and uses it to classify the data. Once the tree structure is formed, then it’s kind of a flowchart, which will be traversed to identify the actual value. Decision tree is also a good fit for parallel processing models. It’s primary advantages is it’s easily explainable and disadvantage is it isn’t good fit for small datasets.
SVM is one of the most sophisticated algorithms that work on a wide variety of situations. A lot of the functioning of SVM algorithm depends on what Kernel is used. SVM works on the projecting the data into multidimensional plane to classify the data so that it can be separated by a plane. It’s primary advantages is that it’s robust to the outliers, and its disadvantage is that its slower when it has to be run on larger data sets especially with larger datasets and having high number of dependent classes.