Linear Regression Classifier & KNN Classifier & Bayesian Classifier Simulation and Visualiaztion

First simulate data of three classes.This process uses mixture model of three multinormal distribution with hyper parameters of means which are also multinromal. The means for hyper parameters (1,3), (2,3),(4,4).And all covariance matrix are same, which is diag(1,2)*0.2

library(MASS)
library(class)
set.seed(420121)
mu0 = c(0, 0)
sigma0 = diag(1, 2) * 0.2

traino = mvrnorm(300, mu0, sigma0)
# Generate normal means for three classes
mu1 = mvrnorm(100, c(1, 3), sigma0)
mu2 = mvrnorm(100, c(3, 1), sigma0)
mu3 = mvrnorm(100, c(4, 4), sigma0)
muplus = rbind(mu1, mu2, mu3)
traindata = traino + muplus
traincl = rep(c(1, 2, 3), c(100, 100, 100))
# By doing these settings, I simulated points from a mixture of three normal
# dist. with normal means. And to point out here, weight for each class is
# 1/3.

# Then generate test data
x1 = seq(0, 6, length.out = 25)
x2 = seq(0, 6, length.out = 40)
tx1 = rep(x1, 40)
tx2 = rep(x2, rep(25, 40))
testdata = cbind(tx1, tx2)
plot(traindata, pch = traincl)

plot of chunk unnamed-chunk-2

First do linear fitting

LinearMode = lm(traincl ~ traindata)
# Predict function are as follows:
yhat = function(x1, x2) {
    LinearMode$coef[1] + LinearMode$coef[2] * x1 + LinearMode$coef[3] * x2
}
X = seq(0, 6, length.out = 1000)
Y1 = (1.5 - coef(LinearMode)[1] - coef(LinearMode)[2] * X)/coef(LinearMode)[3]
Y2 = (2.5 - coef(LinearMode)[1] - coef(LinearMode)[2] * X)/coef(LinearMode)[3]
plot(traindata, pch = traincl, main = "Linear Fitting Classifier", xlab = "x1", 
    ylab = "x2")
lines(X, Y1, col = 2)
lines(X, Y2, col = 3)

plot of chunk unnamed-chunk-4

Second is KNN with k=15

Kmo = knn(traindata, testdata, traincl, k = 15, l = 0, prob = T, use.all = TRUE)

probk = attr(Kmo, "prob")
probk = ifelse(Kmo == "1", probk, 1 - probk)
probk = matrix(attr(Kmo, "prob"), length(x1), length(x2))
contour(x1, x2, probk, levels = c(0.5, 0.65, 0.95), col = 2, main = "KNN Classifier Plot K=15")
points(traindata, pch = traincl, cex = 0.8, col = rep(c(3, 4, 5), rep(100, 3)))

plot of chunk unnamed-chunk-6

Last one is Bayesian classifier. This one would be awful since my mixture model is complicated. I will continue this later…