Pros:
Cons:
Our goal is to build parametric model for conditional distribution \(P(Y = k | X = x)\)
A typical approach is to apply Bayes theorem: \[ Pr(Y = k | X=x) = \frac{Pr(X=x|Y=k)Pr(Y=k)}{\sum_{\ell=1}^K Pr(X=x |Y = \ell) Pr(Y=\ell)}\] \[Pr(Y = k | X=x) = \frac{f_k(x) \pi_k}{\sum_{\ell = 1}^K f_{\ell}(x) \pi_{\ell}}\]
Typically prior probabilities \(\pi_k\) are set in advance.
A common choice for \(f_k(x) = \frac{1}{\sigma_k \sqrt{2 \pi}}e^{-\frac{(x-\mu_k)^2}{\sigma_k^2}}\), a Gaussian distribution
Estimate the parameters (\(\mu_k\),\(\sigma_k^2\)) from the data.
Classify to the class with the highest value of \(P(Y = k | X = x)\)
A range of models use this approach
http://statweb.stanford.edu/~tibs/ElemStatLearn/
\[log \frac{Pr(Y = k | X=x)}{Pr(Y = j | X=x)}\] \[ = log \frac{f_k(x)}{f_j(x)} + log \frac{\pi_k}{\pi_j}\] \[ = log \frac{\pi_k}{\pi_j} - \frac{1}{2}(\mu_k + \mu_j)^T \Sigma^{-1}(\mu_k + \mu_j)\] \[ + x^T \Sigma^{-1} (\mu_k - \mu_j)\]
http://statweb.stanford.edu/~tibs/ElemStatLearn/
\[\delta_k(x) = x^T \Sigma^{-1} \mu_k - \frac{1}{2}\mu_k \Sigma^{-1}\mu_k + log(\mu_k)\]
Suppose we have many predictors, we would want to model: \(P(Y = k | X_1,\ldots,X_m)\)
We could use Bayes Theorem to get:
\[P(Y = k | X_1,\ldots,X_m) = \frac{\pi_k P(X_1,\ldots,X_m| Y=k)}{\sum_{\ell = 1}^K P(X_1,\ldots,X_m | Y=k) \pi_{\ell}}\] \[ \propto \pi_k P(X_1,\ldots,X_m| Y=k)\]
This can be written:
\[P(X_1,\ldots,X_m, Y=k) = \pi_k P(X_1 | Y = k)P(X_2,\ldots,X_m | X_1,Y=k)\] \[ = \pi_k P(X_1 | Y = k) P(X_2 | X_1, Y=k) P(X_3,\ldots,X_m | X_1,X_2, Y=k)\] \[ = \pi_k P(X_1 | Y = k) P(X_2 | X_1, Y=k)\ldots P(X_m|X_1\ldots,X_{m-1},Y=k)\]
We could make an assumption to write this:
\[ \approx \pi_k P(X_1 | Y = k) P(X_2 | Y = k)\ldots P(X_m |,Y=k)\]
data(iris); library(ggplot2)
names(iris)
[1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species"
table(iris$Species)
setosa versicolor virginica
50 50 50
inTrain <- createDataPartition(y=iris$Species,
p=0.7, list=FALSE)
training <- iris[inTrain,]
testing <- iris[-inTrain,]
dim(training); dim(testing)
[1] 45 5
modlda = train(Species ~ .,data=training,method="lda")
modnb = train(Species ~ ., data=training,method="nb")
plda = predict(modlda,testing); pnb = predict(modnb,testing)
table(plda,pnb)
pnb
plda setosa versicolor virginica
setosa 15 0 0
versicolor 0 13 1
virginica 0 0 16
equalPredictions = (plda==pnb)
qplot(Petal.Width,Sepal.Width,colour=equalPredictions,data=testing)