Logistic regression provides a method in predicting the classifications based on a set of independent variables. However, in some cases, there are other models that may provide better predictive performance through the use of Bayesโ Theorem. In this section we discuss three alternative classifiers: linear discriminant analysis, quadratic discriminant analysis, and naive Bayes. R codes for each method are provided below.
The linear discriminant analysis (LDA) classifier results
from assuming that the observations within each class come from a normal
distribution with a specific mean and a common variance \(\sigma^2\). The word linear stems
from the fact that the discriminant functions are linear
functions of \(x\) as opposed to more
complex functions. We again consider the Default credit
dataset and apply LDA in classifying customers based on predictors
balance and income.
library(ISLR2)
library(MASS)
names(Default)
## [1] "default" "student" "balance" "income"
head(Default)
## default student balance income
## 1 No No 729.5265 44361.625
## 2 No Yes 817.1804 12106.135
## 3 No No 1073.5492 31767.139
## 4 No No 529.2506 35704.494
## 5 No No 785.6559 38463.496
## 6 No Yes 919.5885 7491.559
#Rename Default data to Credit
Credit <- Default
# Using the function lda() to generate the linear discriminant analysis
# lda() uses the same syntax as lm()
lda.fit <- lda(default ~ balance + income, data=Credit)
lda.fit
## Call:
## lda(default ~ balance + income, data = Credit)
##
## Prior probabilities of groups:
## No Yes
## 0.9667 0.0333
##
## Group means:
## balance income
## No 803.9438 33566.17
## Yes 1747.8217 32089.15
##
## Coefficients of linear discriminants:
## LD1
## balance 2.230835e-03
## income 7.793355e-06
# The plot() function produces plots of the linear discriminants
plot(lda.fit)
# The predict() function returns a list that contains the predicted classes
lda.pred <- predict(lda.fit, Credit)
lda.prob <- lda.pred$posterior[,2]
# Suppose we want a 0.3 threshhold in classifying default
# The predictions can then be assessed using a confusion matrix
lda.class <- ifelse(lda.prob > 0.3, "Yes", "No")
table(lda.class, Credit$default)
##
## lda.class No Yes
## No 9575 182
## Yes 92 151
# Correct classification rate (97.2%)
mean(lda.class == Credit$default)
## [1] 0.9726
Quadratic Discriminant Analysis (QDA) provides an alernative
approach in classifying the outcome variable. Like LDA, the QDA
classifier results from assuming that the observations from each class
are drawn from a Gaussian distribution. However, unlike LDA, QDA assumes
that each class has its own covariance matrix, i.e., it assumes that an
observation from the kth class is of the form \(X\) that follows \(N(\mu_k, \Sigma_k)\), where \(\Sigma_k\) is a covariance matrix for the
kth class. Using QDA, we can also model the probability of
default from the Default data.
# Almost the same codes were adopted from LDA but lda was changed to qda
qda.fit <- qda(default ~ balance + income, data=Credit)
qda.fit
## Call:
## qda(default ~ balance + income, data = Credit)
##
## Prior probabilities of groups:
## No Yes
## 0.9667 0.0333
##
## Group means:
## balance income
## No 803.9438 33566.17
## Yes 1747.8217 32089.15
qda.pred <- predict(qda.fit, Credit)
qda.prob <- qda.pred$posterior[,2]
qda.class <- ifelse(qda.prob > 0.3, "Yes", "No")
table(qda.class, Credit$default)
##
## qda.class No Yes
## No 9509 161
## Yes 158 172
mean(qda.class == Credit$default)
## [1] 0.9681
The naive Bayes classifier takes a different track for
estimating \(f_1(x),...,f_K(x)\).
Instead of assuming that these functions belong to a particular family
of distributions, we make a single assumption: within the kth class,
the p predictors are independent. The naive Bayes function is found
in the e1071 library.
library(e1071)
nb.fit <- naiveBayes(default ~ balance + income, data=Credit)
nb.fit
##
## Naive Bayes Classifier for Discrete Predictors
##
## Call:
## naiveBayes.default(x = X, y = Y, laplace = laplace)
##
## A-priori probabilities:
## Y
## No Yes
## 0.9667 0.0333
##
## Conditional probabilities:
## balance
## Y [,1] [,2]
## No 803.9438 456.4762
## Yes 1747.8217 341.2668
##
## income
## Y [,1] [,2]
## No 33566.17 13318.25
## Yes 32089.15 13804.22
# the predict() function needs the type=raw argument to generate the probabilities
nb.pred <- predict(nb.fit, Credit, type="raw")
nb.prob <- nb.pred[,2]
nb.class <- ifelse(nb.prob > 0.3, "Yes", "No")
table(nb.class, Credit$default)
##
## nb.class No Yes
## No 9493 162
## Yes 174 171
mean(nb.class == Credit$default)
## [1] 0.9664