Classification_Naive Bayes

Types of algorithm used in classification

Naive Bayes
Logistic Regression
K-Nearest neigbours (KNN)
Natural Language Processing (NLP)
Decision Tree
Random Forest
Support Vector Machine (SVM)
Stochastic Gradient Descent

Requirements of Naive Bayes

If all input features are categorical, Naive Bayes is recommended
Features are independent of each other
For this classification (binary dependent), we have to check data’s balance. If the data is not balanced, use function “createDataPartition” from caret package to make a balanced partitioning.

Packages used for Naive Bayes classifier:

#install.packages("e1071")
library(e1071)

Example of diving into 70/30 for training and testing

#trainIndex = createDataPartition(mydata$program, p = 0.7)$Resample1 
#train = mydata[trainIndex, ]
#test = mydata[-trainIndex,]
#print(table(mydata$program)) #Check balance before division
#print(table(train$program)) #Check balance after division. Has to follow the parent dataset because of division

Example:

data("Titanic")

Titanic_df = as.data.frame(Titanic)

###Creating data from table (normally, we do not have to do this)
repeating_sequence=rep.int(seq_len(nrow(Titanic_df)), Titanic_df$Freq) #This will repeat each combination equal to the frequency of each combination

Titanic_dataset=Titanic_df[repeating_sequence,] #Create the dataset by row repetition created

Titanic_dataset$Freq=NULL #We no longer need the frequency, drop the feature

Fit the model

Naive_Bayes_Model <- naiveBayes(Survived ~., data = Titanic_dataset)

Use the built model (Naive_Bayes_Model) to predict the dataset itself

NB_Predictions=predict(Naive_Bayes_Model,Titanic_dataset)

###Use Confusion matrix to check accuracy
table(NB_Predictions,Titanic_dataset$Survived)

##               
## NB_Predictions   No  Yes
##            No  1364  362
##            Yes  126  349

Reference

https://rpubs.com/riazakhan94/naive_bayes_classifier_e1071

https://www.r-bloggers.com/understanding-naive-bayes-classifier-using-r/

Classification_Naive Bayes

Ly Tran

Types of algorithm used in classification

Requirements of Naive Bayes

Packages used for Naive Bayes classifier:

Example of diving into 70/30 for training and testing

Example:

Fit the model

Use the built model (Naive_Bayes_Model) to predict the dataset itself

Reference