Types of algorithm used in classification

Requirements of Naive Bayes

Packages used for Naive Bayes classifier:

#install.packages("e1071")
library(e1071)

Example of diving into 70/30 for training and testing

#trainIndex = createDataPartition(mydata$program, p = 0.7)$Resample1 
#train = mydata[trainIndex, ]
#test = mydata[-trainIndex,]
#print(table(mydata$program)) #Check balance before division
#print(table(train$program)) #Check balance after division. Has to follow the parent dataset because of division

Example:

data("Titanic")

Titanic_df = as.data.frame(Titanic)

###Creating data from table (normally, we do not have to do this)
repeating_sequence=rep.int(seq_len(nrow(Titanic_df)), Titanic_df$Freq) #This will repeat each combination equal to the frequency of each combination

Titanic_dataset=Titanic_df[repeating_sequence,] #Create the dataset by row repetition created

Titanic_dataset$Freq=NULL #We no longer need the frequency, drop the feature

Fit the model

Naive_Bayes_Model <- naiveBayes(Survived ~., data = Titanic_dataset)

Use the built model (Naive_Bayes_Model) to predict the dataset itself

NB_Predictions=predict(Naive_Bayes_Model,Titanic_dataset)

###Use Confusion matrix to check accuracy
table(NB_Predictions,Titanic_dataset$Survived)
##               
## NB_Predictions   No  Yes
##            No  1364  362
##            Yes  126  349