Caio Miyashiro
22.09.2017
An overview of the analysis over the Titanic database.
Famous dataset from kaggle, containing information about passengers and a final variable indicating if the passenger survived the tragedy or not.
Objectives
names(titanicDataset)
[1] "PassengerId" "Survived" "Pclass" "Name" "Sex"
[6] "Age" "SibSp" "Parch" "Ticket" "Fare"
[11] "Cabin" "Embarked"
Models achieved ~79% Accuracy in test set. Naive Bayes identified as the best model to correctly identify if a person has a chance of survival given his attributes
Complete data analysis can be found here!