Titanic Survival Prediction

Fernando Melo
november 19, 2017

Introduction

The objective of this presentation is to give information about what a user will need to get started using the shiny application aplication developed for the final project of the Data Product course.

The Titanic Prediction application uses the Kaggle train.csv dataset to train a decision tree model and predicts if a new passenger will survive, based on the selections made by application user.

Instructions

To get the Titanic Prediction for a new passenger:

1- select the passenger ticket Class (1, 2 or 3).

2- Select the passenger sex (male/female).

3- select the passenger age using the slider (1 to 80).

The aplication will display the user selections and the prediction.

Prediction = 0 : No, passenger didn't survived.

Prediction = 1 : Yes, passenger survived.

Server calculations

This is the code for the server calculations:

require(rpart)
titanicTrain <- read.csv("train.csv",header=TRUE)
titanicTrain$Pclass <- as.factor(titanicTrain$Pclass)
model2 <- rpart(Survived ~ Pclass + Sex + Age,data=titanicTrain,method="class")
model2
n= 891 

node), split, n, loss, yval, (yprob)
      * denotes terminal node

 1) root 891 342 0 (0.61616162 0.38383838)  
   2) Sex=male 577 109 0 (0.81109185 0.18890815)  
     4) Age>=6.5 553  93 0 (0.83182640 0.16817360) *
     5) Age< 6.5 24   8 1 (0.33333333 0.66666667) *
   3) Sex=female 314  81 1 (0.25796178 0.74203822)  
     6) Pclass=3 144  72 0 (0.50000000 0.50000000)  
      12) Age>=38.5 12   1 0 (0.91666667 0.08333333) *
      13) Age< 38.5 132  61 1 (0.46212121 0.53787879)  
        26) Age>=5.5 117  57 1 (0.48717949 0.51282051)  
          52) Age< 12 8   0 0 (1.00000000 0.00000000) *
          53) Age>=12 109  49 1 (0.44954128 0.55045872) *
        27) Age< 5.5 15   4 1 (0.26666667 0.73333333) *
     7) Pclass=1,2 170   9 1 (0.05294118 0.94705882) *

Histogram : Titanic passengers by age

plot of chunk unnamed-chunk-1