In this data set we wil primary deal with missing values. If the amount of missing data is not big relative to the size of the dataset then leaving the missing values might be the best strategy, however leaging the available data doesn’t produce the best information and we need to look for fixes before leaving out the potential useful data points.

This data set is from kaggle and is described in detail on webpage. Following is the description of data set.

VARIABLE DESCRIPTIONS: survival Survival (0 = No; 1 = Yes) pclass Passenger Class (1 = 1st; 2 = 2nd; 3 = 3rd) name Name sex Sex age Age sibsp Number of Siblings/Spouses Aboard parch Number of Parents/Children Aboard ticket Ticket Number fare Passenger Fare cabin Cabin embarked Port of Embarkation (C = Cherbourg; Q = Queenstown; S = Southampton)

Attaching libraries

library(data.table)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:data.table':
## 
##     between, last
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyr)
library(readr)

0: Load the data in RStudio

Save the data set as a CSV file called titanic_original.csv and load it in RStudio into a data frame

Read the data

titanic_original <- read.csv("C:/Users/6430/Desktop/Project/titanic_original.csv", header = TRUE)

df<-read.csv("C:/Users/6430/Desktop/Project/titanic_original.csv",na.strings=c("","na"))# we take care of the missing values right here.


str(df)
## 'data.frame':    1309 obs. of  14 variables:
##  $ pclass   : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ survived : int  1 1 0 0 0 1 1 0 1 0 ...
##  $ name     : Factor w/ 1307 levels "Abbing, Mr. Anthony",..: 22 24 25 26 27 31 46 47 51 55 ...
##  $ sex      : Factor w/ 2 levels "female","male": 1 2 1 2 1 2 1 2 1 2 ...
##  $ age      : num  29 0.917 2 30 25 ...
##  $ sibsp    : int  0 1 1 1 1 0 1 0 2 0 ...
##  $ parch    : int  0 2 2 2 2 0 0 0 0 0 ...
##  $ ticket   : Factor w/ 929 levels "110152","110413",..: 188 50 50 50 50 125 93 16 77 826 ...
##  $ fare     : num  211 152 152 152 152 ...
##  $ cabin    : Factor w/ 186 levels "A10","A11","A14",..: 44 80 80 80 80 150 146 16 62 NA ...
##  $ embarked : Factor w/ 3 levels "C","Q","S": 3 3 3 3 3 3 3 3 3 1 ...
##  $ boat     : Factor w/ 27 levels "1","10","11",..: 12 3 NA NA NA 13 2 NA 27 NA ...
##  $ body     : int  NA NA NA 135 NA NA NA NA NA 22 ...
##  $ home.dest: Factor w/ 369 levels "?Havana, Cuba",..: 309 231 231 231 231 237 162 24 22 229 ...
class(df)
## [1] "data.frame"

1: Port of embarkation

The embarked column has some missing values, which are known to correspond to passengers who actually embarked at Southampton. Find the missing values and replace them with S. (Caution: Sometimes a missing value might be read into R as a blank or empty string.)

df[is.na(df$embarked),] #only the rows 169 and 285
##     pclass survived                                      name    sex age
## 169      1        1                       Icard, Miss. Amelie female  38
## 285      1        1 Stone, Mrs. George Nelson (Martha Evelyn) female  62
##     sibsp parch ticket fare cabin embarked boat body      home.dest
## 169     0     0 113572   80   B28     <NA>    6   NA           <NA>
## 285     0     0 113572   80   B28     <NA>    6   NA Cincinatti, OH
df1<-df
df1$embarked <-lapply(df1$embarked, as.character)# Since embarked column is factor, it has to be converted into "character" first to input string.

df1$embarked[which(is.na(df1$embarked))] <-"s"
df2<-df1 
df2[is.na(df2$embarked),] ## return  null i.e "na" values have been replaced by "s" as required.
##  [1] pclass    survived  name      sex       age       sibsp     parch    
##  [8] ticket    fare      cabin     embarked  boat      body      home.dest
## <0 rows> (or 0-length row.names)

2: Age

You’ll notice that a lot of the values in the Age column are missing. While there are many ways to fill these missing values, using the mean or median of the rest of the values is quite common in such cases.

Calculate the mean of the Age column and use that value to populate the missing values

Think about other ways you could have populated the missing values in the age column. Why would you pick any of those over the mean (or not)?

df2[is.na(df2$age),]# quite many rows as suggested above
##      pclass survived
## 16        1        0
## 38        1        1
## 41        1        0
## 47        1        0
## 60        1        1
## 70        1        1
## 71        1        0
## 75        1        0
## 81        1        0
## 107       1        0
## 108       1        1
## 109       1        1
## 119       1        0
## 122       1        1
## 126       1        0
## 135       1        1
## 148       1        0
## 153       1        1
## 158       1        0
## 167       1        0
## 177       1        1
## 180       1        0
## 185       1        0
## 197       1        1
## 205       1        1
## 220       1        1
## 224       1        0
## 236       1        1
## 238       1        0
## 242       1        0
## 255       1        1
## 257       1        1
## 270       1        0
## 278       1        1
## 284       1        0
## 294       1        1
## 298       1        1
## 319       1        0
## 321       1        1
## 364       2        0
## 383       2        0
## 385       2        0
## 411       2        0
## 470       2        1
## 474       2        0
## 478       2        0
## 484       2        1
## 492       2        0
## 496       2        0
## 525       2        1
## 529       2        0
## 532       2        0
## 582       2        0
## 596       2        0
## 598       2        1
## 673       3        0
## 681       3        0
## 682       3        0
## 683       3        0
## 706       3        0
## 707       3        0
## 757       3        0
## 758       3        1
## 768       3        0
## 769       3        0
## 776       3        0
## 790       3        0
## 796       3        0
## 799       3        1
## 801       3        0
## 802       3        0
## 803       3        0
## 805       3        0
## 806       3        1
## 809       3        0
## 813       3        0
## 814       3        0
## 816       3        0
## 817       3        0
## 820       3        1
## 836       3        0
## 843       3        0
## 844       3        0
## 853       3        0
## 855       3        0
## 857       3        1
## 859       3        1
## 866       3        0
## 872       3        0
## 873       3        1
## 875       3        1
## 877       3        0
## 880       3        0
## 883       3        0
## 887       3        1
## 888       3        1
## 901       3        0
## 902       3        0
## 903       3        0
## 904       3        0
## 919       3        0
## 921       3        0
## 922       3        0
## 923       3        1
## 924       3        1
## 927       3        1
## 928       3        0
## 929       3        0
## 930       3        0
## 931       3        0
## 932       3        0
## 941       3        0
## 943       3        0
## 945       3        0
## 946       3        1
## 947       3        0
## 949       3        0
## 955       3        0
## 956       3        0
## 957       3        0
## 958       3        0
## 959       3        0
## 962       3        0
## 963       3        0
## 972       3        0
## 974       3        0
## 977       3        0
## 983       3        0
## 984       3        0
## 985       3        1
## 988       3        0
## 989       3        0
## 990       3        0
## 992       3        1
## 994       3        1
## 995       3        0
## 998       3        1
## 999       3        0
## 1000      3        1
## 1001      3        1
## 1002      3        1
## 1003      3        1
## 1004      3        1
## 1005      3        1
## 1006      3        0
## 1007      3        1
## 1010      3        0
## 1013      3        0
## 1014      3        0
## 1015      3        0
## 1017      3        0
## 1019      3        0
## 1023      3        0
## 1024      3        1
## 1028      3        0
## 1029      3        1
## 1030      3        0
## 1031      3        0
## 1033      3        0
## 1034      3        1
## 1035      3        1
## 1036      3        1
## 1037      3        1
## 1038      3        1
## 1039      3        0
## 1040      3        1
## 1042      3        0
## 1043      3        1
## 1044      3        1
## 1045      3        1
## 1053      3        0
## 1054      3        0
## 1055      3        0
## 1056      3        0
## 1070      3        0
## 1071      3        0
## 1072      3        1
## 1073      3        0
## 1074      3        0
## 1075      3        0
## 1077      3        0
## 1078      3        1
## 1079      3        1
## 1081      3        1
## 1082      3        1
## 1086      3        0
## 1096      3        0
## 1110      3        0
## 1115      3        0
## 1116      3        0
## 1117      3        0
## 1122      3        1
## 1123      3        1
## 1124      3        1
## 1125      3        0
## 1129      3        0
## 1133      3        0
## 1136      3        0
## 1137      3        0
## 1138      3        0
## 1139      3        0
## 1150      3        1
## 1151      3        0
## 1152      3        0
## 1155      3        0
## 1156      3        0
## 1160      3        1
## 1163      3        1
## 1164      3        0
## 1165      3        0
## 1167      3        0
## 1168      3        0
## 1169      3        0
## 1171      3        0
## 1173      3        0
## 1174      3        0
## 1175      3        0
## 1176      3        0
## 1177      3        0
## 1178      3        0
## 1179      3        0
## 1180      3        0
## 1181      3        0
## 1185      3        0
## 1186      3        0
## 1187      3        0
## 1194      3        0
## 1195      3        0
## 1196      3        0
## 1198      3        0
## 1199      3        1
## 1200      3        0
## 1201      3        0
## 1203      3        0
## 1213      3        0
## 1214      3        0
## 1215      3        0
## 1216      3        0
## 1217      3        1
## 1220      3        0
## 1222      3        0
## 1242      3        0
## 1243      3        0
## 1244      3        0
## 1246      3        0
## 1247      3        0
## 1248      3        1
## 1250      3        0
## 1251      3        0
## 1254      3        0
## 1256      3        0
## 1263      3        0
## 1269      3        0
## 1283      3        0
## 1284      3        0
## 1285      3        0
## 1292      3        0
## 1293      3        0
## 1294      3        0
## 1298      3        0
## 1303      3        0
## 1304      3        0
## 1306      3        0
##                                                             name    sex
## 16                                           Baumann, Mr. John D   male
## 38                 Bradley, Mr. George ("George Arthur Brayton")   male
## 41                                     Brewe, Dr. Arthur Jackson   male
## 47                                         Cairns, Mr. Alexander   male
## 60   Cassebeer, Mrs. Henry Arthur Jr (Eleanor Genevieve Fosdick) female
## 70                        Chibnall, Mrs. (Edith Martha Bowerman) female
## 71                         Chisholm, Mr. Roderick Robert Crispin   male
## 75                                   Clifford, Mr. George Quincy   male
## 81                                     Crafton, Mr. John Bertram   male
## 107                                           Farthing, Mr. John   male
## 108                         Flegenheim, Mrs. Alfred (Antoinette) female
## 109                                      Fleming, Miss. Margaret female
## 119                                  Franklin, Mr. Thomas Parham   male
## 122           Frauenthal, Mrs. Henry William (Clara Heinsheimer) female
## 126                                             Fry, Mr. Richard   male
## 135                 Goldenberg, Mrs. Samuel L (Edwiga Grabowska) female
## 148                                    Harrington, Mr. Charles H   male
## 153                                  Hawksford, Mr. Walter James   male
## 158                                  Hilliard, Mr. Herbert Henry   male
## 167                                     Hoyt, Mr. William Fisher   male
## 177                            Kenyon, Mrs. Frederick R (Marion) female
## 180                                           Klaber, Mr. Herman   male
## 185                                            Lewy, Mr. Ervin G   male
## 197                                         Marechal, Mr. Pierre   male
## 205                        Meyer, Mrs. Edgar Joseph (Leila Saks) female
## 220                                    Omont, Mr. Alfred Fernand   male
## 224                                Parr, Mr. William Henry Marsh   male
## 236                          Rheims, Mr. George Alexander Lucien   male
## 238                                          Robbins, Mr. Victor   male
## 242                                        Rood, Mr. Hugh Roscoe   male
## 255                                        Saalfeld, Mr. Adolphe   male
## 257                                       Salomon, Mr. Abraham L   male
## 270                                   Smith, Mr. Richard William   male
## 278               Spencer, Mrs. William Augustus (Marie Eugenie) female
## 284                                        Stewart, Mr. Albert A   male
## 294            Taylor, Mrs. Elmer Zebley (Juliet Cummins Wright) female
## 298                               Thorne, Mrs. Gertrude Maybelle female
## 319                       Williams-Lambert, Mr. Fletcher Fellows   male
## 321                                            Woolner, Mr. Hugh   male
## 364                                        Campbell, Mr. William   male
## 383          Corey, Mrs. Percy C (Mary Phyllis Elizabeth Miller) female
## 385                               Cunningham, Mr. Alfred Fleming   male
## 411                             Frost, Mr. Anthony Wood "Archie"   male
## 470                                          Keane, Miss. Nora A female
## 474                                         Knight, Mr. Robert J   male
## 478                                        Lamb, Mr. John Joseph   male
## 484                                   Leitch, Miss. Jessie Wills female
## 492                                          Malachard, Mr. Noel   male
## 496                            Mangiavacchi, Mr. Serafino Emilio   male
## 525                                   Padro y Manent, Mr. Julian   male
## 529                                  Parkes, Mr. Francis "Frank"   male
## 532                                             Pernot, Mr. Rene   male
## 582                                   Watson, Mr. Ennis Hastings   male
## 596                               Wheeler, Mr. Edwin "Frederick"   male
## 598                                 Williams, Mr. Charles Eugene   male
## 673                                        Betros, Master. Seman   male
## 681                                            Boulos, Mr. Hanna   male
## 682                                Boulos, Mrs. Joseph (Sultana) female
## 683                                           Bourke, Miss. Mary female
## 706                                            Caram, Mr. Joseph   male
## 707                             Caram, Mrs. Joseph (Maria Elias) female
## 757                                    Davison, Mr. Thomas Henry   male
## 758                    Davison, Mrs. Thomas Henry (Mary E Finck) female
## 768                                         Demetri, Mr. Marinko   male
## 769                                           Denkoff, Mr. Mitto   male
## 776                                          Doharr, Mr. Tannous   male
## 790                                              Elias, Mr. Dibo   male
## 796                                      Emir, Mr. Farred Chehab   male
## 799                                            Finoli, Mr. Luigi   male
## 801                                        Fleming, Miss. Honora female
## 802                                             Flynn, Mr. James   male
## 803                                              Flynn, Mr. John   male
## 805                                           Foley, Mr. William   male
## 806                                              Foo, Mr. Choong   male
## 809                                             Ford, Mr. Arthur   male
## 813                                             Fox, Mr. Patrick   male
## 814                       Franklin, Mr. Charles (Charles Fardon)   male
## 816                                           Garfirth, Mr. John   male
## 817                                       Gheorgheff, Mr. Stanio   male
## 820                                     Glynn, Miss. Mary Agatha female
## 836                                            Guest, Mr. Robert   male
## 843                              Hagland, Mr. Ingvald Olai Olsen   male
## 844                         Hagland, Mr. Konrad Mathias Reiersen   male
## 853                                 Harknett, Miss. Alice Phoebe female
## 855                                              Hart, Mr. Henry   male
## 857                                   Healy, Miss. Hanora "Nora" female
## 859                                                Hee, Mr. Ling   male
## 866                                           Henry, Miss. Delia female
## 872                                             Horgan, Mr. John   male
## 873                                  Howard, Miss. May Elizabeth female
## 875                                           Hyman, Mr. Abraham   male
## 877                                             Ilieff, Mr. Ylio   male
## 880                                           Ivanoff, Mr. Kanio   male
## 883                                        Jardin, Mr. Jose Neto   male
## 887                                          Jermyn, Miss. Annie female
## 888                            Johannesen-Bratthammer, Mr. Bernt   male
## 901                    Johnston, Master. William Arthur "Willie"   male
## 902                     Johnston, Miss. Catherine Helen "Carrie" female
## 903                                       Johnston, Mr. Andrew G   male
## 904            Johnston, Mrs. Andrew G (Elizabeth "Lily" Watson) female
## 919                                            Kassem, Mr. Fared   male
## 921                                     Keane, Mr. Andrew "Andy"   male
## 922                                            Keefe, Mr. Arthur   male
## 923                     Kelly, Miss. Anna Katherine "Annie Kate" female
## 924                                            Kelly, Miss. Mary female
## 927                                            Kennedy, Mr. John   male
## 928                                           Khalil, Mr. Betros   male
## 929                    Khalil, Mrs. Betros (Zahie "Maria" Elias) female
## 930                                            Kiernan, Mr. John   male
## 931                                          Kiernan, Mr. Philip   male
## 932                                      Kilgannon, Mr. Thomas J   male
## 941                                          Kraeff, Mr. Theodor   male
## 943                                           Lahoud, Mr. Sarkis   male
## 945                                           Laleff, Mr. Kristo   male
## 946                                                 Lam, Mr. Ali   male
## 947                                                 Lam, Mr. Len   male
## 949                                            Lane, Mr. Patrick   male
## 955                                Lefebre, Master. Henry Forbes   male
## 956                                           Lefebre, Miss. Ida female
## 957                                       Lefebre, Miss. Jeannie female
## 958                                      Lefebre, Miss. Mathilde female
## 959                                Lefebre, Mrs. Frank (Frances) female
## 962                                           Lennon, Miss. Mary female
## 963                                            Lennon, Mr. Denis   male
## 972                                         Linehan, Mr. Michael   male
## 974                                           Lithman, Mr. Simon   male
## 977                                          Lockyer, Mr. Edward   male
## 983                                        Lyntakoff, Mr. Stanko   male
## 984                                   MacKay, Mr. George William   male
## 985                             Madigan, Miss. Margaret "Maggie" female
## 988                                   Mahon, Miss. Bridget Delia female
## 989                                              Mahon, Mr. John   male
## 990                                           Maisner, Mr. Simon   male
## 992                                             Mamee, Mr. Hanna   male
## 994                                     Mannion, Miss. Margareth female
## 995                                      Mardirosian, Mr. Sarkis   male
## 998                                      Masselmani, Mrs. Fatima female
## 999                                         Matinoff, Mr. Nicola   male
## 1000                           McCarthy, Miss. Catherine "Katie" female
## 1001                                McCormack, Mr. Thomas Joseph   male
## 1002                                          McCoy, Miss. Agnes female
## 1003                                         McCoy, Miss. Alicia female
## 1004                                          McCoy, Mr. Bernard   male
## 1005                              McDermott, Miss. Brigdet Delia female
## 1006                                         McEvoy, Mr. Michael   male
## 1007                                        McGovern, Miss. Mary female
## 1010                                         McMahon, Mr. Martin   male
## 1013                                      McNeill, Miss. Bridget female
## 1014                              Meanwell, Miss. (Marion Ogden) female
## 1015                     Meek, Mrs. Thomas (Annie Louise Rowley) female
## 1017                                         Mernagh, Mr. Robert   male
## 1019                                            Miles, Mr. Frank   male
## 1023                                           Mitkoff, Mr. Mito   male
## 1024                           Mockler, Miss. Helen Mary "Ellie" female
## 1028                                  Moore, Mr. Leonard Charles   male
## 1029                                         Moran, Miss. Bertha female
## 1030                                         Moran, Mr. Daniel J   male
## 1031                                            Moran, Mr. James   male
## 1033                                    Morrow, Mr. Thomas Rowan   male
## 1034                                      Moss, Mr. Albert Johan   male
## 1035                                    Moubarek, Master. Gerios   male
## 1036           Moubarek, Master. Halim Gonios ("William George")   male
## 1037            Moubarek, Mrs. George (Omine "Amenia" Alexander) female
## 1038                              Moussa, Mrs. (Mantoura Boulos) female
## 1039                                    Moutal, Mr. Rahamin Haim   male
## 1040                            Mullens, Miss. Katherine "Katie" female
## 1042                                         Murdlin, Mr. Joseph   male
## 1043                              Murphy, Miss. Katherine "Kate" female
## 1044                                 Murphy, Miss. Margaret Jane female
## 1045                                          Murphy, Miss. Nora female
## 1053                                          Nankoff, Mr. Minko   male
## 1054                                           Nasr, Mr. Mustafa   male
## 1055                                      Naughton, Miss. Hannah female
## 1056                                        Nenkoff, Mr. Christo   male
## 1070                                         O'Brien, Mr. Thomas   male
## 1071                                        O'Brien, Mr. Timothy   male
## 1072             O'Brien, Mrs. Thomas (Johanna "Hannah" Godfrey) female
## 1073                                    O'Connell, Mr. Patrick D   male
## 1074                                       O'Connor, Mr. Maurice   male
## 1075                                       O'Connor, Mr. Patrick   male
## 1077                                     O'Donoghue, Ms. Bridget female
## 1078                                   O'Driscoll, Miss. Bridget female
## 1079                               O'Dwyer, Miss. Ellen "Nellie" female
## 1081                                        O'Keefe, Mr. Patrick   male
## 1082                               O'Leary, Miss. Hanora "Norah" female
## 1086                                       Olsen, Mr. Ole Martin   male
## 1096                              O'Sullivan, Miss. Bridget Mary female
## 1110                                         Paulner, Mr. Uscher   male
## 1115                                          Pearce, Mr. Ernest   male
## 1116                                          Pedersen, Mr. Olaf   male
## 1117                                         Peduzzi, Mr. Joseph   male
## 1122                                    Peter, Master. Michael J   male
## 1123                                           Peter, Miss. Anna female
## 1124                      Peter, Mrs. Catherine (Catherine Rizk) female
## 1125                                         Peters, Miss. Katie female
## 1129                            Petroff, Mr. Pastcho ("Pentcho")   male
## 1133                                      Plotcharsky, Mr. Vasil   male
## 1136                                       Radeff, Mr. Alexander   male
## 1137                     Rasmussen, Mrs. (Lena Jacobsen Solvang) female
## 1138                                            Razi, Mr. Raihed   male
## 1139                                      Reed, Mr. James George   male
## 1150                             Riordan, Miss. Johanna "Hannah" female
## 1151                                    Risien, Mr. Samuel Beard   male
## 1152                                  Risien, Mrs. Samuel (Emma) female
## 1155                                    Rogers, Mr. William John   male
## 1156                                  Rommetvedt, Mr. Knud Paust   male
## 1160                                         Roth, Miss. Sarah A female
## 1163                                            Ryan, Mr. Edward   male
## 1164                                           Ryan, Mr. Patrick   male
## 1165                                              Saad, Mr. Amin   male
## 1167                                       Saade, Mr. Jean Nassr   male
## 1168                                        Sadlier, Mr. Matthew   male
## 1169                                         Sadowitz, Mr. Harry   male
## 1171                                  Sage, Master. Thomas Henry   male
## 1173                                             Sage, Miss. Ada female
## 1174                                Sage, Miss. Constance Gladys female
## 1175                           Sage, Miss. Dorothy Edith "Dolly" female
## 1176                                     Sage, Miss. Stella Anna female
## 1177                                    Sage, Mr. Douglas Bullen   male
## 1178                                         Sage, Mr. Frederick   male
## 1179                                    Sage, Mr. George John Jr   male
## 1180                                       Sage, Mr. John George   male
## 1181                              Sage, Mrs. John (Annie Bullen) female
## 1185                                           Samaan, Mr. Elias   male
## 1186                                           Samaan, Mr. Hanna   male
## 1187                                         Samaan, Mr. Youssef   male
## 1194                                          Scanlan, Mr. James   male
## 1195                                          Sdycoff, Mr. Todor   male
## 1196                                    Shaughnessy, Mr. Patrick   male
## 1198                             Shellard, Mr. Frederick William   male
## 1199                                  Shine, Miss. Ellen Natalia female
## 1200                                 Shorney, Mr. Charles Joseph   male
## 1201                                           Simmons, Mr. John   male
## 1203                                         Sirota, Mr. Maurice   male
## 1213                                        Slabenoff, Mr. Petco   male
## 1214                               Slocovski, Mr. Selman Francis   male
## 1215                                         Smiljanic, Mr. Mile   male
## 1216                                           Smith, Mr. Thomas   male
## 1217                                          Smyth, Miss. Julia female
## 1220                                          Spector, Mr. Woolf   male
## 1222                                           Staneff, Mr. Ivan   male
## 1242                                       Thomas, Mr. Charles P   male
## 1243                                            Thomas, Mr. John   male
## 1244                                         Thomas, Mr. Tannous   male
## 1246                             Thomson, Mr. Alexander Morrison   male
## 1247                                  Thorneycroft, Mr. Percival   male
## 1248           Thorneycroft, Mrs. Percival (Florence Kate White) female
## 1250                                            Tobin, Mr. Roger   male
## 1251                                         Todoroff, Mr. Lalio   male
## 1254                                            Torfa, Mr. Assad   male
## 1256                                           Toufik, Mr. Nakli   male
## 1263                         van Billiard, Master. James William   male
## 1269                                 van Melkebeke, Mr. Philemon   male
## 1283                                         Ware, Mr. Frederick   male
## 1284                                 Warren, Mr. Charles William   male
## 1285                                           Webber, Mr. James   male
## 1292                            Willer, Mr. Aaron ("Abi Weller")   male
## 1293                                          Willey, Mr. Edward   male
## 1294                           Williams, Mr. Howard Hugh "Harry"   male
## 1298                                      Wiseman, Mr. Phillippe   male
## 1303                                           Yousif, Mr. Wazli   male
## 1304                                       Yousseff, Mr. Gerious   male
## 1306                                       Zabour, Miss. Thamine female
##      age sibsp parch             ticket     fare cabin embarked  boat body
## 16    NA     0     0           PC 17318  25.9250  <NA>        S  <NA>   NA
## 38    NA     0     0             111427  26.5500  <NA>        S     9   NA
## 41    NA     0     0             112379  39.6000  <NA>        C  <NA>   NA
## 47    NA     0     0             113798  31.0000  <NA>        S  <NA>   NA
## 60    NA     0     0              17770  27.7208  <NA>        C     5   NA
## 70    NA     0     1             113505  55.0000   E33        S     6   NA
## 71    NA     0     0             112051   0.0000  <NA>        S  <NA>   NA
## 75    NA     0     0             110465  52.0000   A14        S  <NA>   NA
## 81    NA     0     0             113791  26.5500  <NA>        S  <NA>   NA
## 107   NA     0     0           PC 17483 221.7792   C95        S  <NA>   NA
## 108   NA     0     0           PC 17598  31.6833  <NA>        S     7   NA
## 109   NA     0     0              17421 110.8833  <NA>        C     4   NA
## 119   NA     0     0             113778  26.5500   D34        S  <NA>   NA
## 122   NA     1     0           PC 17611 133.6500  <NA>        S     5   NA
## 126   NA     0     0             112058   0.0000  B102        S  <NA>   NA
## 135   NA     1     0              17453  89.1042   C92        C     5   NA
## 148   NA     0     0             113796  42.4000  <NA>        S  <NA>   NA
## 153   NA     0     0              16988  30.0000   D45        S     3   NA
## 158   NA     0     0              17463  51.8625   E46        S  <NA>   NA
## 167   NA     0     0           PC 17600  30.6958  <NA>        C    14   NA
## 177   NA     1     0              17464  51.8625   D21        S     8   NA
## 180   NA     0     0             113028  26.5500  C124        S  <NA>   NA
## 185   NA     0     0           PC 17612  27.7208  <NA>        C  <NA>   NA
## 197   NA     0     0              11774  29.7000   C47        C     7   NA
## 205   NA     1     0           PC 17604  82.1708  <NA>        C     6   NA
## 220   NA     0     0         F.C. 12998  25.7417  <NA>        C     7   NA
## 224   NA     0     0             112052   0.0000  <NA>        S  <NA>   NA
## 236   NA     0     0           PC 17607  39.6000  <NA>        S     A   NA
## 238   NA     0     0           PC 17757 227.5250  <NA>        C  <NA>   NA
## 242   NA     0     0             113767  50.0000   A32        S  <NA>   NA
## 255   NA     0     0              19988  30.5000  C106        S     3   NA
## 257   NA     0     0             111163  26.0000  <NA>        S     1   NA
## 270   NA     0     0             113056  26.0000   A19        S  <NA>   NA
## 278   NA     1     0           PC 17569 146.5208   B78        C     6   NA
## 284   NA     0     0           PC 17605  27.7208  <NA>        C  <NA>   NA
## 294   NA     1     0              19996  52.0000  C126        S   5 7   NA
## 298   NA     0     0           PC 17585  79.2000  <NA>        C     D   NA
## 319   NA     0     0             113510  35.0000  C128        S  <NA>   NA
## 321   NA     0     0              19947  35.5000   C52        S     D   NA
## 364   NA     0     0             239853   0.0000  <NA>        S  <NA>   NA
## 383   NA     0     0       F.C.C. 13534  21.0000  <NA>        S  <NA>   NA
## 385   NA     0     0             239853   0.0000  <NA>        S  <NA>   NA
## 411   NA     0     0             239854   0.0000  <NA>        S  <NA>   NA
## 470   NA     0     0             226593  12.3500  E101        Q    10   NA
## 474   NA     0     0             239855   0.0000  <NA>        S  <NA>   NA
## 478   NA     0     0             240261  10.7083  <NA>        Q  <NA>   NA
## 484   NA     0     0             248727  33.0000  <NA>        S    11   NA
## 492   NA     0     0             237735  15.0458     D        C  <NA>   NA
## 496   NA     0     0        SC/A.3 2861  15.5792  <NA>        C  <NA>   NA
## 525   NA     0     0      SC/PARIS 2146  13.8625  <NA>        C     9   NA
## 529   NA     0     0             239853   0.0000  <NA>        S  <NA>   NA
## 532   NA     0     0      SC/PARIS 2131  15.0500  <NA>        C  <NA>   NA
## 582   NA     0     0             239856   0.0000  <NA>        S  <NA>   NA
## 596   NA     0     0      SC/PARIS 2159  12.8750  <NA>        S  <NA>   NA
## 598   NA     0     0             244373  13.0000  <NA>        S    14   NA
## 673   NA     0     0               2622   7.2292  <NA>        C  <NA>   NA
## 681   NA     0     0               2664   7.2250  <NA>        C  <NA>   NA
## 682   NA     0     2               2678  15.2458  <NA>        C  <NA>   NA
## 683   NA     0     2             364848   7.7500  <NA>        Q  <NA>   NA
## 706   NA     1     0               2689  14.4583  <NA>        C  <NA>   NA
## 707   NA     1     0               2689  14.4583  <NA>        C  <NA>   NA
## 757   NA     1     0             386525  16.1000  <NA>        S  <NA>   NA
## 758   NA     1     0             386525  16.1000  <NA>        S    16   NA
## 768   NA     0     0             349238   7.8958  <NA>        S  <NA>   NA
## 769   NA     0     0             349225   7.8958  <NA>        S  <NA>   NA
## 776   NA     0     0               2686   7.2292  <NA>        C  <NA>   NA
## 790   NA     0     0               2674   7.2250  <NA>        C  <NA>   NA
## 796   NA     0     0               2631   7.2250  <NA>        C  <NA>   NA
## 799   NA     0     0 SOTON/O.Q. 3101308   7.0500  <NA>        S    15   NA
## 801   NA     0     0             364859   7.7500  <NA>        Q  <NA>   NA
## 802   NA     0     0             364851   7.7500  <NA>        Q  <NA>   NA
## 803   NA     0     0             368323   6.9500  <NA>        Q  <NA>   NA
## 805   NA     0     0             365235   7.7500  <NA>        Q  <NA>   NA
## 806   NA     0     0               1601  56.4958  <NA>        S    13   NA
## 809   NA     0     0           A/5 1478   8.0500  <NA>        S  <NA>   NA
## 813   NA     0     0             368573   7.7500  <NA>        Q  <NA>   NA
## 814   NA     0     0 SOTON/O.Q. 3101314   7.2500  <NA>        S  <NA>   NA
## 816   NA     0     0             358585  14.5000  <NA>        S  <NA>   NA
## 817   NA     0     0             349254   7.8958  <NA>        C  <NA>   NA
## 820   NA     0     0             335677   7.7500  <NA>        Q    13   NA
## 836   NA     0     0             376563   8.0500  <NA>        S  <NA>   NA
## 843   NA     1     0              65303  19.9667  <NA>        S  <NA>   NA
## 844   NA     1     0              65304  19.9667  <NA>        S  <NA>   NA
## 853   NA     0     0         W./C. 6609   7.5500  <NA>        S  <NA>   NA
## 855   NA     0     0             394140   6.8583  <NA>        Q  <NA>   NA
## 857   NA     0     0             370375   7.7500  <NA>        Q    16   NA
## 859   NA     0     0               1601  56.4958  <NA>        S     C   NA
## 866   NA     0     0             382649   7.7500  <NA>        Q  <NA>   NA
## 872   NA     0     0             370377   7.7500  <NA>        Q  <NA>   NA
## 873   NA     0     0        A. 2. 39186   8.0500  <NA>        S     C   NA
## 875   NA     0     0               3470   7.8875  <NA>        S     C   NA
## 877   NA     0     0             349220   7.8958  <NA>        S  <NA>   NA
## 880   NA     0     0             349201   7.8958  <NA>        S  <NA>   NA
## 883   NA     0     0 SOTON/O.Q. 3101305   7.0500  <NA>        S  <NA>   NA
## 887   NA     0     0              14313   7.7500  <NA>        Q     D   NA
## 888   NA     0     0              65306   8.1125  <NA>        S    13   NA
## 901   NA     1     2         W./C. 6607  23.4500  <NA>        S  <NA>   NA
## 902   NA     1     2         W./C. 6607  23.4500  <NA>        S  <NA>   NA
## 903   NA     1     2         W./C. 6607  23.4500  <NA>        S  <NA>   NA
## 904   NA     1     2         W./C. 6607  23.4500  <NA>        S  <NA>   NA
## 919   NA     0     0               2700   7.2292  <NA>        C  <NA>   NA
## 921   NA     0     0              12460   7.7500  <NA>        Q  <NA>   NA
## 922   NA     0     0             323592   7.2500  <NA>        S     A   NA
## 923   NA     0     0               9234   7.7500  <NA>        Q    16   NA
## 924   NA     0     0              14312   7.7500  <NA>        Q     D   NA
## 927   NA     0     0             368783   7.7500  <NA>        Q  <NA>   NA
## 928   NA     1     0               2660  14.4542  <NA>        C  <NA>   NA
## 929   NA     1     0               2660  14.4542  <NA>        C  <NA>   NA
## 930   NA     1     0             367227   7.7500  <NA>        Q  <NA>   NA
## 931   NA     1     0             367229   7.7500  <NA>        Q  <NA>   NA
## 932   NA     0     0              36865   7.7375  <NA>        Q  <NA>   NA
## 941   NA     0     0             349253   7.8958  <NA>        C  <NA>   NA
## 943   NA     0     0               2624   7.2250  <NA>        C  <NA>   NA
## 945   NA     0     0             349217   7.8958  <NA>        S  <NA>   NA
## 946   NA     0     0               1601  56.4958  <NA>        S     C   NA
## 947   NA     0     0               1601  56.4958  <NA>        S  <NA>   NA
## 949   NA     0     0               7935   7.7500  <NA>        Q  <NA>   NA
## 955   NA     3     1               4133  25.4667  <NA>        S  <NA>   NA
## 956   NA     3     1               4133  25.4667  <NA>        S  <NA>   NA
## 957   NA     3     1               4133  25.4667  <NA>        S  <NA>   NA
## 958   NA     3     1               4133  25.4667  <NA>        S  <NA>   NA
## 959   NA     0     4               4133  25.4667  <NA>        S  <NA>   NA
## 962   NA     1     0             370371  15.5000  <NA>        Q  <NA>   NA
## 963   NA     1     0             370371  15.5000  <NA>        Q  <NA>   NA
## 972   NA     0     0             330971   7.8792  <NA>        Q  <NA>   NA
## 974   NA     0     0      S.O./P.P. 251   7.5500  <NA>        S  <NA>   NA
## 977   NA     0     0               1222   7.8792  <NA>        S  <NA>  153
## 983   NA     0     0             349235   7.8958  <NA>        S  <NA>   NA
## 984   NA     0     0         C.A. 42795   7.5500  <NA>        S  <NA>   NA
## 985   NA     0     0             370370   7.7500  <NA>        Q    15   NA
## 988   NA     0     0             330924   7.8792  <NA>        Q  <NA>   NA
## 989   NA     0     0          AQ/4 3130   7.7500  <NA>        Q  <NA>   NA
## 990   NA     0     0           A/S 2816   8.0500  <NA>        S  <NA>   NA
## 992   NA     0     0               2677   7.2292  <NA>        C    15   NA
## 994   NA     0     0              36866   7.7375  <NA>        Q    16   NA
## 995   NA     0     0               2655   7.2292 F E46        C  <NA>   NA
## 998   NA     0     0               2649   7.2250  <NA>        C     C   NA
## 999   NA     0     0             349255   7.8958  <NA>        C  <NA>   NA
## 1000  NA     0     0             383123   7.7500  <NA>        Q 15 16   NA
## 1001  NA     0     0             367228   7.7500  <NA>        Q  <NA>   NA
## 1002  NA     2     0             367226  23.2500  <NA>        Q    16   NA
## 1003  NA     2     0             367226  23.2500  <NA>        Q    16   NA
## 1004  NA     2     0             367226  23.2500  <NA>        Q    16   NA
## 1005  NA     0     0             330932   7.7875  <NA>        Q    13   NA
## 1006  NA     0     0              36568  15.5000  <NA>        Q  <NA>   NA
## 1007  NA     0     0             330931   7.8792  <NA>        Q    13   NA
## 1010  NA     0     0             370372   7.7500  <NA>        Q  <NA>   NA
## 1013  NA     0     0             370368   7.7500  <NA>        Q  <NA>   NA
## 1014  NA     0     0  SOTON/O.Q. 392087   8.0500  <NA>        S  <NA>   NA
## 1015  NA     0     0             343095   8.0500  <NA>        S  <NA>   NA
## 1017  NA     0     0             368703   7.7500  <NA>        Q  <NA>   NA
## 1019  NA     0     0             359306   8.0500  <NA>        S  <NA>   NA
## 1023  NA     0     0             349221   7.8958  <NA>        S  <NA>   NA
## 1024  NA     0     0             330980   7.8792  <NA>        Q    16   NA
## 1028  NA     0     0          A4. 54510   8.0500  <NA>        S  <NA>   NA
## 1029  NA     1     0             371110  24.1500  <NA>        Q    16   NA
## 1030  NA     1     0             371110  24.1500  <NA>        Q  <NA>   NA
## 1031  NA     0     0             330877   8.4583  <NA>        Q  <NA>   NA
## 1033  NA     0     0             372622   7.7500  <NA>        Q  <NA>   NA
## 1034  NA     0     0             312991   7.7750  <NA>        S     B   NA
## 1035  NA     1     1               2661  15.2458  <NA>        C     C   NA
## 1036  NA     1     1               2661  15.2458  <NA>        C     C   NA
## 1037  NA     0     2               2661  15.2458  <NA>        C     C   NA
## 1038  NA     0     0               2626   7.2292  <NA>        C  <NA>   NA
## 1039  NA     0     0             374746   8.0500  <NA>        S  <NA>   NA
## 1040  NA     0     0              35852   7.7333  <NA>        Q    16   NA
## 1042  NA     0     0         A./5. 3235   8.0500  <NA>        S  <NA>   NA
## 1043  NA     1     0             367230  15.5000  <NA>        Q    16   NA
## 1044  NA     1     0             367230  15.5000  <NA>        Q    16   NA
## 1045  NA     0     0              36568  15.5000  <NA>        Q    16   NA
## 1053  NA     0     0             349218   7.8958  <NA>        S  <NA>   NA
## 1054  NA     0     0               2652   7.2292  <NA>        C  <NA>   NA
## 1055  NA     0     0             365237   7.7500  <NA>        Q  <NA>   NA
## 1056  NA     0     0             349234   7.8958  <NA>        S  <NA>   NA
## 1070  NA     1     0             370365  15.5000  <NA>        Q  <NA>   NA
## 1071  NA     0     0             330979   7.8292  <NA>        Q  <NA>   NA
## 1072  NA     1     0             370365  15.5000  <NA>        Q  <NA>   NA
## 1073  NA     0     0             334912   7.7333  <NA>        Q  <NA>   NA
## 1074  NA     0     0             371060   7.7500  <NA>        Q  <NA>   NA
## 1075  NA     0     0             366713   7.7500  <NA>        Q  <NA>   NA
## 1077  NA     0     0             364856   7.7500  <NA>        Q  <NA>   NA
## 1078  NA     0     0              14311   7.7500  <NA>        Q     D   NA
## 1079  NA     0     0             330959   7.8792  <NA>        Q  <NA>   NA
## 1081  NA     0     0             368402   7.7500  <NA>        Q     B   NA
## 1082  NA     0     0             330919   7.8292  <NA>        Q    13   NA
## 1086  NA     0     0          Fa 265302   7.3125  <NA>        S  <NA>   NA
## 1096  NA     0     0             330909   7.6292  <NA>        Q  <NA>   NA
## 1110  NA     0     0               3411   8.7125  <NA>        C  <NA>   NA
## 1115  NA     0     0             343271   7.0000  <NA>        S  <NA>   NA
## 1116  NA     0     0             345498   7.7750  <NA>        S  <NA>   NA
## 1117  NA     0     0           A/5 2817   8.0500  <NA>        S  <NA>   NA
## 1122  NA     1     1               2668  22.3583  <NA>        C     C   NA
## 1123  NA     1     1               2668  22.3583 F E69        C     D   NA
## 1124  NA     0     2               2668  22.3583  <NA>        C     D   NA
## 1125  NA     0     0             330935   8.1375  <NA>        Q  <NA>   NA
## 1129  NA     0     0             349215   7.8958  <NA>        S  <NA>   NA
## 1133  NA     0     0             349227   7.8958  <NA>        S  <NA>   NA
## 1136  NA     0     0             349223   7.8958  <NA>        S  <NA>   NA
## 1137  NA     0     0              65305   8.1125  <NA>        S  <NA>   NA
## 1138  NA     0     0               2629   7.2292  <NA>        C  <NA>   NA
## 1139  NA     0     0             362316   7.2500  <NA>        S  <NA>   NA
## 1150  NA     0     0             334915   7.7208  <NA>        Q    13   NA
## 1151  NA     0     0             364498  14.5000  <NA>        S  <NA>   NA
## 1152  NA     0     0             364498  14.5000  <NA>        S  <NA>   NA
## 1155  NA     0     0    S.C./A.4. 23567   8.0500  <NA>        S  <NA>   NA
## 1156  NA     0     0             312993   7.7750  <NA>        S  <NA>   NA
## 1160  NA     0     0             342712   8.0500  <NA>        S     C   NA
## 1163  NA     0     0             383162   7.7500  <NA>        Q    14   NA
## 1164  NA     0     0             371110  24.1500  <NA>        Q  <NA>   NA
## 1165  NA     0     0               2671   7.2292  <NA>        C  <NA>   NA
## 1167  NA     0     0               2676   7.2250  <NA>        C  <NA>   NA
## 1168  NA     0     0             367655   7.7292  <NA>        Q  <NA>   NA
## 1169  NA     0     0            LP 1588   7.5750  <NA>        S  <NA>   NA
## 1171  NA     8     2           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1173  NA     8     2           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1174  NA     8     2           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1175  NA     8     2           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1176  NA     8     2           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1177  NA     8     2           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1178  NA     8     2           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1179  NA     8     2           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1180  NA     1     9           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1181  NA     1     9           CA. 2343  69.5500  <NA>        S  <NA>   NA
## 1185  NA     2     0               2662  21.6792  <NA>        C  <NA>   NA
## 1186  NA     2     0               2662  21.6792  <NA>        C  <NA>   NA
## 1187  NA     2     0               2662  21.6792  <NA>        C  <NA>   NA
## 1194  NA     0     0              36209   7.7250  <NA>        Q  <NA>   NA
## 1195  NA     0     0             349222   7.8958  <NA>        S  <NA>   NA
## 1196  NA     0     0             370374   7.7500  <NA>        Q  <NA>   NA
## 1198  NA     0     0          C.A. 6212  15.1000  <NA>        S  <NA>   NA
## 1199  NA     0     0             330968   7.7792  <NA>        Q  <NA>   NA
## 1200  NA     0     0             374910   8.0500  <NA>        S  <NA>   NA
## 1201  NA     0     0    SOTON/OQ 392082   8.0500  <NA>        S  <NA>   NA
## 1203  NA     0     0             392092   8.0500  <NA>        S  <NA>   NA
## 1213  NA     0     0             349214   7.8958  <NA>        S  <NA>   NA
## 1214  NA     0     0    SOTON/OQ 392086   8.0500  <NA>        S  <NA>   NA
## 1215  NA     0     0             315037   8.6625  <NA>        S  <NA>   NA
## 1216  NA     0     0             384461   7.7500  <NA>        Q  <NA>   NA
## 1217  NA     0     0             335432   7.7333  <NA>        Q    13   NA
## 1220  NA     0     0          A.5. 3236   8.0500  <NA>        S  <NA>   NA
## 1222  NA     0     0             349208   7.8958  <NA>        S  <NA>   NA
## 1242  NA     1     0               2621   6.4375  <NA>        C  <NA>   NA
## 1243  NA     0     0               2681   6.4375  <NA>        C  <NA>   NA
## 1244  NA     0     0               2684   7.2250  <NA>        C  <NA>   NA
## 1246  NA     0     0              32302   8.0500  <NA>        S  <NA>   NA
## 1247  NA     1     0             376564  16.1000  <NA>        S  <NA>   NA
## 1248  NA     1     0             376564  16.1000  <NA>        S    10   NA
## 1250  NA     0     0             383121   7.7500   F38        Q  <NA>   NA
## 1251  NA     0     0             349216   7.8958  <NA>        S  <NA>   NA
## 1254  NA     0     0               2673   7.2292  <NA>        C  <NA>   NA
## 1256  NA     0     0               2641   7.2292  <NA>        C  <NA>   NA
## 1263  NA     1     1           A/5. 851  14.5000  <NA>        S  <NA>   NA
## 1269  NA     0     0             345777   9.5000  <NA>        S  <NA>   NA
## 1283  NA     0     0             359309   8.0500  <NA>        S  <NA>   NA
## 1284  NA     0     0         C.A. 49867   7.5500  <NA>        S  <NA>   NA
## 1285  NA     0     0   SOTON/OQ 3101316   8.0500  <NA>        S  <NA>   NA
## 1292  NA     0     0               3410   8.7125  <NA>        S  <NA>   NA
## 1293  NA     0     0      S.O./P.P. 751   7.5500  <NA>        S  <NA>   NA
## 1294  NA     0     0           A/5 2466   8.0500  <NA>        S  <NA>   NA
## 1298  NA     0     0         A/4. 34244   7.2500  <NA>        S  <NA>   NA
## 1303  NA     0     0               2647   7.2250  <NA>        C  <NA>   NA
## 1304  NA     0     0               2627  14.4583  <NA>        C  <NA>   NA
## 1306  NA     1     0               2665  14.4542  <NA>        C  <NA>   NA
##                                  home.dest
## 16                            New York, NY
## 38                         Los Angeles, CA
## 41                        Philadelphia, PA
## 47                                    <NA>
## 60                            New York, NY
## 70        St Leonards-on-Sea, England Ohio
## 71            Liverpool, England / Belfast
## 75                           Stoughton, MA
## 81                           Roachdale, IN
## 107                                   <NA>
## 108                           New York, NY
## 109                                   <NA>
## 119                Westcliff-on-Sea, Essex
## 122                           New York, NY
## 126                                   <NA>
## 135           Paris, France / New York, NY
## 148                                   <NA>
## 153                       Kingston, Surrey
## 158                           Brighton, MA
## 167                           New York, NY
## 177                Southington / Noank, CT
## 180                           Portland, OR
## 185                            Chicago, IL
## 197                          Paris, France
## 205                           New York, NY
## 220                          Paris, France
## 224                                Belfast
## 236                  Paris /  New York, NY
## 238                                   <NA>
## 242                            Seattle, WA
## 255                    Manchester, England
## 257                           New York, NY
## 270                      Streatham, Surrey
## 278                          Paris, France
## 284  Gallipolis, Ohio / ? Paris / New York
## 294              London /  East Orange, NJ
## 298                           New York, NY
## 319                        London, England
## 321                        London, England
## 364                                Belfast
## 383      Upper Burma, India Pittsburgh, PA
## 385                                Belfast
## 411                                Belfast
## 470                         Harrisburg, PA
## 474                                Belfast
## 478                                   <NA>
## 484                   London / Chicago, IL
## 492                                  Paris
## 496                           New York, NY
## 525                   Spain / Havana, Cuba
## 529                                Belfast
## 532                                   <NA>
## 582                                Belfast
## 596                                   <NA>
## 598                        Harrow, England
## 673                                   <NA>
## 681                                  Syria
## 682                         Syria Kent, ON
## 683                    Ireland Chicago, IL
## 706                             Ottawa, ON
## 707                             Ottawa, ON
## 757         Liverpool, England Bedford, OH
## 758         Liverpool, England Bedford, OH
## 768                                   <NA>
## 769               Bulgaria Coon Rapids, IA
## 776                                   <NA>
## 790                                   <NA>
## 796                                   <NA>
## 799                 Italy Philadelphia, PA
## 801                                   <NA>
## 802                                   <NA>
## 803                                   <NA>
## 805                                Ireland
## 806                 Hong Kong New York, NY
## 809          Bridgwater, Somerset, England
## 813                   Ireland New York, NY
## 814                                   <NA>
## 816                                   <NA>
## 817                                   <NA>
## 820       Co Clare, Ireland Washington, DC
## 836                                   <NA>
## 843                                   <NA>
## 844                                   <NA>
## 853                                   <NA>
## 855                                   <NA>
## 857                                   <NA>
## 859                                   <NA>
## 866                                   <NA>
## 872                                   <NA>
## 873                                   <NA>
## 875                                   <NA>
## 877                                   <NA>
## 880                                   <NA>
## 883                                   <NA>
## 887                                   <NA>
## 888                                   <NA>
## 901                                   <NA>
## 902                                   <NA>
## 903                                   <NA>
## 904                                   <NA>
## 919                                   <NA>
## 921                                   <NA>
## 922                                   <NA>
## 923                                   <NA>
## 924                                   <NA>
## 927                                   <NA>
## 928                                   <NA>
## 929                                   <NA>
## 930                                   <NA>
## 931                                   <NA>
## 932                                   <NA>
## 941                                   <NA>
## 943                                   <NA>
## 945                                   <NA>
## 946                                   <NA>
## 947                                   <NA>
## 949                                   <NA>
## 955                                   <NA>
## 956                                   <NA>
## 957                                   <NA>
## 958                                   <NA>
## 959                                   <NA>
## 962                                   <NA>
## 963                                   <NA>
## 972                                   <NA>
## 974                                   <NA>
## 977                                   <NA>
## 983                                   <NA>
## 984                                   <NA>
## 985                                   <NA>
## 988                                   <NA>
## 989                                   <NA>
## 990                                   <NA>
## 992                                   <NA>
## 994                                   <NA>
## 995                                   <NA>
## 998                                   <NA>
## 999                                   <NA>
## 1000                                  <NA>
## 1001                                  <NA>
## 1002                                  <NA>
## 1003                                  <NA>
## 1004                                  <NA>
## 1005                                  <NA>
## 1006                                  <NA>
## 1007                                  <NA>
## 1010                                  <NA>
## 1013                                  <NA>
## 1014                                  <NA>
## 1015                                  <NA>
## 1017                                  <NA>
## 1019                                  <NA>
## 1023                                  <NA>
## 1024                                  <NA>
## 1028                                  <NA>
## 1029                                  <NA>
## 1030                                  <NA>
## 1031                                  <NA>
## 1033                                  <NA>
## 1034                                  <NA>
## 1035                                  <NA>
## 1036                                  <NA>
## 1037                                  <NA>
## 1038                                  <NA>
## 1039                                  <NA>
## 1040                                  <NA>
## 1042                                  <NA>
## 1043                                  <NA>
## 1044                                  <NA>
## 1045                                  <NA>
## 1053                                  <NA>
## 1054                                  <NA>
## 1055                                  <NA>
## 1056                                  <NA>
## 1070                                  <NA>
## 1071                                  <NA>
## 1072                                  <NA>
## 1073                                  <NA>
## 1074                                  <NA>
## 1075                                  <NA>
## 1077                                  <NA>
## 1078                                  <NA>
## 1079                                  <NA>
## 1081                                  <NA>
## 1082                                  <NA>
## 1086                                  <NA>
## 1096                                  <NA>
## 1110                                  <NA>
## 1115                                  <NA>
## 1116                                  <NA>
## 1117                                  <NA>
## 1122                                  <NA>
## 1123                                  <NA>
## 1124                                  <NA>
## 1125                                  <NA>
## 1129                                  <NA>
## 1133                                  <NA>
## 1136                                  <NA>
## 1137                                  <NA>
## 1138                                  <NA>
## 1139                                  <NA>
## 1150                                  <NA>
## 1151                                  <NA>
## 1152                                  <NA>
## 1155                                  <NA>
## 1156                                  <NA>
## 1160                                  <NA>
## 1163                                  <NA>
## 1164                                  <NA>
## 1165                                  <NA>
## 1167                                  <NA>
## 1168                                  <NA>
## 1169                                  <NA>
## 1171                                  <NA>
## 1173                                  <NA>
## 1174                                  <NA>
## 1175                                  <NA>
## 1176                                  <NA>
## 1177                                  <NA>
## 1178                                  <NA>
## 1179                                  <NA>
## 1180                                  <NA>
## 1181                                  <NA>
## 1185                                  <NA>
## 1186                                  <NA>
## 1187                                  <NA>
## 1194                                  <NA>
## 1195                                  <NA>
## 1196                                  <NA>
## 1198                                  <NA>
## 1199                                  <NA>
## 1200                                  <NA>
## 1201                                  <NA>
## 1203                                  <NA>
## 1213                                  <NA>
## 1214                                  <NA>
## 1215                                  <NA>
## 1216                                  <NA>
## 1217                                  <NA>
## 1220                                  <NA>
## 1222                                  <NA>
## 1242                                  <NA>
## 1243                                  <NA>
## 1244                                  <NA>
## 1246                                  <NA>
## 1247                                  <NA>
## 1248                                  <NA>
## 1250                                  <NA>
## 1251                                  <NA>
## 1254                                  <NA>
## 1256                                  <NA>
## 1263                                  <NA>
## 1269                                  <NA>
## 1283                                  <NA>
## 1284                                  <NA>
## 1285                                  <NA>
## 1292                                  <NA>
## 1293                                  <NA>
## 1294                                  <NA>
## 1298                                  <NA>
## 1303                                  <NA>
## 1304                                  <NA>
## 1306                                  <NA>
df2$age[which(is.na(df2$age))] <- mean(df2$age,na.rm = TRUE)

df2[is.na(df2$age),]# "na" values have been replaces by mean of rest of the values
##  [1] pclass    survived  name      sex       age       sibsp     parch    
##  [8] ticket    fare      cabin     embarked  boat      body      home.dest
## <0 rows> (or 0-length row.names)
## I don't think mean age is the right way to go. Men and women have different life span and inputting the same mean value for each missing one can be quite misleading. 
df3<-df2

3: Lifeboat

You’re interested in looking at the distribution of passengers in different lifeboats, but as we know, many passengers did not make it to a boat :-( This means that there are a lot of missing values in the boat column. Fill these empty slots with a dummy value e.g. the string ‘None’ or ‘NA’

This question was answered when we first loaded the data and “na” values were input

4: Cabin

You notice that many passengers don’t have a cabin number associated with them.

Does it make sense to fill missing cabin numbers with a value?

What does a missing value here mean?

You have a hunch that the fact that the cabin number is missing might be a useful indicator of survival. Create a new column has_cabin_number which has 1 if there is a cabin number, and 0 otherwise.

Cabin are given depending on the class of tickets bought. If the cabins were near to lifeboats then people had easy and early access to them and therefore survival rate would have been better.

But since too much data is missing from cabin, it doesnt make sense to predict the cabin numbers based on too little data.

Missing value perhaps means: People didn’t have a specific cabin as tickets were cheap and many people stayed in a big lobby or room, that didn’t have a speicifc number or the data is simply missing.

df3$cabin <-lapply(df3$cabin, as.character)
df3$has_cabin_number <- ifelse(df3$cabin == "NA", 0, 1)
df4<-df3
df4$has_cabin_number[which(is.na(df4$has_cabin_number))] <- 0
str(df4)
## 'data.frame':    1309 obs. of  15 variables:
##  $ pclass          : int  1 1 1 1 1 1 1 1 1 1 ...
##  $ survived        : int  1 1 0 0 0 1 1 0 1 0 ...
##  $ name            : Factor w/ 1307 levels "Abbing, Mr. Anthony",..: 22 24 25 26 27 31 46 47 51 55 ...
##  $ sex             : Factor w/ 2 levels "female","male": 1 2 1 2 1 2 1 2 1 2 ...
##  $ age             : num  29 0.917 2 30 25 ...
##  $ sibsp           : int  0 1 1 1 1 0 1 0 2 0 ...
##  $ parch           : int  0 2 2 2 2 0 0 0 0 0 ...
##  $ ticket          : Factor w/ 929 levels "110152","110413",..: 188 50 50 50 50 125 93 16 77 826 ...
##  $ fare            : num  211 152 152 152 152 ...
##  $ cabin           :List of 1309
##   ..$ : chr "B5"
##   ..$ : chr "C22 C26"
##   ..$ : chr "C22 C26"
##   ..$ : chr "C22 C26"
##   ..$ : chr "C22 C26"
##   ..$ : chr "E12"
##   ..$ : chr "D7"
##   ..$ : chr "A36"
##   ..$ : chr "C101"
##   ..$ : chr NA
##   ..$ : chr "C62 C64"
##   ..$ : chr "C62 C64"
##   ..$ : chr "B35"
##   ..$ : chr NA
##   ..$ : chr "A23"
##   ..$ : chr NA
##   ..$ : chr "B58 B60"
##   ..$ : chr "B58 B60"
##   ..$ : chr "D15"
##   ..$ : chr "C6"
##   ..$ : chr "D35"
##   ..$ : chr "D35"
##   ..$ : chr "C148"
##   ..$ : chr NA
##   ..$ : chr "C97"
##   ..$ : chr NA
##   ..$ : chr "B49"
##   ..$ : chr "B49"
##   ..$ : chr "C99"
##   ..$ : chr "C52"
##   ..$ : chr "T"
##   ..$ : chr "A31"
##   ..$ : chr "C7"
##   ..$ : chr "C103"
##   ..$ : chr "D22"
##   ..$ : chr NA
##   ..$ : chr "E33"
##   ..$ : chr NA
##   ..$ : chr "A21"
##   ..$ : chr "B10"
##   ..$ : chr NA
##   ..$ : chr "B4"
##   ..$ : chr "C101"
##   ..$ : chr "D15"
##   ..$ : chr "E40"
##   ..$ : chr "B38"
##   ..$ : chr NA
##   ..$ : chr "E24"
##   ..$ : chr NA
##   ..$ : chr "B51 B53 B55"
##   ..$ : chr "B51 B53 B55"
##   ..$ : chr "B51 B53 B55"
##   ..$ : chr NA
##   ..$ : chr NA
##   ..$ : chr "B96 B98"
##   ..$ : chr "B96 B98"
##   ..$ : chr "B96 B98"
##   ..$ : chr "B96 B98"
##   ..$ : chr NA
##   ..$ : chr NA
##   ..$ : chr "C46"
##   ..$ : chr "C46"
##   ..$ : chr "E31"
##   ..$ : chr "E31"
##   ..$ : chr "E8"
##   ..$ : chr "E8"
##   ..$ : chr "B61"
##   ..$ : chr "B77"
##   ..$ : chr "A9"
##   ..$ : chr "E33"
##   ..$ : chr NA
##   ..$ : chr "C89"
##   ..$ : chr "C89"
##   ..$ : chr NA
##   ..$ : chr "A14"
##   ..$ : chr "E58"
##   ..$ : chr "E49"
##   ..$ : chr "E52"
##   ..$ : chr "E45"
##   ..$ : chr "C101"
##   ..$ : chr NA
##   ..$ : chr "B22"
##   ..$ : chr "B22"
##   ..$ : chr "B26"
##   ..$ : chr "C85"
##   ..$ : chr "C85"
##   ..$ : chr "E17"
##   ..$ : chr NA
##   ..$ : chr NA
##   ..$ : chr "B71"
##   ..$ : chr "B71"
##   ..$ : chr "B20"
##   ..$ : chr "B20"
##   ..$ : chr "A34"
##   ..$ : chr "A34"
##   ..$ : chr "A34"
##   ..$ : chr "C86"
##   ..$ : chr "B58 B60"
##   ..$ : chr "C86"
##   .. [list output truncated]
##  $ embarked        :List of 1309
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "S"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   ..$ : chr "C"
##   .. [list output truncated]
##  $ boat            : Factor w/ 27 levels "1","10","11",..: 12 3 NA NA NA 13 2 NA 27 NA ...
##  $ body            : int  NA NA NA 135 NA NA NA NA NA 22 ...
##  $ home.dest       : Factor w/ 369 levels "?Havana, Cuba",..: 309 231 231 231 231 237 162 24 22 229 ...
##  $ has_cabin_number: num  1 1 1 1 1 1 1 1 1 0 ...