Andreas Schätti
January 30, 2017
There are 177 missing values for Age in the titanic dataset. These are input by randomly sampling existing values, separately by sex (only shown for females here):
n.samples <- sum(is.na.age & is.female)
sampled.age.female <- sample(
titanic.data$Age[!is.na.age & is.female],
n.samples,
replace=TRUE)
titanic.data$Age[is.na.age & is.female] <-
sampled.age.female
sum(is.na(titanic.data$Age) & is.female)
[1] 0