In thinking about a fun way to show some spurious correlations (which is often a byproduct of confounding variables I came across this Kaggle post:https://www.kaggle.com/antgoldbloom/lucky-names/comments? which in response to machine learning competition to predict survivors of the Titantic tragedy the idea that names could have an affect on probablility of survival.
This notion ofcourse, while fun definitely contains confounding variables - as mentioned by a lot of the comments left under the original post.
Now, I understand this might not be exactly the task - i.e. the dataset isn’t real world, but it was a fun way to demonstrate how looking at the data incorrectly could lead to some funny(wrong) conclusions.
This post is a more about thought process than it is about the technical abilities to adjust the counfounding variables, and is intended to be a basic way to look at the results you get and question its validity. For a more technical view on this I highly recommend looking at the many posts on Canvas and CIC Around by other students for a more real world example as mine was really just to do some fun things in R to illustrate the issue of confounding variables.
“In statistics, a confounder (also confounding variable, confounding factor, or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Confounding is a causal concept, and as such, cannot be described in terms of correlations or associations.” ~ Wikipedia Wikipedia:Confounding’
The original data is available here: http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic3.csv
I must admit I cheated a bit here and downloaded the CSV then did the munging of the data in Excel. It is currently a little beyond my skill level to munge the names data efficiently so in the interest of time firstnames were extrapolated in Excel.
I also added a coluumn to the data called freq_firstName this is just a count of the first name and age groupings.
titanicData <- read_csv("titanicdata.csv")
## Parsed with column specification:
## cols(
## .default = col_character(),
## pclass = col_double(),
## survived = col_double(),
## freq_firstName = col_double(),
## age = col_double(),
## sibsp = col_double(),
## parch = col_double(),
## fare = col_double(),
## body = col_double()
## )
## See spec(...) for full column specifications.
This part is basically a replica of the Kaggle post I reference in the begining.
Create a list of popular names, those with a frequency greater than 10 and then a list of lucky names which is a popular name where the person survived
The plot below shows the frequency of the lucky names that survived… but surely there is more to it!
popularNames <- titanicData %>%
filter(freq_firstName > 10)
luckynames <- popularNames %>%
filter(survived==1)
luckynames$firstname <- factor(luckynames$firstname, levels = c("Elizabeth","Mary","Anna","Margaret","William","John","George","Charles","Edward","Henry","Richard","Karl","Johan","Joseph","Alfred","Thomas","James","Patrick"))
luckynames %>%
ggplot()+
geom_bar(aes(x=firstname))
Some of the names seem to be significant, but so is the pclass and to a lesser extent age, but the significant names are mostly female.
Anna, Elizabeth, Margaret and Mary are amoung the most significant names and perhaps Johan and Karl for men, but these are not as significant, also looks like age might be important.
PopNames <- popularNames[,-c(3,4,5,7,8,9,10,11,15,16,17,18,19,20,21,22,23)] #Remove unwanted columns
PopNames <- na.omit(PopNames) #Remove NA
modelPopNames<-glm(survived~., family=binomial(logit),data=PopNames)
summary(modelPopNames)
##
## Call:
## glm(formula = survived ~ ., family = binomial(logit), data = PopNames)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.2385 -0.5886 -0.2279 0.2591 2.9274
##
## Coefficients: (1 not defined because of singularities)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 2.75932 1.85205 1.490 0.136259
## pclass -2.20840 0.39356 -5.611 2.01e-08 ***
## firstnameAnna 5.59974 1.53399 3.650 0.000262 ***
## firstnameCharles 1.03245 1.48220 0.697 0.486073
## firstnameEdward 2.67235 1.54868 1.726 0.084426 .
## firstnameElizabeth 6.62290 1.72624 3.837 0.000125 ***
## firstnameGeorge 0.49702 1.51341 0.328 0.742600
## firstnameHenry 1.82000 1.49806 1.215 0.224402
## firstnameJames 1.28169 1.75569 0.730 0.465378
## firstnameJohan 4.16500 1.58286 2.631 0.008505 **
## firstnameJohn 0.38372 1.43990 0.266 0.789860
## firstnameJoseph 1.54034 1.56556 0.984 0.325169
## firstnameKarl 3.31956 1.56795 2.117 0.034248 *
## firstnameMargaret 5.85245 1.75490 3.335 0.000853 ***
## firstnameMary 4.31181 1.52048 2.836 0.004571 **
## firstnamePatrick -11.40540 1123.03377 -0.010 0.991897
## firstnameRichard 0.86522 1.64307 0.527 0.598481
## firstnameThomas 0.25732 1.72218 0.149 0.881225
## firstnameWilliam 1.05845 1.35723 0.780 0.435474
## sexmale NA NA NA NA
## age -0.05854 0.02832 -2.067 0.038702 *
## age_groupBaby 3.80171 1.74301 2.181 0.029175 *
## age_groupChild 0.82002 1.21188 0.677 0.498625
## age_groupSenior 0.67179 1.17103 0.574 0.566190
## age_groupTeen 0.31324 0.84540 0.371 0.710993
## age_groupYoung adult 0.68359 0.73676 0.928 0.353497
## age_groupYoung child 2.78963 1.92208 1.451 0.146680
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 328.62 on 265 degrees of freedom
## Residual deviance: 177.99 on 240 degrees of freedom
## AIC: 229.99
##
## Number of Fisher Scoring iterations: 15
Age is now a significant factor, where it was not in the previous model… and while Anna and Elizabeth remain highly significant, Margaret has seemed to drop a bit…
reduced.modelPopNames<-step(modelPopNames, direction="both")
## Start: AIC=229.99
## survived ~ pclass + firstname + sex + age + age_group
##
##
## Step: AIC=229.99
## survived ~ pclass + firstname + age + age_group
##
## Df Deviance AIC
## - age_group 6 187.12 227.12
## <none> 177.99 229.99
## - age 1 182.49 232.49
## - pclass 1 229.86 279.86
## - firstname 17 285.14 303.14
##
## Step: AIC=227.12
## survived ~ pclass + firstname + age
##
## Df Deviance AIC
## <none> 187.12 227.12
## + age_group 6 177.99 229.99
## - age 1 215.53 253.53
## - pclass 1 237.87 275.87
## - firstname 17 290.18 296.18
summary(reduced.modelPopNames)
##
## Call:
## glm(formula = survived ~ pclass + firstname + age, family = binomial(logit),
## data = PopNames)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.3398 -0.5936 -0.2469 0.2452 3.0684
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.08014 1.48770 2.743 0.006096 **
## pclass -2.08135 0.36756 -5.663 1.49e-08 ***
## firstnameAnna 4.85637 1.38221 3.513 0.000442 ***
## firstnameCharles 0.45298 1.33707 0.339 0.734771
## firstnameEdward 2.31796 1.40527 1.649 0.099051 .
## firstnameElizabeth 6.01841 1.59227 3.780 0.000157 ***
## firstnameGeorge 0.40009 1.33755 0.299 0.764845
## firstnameHenry 1.31833 1.35409 0.974 0.330261
## firstnameJames 0.61743 1.62533 0.380 0.704036
## firstnameJohan 3.35251 1.41355 2.372 0.017707 *
## firstnameJohn 0.02233 1.27627 0.017 0.986038
## firstnameJoseph 0.76563 1.41768 0.540 0.589154
## firstnameKarl 2.64845 1.41798 1.868 0.061795 .
## firstnameMargaret 5.01827 1.61430 3.109 0.001880 **
## firstnameMary 3.86312 1.38022 2.799 0.005128 **
## firstnamePatrick -11.91022 1084.90156 -0.011 0.991241
## firstnameRichard 0.73195 1.44216 0.508 0.611776
## firstnameThomas -0.40347 1.60349 -0.252 0.801333
## firstnameWilliam 0.69438 1.21658 0.571 0.568163
## age -0.08090 0.01739 -4.652 3.28e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 328.62 on 265 degrees of freedom
## Residual deviance: 187.12 on 246 degrees of freedom
## AIC: 227.12
##
## Number of Fisher Scoring iterations: 15
Well we can make assumptions about sex from the names of the people, and looks like class has much more of a role in survival than simply names, same could be said of sex.
popularNames$firstname <- factor(popularNames$firstname, levels = c("Elizabeth","Mary","Anna","Margaret","William","John","George","Charles","Edward","Henry","Richard","Karl","Johan","Joseph","Alfred","Thomas","James","Patrick"))
ggplot(data = popularNames, aes( x = pclass, fill = as.factor(survived )))+
geom_bar()+
facet_grid(~firstname) #by name
Many more women survived than men, even though there were much less women than men on board and it does seem that class might also be a more indicative factor to survival than simply a name.
ggplot(data = popularNames, aes( x = pclass, fill = as.factor(survived )))+
geom_bar()+
facet_grid(~sex)
Babies and young children had the highest survival rate per their age groups. To get a better idea of what is happening we create another set of graphs based on the whole dataset not just those with popular names
popularNames$age_group <- factor(popularNames$age_group, levels = c("Baby","Young child","Child","Teen","Young adult","Adult","Senior"))
ggplot(data=popularNames,aes( x = age_group, fill = as.factor(survived )))+
geom_bar()
Using the Titanic Dataset first loaded we create a new model that does not filter to just the popular names. And now names don’t seem to matter, but pclass still does!
newTitanic <- titanicData[,-c(3,4,5,7,8,9,10,11,15,16,17,18,19,20,21,22,23)] #Remove unwanted columns
newTitanic <- na.omit(newTitanic) #Remove NA
modelTitanic<-glm(survived~., family=binomial(logit),data=newTitanic)
summary(modelTitanic)
##
## Call:
## glm(formula = survived ~ ., family = binomial(logit), data = newTitanic)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.3262 -0.2386 0.0000 0.0000 2.9292
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.234e+01 2.507e+04 0.002 0.9987
## pclass -2.023e+00 2.634e-01 -7.679 1.61e-14 ***
## firstnameAchille -2.108e+01 1.773e+04 -0.001 0.9991
## firstnameAda -1.687e+01 2.659e+04 -0.001 0.9995
## firstnameAddie -1.697e+01 3.071e+04 -0.001 0.9996
## firstnameAdele -1.640e+01 2.768e+04 -0.001 0.9995
## firstnameAdola -1.974e+01 1.773e+04 -0.001 0.9991
## firstnameAdolf -2.050e+01 1.211e+04 -0.002 0.9986
## firstnameAgda -5.618e+01 3.071e+04 -0.002 0.9985
## firstnameAgnes -1.609e+01 3.071e+04 -0.001 0.9996
## firstnameAhmed -2.103e+01 1.773e+04 -0.001 0.9991
## firstnameAkar -2.181e+01 1.773e+04 -0.001 0.9990
## firstnameAlbert -1.940e+00 1.786e+00 -1.086 0.2774
## firstnameAlbina -1.891e+01 3.071e+04 -0.001 0.9995
## firstnameAlden 1.615e+01 1.773e+04 0.001 0.9993
## firstnameAlexander -2.299e+01 8.893e+03 -0.003 0.9979
## firstnameAlfons -2.061e+01 1.773e+04 -0.001 0.9991
## firstnameAlfonzo -1.920e+01 1.773e+04 -0.001 0.9991
## firstnameAlfred -4.111e+00 1.873e+00 -2.195 0.0282 *
## firstnameAlfrida -5.582e+01 3.071e+04 -0.002 0.9985
## firstnameAlgernon 1.944e+01 1.773e+04 0.001 0.9991
## firstnameAli -2.106e+01 1.773e+04 -0.001 0.9991
## firstnameAlice -1.785e+01 2.572e+04 -0.001 0.9994
## firstnameAlma -5.608e+01 3.071e+04 -0.002 0.9985
## firstnameAloisia -5.722e+01 3.071e+04 -0.002 0.9985
## firstnameAmalie -1.884e+01 3.071e+04 -0.001 0.9995
## firstnameAmbrose -2.313e+01 1.773e+04 -0.001 0.9990
## firstnameAmelia -1.768e+01 2.691e+04 -0.001 0.9995
## firstnameAmelie -1.876e+01 3.071e+04 -0.001 0.9995
## firstnameAmy -1.680e+01 2.772e+04 -0.001 0.9995
## firstnameAnders -1.966e+01 1.254e+04 -0.002 0.9987
## firstnameAndre 1.615e+01 1.773e+04 0.001 0.9993
## firstnameAnn -5.958e+01 3.071e+04 -0.002 0.9985
## firstnameAnna -3.541e+01 2.507e+04 -0.001 0.9989
## firstnameAnne -1.897e+01 3.071e+04 -0.001 0.9995
## firstnameAnnie -3.704e+01 2.507e+04 -0.001 0.9988
## firstnameAnthony -1.955e+01 1.773e+04 -0.001 0.9991
## firstnameAntoinette -1.811e+01 3.071e+04 -0.001 0.9995
## firstnameAnton 2.124e+01 1.773e+04 0.001 0.9990
## firstnameAntoni -1.994e+01 1.773e+04 -0.001 0.9991
## firstnameAntti -2.061e+01 1.221e+04 -0.002 0.9987
## firstnameApostolos -1.997e+01 1.773e+04 -0.001 0.9991
## firstnameArchibald -2.850e+00 2.115e+00 -1.347 0.1779
## firstnameArgenia -1.811e+01 3.071e+04 -0.001 0.9995
## firstnameArne -2.258e+01 1.773e+04 -0.001 0.9990
## firstnameArthur -4.069e+00 1.901e+00 -2.140 0.0323 *
## firstnameArtur 1.940e+01 1.773e+04 0.001 0.9991
## firstnameAssad 1.816e+01 1.773e+04 0.001 0.9992
## firstnameAsuncion -1.702e+01 3.071e+04 -0.001 0.9996
## firstnameAugust -2.045e+00 1.875e+00 -1.091 0.2754
## firstnameAugusta -3.819e+01 2.507e+04 -0.002 0.9988
## firstnameAurora -1.614e+01 3.071e+04 -0.001 0.9996
## firstnameAusten -2.351e+01 1.773e+04 -0.001 0.9989
## firstnameAustin -1.959e+01 1.773e+04 -0.001 0.9991
## firstnameBanoura -1.580e+01 3.071e+04 -0.001 0.9996
## firstnameBarbara -2.003e+01 3.071e+04 -0.001 0.9995
## firstnameBartol -1.974e+01 1.773e+04 -0.001 0.9991
## firstnameBeatrice -1.801e+01 3.071e+04 -0.001 0.9995
## firstnameBeila -1.500e+01 3.071e+04 0.000 0.9996
## firstnameBengt -1.989e+01 1.773e+04 -0.001 0.9991
## firstnameBenjamin -2.289e+01 7.974e+03 -0.003 0.9977
## firstnameBerk 2.132e+01 1.773e+04 0.001 0.9990
## firstnameBerta -1.561e+01 3.071e+04 -0.001 0.9996
## firstnameBertha -1.749e+01 2.597e+04 -0.001 0.9995
## firstnameBerthe -2.013e+01 3.071e+04 -0.001 0.9995
## firstnameBertram -8.966e-01 2.375e+00 -0.377 0.7058
## firstnameBessie -6.023e+01 3.071e+04 -0.002 0.9984
## firstnameBlanche -1.858e+01 3.071e+04 -0.001 0.9995
## firstnameBranko -2.053e+01 1.773e+04 -0.001 0.9991
## firstnameBridget -1.614e+01 3.071e+04 -0.001 0.9996
## firstnameCamille -1.971e+01 1.773e+04 -0.001 0.9991
## firstnameCarl -5.921e-01 1.785e+00 -0.332 0.7402
## firstnameCarl/Charles -1.932e+01 1.773e+04 -0.001 0.9991
## firstnameCarla -1.559e+01 3.071e+04 -0.001 0.9996
## firstnameCaroline -1.861e+01 2.706e+04 -0.001 0.9995
## firstnameCarrie -1.852e+01 3.071e+04 -0.001 0.9995
## firstnameCatharina -5.701e+01 3.071e+04 -0.002 0.9985
## firstnameCatherine -3.695e+01 2.507e+04 -0.001 0.9988
## firstnameCerin -1.997e+01 1.773e+04 -0.001 0.9991
## firstnameChang 2.132e+01 1.773e+04 0.001 0.9990
## firstnameCharles -3.850e+00 1.684e+00 -2.287 0.0222 *
## firstnameCharlotte -1.767e+01 2.696e+04 -0.001 0.9995
## firstnameChristiana -1.850e+01 3.071e+04 -0.001 0.9995
## firstnameChristopher -2.360e+01 1.773e+04 -0.001 0.9989
## firstnameClaire -5.929e+01 3.071e+04 -0.002 0.9985
## firstnameClara -1.839e+01 3.071e+04 -0.001 0.9995
## firstnameClarence -2.285e+01 1.200e+04 -0.002 0.9985
## firstnameClaus -1.958e+01 1.773e+04 -0.001 0.9991
## firstnameClear -1.681e+01 3.071e+04 -0.001 0.9996
## firstnameClifford -2.261e+01 1.229e+04 -0.002 0.9985
## firstnameConstance -1.942e+01 2.788e+04 -0.001 0.9994
## firstnameCordelia -5.616e+01 3.071e+04 -0.002 0.9985
## firstnameCosmo 1.772e+01 1.773e+04 0.001 0.9992
## firstnameDagmar -1.821e+01 3.071e+04 -0.001 0.9995
## firstnameDaisy -1.889e+01 3.071e+04 -0.001 0.9995
## firstnameDaniel -2.970e+00 2.162e+00 -1.374 0.1695
## firstnameDavid -2.063e+00 1.861e+00 -1.109 0.2675
## firstnameDemetrios -2.056e+01 1.773e+04 -0.001 0.9991
## firstnameDickinson 1.709e+01 1.773e+04 0.001 0.9992
## firstnameDomingos -2.114e+01 1.773e+04 -0.001 0.9990
## firstnameDoolina -5.730e+01 3.071e+04 -0.002 0.9985
## firstnameDorothy -3.840e+01 2.507e+04 -0.002 0.9988
## firstnameEbba -5.799e+01 3.071e+04 -0.002 0.9985
## firstnameEberhard -2.056e+01 1.773e+04 -0.001 0.9991
## firstnameEden 1.940e+01 1.773e+04 0.001 0.9991
## firstnameEdgar -2.360e+01 1.237e+04 -0.002 0.9985
## firstnameEdgardo -2.258e+01 1.773e+04 -0.001 0.9990
## firstnameEdith -3.788e+01 2.507e+04 -0.002 0.9988
## firstnameEdmond 1.618e+01 1.773e+04 0.001 0.9993
## firstnameEdvard -2.076e+01 1.000e+04 -0.002 0.9983
## firstnameEdvin 1.823e+01 1.773e+04 0.001 0.9992
## firstnameEdward -2.183e+00 1.736e+00 -1.258 0.2085
## firstnameEdwin -3.156e+00 2.100e+00 -1.503 0.1328
## firstnameEdwina -1.702e+01 3.071e+04 -0.001 0.9996
## firstnameEdwy -2.173e+01 1.773e+04 -0.001 0.9990
## firstnameEileen -5.672e+01 3.071e+04 -0.002 0.9985
## firstnameEinar -5.435e-01 2.055e+00 -0.264 0.7914
## firstnameEino -1.479e+00 2.180e+00 -0.679 0.4974
## firstnameEiriik 2.132e+01 1.773e+04 0.001 0.9990
## firstnameEleanor -1.823e+01 2.797e+04 -0.001 0.9995
## firstnameElias 2.036e+01 1.773e+04 0.001 0.9991
## firstnameEliezer -2.108e+01 1.773e+04 -0.001 0.9991
## firstnameEliina -1.500e+01 3.071e+04 0.000 0.9996
## firstnameElin -3.705e+01 2.507e+04 -0.001 0.9988
## firstnameElina -5.603e+01 3.071e+04 -0.002 0.9985
## firstnameElisabeth -1.642e+01 2.648e+04 -0.001 0.9995
## firstnameElise -1.824e+01 3.071e+04 -0.001 0.9995
## firstnameElizabeth -3.464e+01 2.507e+04 -0.001 0.9989
## firstnameElla -1.831e+01 3.071e+04 -0.001 0.9995
## firstnameEllen -3.589e+01 2.507e+04 -0.001 0.9989
## firstnameEllis -5.912e+01 3.071e+04 -0.002 0.9985
## firstnameElmer 1.769e+01 1.773e+04 0.001 0.9992
## firstnameElna -5.608e+01 3.071e+04 -0.002 0.9985
## firstnameElsie -1.843e+01 2.758e+04 -0.001 0.9995
## firstnameEmelia -5.603e+01 3.071e+04 -0.002 0.9985
## firstnameEmil -2.279e+01 9.202e+03 -0.002 0.9980
## firstnameEmile -2.308e+01 1.773e+04 -0.001 0.9990
## firstnameEmilie -1.873e+01 3.071e+04 -0.001 0.9995
## firstnameEmilio 1.923e+01 1.254e+04 0.002 0.9988
## firstnameEmily -1.654e+01 2.582e+04 -0.001 0.9995
## firstnameEmma -1.645e+01 2.610e+04 -0.001 0.9995
## firstnameEncarnacion -1.700e+01 3.071e+04 -0.001 0.9996
## firstnameEngelhart -2.208e+01 1.773e+04 -0.001 0.9990
## firstnameEric -2.178e+01 1.773e+04 -0.001 0.9990
## firstnameErik -2.259e+01 9.494e+03 -0.002 0.9981
## firstnameErna -1.564e+01 3.071e+04 -0.001 0.9996
## firstnameErnest -2.161e+01 6.701e+03 -0.003 0.9974
## firstnameErnesti -2.061e+01 1.773e+04 -0.001 0.9991
## firstnameErnst -1.500e+00 1.871e+00 -0.802 0.4228
## firstnameEscott -2.155e+01 1.773e+04 -0.001 0.9990
## firstnameEsther -1.655e+01 3.071e+04 -0.001 0.9996
## firstnameEthel -1.758e+01 2.691e+04 -0.001 0.9995
## firstnameEugene -1.534e+00 2.133e+00 -0.719 0.4721
## firstnameEugenie -1.802e+01 3.071e+04 -0.001 0.9995
## firstnameEva -1.589e+01 2.716e+04 -0.001 0.9995
## firstnameEvan -2.108e+01 1.773e+04 -0.001 0.9991
## firstnameFahim 2.005e+01 1.773e+04 0.001 0.9991
## firstnameFang 2.116e+01 1.773e+04 0.001 0.9990
## firstnameFermina -1.873e+01 3.071e+04 -0.001 0.9995
## firstnameFilip -2.074e+01 1.773e+04 -0.001 0.9991
## firstnameFlorence -1.706e+01 2.649e+04 -0.001 0.9995
## firstnameFlorentina -1.694e+01 3.071e+04 -0.001 0.9996
## firstnameFrancesco -2.103e+01 1.773e+04 -0.001 0.9991
## firstnameFrancis -2.134e+01 1.158e+04 -0.002 0.9985
## firstnameFrancisco -2.396e+01 1.773e+04 -0.001 0.9989
## firstnameFrank -2.474e+00 1.901e+00 -1.301 0.1931
## firstnameFrans -2.383e+01 1.773e+04 -0.001 0.9989
## firstnameFranz -2.331e-01 2.231e+00 -0.104 0.9168
## firstnameFrederic 1.748e+01 1.252e+04 0.001 0.9989
## firstnameFrederick -4.083e+00 1.890e+00 -2.161 0.0307 *
## firstnameFridtjof 2.010e+01 1.773e+04 0.001 0.9991
## firstnameGeorge -3.850e+00 1.688e+00 -2.281 0.0226 *
## firstnameGeorges 1.935e+01 1.773e+04 0.001 0.9991
## firstnameGeorgette -1.974e+01 3.071e+04 -0.001 0.9995
## firstnameGerda -5.727e+01 3.071e+04 -0.002 0.9985
## firstnameGerios -2.111e+01 1.773e+04 -0.001 0.9991
## firstnameGerious -1.946e+01 1.773e+04 -0.001 0.9991
## firstnameGertrud -5.914e+01 3.071e+04 -0.002 0.9985
## firstnameGertrude -1.858e+01 3.071e+04 -0.001 0.9995
## firstnameGilbert -2.863e+00 2.121e+00 -1.350 0.1771
## firstnameGladys -1.897e+01 3.071e+04 -0.001 0.9995
## firstnameGosta -2.293e+01 1.773e+04 -0.001 0.9990
## firstnameGrace -3.709e+01 2.507e+04 -0.001 0.9988
## firstnameGretchen -2.021e+01 3.071e+04 -0.001 0.9995
## firstnameGuentcho -1.997e+01 1.773e+04 -0.001 0.9991
## firstnameGuillaume 2.144e+01 1.773e+04 0.001 0.9990
## firstnameGunnar 2.114e+01 1.773e+04 0.001 0.9990
## firstnameGurshon 2.058e+01 1.773e+04 0.001 0.9991
## firstnameGustaf -2.023e+01 1.242e+04 -0.002 0.9987
## firstnameGustave 1.735e+01 1.773e+04 0.001 0.9992
## firstnameHammad 1.714e+01 1.773e+04 0.001 0.9992
## firstnameHanna 2.000e+01 1.773e+04 0.001 0.9991
## firstnameHannah -1.700e+01 3.071e+04 -0.001 0.9996
## firstnameHanne -1.495e+01 3.071e+04 0.000 0.9996
## firstnameHanora -5.674e+01 3.071e+04 -0.002 0.9985
## firstnameHans -2.126e+01 8.721e+03 -0.002 0.9981
## firstnameHarald -2.288e+01 1.773e+04 -0.001 0.9990
## firstnameHarold -2.113e+00 2.017e+00 -1.047 0.2949
## firstnameHarriet -1.881e+01 3.071e+04 -0.001 0.9995
## firstnameHarry -3.277e+00 1.797e+00 -1.824 0.0682 .
## firstnameHarvey -2.186e+01 1.773e+04 -0.001 0.9990
## firstnameHedwig -1.444e+01 2.674e+04 -0.001 0.9996
## firstnameHelen -3.707e+01 2.507e+04 -0.001 0.9988
## firstnameHelena -5.577e+01 3.071e+04 -0.002 0.9986
## firstnameHelene -1.864e+01 2.690e+04 -0.001 0.9994
## firstnameHelga -1.614e+01 3.071e+04 -0.001 0.9996
## firstnameHelmina -1.503e+01 3.071e+04 0.000 0.9996
## firstnameHenriette -5.924e+01 3.071e+04 -0.002 0.9985
## firstnameHenrik -1.997e+01 1.773e+04 -0.001 0.9991
## firstnameHenry -3.184e+00 1.693e+00 -1.881 0.0600 .
## firstnameHerbert -2.294e+01 1.210e+04 -0.002 0.9985
## firstnameHilda -1.649e+01 2.796e+04 -0.001 0.9995
## firstnameHildur -1.798e+01 3.071e+04 -0.001 0.9995
## firstnameHileni -5.689e+01 3.071e+04 -0.002 0.9985
## firstnameHoussein -2.080e+01 1.773e+04 -0.001 0.9991
## firstnameHoward -2.342e+01 1.773e+04 -0.001 0.9989
## firstnameHudson -4.891e+00 2.463e+00 -1.985 0.0471 *
## firstnameHulda -5.649e+01 2.794e+04 -0.002 0.9984
## firstnameHusein -1.961e+01 1.773e+04 -0.001 0.9991
## firstnameIda -3.826e+01 2.507e+04 -0.002 0.9988
## firstnameIgnjac -1.992e+01 1.773e+04 -0.001 0.9991
## firstnameIisakki -1.958e+01 1.773e+04 -0.001 0.9991
## firstnameIlia -2.053e+01 1.773e+04 -0.001 0.9991
## firstnameIlmari -2.114e+01 1.773e+04 -0.001 0.9990
## firstnameImanita -1.707e+01 3.071e+04 -0.001 0.9996
## firstnameIngeborg -5.791e+01 3.071e+04 -0.002 0.9985
## firstnameIngvar -2.313e+01 1.773e+04 -0.001 0.9990
## firstnameIrene -3.846e+01 2.507e+04 -0.002 0.9988
## firstnameIsaac 1.756e+01 1.773e+04 0.001 0.9992
## firstnameIsidor -2.203e+01 1.773e+04 -0.001 0.9990
## firstnameIsrael -2.199e+01 1.773e+04 -0.001 0.9990
## firstnameIvan -1.002e+00 1.867e+00 -0.537 0.5913
## firstnameJaako -2.072e+01 1.773e+04 -0.001 0.9991
## firstnameJacob -2.076e+01 1.187e+04 -0.002 0.9986
## firstnameJacques -2.373e+01 1.773e+04 -0.001 0.9989
## firstnameJakob -2.241e+01 7.474e+03 -0.003 0.9976
## firstnameJames -3.676e+00 1.909e+00 -1.926 0.0541 .
## firstnameJamila -1.577e+01 3.071e+04 -0.001 0.9996
## firstnameJan 2.124e+01 1.773e+04 0.001 0.9990
## firstnameJane -1.711e+01 2.688e+04 -0.001 0.9995
## firstnameJanko -2.108e+01 1.773e+04 -0.001 0.9991
## firstnameJean -1.971e+01 1.773e+04 -0.001 0.9991
## firstnameJeannie -5.587e+01 3.071e+04 -0.002 0.9985
## firstnameJego -2.056e+01 1.773e+04 -0.001 0.9991
## firstnameJelka -5.724e+01 3.071e+04 -0.002 0.9985
## firstnameJennie -1.511e+01 2.783e+04 -0.001 0.9996
## firstnameJenny -5.611e+01 3.071e+04 -0.002 0.9985
## firstnameJeremiah -2.053e+01 1.773e+04 -0.001 0.9991
## firstnameJeso -2.058e+01 1.773e+04 -0.001 0.9991
## firstnameJessie -3.700e+01 2.507e+04 -0.001 0.9988
## firstnameJoan -1.995e+01 3.071e+04 -0.001 0.9995
## firstnameJohan -6.771e-01 1.649e+00 -0.411 0.6814
## firstnameJohann -1.979e+01 1.773e+04 -0.001 0.9991
## firstnameJohanna -5.579e+01 3.071e+04 -0.002 0.9986
## firstnameJohannes -2.060e+01 1.220e+04 -0.002 0.9987
## firstnameJohn -4.139e+00 1.635e+00 -2.531 0.0114 *
## firstnameJose -2.380e+01 1.135e+04 -0.002 0.9983
## firstnameJosef -2.000e+01 1.773e+04 -0.001 0.9991
## firstnameJosefine -5.674e+01 3.071e+04 -0.002 0.9985
## firstnameJoseph -3.171e+00 1.744e+00 -1.819 0.0690 .
## firstnameJovan -1.955e+01 1.773e+04 -0.001 0.9991
## firstnameJovo -2.058e+01 1.773e+04 -0.001 0.9991
## firstnameJozef -1.979e+01 1.773e+04 -0.001 0.9991
## firstnameJuha -1.394e-01 2.242e+00 -0.062 0.9504
## firstnameJuho 7.392e-01 2.053e+00 0.360 0.7188
## firstnameJulia -3.649e+01 2.507e+04 -0.001 0.9988
## firstnameJulie -1.707e+01 3.071e+04 -0.001 0.9996
## firstnameJuliette -1.816e+01 3.071e+04 -0.001 0.9995
## firstnameJulius -3.097e-03 1.928e+00 -0.002 0.9987
## firstnameJuozas -2.197e+01 1.773e+04 -0.001 0.9990
## firstnameKalle -1.989e+01 1.773e+04 -0.001 0.9991
## firstnameKaren -1.566e+01 3.071e+04 -0.001 0.9996
## firstnameKarl -1.486e+00 1.664e+00 -0.893 0.3719
## firstnameKarolina -1.663e+01 3.071e+04 -0.001 0.9996
## firstnameKate -3.621e+01 2.507e+04 -0.001 0.9988
## firstnameKatherine -3.664e+01 2.507e+04 -0.001 0.9988
## firstnameKatriina -5.732e+01 3.071e+04 -0.002 0.9985
## firstnameKhalil -2.000e+01 1.773e+04 -0.001 0.9991
## firstnameKlas -2.056e+01 1.773e+04 -0.001 0.9991
## firstnameKornelia -1.719e+01 3.071e+04 -0.001 0.9996
## firstnameKristina -5.587e+01 3.071e+04 -0.002 0.9985
## firstnameKurt -2.202e+01 1.773e+04 -0.001 0.9990
## firstnameLaina -1.503e+01 3.071e+04 0.000 0.9996
## firstnameLalio -2.106e+01 1.773e+04 -0.001 0.9991
## firstnameLatifa -1.609e+01 3.071e+04 -0.001 0.9996
## firstnameLaura -1.657e+01 2.736e+04 -0.001 0.9995
## firstnameLawrence -1.323e+00 2.062e+00 -0.642 0.5211
## firstnameLazar -2.111e+01 1.773e+04 -0.001 0.9991
## firstnameLeah -1.561e+01 3.071e+04 -0.001 0.9996
## firstnameLee 7.000e-01 2.053e+00 0.341 0.7331
## firstnameLeo -2.018e+01 1.016e+04 -0.002 0.9984
## firstnameLeon -2.114e+01 1.773e+04 -0.001 0.9990
## firstnameLeonard -2.305e+01 1.773e+04 -0.001 0.9990
## firstnameLeontine -2.013e+01 3.071e+04 -0.001 0.9995
## firstnameLeopold -2.197e+01 1.773e+04 -0.001 0.9990
## firstnameLeslie -1.991e+01 1.773e+04 -0.001 0.9991
## firstnameLewis -2.116e+01 1.184e+04 -0.002 0.9986
## firstnameLilian -3.766e+01 2.507e+04 -0.002 0.9988
## firstnameLillian -3.625e+01 2.507e+04 -0.001 0.9988
## firstnameLily -1.854e+01 2.801e+04 -0.001 0.9995
## firstnameLinhart -1.980e+01 1.773e+04 -0.001 0.9991
## firstnameLionel -1.971e+01 1.773e+04 -0.001 0.9991
## firstnameLiudevit -2.053e+01 1.773e+04 -0.001 0.9991
## firstnameLouise -2.003e+01 3.071e+04 -0.001 0.9995
## firstnameLucien -2.508e+01 1.773e+04 -0.001 0.9989
## firstnameLucile -1.923e+01 2.794e+04 -0.001 0.9995
## firstnameLucy -1.720e+01 2.760e+04 -0.001 0.9995
## firstnameLuise -1.593e+01 2.738e+04 -0.001 0.9995
## firstnameLuka -2.058e+01 1.212e+04 -0.002 0.9986
## firstnameLulu -1.684e+01 3.071e+04 -0.001 0.9996
## firstnameLutie -1.642e+01 3.071e+04 -0.001 0.9996
## firstnameLyyli -1.763e+01 3.071e+04 -0.001 0.9995
## firstnameMabel -3.904e+01 2.507e+04 -0.002 0.9988
## firstnameMadeleine -1.860e+01 2.694e+04 -0.001 0.9994
## firstnameMahala -1.850e+01 3.071e+04 -0.001 0.9995
## firstnameMalake -5.677e+01 3.071e+04 -0.002 0.9985
## firstnameMalkolm -1.979e+01 1.773e+04 -0.001 0.9991
## firstnameMalvina -1.831e+01 3.071e+04 -0.001 0.9995
## firstnameManca -1.793e+01 3.071e+04 -0.001 0.9995
## firstnameManda -5.730e+01 3.071e+04 -0.002 0.9985
## firstnameMansouer -1.991e+01 1.773e+04 -0.001 0.9991
## firstnameMansour -2.104e+01 1.773e+04 -0.001 0.9991
## firstnameManta -5.608e+01 3.071e+04 -0.002 0.9985
## firstnameManuel -2.264e+01 1.058e+04 -0.002 0.9983
## firstnameMapriededer -1.996e+01 1.773e+04 -0.001 0.9991
## firstnameMara -1.489e+01 3.071e+04 0.000 0.9996
## firstnameMargaret -3.541e+01 2.507e+04 -0.001 0.9989
## firstnameMargaretha -1.850e+01 3.071e+04 -0.001 0.9995
## firstnameMargaretta -1.871e+01 3.071e+04 -0.001 0.9995
## firstnameMargit -5.912e+01 3.071e+04 -0.002 0.9985
## firstnameMarguerite -1.793e+01 3.071e+04 -0.001 0.9995
## firstnameMari -5.730e+01 3.071e+04 -0.002 0.9985
## firstnameMaria -3.605e+01 2.507e+04 -0.001 0.9989
## firstnameMariana -1.557e+01 2.717e+04 -0.001 0.9995
## firstnameMarie -1.771e+01 2.692e+04 -0.001 0.9995
## firstnameMarija -5.683e+01 2.790e+04 -0.002 0.9984
## firstnameMarin -1.974e+01 1.773e+04 -0.001 0.9991
## firstnameMarion -1.746e+01 2.669e+04 -0.001 0.9995
## firstnameMarius -2.103e+01 1.773e+04 -0.001 0.9991
## firstnameMarjorie -1.935e+01 2.788e+04 -0.001 0.9994
## firstnameMark -2.211e+01 1.773e+04 -0.001 0.9990
## firstnameMarshall 1.736e+01 1.773e+04 0.001 0.9992
## firstnameMarta -5.877e+01 3.071e+04 -0.002 0.9985
## firstnameMartha -1.769e+01 2.791e+04 -0.001 0.9995
## firstnameMartin -2.220e+01 9.287e+03 -0.002 0.9981
## firstnameMary -3.672e+01 2.507e+04 -0.001 0.9988
## firstnameMasabumi 1.956e+01 1.773e+04 0.001 0.9991
## firstnameMate -2.058e+01 1.773e+04 -0.001 0.9991
## firstnameMathilde -1.697e+01 3.071e+04 -0.001 0.9996
## firstnameMatilda -5.611e+01 3.071e+04 -0.002 0.9985
## firstnameMatti -2.056e+01 1.219e+04 -0.002 0.9987
## firstnameMaude -1.821e+01 3.071e+04 -0.001 0.9995
## firstnameMauritz -1.349e+00 2.682e+00 -0.503 0.6150
## firstnameMax 1.727e+01 1.773e+04 0.001 0.9992
## firstnameMaxmillian 1.892e+01 1.773e+04 0.001 0.9991
## firstnameMeier 1.933e+01 1.773e+04 0.001 0.9991
## firstnameMichael -1.984e+01 1.773e+04 -0.001 0.9991
## firstnameMichel -2.756e+00 2.446e+00 -1.127 0.2598
## firstnameMilan -1.987e+01 1.773e+04 -0.001 0.9991
## firstnameMilton -2.394e+01 1.773e+04 -0.001 0.9989
## firstnameMinko -1.997e+01 1.773e+04 -0.001 0.9991
## firstnameMiriam -1.811e+01 3.071e+04 -0.001 0.9995
## firstnameMirko -2.058e+01 1.773e+04 -0.001 0.9991
## firstnameMohamed -1.961e+01 1.773e+04 -0.001 0.9991
## firstnameMoses -2.308e+01 1.773e+04 -0.001 0.9990
## firstnameMyna -1.847e+01 3.071e+04 -0.001 0.9995
## firstnameNassef 2.116e+01 1.773e+04 0.001 0.9990
## firstnameNathan -1.958e+01 1.773e+04 -0.001 0.9991
## firstnameNeal -2.103e+01 1.773e+04 -0.001 0.9991
## firstnameNedelio -2.053e+01 1.773e+04 -0.001 0.9991
## firstnameNelle -2.016e+01 3.071e+04 -0.001 0.9995
## firstnameNellie -1.685e+01 2.803e+04 -0.001 0.9995
## firstnameNeshan 2.114e+01 1.773e+04 0.001 0.9990
## firstnameNestor -1.992e+01 1.773e+04 -0.001 0.9991
## firstnameNicholas -2.182e+01 1.773e+04 -0.001 0.9990
## firstnameNiels -1.940e+01 1.773e+04 -0.001 0.9991
## firstnameNikola 2.119e+01 1.773e+04 0.001 0.9990
## firstnameNikolai -2.032e+01 1.247e+04 -0.002 0.9987
## firstnameNils -2.050e+01 7.755e+03 -0.003 0.9979
## firstnameNorman 1.714e+01 1.773e+04 0.001 0.9992
## firstnameNourelain -5.791e+01 3.071e+04 -0.002 0.9985
## firstnameOberst 1.790e+01 1.773e+04 0.001 0.9992
## firstnameOlaf -2.061e+01 1.773e+04 -0.001 0.9991
## firstnameOlaus 2.114e+01 1.773e+04 0.001 0.9990
## firstnameOlga -1.611e+01 3.071e+04 -0.001 0.9996
## firstnameOlive -2.016e+01 3.071e+04 -0.001 0.9995
## firstnameOlof -2.108e+01 1.253e+04 -0.002 0.9987
## firstnameOrian -1.904e+01 3.071e+04 -0.001 0.9995
## firstnameOrsen -2.108e+01 1.773e+04 -0.001 0.9991
## firstnameOrtin -1.994e+01 1.773e+04 -0.001 0.9991
## firstnameOscar 2.132e+01 1.773e+04 0.001 0.9990
## firstnameOskar 2.117e+01 1.254e+04 0.002 0.9987
## firstnameOwen -2.084e+01 1.248e+04 -0.002 0.9987
## firstnamePatrick -2.039e+01 8.260e+03 -0.002 0.9980
## firstnamePaul -2.422e+00 2.055e+00 -1.179 0.2384
## firstnamePauline -1.858e+01 3.071e+04 -0.001 0.9995
## firstnamePehr -2.056e+01 1.773e+04 -0.001 0.9991
## firstnamePeju -1.971e+01 1.773e+04 -0.001 0.9991
## firstnamePekka -1.992e+01 1.773e+04 -0.001 0.9991
## firstnamePenko -2.108e+01 1.773e+04 -0.001 0.9991
## firstnamePercival -2.277e+01 1.218e+04 -0.002 0.9985
## firstnamePercy -2.900e+00 1.970e+00 -1.472 0.1411
## firstnamePetar -2.058e+01 1.773e+04 -0.001 0.9991
## firstnamePeter -2.606e+00 1.975e+00 -1.319 0.1871
## firstnamePhilip -1.923e+00 2.109e+00 -0.912 0.3618
## firstnamePhilipp 1.722e+01 1.773e+04 0.001 0.9992
## firstnamePhyllis -2.001e+01 3.071e+04 -0.001 0.9995
## firstnamePieta -5.618e+01 3.071e+04 -0.002 0.9985
## firstnameQuigg -2.508e+01 1.773e+04 -0.001 0.9989
## firstnameRaffull -2.114e+01 1.773e+04 -0.001 0.9990
## firstnameRalph -3.437e+00 2.217e+00 -1.551 0.1210
## firstnameRamon -2.193e+01 1.773e+04 -0.001 0.9990
## firstnameRedjo -2.000e+01 1.773e+04 -0.001 0.9991
## firstnameReginald -2.212e+01 8.819e+03 -0.003 0.9980
## firstnameRene -2.142e+01 1.241e+04 -0.002 0.9986
## firstnameRichard -3.605e+00 1.780e+00 -2.026 0.0428 *
## firstnameRistiu -2.000e+01 1.773e+04 -0.001 0.9991
## firstnameRobert -3.193e+00 1.905e+00 -1.676 0.0938 .
## firstnameRoberta -1.971e+01 3.071e+04 -0.001 0.9995
## firstnameRobina -5.791e+01 3.071e+04 -0.002 0.9985
## firstnameRosa -1.550e+01 2.769e+04 -0.001 0.9996
## firstnameRosalie -3.859e+01 2.507e+04 -0.002 0.9988
## firstnameRossmore -2.061e+01 1.773e+04 -0.001 0.9991
## firstnameRuth -1.844e+01 2.696e+04 -0.001 0.9995
## firstnameSahid 2.000e+01 1.773e+04 0.001 0.9991
## firstnameSaiide -5.674e+01 3.071e+04 -0.002 0.9985
## firstnameSalli -5.912e+01 3.071e+04 -0.002 0.9985
## firstnameSallie -1.852e+01 3.071e+04 -0.001 0.9995
## firstnameSamuel -3.390e+00 1.876e+00 -1.807 0.0707 .
## firstnameSante -2.513e+01 1.773e+04 -0.001 0.9989
## firstnameSara -3.842e+01 2.507e+04 -0.002 0.9988
## firstnameSarah -1.889e+01 3.071e+04 -0.001 0.9995
## firstnameSatio -2.103e+01 1.773e+04 -0.001 0.9991
## firstnameSebastiano -2.191e+01 1.773e+04 -0.001 0.9990
## firstnameSelena -1.816e+01 3.071e+04 -0.001 0.9995
## firstnameSelini -1.569e+01 3.071e+04 -0.001 0.9996
## firstnameSelma -1.471e+01 3.071e+04 0.000 0.9996
## firstnameServando -2.395e+01 1.773e+04 -0.001 0.9989
## firstnameShadrach -2.178e+01 1.773e+04 -0.001 0.9990
## firstnameShawneene -1.471e+01 3.071e+04 0.000 0.9996
## firstnameShedid -2.107e+01 1.773e+04 -0.001 0.9991
## firstnameSidney -2.742e+00 1.982e+00 -1.383 0.1665
## firstnameSigrid -3.765e+01 2.507e+04 -0.002 0.9988
## firstnameSigurd -2.000e+01 1.773e+04 -0.001 0.9991
## firstnameSigvard -2.288e+01 1.773e+04 -0.001 0.9990
## firstnameSimon -1.964e+01 1.773e+04 -0.001 0.9991
## firstnameSimonne -1.998e+01 3.071e+04 -0.001 0.9995
## firstnameSinai -2.178e+01 1.773e+04 -0.001 0.9990
## firstnameSleiman -1.987e+01 1.773e+04 -0.001 0.9991
## firstnameSolomon -2.158e+01 1.773e+04 -0.001 0.9990
## firstnameSophie -1.561e+01 3.071e+04 -0.001 0.9996
## firstnameSpencer 1.735e+01 1.773e+04 0.001 0.9992
## firstnameStanley -2.260e+01 1.216e+04 -0.002 0.9985
## firstnameStefo -1.981e+01 1.773e+04 -0.001 0.9991
## firstnameStephen -2.259e+01 9.679e+03 -0.002 0.9981
## firstnameStina -5.909e+01 3.071e+04 -0.002 0.9985
## firstnameStjepan -1.971e+01 1.773e+04 -0.001 0.9991
## firstnameStoytcho -1.992e+01 1.773e+04 -0.001 0.9991
## firstnameSusan -1.779e+01 2.739e+04 -0.001 0.9995
## firstnameSusanna -5.727e+01 3.071e+04 -0.002 0.9985
## firstnameSvend -2.058e+01 1.773e+04 -0.001 0.9991
## firstnameSylvia -1.816e+01 3.071e+04 -0.001 0.9995
## firstnameTannous -2.091e+01 1.248e+04 -0.002 0.9987
## firstnameTelma -5.912e+01 3.071e+04 -0.002 0.9985
## firstnameThamine -1.566e+01 3.071e+04 -0.001 0.9996
## firstnameTheodore 2.127e+01 1.773e+04 0.001 0.9990
## firstnameThomas -4.461e+00 1.899e+00 -2.349 0.0188 *
## firstnameThomson -2.376e+01 1.773e+04 -0.001 0.9989
## firstnameThor -2.114e+01 1.773e+04 -0.001 0.9990
## firstnameThornton -2.389e+01 1.773e+04 -0.001 0.9989
## firstnameThure 2.132e+01 1.773e+04 0.001 0.9990
## firstnameTido -1.966e+01 1.773e+04 -0.001 0.9991
## firstnameTillie -1.873e+01 3.071e+04 -0.001 0.9995
## firstnameTimothy -2.329e+01 1.773e+04 -0.001 0.9990
## firstnameTome -2.103e+01 1.773e+04 -0.001 0.9991
## firstnameTorborg -5.794e+01 3.071e+04 -0.002 0.9985
## firstnameTreasteall -5.909e+01 3.071e+04 -0.002 0.9985
## firstnameTyrell -2.376e+01 1.773e+04 -0.001 0.9989
## firstnameUrho -2.293e+01 1.773e+04 -0.001 0.9990
## firstnameValtcho -1.953e+01 1.773e+04 -0.001 0.9991
## firstnameVassilios -2.054e+01 1.773e+04 -0.001 0.9991
## firstnameVelin -1.614e+01 3.071e+04 -0.001 0.9996
## firstnameVera -1.968e+01 3.071e+04 -0.001 0.9995
## firstnameVictor -4.310e+00 2.120e+00 -2.033 0.0420 *
## firstnameVictorine -1.881e+01 3.071e+04 -0.001 0.9995
## firstnameViktor -2.056e+01 1.773e+04 -0.001 0.9991
## firstnameViljo 1.615e+01 1.773e+04 0.001 0.9993
## firstnameVincenz -1.997e+01 1.773e+04 -0.001 0.9991
## firstnameVirginia -1.760e+01 2.769e+04 -0.001 0.9995
## firstnameVivian -2.510e+01 1.773e+04 -0.001 0.9989
## firstnameWaika -1.559e+01 3.071e+04 -0.001 0.9996
## firstnameWalter -2.307e+01 7.390e+03 -0.003 0.9975
## firstnameWashington -2.966e+00 2.084e+00 -1.423 0.1548
## firstnameWendla -5.724e+01 3.071e+04 -0.002 0.9985
## firstnameWilhelm -1.961e+01 1.773e+04 -0.001 0.9991
## firstnameWilliam -3.521e+00 1.590e+00 -2.215 0.0268 *
## firstnameWinifred -1.883e+01 3.071e+04 -0.001 0.9995
## firstnameWinnie -1.476e+01 3.071e+04 0.000 0.9996
## firstnameWyckoff -2.219e+01 1.773e+04 -0.001 0.9990
## firstnameYoto -1.994e+01 1.773e+04 -0.001 0.9991
## firstnameYousseff -1.987e+01 1.773e+04 -0.001 0.9991
## sexmale -3.619e+01 2.507e+04 -0.001 0.9988
## age -2.609e-02 2.011e-02 -1.297 0.1945
## age_groupBaby 2.332e+00 9.790e-01 2.382 0.0172 *
## age_groupChild 4.338e-01 9.785e-01 0.443 0.6575
## age_groupSenior -9.141e-01 8.701e-01 -1.051 0.2935
## age_groupTeen 3.777e-01 6.311e-01 0.598 0.5495
## age_groupYoung adult 1.009e+00 5.052e-01 1.997 0.0458 *
## age_groupYoung child 1.314e+00 1.043e+00 1.259 0.2079
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1414.62 on 1045 degrees of freedom
## Residual deviance: 404.46 on 537 degrees of freedom
## AIC: 1422.5
##
## Number of Fisher Scoring iterations: 19
In this model we can clearly see that age, sex and pclass are significant but name is not.
reduced.modelTitanic<-step(modelTitanic, direction="both")
## Start: AIC=1422.46
## survived ~ pclass + firstname + sex + age + age_group
##
## Df Deviance AIC
## - firstname 499 974.57 994.57
## - age_group 6 412.77 1418.77
## - sex 1 404.78 1420.78
## - age 1 406.16 1422.16
## <none> 404.46 1422.46
## - pclass 1 492.84 1508.84
##
## Step: AIC=994.57
## survived ~ pclass + sex + age + age_group
##
## Df Deviance AIC
## - age_group 6 983.02 991.02
## <none> 974.57 994.57
## - age 1 981.11 999.11
## - pclass 1 1086.01 1104.01
## - sex 1 1249.34 1267.34
## + firstname 499 404.46 1422.46
##
## Step: AIC=991.02
## survived ~ pclass + sex + age
##
## Df Deviance AIC
## <none> 983.02 991.02
## + age_group 6 974.57 994.57
## - age 1 1013.85 1019.85
## - pclass 1 1101.34 1107.34
## - sex 1 1256.18 1262.18
## + firstname 499 412.77 1418.77
summary(reduced.modelTitanic)
##
## Call:
## glm(formula = survived ~ pclass + sex + age, family = binomial(logit),
## data = newTitanic)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.6159 -0.7162 -0.4321 0.6572 2.4041
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 4.58927 0.40572 11.311 < 2e-16 ***
## pclass -1.13324 0.11173 -10.143 < 2e-16 ***
## sexmale -2.49738 0.16612 -15.034 < 2e-16 ***
## age -0.03388 0.00628 -5.395 6.84e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1414.62 on 1045 degrees of freedom
## Residual deviance: 983.02 on 1042 degrees of freedom
## AIC: 991.02
##
## Number of Fisher Scoring iterations: 4
ggplot(data = titanicData, aes( x = pclass, fill = as.factor(survived )))+
geom_bar()+
facet_grid(~sex)
titanicData$age_group <- factor(titanicData$age_group, levels = c("Baby","Young child","Child","Teen","Young adult","Adult","Senior"))
ggplot(data=titanicData,aes( x = age_group, fill = as.factor(survived )))+
geom_bar()
These names might have been lucky, because they belonged to many people that survived, but is is much more likely that these were just commonly used names for the time and the relationship between age, class and sex are much better predictors of chance of survival than just name alone.