One of the issues we have is that we want to include passenger class in our analysis but we don’t want to lose all the data on the cres. Let’s look at the relationship between Passenger Class and Crew as variables using crosstab().
crosstab(MV_Princess_Victoria, col.vars = "Crew", format = "frequency")
## NULL
## 0 1
## N 129 50
crosstab(MV_Princess_Victoria, col.vars = "Nationality_of_Passenger", format = "frequency")
## NULL
## England Ireland Northern Ireland Scotland Wales
## N 21 5 111 40 2
# Add a single variable crosstab for "Passenger Status"
# Add a crosstab of "Crew" and "Nationality of Passenger" with "Crew" as the row variable.
crosstab(MV_Princess_Victoria, row.vars= "Nationality_of_Passenger", col.vars = "Crew", title = "Nationality_of_Passenger vs. Crew", format = "frequency")
## Nationality_of_Passenger vs. Crew
## 0 1
## England 21 0
## Ireland 5 0
## Northern Ireland 88 23
## Scotland 13 27
## Wales 2 0
## Total N 129 50
# Run this table to see missing values
table(MV_Princess_Victoria$Crew, MV_Princess_Victoria$`Nationality_of_Passenger`, useNA = "ifany")
##
## England Ireland Northern Ireland Scotland Wales
## 0 21 5 88 13 2
## 1 0 0 23 27 0
My current variable Nationality of Passenger does not limit my ability to analyze data because I had a numeric value for the variable.
Now we want to create a new variable called “Class.Combined” that combines the two variables by adding “Crew” as a status.
MV_Princess_Victoria$Class.Combined<- ifelse(MV_Princess_Victoria$Crew == 1, "Crew", MV_Princess_Victoria$`Nationality_of_Passenger`)
The code above is telling R to combine Nationality of Passenger and crew using the “ifelse” function. Then R looks at each person in the dataset, it asks if that person is crew member. If yes, then they become “crew” in our new variable, “class.combined”. If no, then the person a passenger and R uses the “Nationality of Passenger” variable in our new variable “class.combined”.
crosstab(MV_Princess_Victoria, row.vars ="Crew", col.vars = "Class.Combined", title="Nationality of Passenger vs. Crew", format="frequency")
## Nationality of Passenger vs. Crew
## Crew England Ireland Northern Ireland Scotland Wales
## 0 0 21 5 88 13 2
## 1 50 0 0 0 0 0
## Total N 50 21 5 88 13 2
Add a crosstab of “Class.Combined” with “Survival” as the row variable and type=“c” to get column percents.
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = "Class.Combined", title = "Survival by Crew", format = "col_frequency")
## Survival by Crew
## Crew England Ireland Northern Ireland Scotland Wales
## 0 40 14 5 67 9 0
## 1 10 7 0 21 4 2
## Total N 50 21 5 88 13 2
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = "Class.Combined", title = "Survival by Crew and Nationality of Passenger", format = "col_percent")
## Survival by Crew and Nationality of Passenger
## Crew England Ireland Northern Ireland Scotland Wales
## 0 80 66.7 100 76.1 69.2 0
## 1 20 33.3 0 23.9 30.8 100
## Total N 50 21 5 88 13 2
Northern Ireland passenger on board had a greater chance of dying in the disaster since they were the majority of the population. MV Princess Victoria had 88 Northern Ireland passenger on board. 21 passenger from England, 13 passenger from Scotland, 5 passenger from Ireland and only 2 passenger from Wales. Only 23.9% of the Northern Ireland passenger population survived the marine disaster and they were the majority of the population on board. The information also details that some nationalities survived more than others nationalities on board.
We have previously looked at survival by gender. This time let’s look at the combination of Class.Combined and Gender.
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined","Gender"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
## Crew 0 England 0 Ireland 0 Northern Ireland 0 Scotland 0 Wales 0
## 0 77.8 53.3 100 71.6 50 0
## 1 22.2 46.7 0 28.4 50 100
## Total N 45 15 4 74 8 2
## Crew 1 England 1 Ireland 1 Northern Ireland 1 Scotland 1 Wales 1
## 0 100 100 100 100 100 NaN
## 1 0 0 0 0 0 NaN
## Total N 5 6 1 14 5 0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender", "Class.Combined"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
## 0 Crew 1 Crew 0 England 1 England 0 Ireland 1 Ireland
## 0 77.8 100 53.3 100 100 100
## 1 22.2 0 46.7 0 0 0
## Total N 45 5 15 6 4 1
## 0 Northern Ireland 1 Northern Ireland 0 Scotland 1 Scotland
## 0 71.6 100 50 100
## 1 28.4 0 50 0
## Total N 74 14 8 5
## 0 Wales 1 Wales
## 0 0 NaN
## 1 100 NaN
## Total N 2 0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender"), title = "Survival by Crew and Gender", format = "col_frequency")
## Survival by Crew and Gender
## Crew 0 England 0 Ireland 0 Northern Ireland 0 Scotland 0 Wales 0
## 0 35 8 4 53 4 0
## 1 10 7 0 21 4 2
## Total N 45 15 4 74 8 2
## Crew 1 England 1 Ireland 1 Northern Ireland 1 Scotland 1 Wales 1
## 0 5 6 1 14 5 0
## 1 0 0 0 0 0 0
## Total N 5 6 1 14 5 0
# Do the same table but change the order of the column variables.
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender","Class.Combined"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
## 0 Crew 1 Crew 0 England 1 England 0 Ireland 1 Ireland
## 0 77.8 100 53.3 100 100 100
## 1 22.2 0 46.7 0 0 0
## Total N 45 5 15 6 4 1
## 0 Northern Ireland 1 Northern Ireland 0 Scotland 1 Scotland
## 0 71.6 100 50 100
## 1 28.4 0 50 0
## Total N 74 14 8 5
## 0 Wales 1 Wales
## 0 0 NaN
## 1 100 NaN
## Total N 2 0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
## Crew 0 England 0 Ireland 0 Northern Ireland 0 Scotland 0 Wales 0
## 0 77.8 53.3 100 71.6 50 0
## 1 22.2 46.7 0 28.4 50 100
## Total N 45 15 4 74 8 2
## Crew 1 England 1 Ireland 1 Northern Ireland 1 Scotland 1 Wales 1
## 0 100 100 100 100 100 NaN
## 1 0 0 0 0 0 NaN
## Total N 5 6 1 14 5 0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender", "Class.Combined"), title = "Survival by Crew and Gender", format = "col_frequency")
## Survival by Crew and Gender
## 0 Crew 1 Crew 0 England 1 England 0 Ireland 1 Ireland
## 0 35 5 8 6 4 1
## 1 10 0 7 0 0 0
## Total N 45 5 15 6 4 1
## 0 Northern Ireland 1 Northern Ireland 0 Scotland 1 Scotland
## 0 53 14 4 5
## 1 21 0 4 0
## Total N 74 14 8 5
## 0 Wales 1 Wales
## 0 0 0
## 1 2 0
## Total N 2 0
#Do the same table but create a frequency table.
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender", "Nationality_of_Passenger"), title = "Survival by Gender and Nationality of Passenger", format = "col_frequency")
## Survival by Gender and Nationality of Passenger
## 0 England 1 England 0 Ireland 1 Ireland 0 Northern Ireland
## 0 8 6 4 1 71
## 1 7 0 0 0 24
## Total N 15 6 4 1 95
## 1 Northern Ireland 0 Scotland 1 Scotland 0 Wales 1 Wales
## 0 16 21 8 0 0
## 1 0 11 0 2 0
## Total N 16 32 8 2 0
All three tables include crew and nationality of passenger. The three tables related to each other because the variable is survival for all three tables. The second table provides more information because it divides the crew and passenger by gender. It also provides information on who perished based on gender, passenger or crew and nationality.
I was able to analyzed the data, and concluded that more male crew and passengers aboard had a greater chance of surviving the disaster. All tables had a higher percentage of male survivors regardless of nationality of passenger.
Now let’s add a third variable, Child. Rememember that for MV_Princess_Victoria we have to create the Child variable.
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender","Class.Combined", "Child"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
## 0 0 1 0 0 1 1 1
## 0 69.7 100 100 NaN
## 1 30.3 0 0 NaN
## Total N 142 31 5 0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender", "Child"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
## Crew 0 England 0 Ireland 0 Northern Ireland 0 Scotland 0 Wales 0
## 0 80 65 100 75.9 69.2 0
## 1 20 35 0 24.1 30.8 100
## Total N 50 20 5 83 13 2
## Crew 1 England 1 Ireland 1 Northern Ireland 1 Scotland 1 Wales 1
## 0 NaN 100 NaN 100 NaN NaN
## 1 NaN 0 NaN 0 NaN NaN
## Total N 0 1 0 4 0 0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender", "Class.Combined", "Child"), title = "Survival by Crew and Gender", format = "col_frequency")
## Survival by Crew and Gender
## 0 0 1 0 0 1 1 1
## 0 99 31 5 0
## 1 43 0 0 0
## Total N 142 31 5 0
MV_Princess_Victoria$Child <- as.numeric(MV_Princess_Victoria$Age) <=15
# Run a cross tab with Class.Combined, Gender and Child as column variables and survival as the row variable.
# You can run several variations if you want.
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender", "Child"), title = "Survival by Crew,Gender and Child", format = "col_percent")
## Survival by Crew,Gender and Child
## Crew FALSE England FALSE Ireland FALSE Northern Ireland FALSE
## 0 78.3 50 100 65.1
## 1 21.7 50 0 34.9
## Total N 23 8 2 43
## Scotland FALSE Wales FALSE Crew TRUE England TRUE Ireland TRUE
## 0 70 0 NaN 100 NaN
## 1 30 100 NaN 0 NaN
## Total N 10 2 0 1 0
## Northern Ireland TRUE Scotland TRUE Wales TRUE
## 0 100 NaN NaN
## 1 0 NaN NaN
## Total N 4 0 0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender", "Child"), title = "Survival by Crew,Gender and Child", format = "col_percent")
## Survival by Crew,Gender and Child
## Crew FALSE England FALSE Ireland FALSE Northern Ireland FALSE
## 0 78.3 50 100 65.1
## 1 21.7 50 0 34.9
## Total N 23 8 2 43
## Scotland FALSE Wales FALSE Crew TRUE England TRUE Ireland TRUE
## 0 70 0 NaN 100 NaN
## 1 30 100 NaN 0 NaN
## Total N 10 2 0 1 0
## Northern Ireland TRUE Scotland TRUE Wales TRUE
## 0 100 NaN NaN
## 1 0 NaN NaN
## Total N 4 0 0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender", "Child"), title = "Survival by Crew,Gender and Child", format = "col_frequency")
## Survival by Crew,Gender and Child
## Crew FALSE England FALSE Ireland FALSE Northern Ireland FALSE
## 0 18 4 2 28
## 1 5 4 0 15
## Total N 23 8 2 43
## Scotland FALSE Wales FALSE Crew TRUE England TRUE Ireland TRUE
## 0 7 0 0 1 0
## 1 3 2 0 0 0
## Total N 10 2 0 1 0
## Northern Ireland TRUE Scotland TRUE Wales TRUE
## 0 4 0 0
## 1 0 0 0
## Total N 4 0 0
18 Crew members perish in which did not bring any childrens on board. 4 passengers from England perished in which did not bring any children on board. 2 passengers from Ireland perished in which did not bring any children. 7 passengers from Scotland perished in which did not bring any children. All passengers from Wales survived that did not bring a child on board. 1 Passenger from England died which brought a child on board. 0 Ireland passenger perished in which brought a child on board. 4 Northern Ireland passenger perished in which brought a child on board. 0 Scotland Passenger perished in which brought a child on board. And 0 Wales passenger perished in which brought a passenger on board. 5 Crew members survived in which did not bring a child on board. 4 passengers from England survived in which did not bring a child on board. 0 Ireland passengers survived in which did not bring a child on board. 15 Northern Ireland passenger in which did not bring a child on board. 3 Scotland passenger survived in which did not bring a child on board. 2 Wales passengers survived in which did not bring a child on board. 0 England passengers survived in which brought a child on board. 0 Ireland passengers survived in which brought a child on board. 0 Northern Ireland survived in which brought a child on board. 0 Scotland passengers survived in which brought a child on board. 0 Wales passengers survived in which brought a child on board.
Overall (using all of the tables above as needed): #### Does being a child increase or decrease someone’s chances of survival compared to not being a child? Being a child decreases the chances of survival, compared to not being a child increases the chances of survival.
Being a female decrease someone chances of survival compared to being a male increases your chances of survival.
Unable to compared data since the MV Princess Victoria does not have data regarding passenger class.
Unable to compared data since the MV Princess Victoria does not have passenger class variable.
Unable to compared crew data to passenger class since MV Princess Victoria does not have passenger class variable data.
Does this seem to be getting a bit complex? That is one reason why sociologists will often switch to a linear model. These may seem complex at first but compared to reading a complex crosstab they may be easier.
Here’s how to run a linear model called a logistic regression. Just run it.
results<-glm(Survival~Class.Combined + Gender + Child, data=MV_Princess_Victoria, family = binomial(link = logit))
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
coef(results)
## (Intercept) Class.CombinedEngland
## -1.2237754 1.5114575
## Class.CombinedIreland Class.CombinedNorthern Ireland
## -16.2907453 0.6737291
## Class.CombinedScotland Class.CombinedWales
## 1.2237754 19.7898439
## Gender ChildTRUE
## -18.1772766 -18.2151220
We are going to discuss this together, but for right now:
Children have a negative sign of survival in the marine disaster MV Princess Victoria because all children perished in the disaster.
Female have a negative probability surviving the marine disaster MV Princess Victoria because all female perished in the disaster.
Crew member had a positive probability of surviving the marine disaster MV Princess Victoria compare to passengers on board.
Unable to make a determination since MV Princess Victoria did not have a passenger class variable.
Unable to make a determination since MV Princess Victoria does not have a passenger class. My class combined varible is based of Nationality of Passengers and Crew member information.
These answers aboved compared to the data provided in table Survival~Class.Combined + Gender + Child which displayed a negative factor towards female gender and child. It also has a negative impact in the nationality of Northern Ireland because you had a greater chance of dying due to the majority of the passenger on board were from Northern Ireland.