One of the issues we have is that we want to include passenger class in our analysis but we don’t want to lose all the data on the cres. Let’s look at the relationship between Passenger Class and Crew as variables using crosstab().

crosstab(MV_Princess_Victoria, col.vars = "Crew", format = "frequency")
## NULL
##   0   1 
## N 129 50
crosstab(MV_Princess_Victoria, col.vars = "Nationality_of_Passenger", format = "frequency")
## NULL
##   England Ireland Northern Ireland Scotland Wales
## N 21      5       111              40       2
# Add a single variable crosstab for "Passenger Status"


# Add a crosstab of "Crew" and "Nationality of Passenger" with "Crew" as the row variable.
crosstab(MV_Princess_Victoria, row.vars= "Nationality_of_Passenger", col.vars = "Crew", title = "Nationality_of_Passenger vs. Crew", format = "frequency")
## Nationality_of_Passenger vs. Crew
##                  0   1 
## England          21  0 
## Ireland          5   0 
## Northern Ireland 88  23
## Scotland         13  27
## Wales            2   0 
## Total N          129 50
# Run this table to see missing values
table(MV_Princess_Victoria$Crew, MV_Princess_Victoria$`Nationality_of_Passenger`, useNA = "ifany")
##    
##     England Ireland Northern Ireland Scotland Wales
##   0      21       5               88       13     2
##   1       0       0               23       27     0

Explain how the current variable “Nationality of Passenger” limits our ability to analyze the data.

My current variable Nationality of Passenger does not limit my ability to analyze data because I had a numeric value for the variable.

Data Wrangling

Now we want to create a new variable called “Class.Combined” that combines the two variables by adding “Crew” as a status.

MV_Princess_Victoria$Class.Combined<- ifelse(MV_Princess_Victoria$Crew == 1, "Crew", MV_Princess_Victoria$`Nationality_of_Passenger`)

Explain in words what the above code does.

The code above is telling R to combine Nationality of Passenger and crew using the “ifelse” function. Then R looks at each person in the dataset, it asks if that person is crew member. If yes, then they become “crew” in our new variable, “class.combined”. If no, then the person a passenger and R uses the “Nationality of Passenger” variable in our new variable “class.combined”.

Add a crosstab of “Crew” and “Class.Combined” with “Crew” as the row variable.

crosstab(MV_Princess_Victoria, row.vars ="Crew", col.vars = "Class.Combined", title="Nationality of Passenger vs. Crew", format="frequency")
## Nationality of Passenger vs. Crew
##         Crew England Ireland Northern Ireland Scotland Wales
## 0       0    21      5       88               13       2    
## 1       50   0       0       0                0        0    
## Total N 50   21      5       88               13       2

Add a crosstab of “Class.Combined” with “Survival” as the row variable and type=“c” to get column percents.

crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = "Class.Combined", title = "Survival by Crew", format = "col_frequency")
## Survival by Crew
##         Crew England Ireland Northern Ireland Scotland Wales
## 0       40   14      5       67               9        0    
## 1       10   7       0       21               4        2    
## Total N 50   21      5       88               13       2
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = "Class.Combined", title = "Survival by Crew and Nationality of Passenger", format = "col_percent")
## Survival by Crew and Nationality of Passenger
##         Crew England Ireland Northern Ireland Scotland Wales
## 0       80   66.7    100     76.1             69.2     0    
## 1       20   33.3    0       23.9             30.8     100  
## Total N 50   21      5       88               13       2

Write a paragraph summarizing the relationship between Class.Combined and Survival.

Northern Ireland passenger on board had a greater chance of dying in the disaster since they were the majority of the population. MV Princess Victoria had 88 Northern Ireland passenger on board. 21 passenger from England, 13 passenger from Scotland, 5 passenger from Ireland and only 2 passenger from Wales. Only 23.9% of the Northern Ireland passenger population survived the marine disaster and they were the majority of the population on board. The information also details that some nationalities survived more than others nationalities on board.

We have previously looked at survival by gender. This time let’s look at the combination of Class.Combined and Gender.

crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined","Gender"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
##         Crew 0 England 0 Ireland 0 Northern Ireland 0 Scotland 0 Wales 0
## 0       77.8   53.3      100       71.6               50         0      
## 1       22.2   46.7      0         28.4               50         100    
## Total N 45     15        4         74                 8          2      
##         Crew 1 England 1 Ireland 1 Northern Ireland 1 Scotland 1 Wales 1
## 0       100    100       100       100                100        NaN    
## 1       0      0         0         0                  0          NaN    
## Total N 5      6         1         14                 5          0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender", "Class.Combined"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
##         0 Crew 1 Crew 0 England 1 England 0 Ireland 1 Ireland
## 0       77.8   100    53.3      100       100       100      
## 1       22.2   0      46.7      0         0         0        
## Total N 45     5      15        6         4         1        
##         0 Northern Ireland 1 Northern Ireland 0 Scotland 1 Scotland
## 0       71.6               100                50         100       
## 1       28.4               0                  50         0         
## Total N 74                 14                 8          5         
##         0 Wales 1 Wales
## 0       0       NaN    
## 1       100     NaN    
## Total N 2       0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender"), title = "Survival by Crew and Gender", format = "col_frequency")
## Survival by Crew and Gender
##         Crew 0 England 0 Ireland 0 Northern Ireland 0 Scotland 0 Wales 0
## 0       35     8         4         53                 4          0      
## 1       10     7         0         21                 4          2      
## Total N 45     15        4         74                 8          2      
##         Crew 1 England 1 Ireland 1 Northern Ireland 1 Scotland 1 Wales 1
## 0       5      6         1         14                 5          0      
## 1       0      0         0         0                  0          0      
## Total N 5      6         1         14                 5          0
# Do the same table but change the order of the column variables.

crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender","Class.Combined"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
##         0 Crew 1 Crew 0 England 1 England 0 Ireland 1 Ireland
## 0       77.8   100    53.3      100       100       100      
## 1       22.2   0      46.7      0         0         0        
## Total N 45     5      15        6         4         1        
##         0 Northern Ireland 1 Northern Ireland 0 Scotland 1 Scotland
## 0       71.6               100                50         100       
## 1       28.4               0                  50         0         
## Total N 74                 14                 8          5         
##         0 Wales 1 Wales
## 0       0       NaN    
## 1       100     NaN    
## Total N 2       0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
##         Crew 0 England 0 Ireland 0 Northern Ireland 0 Scotland 0 Wales 0
## 0       77.8   53.3      100       71.6               50         0      
## 1       22.2   46.7      0         28.4               50         100    
## Total N 45     15        4         74                 8          2      
##         Crew 1 England 1 Ireland 1 Northern Ireland 1 Scotland 1 Wales 1
## 0       100    100       100       100                100        NaN    
## 1       0      0         0         0                  0          NaN    
## Total N 5      6         1         14                 5          0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender", "Class.Combined"), title = "Survival by Crew and Gender", format = "col_frequency")
## Survival by Crew and Gender
##         0 Crew 1 Crew 0 England 1 England 0 Ireland 1 Ireland
## 0       35     5      8         6         4         1        
## 1       10     0      7         0         0         0        
## Total N 45     5      15        6         4         1        
##         0 Northern Ireland 1 Northern Ireland 0 Scotland 1 Scotland
## 0       53                 14                 4          5         
## 1       21                 0                  4          0         
## Total N 74                 14                 8          5         
##         0 Wales 1 Wales
## 0       0       0      
## 1       2       0      
## Total N 2       0
#Do the same table but create a frequency table.
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender", "Nationality_of_Passenger"), title = "Survival by Gender and Nationality of Passenger", format = "col_frequency")
## Survival by Gender and Nationality of Passenger
##         0 England 1 England 0 Ireland 1 Ireland 0 Northern Ireland
## 0       8         6         4         1         71                
## 1       7         0         0         0         24                
## Total N 15        6         4         1         95                
##         1 Northern Ireland 0 Scotland 1 Scotland 0 Wales 1 Wales
## 0       16                 21         8          0       0      
## 1       0                  11         0          2       0      
## Total N 16                 32         8          2       0

Explain how the three tables relate to each other.

All three tables include crew and nationality of passenger. The three tables related to each other because the variable is survival for all three tables. The second table provides more information because it divides the crew and passenger by gender. It also provides information on who perished based on gender, passenger or crew and nationality.

Describe the relationship between Gender, Combined.Class and Survival. Remember that survival is the dependent variable.

I was able to analyzed the data, and concluded that more male crew and passengers aboard had a greater chance of surviving the disaster. All tables had a higher percentage of male survivors regardless of nationality of passenger.

Now let’s add a third variable, Child. Rememember that for MV_Princess_Victoria we have to create the Child variable.

crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender","Class.Combined", "Child"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
##         0 0  1 0 0 1 1 1
## 0       69.7 100 100 NaN
## 1       30.3 0   0   NaN
## Total N 142  31  5   0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender", "Child"), title = "Survival by Crew and Gender", format = "col_percent")
## Survival by Crew and Gender
##         Crew 0 England 0 Ireland 0 Northern Ireland 0 Scotland 0 Wales 0
## 0       80     65        100       75.9               69.2       0      
## 1       20     35        0         24.1               30.8       100    
## Total N 50     20        5         83                 13         2      
##         Crew 1 England 1 Ireland 1 Northern Ireland 1 Scotland 1 Wales 1
## 0       NaN    100       NaN       100                NaN        NaN    
## 1       NaN    0         NaN       0                  NaN        NaN    
## Total N 0      1         0         4                  0          0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Gender", "Class.Combined", "Child"), title = "Survival by Crew and Gender", format = "col_frequency")
## Survival by Crew and Gender
##         0 0 1 0 0 1 1 1
## 0       99  31  5   0  
## 1       43  0   0   0  
## Total N 142 31  5   0
MV_Princess_Victoria$Child <- as.numeric(MV_Princess_Victoria$Age) <=15


# Run a cross tab with Class.Combined, Gender and Child as column variables and survival as the row variable. 
# You can run several variations if you want.
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender", "Child"), title = "Survival by Crew,Gender and Child", format = "col_percent")
## Survival by Crew,Gender and Child
##         Crew FALSE England FALSE Ireland FALSE Northern Ireland FALSE
## 0       78.3       50            100           65.1                  
## 1       21.7       50            0             34.9                  
## Total N 23         8             2             43                    
##         Scotland FALSE Wales FALSE Crew TRUE England TRUE Ireland TRUE
## 0       70             0           NaN       100          NaN         
## 1       30             100         NaN       0            NaN         
## Total N 10             2           0         1            0           
##         Northern Ireland TRUE Scotland TRUE Wales TRUE
## 0       100                   NaN           NaN       
## 1       0                     NaN           NaN       
## Total N 4                     0             0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender", "Child"), title = "Survival by Crew,Gender and Child", format = "col_percent")
## Survival by Crew,Gender and Child
##         Crew FALSE England FALSE Ireland FALSE Northern Ireland FALSE
## 0       78.3       50            100           65.1                  
## 1       21.7       50            0             34.9                  
## Total N 23         8             2             43                    
##         Scotland FALSE Wales FALSE Crew TRUE England TRUE Ireland TRUE
## 0       70             0           NaN       100          NaN         
## 1       30             100         NaN       0            NaN         
## Total N 10             2           0         1            0           
##         Northern Ireland TRUE Scotland TRUE Wales TRUE
## 0       100                   NaN           NaN       
## 1       0                     NaN           NaN       
## Total N 4                     0             0
crosstab(MV_Princess_Victoria, row.vars = "Survival", col.vars = c("Class.Combined", "Gender", "Child"), title = "Survival by Crew,Gender and Child", format = "col_frequency")
## Survival by Crew,Gender and Child
##         Crew FALSE England FALSE Ireland FALSE Northern Ireland FALSE
## 0       18         4             2             28                    
## 1       5          4             0             15                    
## Total N 23         8             2             43                    
##         Scotland FALSE Wales FALSE Crew TRUE England TRUE Ireland TRUE
## 0       7              0           0         1            0           
## 1       3              2           0         0            0           
## Total N 10             2           0         1            0           
##         Northern Ireland TRUE Scotland TRUE Wales TRUE
## 0       4                     0             0         
## 1       0                     0             0         
## Total N 4                     0             0

Describe the combined relationships between the variables.

18 Crew members perish in which did not bring any childrens on board. 4 passengers from England perished in which did not bring any children on board. 2 passengers from Ireland perished in which did not bring any children. 7 passengers from Scotland perished in which did not bring any children. All passengers from Wales survived that did not bring a child on board. 1 Passenger from England died which brought a child on board. 0 Ireland passenger perished in which brought a child on board. 4 Northern Ireland passenger perished in which brought a child on board. 0 Scotland Passenger perished in which brought a child on board. And 0 Wales passenger perished in which brought a passenger on board. 5 Crew members survived in which did not bring a child on board. 4 passengers from England survived in which did not bring a child on board. 0 Ireland passengers survived in which did not bring a child on board. 15 Northern Ireland passenger in which did not bring a child on board. 3 Scotland passenger survived in which did not bring a child on board. 2 Wales passengers survived in which did not bring a child on board. 0 England passengers survived in which brought a child on board. 0 Ireland passengers survived in which brought a child on board. 0 Northern Ireland survived in which brought a child on board. 0 Scotland passengers survived in which brought a child on board. 0 Wales passengers survived in which brought a child on board.

Overall (using all of the tables above as needed): #### Does being a child increase or decrease someone’s chances of survival compared to not being a child? Being a child decreases the chances of survival, compared to not being a child increases the chances of survival.

Does being female (gender = 1 ) increase or decrease someone’s chances of survival compared to being a male?

Being a female decrease someone chances of survival compared to being a male increases your chances of survival.

Does being a crew member increase or decrease someone’s chances of survival compared to being in first class?

Unable to compared data since the MV Princess Victoria does not have data regarding passenger class.

Does being in third class increase or decrease someone’s chances of survival compared to being in first class?

Unable to compared data since the MV Princess Victoria does not have passenger class variable.

Does being a crew member increase or decrease someone’s survival compared to being in third class?

Unable to compared crew data to passenger class since MV Princess Victoria does not have passenger class variable data.

Does this seem to be getting a bit complex? That is one reason why sociologists will often switch to a linear model. These may seem complex at first but compared to reading a complex crosstab they may be easier.

Here’s how to run a linear model called a logistic regression. Just run it.

results<-glm(Survival~Class.Combined + Gender + Child, data=MV_Princess_Victoria, family = binomial(link = logit))
## Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
coef(results)
##                    (Intercept)          Class.CombinedEngland 
##                     -1.2237754                      1.5114575 
##          Class.CombinedIreland Class.CombinedNorthern Ireland 
##                    -16.2907453                      0.6737291 
##         Class.CombinedScotland            Class.CombinedWales 
##                      1.2237754                     19.7898439 
##                         Gender                      ChildTRUE 
##                    -18.1772766                    -18.2151220

We are going to discuss this together, but for right now:

Is the sign of ChildTRUE positive or negative?

Children have a negative sign of survival in the marine disaster MV Princess Victoria because all children perished in the disaster.

Is the sign of Gender (being female in this case) positive or negative?

Female have a negative probability surviving the marine disaster MV Princess Victoria because all female perished in the disaster.

Is the sign of Class.CombinedCrew (being a crew member) positive or negative?

Crew member had a positive probability of surviving the marine disaster MV Princess Victoria compare to passengers on board.

Is the sign of Class.Combined3 (being in 3rd Class) positive or negative?

Unable to make a determination since MV Princess Victoria did not have a passenger class variable.

Is the size (absolute value) of Class.CombinedCrew bigger or smaller than Class.Combined3?

Unable to make a determination since MV Princess Victoria does not have a passenger class. My class combined varible is based of Nationality of Passengers and Crew member information.

How do these answers compare with the answers about the cross tabulation?

These answers aboved compared to the data provided in table Survival~Class.Combined + Gender + Child which displayed a negative factor towards female gender and child. It also has a negative impact in the nationality of Northern Ireland because you had a greater chance of dying due to the majority of the passenger on board were from Northern Ireland.