Now that we have our data sets set up let’s take a look at them and start to analyze what happened. (And learn some R basics along the way.)

Let’s focus on the Titanic.

Start by loading the dataset that you previously saved.

load(paste("~","data/MSEstonia.RData", sep="/"))

Describing the Data

The data we downloaded contains 1 row per person on the ship for a given ship, in this case MS Estonia.

Find some basic information about the RMS_Titanic using the str() function and the summary() function on the MS_Estonia data set. That means, put the name of the data set inside the parentheses.

str(`MS Estonia`)
## Classes 'tbl_df', 'tbl' and 'data.frame':    989 obs. of  20 variables:
##  $ Id_16                      : num  1 2 3 4 5 6 7 8 9 10 ...
##  $ Ship Id                    : num  16 16 16 16 16 16 16 16 16 16 ...
##  $ Year                       : num  1994 1994 1994 1994 1994 ...
##  $ Nationality of the Ship    : chr  "Estonia" "Estonia" "Estonia" "Estonia" ...
##  $ Women and children first   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Quick                      : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Cause                      : chr  "Technical" "Technical" "Technical" "Technical" ...
##  $ No. of passengers          : num  796 796 796 796 796 796 796 796 796 796 ...
##  $ No. of women passengers    : num  377 377 377 377 377 377 377 377 377 377 ...
##  $ Women passengers/passengers: num  0.474 0.474 0.474 0.474 0.474 0.474 0.474 0.474 0.474 0.474 ...
##  $ Ship size                  : num  989 989 989 989 989 989 989 989 989 989 ...
##  $ Length of voyage           : num  2 2 2 2 2 2 2 2 2 2 ...
##  $ Gender                     : num  0 1 1 0 1 1 0 1 0 1 ...
##  $ Age                        : num  62 22 21 53 56 71 60 18 31 63 ...
##  $ Child                      : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Crew                       : num  0 1 1 1 0 0 0 0 1 0 ...
##  $ Passenger Class            : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Nationality of Passenger   : chr  "Swedish" "Estonian" "Estonian" "Swedish" ...
##  $ Companionship              : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ Survival                   : num  0 0 0 0 0 0 0 0 0 0 ...
summary(`MS Estonia`) 
##      Id_16        Ship Id        Year      Nationality of the Ship
##  Min.   :  1   Min.   :16   Min.   :1994   Length:989             
##  1st Qu.:248   1st Qu.:16   1st Qu.:1994   Class :character       
##  Median :495   Median :16   Median :1994   Mode  :character       
##  Mean   :495   Mean   :16   Mean   :1994                          
##  3rd Qu.:742   3rd Qu.:16   3rd Qu.:1994                          
##  Max.   :989   Max.   :16   Max.   :1994                          
##                                                                   
##  Women and children first     Quick      Cause           No. of passengers
##  Min.   :0                Min.   :0   Length:989         Min.   :796      
##  1st Qu.:0                1st Qu.:0   Class :character   1st Qu.:796      
##  Median :0                Median :0   Mode  :character   Median :796      
##  Mean   :0                Mean   :0                      Mean   :796      
##  3rd Qu.:0                3rd Qu.:0                      3rd Qu.:796      
##  Max.   :0                Max.   :0                      Max.   :796      
##                                                                           
##  No. of women passengers Women passengers/passengers   Ship size  
##  Min.   :377             Min.   :0.474               Min.   :989  
##  1st Qu.:377             1st Qu.:0.474               1st Qu.:989  
##  Median :377             Median :0.474               Median :989  
##  Mean   :377             Mean   :0.474               Mean   :989  
##  3rd Qu.:377             3rd Qu.:0.474               3rd Qu.:989  
##  Max.   :377             Max.   :0.474               Max.   :989  
##                                                                   
##  Length of voyage     Gender            Age            Child    
##  Min.   :2        Min.   :0.0000   Min.   : 0.00   Min.   : NA  
##  1st Qu.:2        1st Qu.:0.0000   1st Qu.:30.00   1st Qu.: NA  
##  Median :2        Median :0.0000   Median :44.00   Median : NA  
##  Mean   :2        Mean   :0.4904   Mean   :44.73   Mean   :NaN  
##  3rd Qu.:2        3rd Qu.:1.0000   3rd Qu.:59.00   3rd Qu.: NA  
##  Max.   :2        Max.   :1.0000   Max.   :87.00   Max.   : NA  
##                                                    NA's   :989  
##       Crew        Passenger Class Nationality of Passenger Companionship
##  Min.   :0.0000   Min.   : NA     Length:989               Min.   : NA  
##  1st Qu.:0.0000   1st Qu.: NA     Class :character         1st Qu.: NA  
##  Median :0.0000   Median : NA     Mode  :character         Median : NA  
##  Mean   :0.1951   Mean   :NaN                              Mean   :NaN  
##  3rd Qu.:0.0000   3rd Qu.: NA                              3rd Qu.: NA  
##  Max.   :1.0000   Max.   : NA                              Max.   : NA  
##                   NA's   :989                              NA's   :989  
##     Survival     
##  Min.   :0.0000  
##  1st Qu.:0.0000  
##  Median :0.0000  
##  Mean   :0.1385  
##  3rd Qu.:0.0000  
##  Max.   :1.0000  
## 

You should notice some odd things.

Some variables are all set to NA. Which ones are these?

Child,Passenger Class,Companionship

We need to pay attention to the fact that there are some variables not available for all of the ships. When a variable is not available, all values will be missing.

There are some variables where all the values are the same. Which are these?

The characteristics is the same because they were all in the same ship.

Why do some variables have the same value for every observation? Think about what they refer to.

They have the same characteristics because everyone in the ship saw the same thing.Which means the characteristics were the same because they were in the same ship.

Browse the dataset. There is something interesting about how the Crew and Passenger Class variables relate to each other. What is that?

Passenger Class not avaliable.

Tables

Now let’s get some descrptive information about the people on the ship using the crosstab function. The variables we are interested in in the MS_Estonia data set are: Gender, Passenger Class, Crew, Survival.

#Here is the first. You add the others.
crosstab(`MS Estonia`, row.vars = "Gender")
##       
## Gender  Count Total %
##    0   504.00   50.96
##    1   485.00   49.04
##    Sum 989.00  100.00
#crosstab(`MS Estonia`, row.vars = "Passenger Class")
crosstab(`MS Estonia`, row.vars = "Crew")
##      
## Crew   Count Total %
##   0   796.00   80.49
##   1   193.00   19.51
##   Sum 989.00  100.00
crosstab(`MS Estonia`, row.vars = "Survival")
##         
## Survival  Count Total %
##      0   852.00   86.15
##      1   137.00   13.85
##      Sum 989.00  100.00

We would also like to know about the proportion who are 15 and under. The problem is that the Child variable is all missing and age is given in years. Let’s update the Child variable with information from the Age variable and then run the crosstab. (For your other ship you need to see if this is necessary and possible. You may need to do other similar things with other variables.)

MS_Estonia$Child <- as.numeric(MS_Estonia$Age) <= 15

# Add the crosstab below
crosstab(MS_Estonia, row.vars = "Survival")
##         
## Survival  Count Total %
##      0   852.00   86.15
##      1   137.00   13.85
##      Sum 989.00  100.00

Explain what you think the code MS_Estonia$Survival <- as.numeric(MS_Estonia$Age) <= 15 does.

This code reflects that most of the people in the ship died.It was about 86.15%.Very little survivors in this ship.

Summary

Write a paragraph describing the people who were on board the Estonia and what happened to them. A car and passenger ferry, MS Estonia, has sunk in the Baltic Sea with 950 people on board.She was carrying 989 people: 803 passengers and 186 crew. Most of the passengers were Scandinavian, while most of the crew members were Estonian. 138 were rescued alive, but one died later in hospital.[1] Ships rescued 34 and helicopters 104; the ferries played a much smaller part than the planners had intended because it was too dangerous.Most of the survivors were from the Crew.Most of the passengers die.