1.Total number of Passenger on Board

length(Titanic.df$Survived)

## [1] 889

2. The number of passengers who survived the sinking of titanic

length(which(Titanic.df$Survived=="1"))

## [1] 340

3. Percentage of passengers who survived the sinking of titanic

num <- length(which(Titanic.df$Survived=="1"))
denominator <- length(Titanic.df$Survived)
(num/denominator)*100

## [1] 38.24522

4. The number of first-class passengers who survived the sinking of the Titanic.

mytable <- xtabs(~ Pclass+Survived, data = Titanic.df)
mytable[1,2]

## [1] 134

5. the percentage of first-class passengers who survived the sinking of the Titanic.

hello1 <- prop.table(mytable)*100
hello1[1,2]

## [1] 15.07312

5. the number of females from first class who survived the sinking of titanic.

hello2 <- xtabs(~ Pclass+Survived+Sex, data = Titanic.df)
hello2

## , , Sex = female
## 
##       Survived
## Pclass   0   1
##      1   3  89
##      2   6  70
##      3  72  72
## 
## , , Sex = male
## 
##       Survived
## Pclass   0   1
##      1  77  45
##      2  91  17
##      3 300  47

## [1] 89

6. Percentage of survivors who were female

Total survivors who were female / Total Survivors

hello3 <- xtabs(~ Survived+Sex, data = Titanic.df)
hello3

##         Sex
## Survived female male
##        0     81  468
##        1    231  109

hello3[2,1]/length(which(Titanic.df$Survived=="1"))*100

## [1] 67.94118

7. Percentage of females on board the Titanic who survived

Total survivors who were female / Total Females

hello4 <- xtabs(~ Sex+Survived, data = Titanic.df)
hello4

##         Survived
## Sex        0   1
##   female  81 231
##   male   468 109

hello4[1,2]/length(which(Titanic.df$Sex=="female"))*100

## [1] 74.03846

8. Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

Running a chi squared test.

mytable4 <- xtabs(~ Survived+Sex, data=Titanic.df)

chisq.test(mytable4)

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  mytable4
## X-squared = 258.43, df = 1, p-value < 2.2e-16

There is no relationship between the two parameters as they are independent since pvalue is <0.05. Hence, the null hypothesis is rejected that proportion of female survivors is greater than male survivors.

Titanic Analysis

Sharmistha Sutradhar

January 7, 2018