W1D5 Answers to Q2 and Q3 are as follows :
titanic.df <- read.csv(paste("Titanic Data.csv",sep =""))
3(a)
#View(titanic.df)
nrow(titanic.df)
## [1] 889
3(b)
table(titanic.df$Survived)
##
## 0 1
## 549 340
another way of 3(b)
sum(titanic.df$Survived)
## [1] 340
3(c)
prop.table(table(titanic.df$Survived))*100
##
## 0 1
## 61.75478 38.24522
3(d)
mytable11 <- xtabs(~ Survived + Pclass , data = titanic.df)
mytable11
## Pclass
## Survived 1 2 3
## 0 80 97 372
## 1 134 87 119
3(e)
prop.table(mytable11 , 2)*100
## Pclass
## Survived 1 2 3
## 0 37.38318 52.71739 75.76375
## 1 62.61682 47.28261 24.23625
3(f)
mytable22 = xtabs(~ Pclass + Sex + Survived , data = titanic.df)
ftable(mytable22)
## Survived 0 1
## Pclass Sex
## 1 female 3 89
## male 77 45
## 2 female 6 70
## male 91 17
## 3 female 72 72
## male 300 47
3(g)
mytable33 <- xtabs(~ Sex + Survived , data = titanic.df)
prop.table(mytable33,2)*100
## Survived
## Sex 0 1
## female 14.75410 67.94118
## male 85.24590 32.05882
3(h)
prop.table(mytable33,1)*100
## Survived
## Sex 0 1
## female 25.96154 74.03846
## male 81.10919 18.89081
3(i)
chisq.test(mytable33)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: mytable33
## X-squared = 258.43, df = 1, p-value < 2.2e-16
Since the two variable Survived and Sex are dependent and we have that propotion of females survived greater than that of males survived, so we confirm the hypothesis.