This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
Task 2(b)
titanic.df <- read.csv(paste("Titanic Data.csv", sep=""))
View(titanic.df)
Task 3(a) The value of n gives the total number of passengers on board the titanic
library(psych)
describe(titanic.df)
## vars n mean sd median trimmed mad min max range
## Survived 1 889 0.38 0.49 0.00 0.35 0.00 0.0 1.00 1.00
## Pclass 2 889 2.31 0.83 3.00 2.39 0.00 1.0 3.00 2.00
## Sex* 3 889 1.65 0.48 2.00 1.69 0.00 1.0 2.00 1.00
## Age 4 889 29.65 12.97 29.70 29.22 9.34 0.4 80.00 79.60
## SibSp 5 889 0.52 1.10 0.00 0.27 0.00 0.0 8.00 8.00
## Parch 6 889 0.38 0.81 0.00 0.19 0.00 0.0 6.00 6.00
## Fare 7 889 32.10 49.70 14.45 21.28 10.24 0.0 512.33 512.33
## Embarked* 8 889 2.54 0.79 3.00 2.67 0.00 1.0 3.00 2.00
## skew kurtosis se
## Survived 0.48 -1.77 0.02
## Pclass -0.63 -1.27 0.03
## Sex* -0.62 -1.61 0.02
## Age 0.43 0.96 0.43
## SibSp 3.68 17.69 0.04
## Parch 2.74 9.66 0.03
## Fare 4.79 33.23 1.67
## Embarked* -1.26 -0.23 0.03
Task 3(b) The value of 1 ie 340 gives the total number of people who survived teh sinking
task <- with(titanic.df, table(Survived))
task
## Survived
## 0 1
## 549 340
Task 3(c) The value of 1 ie 38.24522 gives the % of people who survived the sinking
prop.table(task)*100
## Survived
## 0 1
## 61.75478 38.24522
Task 3(d) The value given in row 2 column 1 gives the number of first class passengers to survive
task1 <- xtabs(~ Survived + Pclass, data=titanic.df)
task1
## Pclass
## Survived 1 2 3
## 0 80 97 372
## 1 134 87 119
Task 3(e) The value given in row 2 column 1 gives the percentage of first class passengers who survived
prop.table(task1, 2)
## Pclass
## Survived 1 2 3
## 0 0.3738318 0.5271739 0.7576375
## 1 0.6261682 0.4728261 0.2423625
Task 3(f) The value given in 1st table’s row 2 column 1 gives the number of females from first class who survivd
task2 <- xtabs(~ Survived + Pclass + Sex, data=titanic.df)
task2
## , , Sex = female
##
## Pclass
## Survived 1 2 3
## 0 3 6 72
## 1 89 70 72
##
## , , Sex = male
##
## Pclass
## Survived 1 2 3
## 0 77 91 300
## 1 45 17 47
Task 3(g) The value given in row 2 column 1 gives the percentage of survivors who were female
task3 <- xtabs(~ Survived + Sex, data=titanic.df)
prop.table(task3, 1)*100
## Sex
## Survived female male
## 0 14.75410 85.24590
## 1 67.94118 32.05882
Task 3(h) The value given in row 2 column 1 gives the percentage of females who survived
prop.table(task3, 2)*100
## Sex
## Survived female male
## 0 25.96154 81.10919
## 1 74.03846 18.89081
Task 3(i) Chi-Squared test
chisq.test(task3)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: task3
## X-squared = 258.43, df = 1, p-value < 2.2e-16
Since the value of p is less than 0.01, the null hypothesis can be rejected There is a relation between the sex and survival