R Markdown

This is a R markdown file created in submission for the Analysis of Titanic Survivals.

Task 2b - Reading the dataset

Step 1 - Change the working directory to the folder in which your dataset is located.

setwd("C:/Users/srinivas.s.n/Desktop/IIM internship/Internshipdata")

Step 2 - Use the read.csv() function in R to read the data and store it in a dataframe called “titanic”.

titanic.df <- read.csv(paste("Titanic Data.csv", sep=""))

Step 3 - Use the View() function in R to view the dataframe in R

View(titanic.df)

Task 3a - Use R to count the total number of passengers on board the Titanic.

nrow(titanic.df)
## [1] 889

TASK 3b - Use R to count the number of passengers who survived the sinking of the Titanic.

sum(titanic.df$Survived)
## [1] 340

TASK 3c - Use R to measure the percentage of passengers who survived the sinking of the Titanic.

sum(titanic.df$Survived)/nrow(titanic.df)*100
## [1] 38.24522

TASK 3d - Use R to count the number of first-class passengers who survived the sinking of the Titanic. (Hint: You could use xtabs() )

xtabs(~Survived + Pclass, subset(titanic.df , titanic.df$Survived == 1 & titanic.df$Pclass == 1))
##         Pclass
## Survived   1
##        1 134

TASK 3e - Use R to measure the percentage of first-class passengers who survived the sinking of the Titanic. (Hint: You could use prop.table() )

prop.table(table(titanic.df$Survived, titanic.df$Pclass),1)
##    
##             1         2         3
##   0 0.1457195 0.1766849 0.6775956
##   1 0.3941176 0.2558824 0.3500000

TASK 3f Use R to count the number of females from First-Class who survived the sinking of the Titanic

sum(titanic.df$Survived == 1 & titanic.df$Sex == "female")
## [1] 231

TASK 3g Use R to measure the percentage of survivors who were female

sum(titanic.df$Survived == 1 & titanic.df$Sex == "female")/sum(titanic.df$Survived == 1 )
## [1] 0.6794118

TASK 3h Use R to measure the percentage of females on board the Titanic who survived

sum(titanic.df$Survived == 1 & titanic.df$Sex == "female")/sum(titanic.df$Sex == "female")
## [1] 0.7403846

TASK 3i Run a Pearson’s Chi-squared test to test the following hypothesis:

Hypothesis: The proportion of females onboard who survived the sinking of the Titanic was higher than the proportion of males onboard who survived the sinking of the Titanic.

mytable <- xtabs(~Survived+Sex, data=titanic.df)
addmargins(mytable)
##         Sex
## Survived female male Sum
##      0       81  468 549
##      1      231  109 340
##      Sum    312  577 889
chisq.test(mytable)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  mytable
## X-squared = 258.43, df = 1, p-value < 2.2e-16