R Markdown

Question 1:

Titanic = read.csv(“https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/Titanic.csv”, header = TRUE) titanic <- data.frame(Titanic) summary(titanic) X Name
Min. : 1 Carlsson, Mr Frans Olof : 2
1st Qu.: 329 Connolly, Miss Kate : 2
Median : 657 Kelly, Mr James : 2
Mean : 657 Abbing, Mr Anthony : 1
3rd Qu.: 985 Abbott, Master Eugene Joseph: 1
Max. :1313 Abbott, Mr Rossmore Edward : 1
(Other) :1304
PClass Age Sex Survived
* : 1 Min. : 0.17 female:462 Min. :0.0000
1st:322 1st Qu.:21.00 male :851 1st Qu.:0.0000
2nd:279 Median :28.00 Median :0.0000
3rd:711 Mean :30.40 Mean :0.3427
3rd Qu.:39.00 3rd Qu.:1.0000
Max. :71.00 Max. :1.0000
NA’s :557
SexCode
Min. :0.0000
1st Qu.:0.0000
Median :0.0000
Mean :0.3519
3rd Qu.:1.0000
Max. :1.0000

mean(titanic\(Age, na.rm=TRUE) [1] 30.39799 median(titanic\)Age, na.rm = TRUE) [1] 28 mean(titanic\(Survived, na.rm = TRUE) [1] 0.3427266 median(titanic\)Survived, na.rm = TRUE) [1] 0

Question 2

newdata <- subset(titanic,Survived == 1, select= c(“Age”, “Sex”, “PClass”))

Question 5

colnames(newdata) <- c(“age”, “gender”, “class”)

Questio 4

summary (newdata) age gender class
Min. : 0.17 female:308 * : 0
1st Qu.:19.00 male :142 1st:193
Median :28.00 2nd:119
Mean :29.36 3rd:138
3rd Qu.:39.00
Max. :69.00
NA’s :137

aggregate(age ~ class, newdata, mean) class age 1 1st 36.77640 2 2nd 24.22531 3 3rd 22.46154

mean(newdata$age, na.rm = TRUE) [1] 29.35958

comparison: In my new data set, I only included the survived data, so the mean and median for that information is NA. For the new data, I took the mean age of each class above and the mean for all the ages in the new data. Aside from the 1st class passengers, the mean of the age is lower than the original data. This might be because the bulk of the passengers who survived were from 1st class.

Question 5

levels(newdata\(class)[levels(newdata\)class)==“2nd”] <- “second” levels(newdata\(class)[levels(newdata\)class) == “3rd”] <- “third” levels(newdata\(class)[levels(newdata\)class) == “1st”] <- “first”

Question 6

newdata[c(17, 200, 400), ] age gender class 28 30 female first 339 4 female second 979 NA female third

Bonus

Titanic = read.csv(“https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/datasets/Titanic.csv”, header = TRUE)