# Data directory "Titanic"

mydata <- read.csv('/Users/pin.lyu/Desktop/titanic/test.csv') 

Q1

Answer: From the chart shown above, we can see that “SibSp” has 283 zeros, “Parch” has 324 zeros, and “Age” has only 86 zeors. Hence, “Parch” is the variable in this data set that has the most missing entries.

Q2

Q3

Q4

Q5

Answer: What I noticed from this graph is that most of individuals from both sexes who survived from the ship wreckage were people from 20-40 years old. For both sexes, the median age of the person survived is around 28 years old. Additionally, we can tell that lots of children survived as well. However, the accuracy of this interpretation based on the data is unclear due to the modification that we’ve made on age which we replaced missing values with the median of the age data.