Day 3 Assignment

Author

Moon C

Setup

remove(list = ls())
library(stats)
library(stargazer)

Please cite as: 
 Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
 R package version 5.2.3. https://CRAN.R-project.org/package=stargazer 
library(ggplot2)

Import Data

test <- read.csv("~/Downloads/test.csv")

Clean the Data

?na.omit
test_clean <- na.omit(test)
remove(test)

Summary Statistics

?stargazer
stargazer(... = test_clean, 
         type = "text", 
        title = "Titanic Data Summary Statistics", 
       digits = 1)

Titanic Data Summary Statistics
==========================================
Statistic    N   Mean   St. Dev. Min  Max 
------------------------------------------
PassengerId 331 1,100.2  122.9   892 1,307
Pclass      331   2.1     0.8     1    3  
Age         331  30.2     14.1   0.2 76.0 
SibSp       331   0.5     0.9     0    8  
Parch       331   0.4     0.8     0    6  
Fare        331  41.0     61.2   0.0 512.3
------------------------------------------

The age of the passengers range from infants to elderly, but is in average the passenger’s age is 30.

Boxplot

?boxplot
boxplot(test_clean$Age,
        main = "Boxplot of Passenger Age",
        ylab = "Age",
        col = "skyblue")

Histogram

?hist
hist(test_clean$Fare,
     breaks = seq(0, 550, by = 50), 
     main = "Histogram of Fare",
     xlab = "Fare",
     col = "lavender",
     border = "black")

Takeaway

  • The more the fare costs, the less frequent passengers bought those tickets.