Setup
remove(list = ls())
library(stats)
library(stargazer)
Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
Import Data
test <- read.csv("~/Downloads/test.csv")
Clean the Data
?na.omit
test_clean <- na.omit(test)
remove(test)
Summary Statistics
?stargazer
stargazer(... = test_clean,
type = "text",
title = "Titanic Data Summary Statistics",
digits = 1)
Titanic Data Summary Statistics
==========================================
Statistic N Mean St. Dev. Min Max
------------------------------------------
PassengerId 331 1,100.2 122.9 892 1,307
Pclass 331 2.1 0.8 1 3
Age 331 30.2 14.1 0.2 76.0
SibSp 331 0.5 0.9 0 8
Parch 331 0.4 0.8 0 6
Fare 331 41.0 61.2 0.0 512.3
------------------------------------------
The age of the passengers range from infants to elderly, but is in average the passenger’s age is 30.
Boxplot
?boxplot
boxplot(test_clean$Age,
main = "Boxplot of Passenger Age",
ylab = "Age",
col = "skyblue")
Histogram
?hist
hist(test_clean$Fare,
breaks = seq(0, 550, by = 50),
main = "Histogram of Fare",
xlab = "Fare",
col = "lavender",
border = "black")
Takeaway
- The more the fare costs, the less frequent passengers bought those tickets.