Warning in mean.default(df_clean[, i], na.rm = TRUE): argument is not numeric
or logical: returning NA
Warning in mean.default(df_clean[, i], na.rm = TRUE): argument is not numeric
or logical: returning NA
Warning in mean.default(df_clean[, i], na.rm = TRUE): argument is not numeric
or logical: returning NA
Warning in mean.default(df_clean[, i], na.rm = TRUE): argument is not numeric
or logical: returning NA
Warning in mean.default(df_clean[, i], na.rm = TRUE): argument is not numeric
or logical: returning NA
Summary Statistics
library(stargazer)
Please cite as:
Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
R package version 5.2.3. https://CRAN.R-project.org/package=stargazer
?stargazer
starting httpd help server ...
done
stargazer(df_clean, type ="text", omit.summary.stat ="N", digits =2, title ="Titanic Test Summary Stats")
Titanic Test Summary Stats
=========================================
Statistic Mean St. Dev. Min Max
-----------------------------------------
PassengerId 1,100.50 120.81 892 1,309
Pclass 2.27 0.84 1 3
Age 30.27 12.63 0.17 76.00
SibSp 0.45 0.90 0 8
Parch 0.39 0.98 0 9
Fare 35.63 55.84 0.00 512.33
-----------------------------------------
Observations:
Parch Statistic:
Parch describes the number of Parents and Children and indiviual in this dataset had. The Mean of Parch is 0.39, which is less than one and displays how more often than not, people aboard the titanic did not have parents or children
Fare Statistic:
Fare describes the the price of the ticket each occupant of the titanic had purchased. Looking at just the Min and Max of this vector, the average would seem to be around 250, but instead, the average is 36, which shows a higher concentration of ticket fares towards the bottom of the price spectrum
Box Plot
boxplot(df_clean$Fare, horizontal =TRUE,main ="Box Plot of Ticket Fares for the Titanic", xlab ="Age")
Histogram
hist(df_clean$Age,main ="Histogram of Ages on the Titanic", xlab ="Age", ylab ="Frequency")
Key takeaways:
The desired outcome for the formatting and contents of a graph is very easy to manipulate, and functions for almost any want or need is within R. One specific set of functions for titling and labeling axes on a table is main =, xlab =, and ylab =.