Now let do an empirical cumulative distribution function. This reports any given number percentile of individuals that are above or below that threshold.
set.seed(1234)
wdata =data.frame(
sex =factor(rep(c("F", "M"), each =200)),
weight= c(rnorm(200, 50), rnorm(200, 58)))
Now lets look at our dataframe
head(wdata, 5)
## sex weight
## 1 F 48.79293
## 2 F 50.27743
## 3 F 51.08444
## 4 F 47.65430
## 5 F 50.42912
Now lets load our plotting package
library(ggplot2)
theme_set(
theme_classic() +
theme (legend.position="top")
)
ggplot(wdata, aes(x=weight)) +
stat_ecdf(aes (color = sex, linetype =sex),
geom ="step", size =1.5) +
scale_color_manual(values=c("#00AFBB", "#E7B900"))+
labs(y = "weight")
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.