This is an RMarkdown document displaying R code for generating four samples from a hypothetical population distribution. Then, four plots are constructed as histograms with overlayed density curves.
The plots are intended to demonstrate how samples differ within a population. The hypothetical distribution is intended to represent self-report anxiety scores towards statistics coursework. This reflect a self-report survey item asked of the class prior to the course.
This was created with the intention of supplementing lecture notes regarding sample vs. population.
The first block of code accomplished the following:
library(ggplot2)
library(dplyr)
library(gridExtra)
anxiety <- c(rep(1,25), rep(2,50), rep(3,70), rep(4,90), rep(5,100), rep(6,100), rep(7,90), rep(8,70), rep(9,50), rep(10,25))
anxiety <- as.data.frame(anxiety)
Four samples of size 50 are randomly sampled from the constructed population distribution and coerced into data frames.
sim1 <- sample(anxiety[,1],50)
sim2 <- sample(anxiety[,1],50)
sim3 <- sample(anxiety[,1],50)
sim4 <- sample(anxiety[,1],50)
sim1 <- as.data.frame(sim1)
sim2 <- as.data.frame(sim2)
sim3 <- as.data.frame(sim3)
sim4 <- as.data.frame(sim4)
A histogram is constructed for all four samples. Each histogram has a smoothed density curve overlayed on top of it with slight shading.
p1 <- ggplot(sim1,aes(x=sim1)) + xlab("Anxiety Level") + ylab("Density") + ggtitle("Sample 1") + theme(plot.title = element_text(hjust=.5)) + scale_x_continuous(breaks=seq(1,10,1),limits=c(0,10.5)) + geom_histogram(aes(y=..density..),fill="red",col="black",binwidth=1) + geom_density(fill="black",alpha=.2)
p2 <- ggplot(sim2,aes(x=sim2)) + xlab("Anxiety Level") + ylab("Density") + ggtitle("Sample 2") + theme(plot.title = element_text(hjust=.5)) + scale_x_continuous(breaks=seq(1,10,1),limits=c(0,10.5)) + geom_histogram(aes(y=..density..),fill="red",col="black",binwidth=1) + geom_density(fill="black",alpha=.2)
p3 <- ggplot(sim3,aes(x=sim3)) + xlab("Anxiety Level") + ylab("Density") + ggtitle("Sample 3") + theme(plot.title = element_text(hjust=.5)) + scale_x_continuous(breaks=seq(1,10,1),limits=c(0,10.5)) + geom_histogram(aes(y=..density..),fill="red",col="black",binwidth=1) + geom_density(fill="black",alpha=.2)
p4 <- ggplot(sim4,aes(x=sim4)) + xlab("Anxiety Level") + ylab("Density") + ggtitle("Sample 4") + theme(plot.title = element_text(hjust=.5)) + scale_x_continuous(breaks=seq(1,10,1),limits=c(0,10.5)) + geom_histogram(aes(y=..density..),fill="red",col="black",binwidth=1) + geom_density(fill="black",alpha=.2)
grid.arrange(p1,p2,p3,p4,ncol=2)
## Warning: Removed 1 rows containing missing values (geom_bar).
## Warning: Removed 1 rows containing missing values (geom_bar).
## Warning: Removed 1 rows containing missing values (geom_bar).
## Warning: Removed 1 rows containing missing values (geom_bar).