This is an RMarkdown document displaying R code for generating two versions of animated plots. The plots are intended to demonstrate how histograms construct gradually. The simulated data is meant to represent the distribution of IQ scores.
In addition to the aforementioned plots, animated plots are constructed to visualize the change in distribution when transforming an entire dataset via addition or multiplication by a constant.
This was created with the intention of supplementing lecture notes regarding frequency distributions and transformations.
The first block of code accomplished the following:
library(ggplot2)
library(dplyr)
library(gganimate)
library(transformr)
z1 <- rnorm(10,100,15)
z2 <- rnorm(25,100,15)
z3 <- rnorm(100,100,15)
z4 <- rnorm(500,100,15)
z5 <- rnorm(2000,100,15)
z1 <- as.matrix(z1, nrows=10, ncols=1)
z2 <- as.matrix(z2, nrows=25, ncols=1)
z3 <- as.matrix(z3, nrows=100, ncols=1)
z4 <- as.matrix(z4, nrows=500, ncols=1)
z5 <- as.matrix(z5, nrows=2000, ncols=1)
z1 <- as.data.frame(z1)
z2 <- as.data.frame(z2)
z3 <- as.data.frame(z3)
z4 <- as.data.frame(z4)
z5 <- as.data.frame(z5)
colnames(z1) <- "IQ"
colnames(z2) <- "IQ"
colnames(z3) <- "IQ"
colnames(z4) <- "IQ"
colnames(z5) <- "IQ"
dataset <- c(rep(1,10),rep(2,25),rep(3,100),rep(4,500),rep(5,2000))
fullmat <- matrix(0,2635,2)
fullmat[,1] <- dataset
partmat <- rbind(z1,z2,z3,z4,z5)
fullmat[,2] <- partmat[,1]
colnames(fullmat) <- c("dataset","IQ")
fullmat <- as.data.frame(fullmat)
The next two blocks of code demonstrate two versions of animating the gradual construction of a histogram:
First plot allows for a flexible axis, that changes dynamically as the sample size grows. This provides a closer view of the shape of the distribution at smaller sample sizes.
Second plot uses a fixed set of axes, that do not change as sammple size grows. This allows for a better sense of how the histgoram changes as it grows in sample size.
anim_plot1 <- ggplot(fullmat,aes(IQ)) + geom_histogram(col = "black",fill = "blue") + transition_states(dataset,2,4) + view_follow(fixed_x = TRUE)
anim_plot1
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
anim_plot2 <- ggplot(fullmat,aes(IQ)) + geom_histogram(col = "black",fill = "blue") + transition_states(dataset,2,4)
anim_plot2
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The prior simulation and data-related steps are undertaken for a second simulation of IQ scores. An initial data frame is created as a sample of IQ scores. Then, transformed dataframes are created for use in animation.
t1 <- rnorm(1000,70,15)
t2 <- t1 + 15
t3 <- t2 + 15
t4 <- t3 + 15
t5 <- t4 + 15
t1 <- as.matrix(t1, nrows=1000, ncols=1)
t2 <- as.matrix(t2, nrows=1000, ncols=1)
t3 <- as.matrix(t3, nrows=1000, ncols=1)
t4 <- as.matrix(t4, nrows=1000, ncols=1)
t5 <- as.matrix(t5, nrows=1000, ncols=1)
t1 <- as.data.frame(t1)
t2 <- as.data.frame(t2)
t3 <- as.data.frame(t3)
t4 <- as.data.frame(t4)
t5 <- as.data.frame(t5)
colnames(t1) <- "IQ"
colnames(t2) <- "IQ"
colnames(t3) <- "IQ"
colnames(t4) <- "IQ"
colnames(t5) <- "IQ"
dataset <- c(rep(1,1000),rep(2,1000),rep(3,1000),rep(4,1000),rep(5,1000))
fullmat <- matrix(0,5000,2)
fullmat[,1] <- dataset
partmat <- rbind(t1,t2,t3,t4,t5)
fullmat[,2] <- partmat[,1]
colnames(fullmat) <- c("dataset","IQ")
fullmat <- as.data.frame(fullmat)
Density curve plots are constructed for the intial and transformed IQ score distributions.
anim_plot3 <- ggplot(fullmat,aes(IQ)) + geom_density(col = "black",fill = "blue") + transition_states(dataset,2,4) + shadow_mark(alpha = .3)
anim_plot3
The prior simulation and data-related steps are undertaken for a third simulation of IQ scores. An initial data frame is created as a sample of IQ scores. Then, transformed dataframes are created for use in animation.
v1 <- rnorm(1000,100,5)
v2 <- v1*2
v3 <- v2*2
v1 <- as.matrix(v1, nrows=1000, ncols=1)
v2 <- as.matrix(v2, nrows=1000, ncols=1)
v3 <- as.matrix(v3, nrows=1000, ncols=1)
v1 <- as.data.frame(v1)
v2 <- as.data.frame(v2)
v3 <- as.data.frame(v3)
colnames(v1) <- "IQ"
colnames(v2) <- "IQ"
colnames(v3) <- "IQ"
dataset <- c(rep(1,1000),rep(2,1000),rep(3,1000))
fullmat <- matrix(0,3000,2)
fullmat[,1] <- dataset
partmat <- rbind(v1,v2,v3)
fullmat[,2] <- partmat[,1]
colnames(fullmat) <- c("dataset","IQ")
fullmat <- as.data.frame(fullmat)
Density curve plots are constructed for the intial and transformed IQ score distributions.
anim_plot4 <- ggplot(fullmat,aes(IQ)) + geom_density(col = "black",fill = "blue") + transition_states(dataset,2,4) + shadow_mark(alpha = .3)
anim_plot4