Introduction

This is an RMarkdown document displaying R code for simulating ability estimate distributions for two different groups on three separate tests. Then, two density plots are made which display density curves for ability distributions and vertical lines to denote cut-scores for four proficiency categories.

Initial Cut-score Simulation and Data Frame Preparation

The first block of code accomplished the following:

  1. Load appropriate R packages.
  2. Simulate 3 cut-scores for each test.
  3. Create a data frame which contains cut-scores and test number.
library(ggplot2)
library(dplyr)

Test <- c(1,2,3)
Test <- data.frame(Test)

Cut_1 <- rnorm(3,-.75,.15)
Cut_2 <- rnorm(3,0,.15)
Cut_3 <- rnorm(3,.75,.15)

Cut_set <- cbind(Test, Cut_1, Cut_2, Cut_3)

Simulation of Ability Distributions and Data Frame Preparation

The next block of code generates simulated ability estimate distributions for two groups across three tests. Test number is also generated and combined with the ability estimates into a single data frame. Test and group variables are coerced into factors.

Variation in ability estimates are manipulated to create the following three situations:

Test 1: No difference between groups.
Test 2: Group 1 outperforms Group 2 on average. Test 3: Group 2 outperforms Group 1 on average.

Test <- c(rep(1,1000), rep(2,1000), rep(3,1000))
Test <- data.frame(Test)

Score_1 <- rnorm(4000, -.25, 1)
Score_2 <- c(rnorm(2500, 0, .9), rnorm(1500, -.75, .8))
Score_3 <- c(rnorm(2500, 0, .9), rnorm(1500, .75, .8))

Score_1 <- data.frame(Score_1)
Score_2 <- data.frame(Score_2)
Score_3 <- data.frame(Score_3)

Group <- rep(c(rep(1, 2500), rep(2, 1500)), 3)
Group <- as.data.frame(Group) 

colnames(Score_1) <- "Logits"
colnames(Score_2) <- "Logits"
colnames(Score_3) <- "Logits"

Score_set <- rbind(Score_1, Score_2, Score_3)
Score_set_full <- cbind(Test, Group, Score_set)

Score_set_full$Test <- as.factor(Score_set_full$Test)
Cut_set$Test <- as.factor(Cut_set$Test)
Score_set_full$Group <- as.factor(Score_set_full$Group)

Generating Density Plots with Cut-score Lines

A plot is then generated that uses density curves to represent the distribution of ability estimates. Vertical lines are added to represent the cut-scores for that particular test. Plots for all three tests are provided in a single image.

dens_plot1 <- Score_set_full %>% ggplot(aes(x = Logits)) + labs(title = "Score Distribution") +
geom_density(fill = "red") + xlim(-4,4) +
geom_vline(data = Cut_set, aes(xintercept = Cut_1), linetype = "longdash") +
geom_vline(data = Cut_set, aes(xintercept = Cut_2), linetype = "longdash") +
geom_vline(data = Cut_set, aes(xintercept = Cut_3), linetype = "longdash") +
theme(plot.title = element_text(face = "bold", hjust = 0.5)) + facet_wrap(~Test)

dens_plot1

Generating Density Plots for Different Groups with Cut-score Lines

A plot similar to the previous one is constructed, but this time with multiple density curves to differentiate two groups.

dens_plot2 <- Score_set_full %>% ggplot(aes(x = Logits)) + labs(title = "Score Distribution") +
geom_density(aes(fill = Group), alpha = .5) + xlim(-4,4) +
geom_vline(data = Cut_set, aes(xintercept = Cut_1), linetype = "longdash") +
geom_vline(data = Cut_set, aes(xintercept = Cut_2), linetype = "longdash") +
geom_vline(data = Cut_set, aes(xintercept = Cut_3), linetype = "longdash") +
theme(plot.title = element_text(face = "bold", hjust = 0.5)) + facet_wrap(~Test)

dens_plot2