The total number of unique words is :
## [1] 786
## Loading required package: RColorBrewer
library(text2vec)
library(dplyr)
library(purrr)
# Tokenize resp
tokens <- students_df$response %>%
tolower() %>%
word_tokenizer()
# Build vocab
it <- itoken(tokens, progressbar = FALSE)
vocab <- create_vocabulary(it)
vectorizer <- vocab_vectorizer(vocab)
tcm <- create_tcm(it, vectorizer, skip_grams_window = 5)
# Train GloVe model
glove <- GlobalVectors$new(rank = 50, x_max = 10)
word_main <- glove$fit_transform(tcm, n_iter = 10)
## INFO [11:58:27.592] epoch 1, loss 0.1560
## INFO [11:58:27.610] epoch 2, loss 0.0889
## INFO [11:58:27.621] epoch 3, loss 0.0656
## INFO [11:58:27.629] epoch 4, loss 0.0529
## INFO [11:58:27.637] epoch 5, loss 0.0445
## INFO [11:58:27.645] epoch 6, loss 0.0383
## INFO [11:58:27.654] epoch 7, loss 0.0335
## INFO [11:58:27.662] epoch 8, loss 0.0296
## INFO [11:58:27.670] epoch 9, loss 0.0264
## INFO [11:58:27.679] epoch 10, loss 0.0237
word_context <- glove$components
word_vectors <- word_main + t(word_context)
# Compute Document Embeddings
compute_doc_embedding <- function(words) {
valid_words <- intersect(words, rownames(word_vectors))
if (length(valid_words) == 0) {
rep(0, ncol(word_vectors)) # Return zero vector if no valid words
} else {
colMeans(word_vectors[valid_words, , drop = FALSE])
}
}
students_df <- students_df %>%
mutate(embedding = map(tokens, compute_doc_embedding))
# Check embeddings
# head(students_df$embedding)
set.seed(123) # Ensure reproducibility
library(cluster)
# Convert embeddings list to matrix
embedding_matrix <- do.call(rbind, students_df$embedding)
# Choose number of clusters (e.g., k=5)
k <- 5
clusters <- kmeans(embedding_matrix, centers = k, nstart = 10)
# Add cluster labels to dataframe
students_df <- students_df %>%
mutate(cluster = clusters$cluster)
My code first cleans and tokenizes each student response, filtering out common stopwords to ensure more meaningful terms are used. Next, it builds a vocabulary from these tokens and creates a term co-occurrence matrix (TCM), which is then used to train a GloVe model and generate word embeddings. A GloVe model is a way to teach a computer the relationships between words by examining how often they appear together, producing numeric representations that help the computer understand each word’s meaning. Each student response is converted into a document-level embedding by averaging the relevant word vectors, thus capturing the semantic content of the response. Finally, k-means clustering is applied to these embeddings, yielding five distinct sentiment-based clusters.
Below is the frequency of students in each cluster :
##
## 1 2 3 4 5
## 7 17 16 20 10
The last step is reading each response after being clustered & determining a rough meaning of each category.
Below are the meaning I would assign to each group by manually labeling them myself :
Cluster | Definition |
---|---|
Cluster 1 | These responses focus on how standardized tests fail to accommodate diverse student needs—such as language barriers, IEPs, learning disabilities, and test anxiety—emphasizing that a one-size-fits-all exam is not equitable for students with different educational and personal backgrounds. |
Cluster 2 | Students describe the systemic unfairness of standardized testing, particularly highlighting how socioeconomic factors (e.g., access to tutors, after-school programs, and quality instruction) create unequal opportunities that the tests themselves do not account for, perpetuating bias and inequality. |
Cluster 3 | These responses strike a more nuanced tone, acknowledging that while standardized tests can provide a common measure of certain skills, they also ignore individual learning styles, differing curriculum coverage, resource disparities, and outside factors that make a single standardized metric incomplete or unfair. |
Cluster 4 | Discussion here revolves around a broader critique of standardized tests—from socioeconomic gaps, cultural bias, and test anxiety to concerns about testing’s narrow focus on certain types of intelligence—though a few participants also note potential benefits of having a universal framework for assessment. |
Cluster 5 | These entries are largely personal anecdotes illustrating the stress, confusion, and lack of preparation or transparency around standardized tests (e.g., not knowing about scholarships, navigating language barriers, or unrecognized learning challenges), underscoring the emotional toll and perceived inequities of the testing process. |
After analyzing over sixty student responses on the fairness of standardized tests, a few overarching themes emerge. First and foremost is the theme of inequity tied to socioeconomic status. Many responses highlight disparities in access to test preparation resources—tutors, special classes, and better-funded schools—which advantage students from higher-income families while leaving lower-income peers at a disadvantage. Several students also emphasize resource gaps in underfunded schools, contributing to weaker preparation and, often, lower test scores.
A second prominent theme is the mismatch between standardized formats and individual learning styles/needs. Students with IEPs (For e.g. myself – dyslexia), diagnosed or undiagnosed learning disabilities (such as ADHD), and language barriers described feeling marginalized by a one-size-fits-all test. They argued that these tests do not capture actual growth or reflect their true capabilities. Students also describe significant test anxiety, leading to poor performance unrepresentative of their understanding of the material. This calls into question whether a high-pressure exam is the best measure of academic achievement.
Additionally, several students acknowledged the intended goal of standardized tests—to provide a uniform measure of learning across diverse groups—yet critiqued how inconsistent curricula and quality of teaching across regions or schools undermine these tests’ legitimacy.
Some students mentioned that teachers may “teach to the test,” potentially narrowing real learning. Others noted that while standardized tests can offer snapshots of certain skill sets (e.g., reading comprehension, math problem-solving), they do not account for soft skills like creativity, critical thinking, perseverance, or collaboration.
Through quantitative clustering (k-means on text embeddings), we identified five distinct sentiment clusters. Each cluster underscored a dimension of test fairness or unfairness, whether focusing on lack of accommodations, the emotional toll of testing, or questions of curricular alignment.
While most respondents leaned toward viewing standardized tests as unfair, a few recognized partial benefits—such as establishing consistent benchmarks or catching learning gaps early.
Visual inspection of word frequency (through word clouds and bar charts) confirmed that terms like fair, resources, anxiety, and inequitable appeared frequently. These high-level trends match the primary student concerns about inequities in resources and the stress involved. Additionally, references to language barriers, pressure, and preparation suggest that students from non-English-speaking or low-income backgrounds feel disproportionately disadvantaged.
Overall, the consensus is that standardized tests do not fully account for students’ varied backgrounds, learning styles, and resources. The voices collected emphasize the pressing need to reevaluate how we measure academic performance, calling for more nuanced, equitable, and flexible approaches that capture the breadth of student learning.
(I loved this assignment)
I chose a clustering approach (k-means on text embeddings) with qualitative thematic analysis, because it balances data-driven insights with the nuances of individual student perspectives. The strength of this approach lies in its ability to group similar responses objectively while still capturing the depth of personal experiences. Visualizations like word clouds and frequency charts helped identify common concerns (e.g., anxiety, inequity, and resources) at a glance, reinforcing thematic findings. The results aligned with expectations—most students viewed standardized tests as unfair—but I was surprised by how frequently language barriers and test anxiety were mentioned as major disadvantages. Additionally, while most responses were negative, a few students acknowledged potential benefits, showing a more nuanced debate than anticipated.