The Exame Nacional do Ensino Médio (ENEM) is one of Brazil’s most consequential large-scale educational assessments, taken annually by millions of students who have completed or are completing secondary education. Originally created in 1998 as a diagnostic tool to evaluate the quality of basic education, the ENEM has since evolved into a high-stakes exam that plays a central role in shaping educational trajectories across the country. Today, ENEM scores are used in multiple national programs, including public university admissions (via Sisu), private university scholarships (Prouni), and government-funded student loans (FIES). Because of its broad range of functions and massive participation, the ENEM is arguably the most influential assessment in contemporary Brazilian education.
The ENEM is divided into five major components:
The first four components consist of standardized multiple-choice tests, while the essay is a written argumentative text evaluated by trained raters. The essay portion holds substantial weight within the overall scoring system and often serves as a decisive factor in university admissions.
ENEM evaluates students on:
This project explores the following:
library(tidyverse)
library(geobr)
library(sf)
# Load and clean data
RESULTADOS_2024 <- read.table("~/Downloads/microdados_enem_2024/DADOS/RESULTADOS_2024.csv", sep=";", quote="\"")
enem <- RESULTADOS_2024[-1,]
names(enem) <- c(
"ID","year","school","countyID","county","stateID","state","dpadm",
"locationID","sfe","countytestID","countytest","statetestID","statetest",
"attCN","attCH","attLC","attMT","testIDcn","testIDch","testIDlc","testIDmt",
"scoreCN","scoreCH","scoreLC","scoreMT","answersCN","answersCH","answersLC",
"answersMT","TP_LINGUA","rubricCN","rubricCH","rubricLC","rubricMT",
"status_essay","scoreC1","scoreC2","scoreC3","scoreC4","scoreC5","essay_score"
)
# Assign regions
enem_reg <- enem %>%
mutate(region = case_when(
state %in% c("AC","AP","AM","PA","RO","RR","TO") ~ "North",
state %in% c("AL","BA","CE","MA","PB","PE","PI","RN","SE") ~ "Northeast",
state %in% c("DF","GO","MT","MS") ~ "Central-West",
state %in% c("ES","MG","RJ","SP") ~ "Southeast",
state %in% c("PR","RS","SC") ~ "South",
TRUE ~ NA_character_
))
# Select relevant columns
enem_sel <- enem_reg %>%
select(region, state, scoreC1:scoreC5, essay_score) %>%
mutate(across(scoreC1:essay_score, as.numeric))
mean_scores <- enem_sel %>%
drop_na(region) %>%
group_by(region) %>%
summarise(across(scoreC1:essay_score, ~ mean(.x, na.rm = TRUE)))
mean_scores
regions_br <- read_region(year = 2020)
region_mapping <- c(
"North"="Norte","Northeast"="Nordeste","Central-West"="Centro Oeste",
"Southeast"="Sudeste","South"="Sul"
)
map_data <- regions_br %>%
left_join(mean_scores %>% mutate(region_geobr = region_mapping[region]),
by = c("name_region" = "region_geobr"))
centroids <- st_centroid(map_data)
ggplot(map_data) +
geom_sf(aes(fill = essay_score), color = "black") +
geom_sf_text(data = centroids, aes(label = round(essay_score,1)), size = 4) +
scale_fill_viridis_c() +
labs(title = "ENEM Mean Essay Score by Region",
fill = "Mean Essay Score") +
theme_minimal()
The map displays pronounced regional differences in essay performance. The Southeast leads with the highest average score (≈ 651.5), followed by the South and Central-West. In contrast, the North (≈ 572.3) and Northeast (≈ 594.7) score substantially lower, revealing a strong north–south educational divide. The pattern mirrors socioeconomic and infrastructural disparities historically documented in Brazilian education.
enem_long <- enem_sel %>%
pivot_longer(scoreC1:scoreC5, names_to = "competency", values_to = "score")
ggplot(enem_long, aes(region, score, fill = competency)) +
stat_summary(fun = "mean", geom = "bar", position = "dodge") +
labs(title = "Mean of Essay Competencies by Region",
x = "Region", y = "Mean Score") +
theme_minimal()
The bar plot shows consistent ranking across competencies for every
region. C2 (topic development) is the highest-scoring
competency nationwide, while C5 (proposal for
intervention) tends to be the lowest, particularly in northern
regions. The Southeast performs highest across all
competencies, whereas the North maintains the lowest
scores.
The similarity in patterns indicates that regions differ in
overall proficiency, not in the structure of their writing
strengths and weaknesses.
line_data <- enem_long %>%
group_by(region, competency) %>%
summarise(mean_score = mean(score, na.rm = TRUE))
ggplot(line_data, aes(competency, mean_score, group = region, color = region)) +
geom_line(size = 1.2) +
geom_point(size = 2.5) +
labs(title = "Mean Scores of ENEM Essay Competencies by Region",
x = "Competency", y = "Mean Score") +
theme_minimal()
All regions show a similar shape: a strong peak at
C2 and a decline toward C5. The
Southeast maintains the highest trajectory across all
competencies, whereas the North shows the lowest means,
especially in C3 and C5.
The nearly parallel lines reinforce that regional disparities are
systemic, not tied to particular writing skills.
This quantitative analysis shows clear and consistent regional
disparities in the ENEM 2024 essay scores. Regions differ in
overall writing proficiency, not in specific
competencies: all regions follow the same performance pattern (C2
highest → C5 lowest).
The sharp contrast between the Southeast/South and the North/Northeast
mirrors historical socioeconomic inequalities, making ENEM a reflection
of broader structural disparities in Brazilian education.
Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (n.d.). Matriz de referência do ENEM. INEP. https://download.inep.gov.br/download/enem/matriz_referencia.pdf
Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira (INEP). (2025). Microdados do ENEM 2024 [Dataset]. Portal Brasileiro de Dados Abertos. https://dados.gov.br/dados/conjuntos-dados/inep-microdados-do-enem
Instituto Nacional de Estudos e Pesquisas Educacionais Anísio Teixeira. (2025). A redação no ENEM 2025: Cartilha do(a) participante [PDF]. Ministério da Educação. https://download.inep.gov.br/publicacoes/institucionais/avaliacoes_e_exames_da_educacao_basica/a_redacao_no_enem_2025_cartilha_do_participante.pdf
Pereira, R. H. M., & Gonçalves, C. N. (2019). geobr: An R package to easily access shapefiles of the Brazilian Institute of Geography and Statistics (R package). GitHub repository.