Data source: NCES Common Core of Data (CCD), 2022–2023 district-level data for North Carolina.
The dataset includes student enrollment counts disaggregated by race, sex, and grade, along with teacher staffing counts.
These variables allow me to compute student–teacher ratios and district-level demographic composition, which are the two indicators used in this analysis.
These data allow me to see which districts serve different groups of students and how heavily each district’s teachers are loaded, which is central to my equity question.
What I Did // Context
I will explore educational equity in North Carolina public schools using district-level data from the NCES Common Core of Data (CCD).
Educational equity refers to whether students across different districts have fair access to educational resources and learning conditions, regardless of income, race, or geographic location.
I will focus on district-level patterns, not individual students.
I use CCD data to see how staffing and who is being served (demographics) differ across districts, as a first step in describing equity in basic learning conditions.
Research Question
How do student–teacher ratios and demographic composition vary across North Carolina school districts, and what do these patterns suggest about equity in access to educational resources?
Why I Chose These Variables
Student–teacher ratio directly reflects staffing equity, showing which districts have fewer instructional adults per student and therefore more limited learning support.
Racial demographic percentages highlight differences in who districts are serving, helping identify where historically marginalized groups are concentrated.
Comparing ratios and demographics together reveals districts where high student need and limited staffing overlap, pointing to areas where resource gaps may be most urgent.
I use these variables to create district-level summaries and visualizations that show how these conditions vary statewide.
Stakeholders & Use Cases
State and district education leaders
Policymakers
Equity-focused organizations
Families and community advocates
These Groups Could Use the Findings To:
Identify and advocate for under-resourced districts and more equitable policies and funding.
Support communities with higher needs, especially in a digitized society where basic educational conditions still vary.
For example, if we see clusters of districts with high student–teacher ratios and high proportions of historically marginalized students, these groups could prioritize those regions for staffing investments or targeted wraparound services.
Why This Topic?
North Carolina districts differ widely in staffing and student demographics.
These differences can create uneven educational opportunities across the state.
I grew up with strong educational support and resources, and I am interested in identifying where additional attention and resources may be needed.
To understand how staffing and demographics vary shows where inequitable learning conditions may exist.
My goal is not to tell a personal story, but to use data to highlight where conditions may be less equitable; because statewide equity is hard to see from one classroom, district-level CCD analysis helps surface patterns that aren’t obvious locally.
Check the Data
The CCD file includes ~1.58 million student-level records and 84 variables, including IDs, grades, race/ethnicity, sex, and staff classifications.
To analyze equity at the district level, I need to aggregate these records and create district-level indicators (student–teacher ratios and race/ethnicity percentages).
This preview helps verify that district identifiers, student counts, and staff variables are available to calculate equity-related indicators.
library(dplyr)# How big is the CCD dataset?nc_districts |>summarise(n_rows =n(),n_cols =ncol(nc_districts) )
# Peek at most relevant variablesnc_districts |>select( leaid, lea_name, grade, race_ethnicity, sex, student_count, staff, staff_count ) |>head()
# A tibble: 6 × 8
leaid lea_name grade race_ethnicity sex student_count staff staff_count
<chr> <chr> <chr> <chr> <chr> <dbl> <chr> <dbl>
1 3700001 NC Health … Grad… American Indi… Fema… NA All … NA
2 3700001 NC Health … Grad… American Indi… Fema… NA Elem… NA
3 3700001 NC Health … Grad… American Indi… Fema… NA Elem… NA
4 3700001 NC Health … Grad… American Indi… Fema… NA Inst… NA
5 3700001 NC Health … Grad… American Indi… Fema… NA Kind… NA
6 3700001 NC Health … Grad… American Indi… Fema… NA LEA … NA
This step collapses the CCD student-level file into one row per district, with totals for students and teachers.
I then compute the student–teacher ratio, which is my primary indicator of staffing equity for this project.
I aggregate student- and staff-level records to the district level because my research question focuses on district-wide equity, not individual students or schools. The CCD structure requires collapsing millions of rows to create interpretable district indicators.
I remove missing or invalid values during this step to ensure that each district’s student–teacher ratio is calculated from complete and accurate records, which is essential for making valid comparisons when evaluating district-wide equity.
Wrangle: Build Equity Dashboard Dataset
library(dplyr)library(tidyr)library(stringr)# 1. Get race % by district from the long student filenc_demo <- nc_districts |>group_by(leaid, lea_name, race_ethnicity) |>summarise(race_students =sum(student_count, na.rm =TRUE),.groups ="drop" ) |>group_by(leaid, lea_name) |>mutate(total_students_demo =sum(race_students),race_percent = race_students / total_students_demo ) |>ungroup()# 2. Keep just a few key race groups and make them widedemo_wide <- nc_demo |>mutate(race_key =case_when( race_ethnicity =="Black or African American"~"pct_black", race_ethnicity =="Hispanic/Latino"~"pct_hispanic", race_ethnicity =="White"~"pct_white",TRUE~NA_character_ )) |>filter(!is.na(race_key)) |>select(leaid, lea_name, race_key, race_percent) |>pivot_wider(names_from = race_key,values_from = race_percent )# 3. Join with your district-level ratio datadashboard_data <- nc_districts_dist |>left_join(demo_wide, by =c("leaid", "lea_name"))
This step creates district-level race and ethnicity percentages and then combines them with the student–teacher ratio to form a single “equity dashboard” dataset. By merging these indicators into one table, I can visualize how staffing levels and demographic composition.
Equity Dashboard: Key Indicators Across NC Districts
These four visualizations summarize the staffing and demographic conditions across North Carolina school districts:
Most districts cluster around the middle of the distribution, indicating that many NC districts have similar student–teacher ratios.
A smaller group of districts fall toward the right tail (higher ratios), which may signal heavier student loads and potentially more strained instructional capacity.
Very few districts appear at the extreme low end of the distribution, suggesting that unusually small student–teacher loads are rare.
This distribution shows whether staffing capacity varies enough to signal inequitable conditions.
Explore: Student–Teacher Ratios Across NC Districts
The spread in this distribution shows how much staffing loads differ across districts. Because student–teacher ratios reflect access to instructional support, a wide range of ratios suggests uneven conditions that may point to inequitable district-level learning environments.
Key Findings
Student–teacher ratios in NC districts cluster around 4–6 students per teacher, with your dataset showing:
Average ratio: 5.19 students per teacher
Median ratio: 4.26 students per teacher
Lowest ratio: 0.60
Highest ratio: 22.1
These values show that while most districts operate within the typical 4–6 range, a subset of districts experience staffing loads 3–4× higher than the state median, indicating meaningful variation in instructional capacity.
% Black and % Hispanic/Latino students vary widely across NC districts, with distributions ranging from near 0% in some districts to 30–40% in others.
This demonstrates significant demographic differences across the state.
Districts with higher percentages of Black or Hispanic/Latino students tend to show higher student–teacher ratios, based on your scatterplot where districts with the largest demographic percentages also appear higher on the y-axis.
This suggests potential racial disparities in staffing levels.
Because poverty data was not available, this analysis focuses solely on staffing and demographic composition, which still reveal clear, measurable inequities across districts.
Implications for Equity
Why these patterns matter:
Variations in student–teacher ratios may produce unequal learning conditions, especially in higher-need districts.
Demographic disparities overlap with staffing disparities, pointing to systemic inequities, not random differences.
District-level data reveals inequities that may be hidden at the classroom level.
These patterns highlight districts that may need additional support or targeted interventions to promote equitable access to educational resources.
Limitations & Ethical/Equity Considerations
Limitations
This analysis does not include poverty measures (such as FRL eligibility) because those data were not available or complete in my cleaned dataset. Without a reliable poverty indicator, I cannot directly assess how socioeconomic disadvantage interacts with staffing or demographic patterns.
Because the CCD is aggregated to the district level, districtwide averages can mask substantial variation between individual schools. Some schools within a district may have much higher ratios or different demographic compositions than the district average, meaning the patterns I present reflect broad district trends rather than school-level inequities.
These indicators describe association, not causation. Higher student–teacher ratios in certain demographic contexts may reflect structural factors not captured in the CCD.
Ethical & Equity Considerations
Findings should highlight system-level inequities, not reflect deficits in students or communities.
Interpreting demographic patterns requires care to avoid stereotyping or misrepresenting districts.
Aggregated data supports privacy and responsible data use.
Results should inform supportive, equity-focused action, not punitive measures.
Transparency about data limitations promotes ethical communication with educators and policymakers.
Final Data View: Where Equity Pressures Show Up
This pattern matters for my research question because if districts with higher percentages of Black students also tend to have higher student–teacher ratios, it suggests that staffing resources may not be distributed equitably across districts.
Although this does not prove causation, the upward trend highlights potential structural inequities in how staffing capacity aligns with demographic composition.
Conclusion: What This Analysis Shows
Conclusion:
North Carolina districts differ substantially in staffing levels, poverty, and demographics.
Districts with higher percentages of Black and Hispanic/Latino students often face higher student–teacher ratios,
and poverty is unevenly distributed across districts.
Together, these findings suggest unequal access to educational resources, indicating that equity challenges persist statewide.
Across North Carolina districts, staffing levels and demographic composition vary enough to suggest meaningful differences in the learning conditions students experience. Districts with higher percentages of Black students tend to exhibit higher student–teacher ratios, indicating that access to instructional support is not evenly distributed. While these patterns do not prove causation, they highlight areas where staffing capacity and demographic composition intersect in ways that may disadvantage certain student populations. Overall, the findings suggest that educational equity pressures are present at the district level, with some districts appearing structurally better resourced than others.