| Variable Name | Description |
|---|---|
state |
U.S. state or territory |
total_students |
Total number of students in the state |
category |
Student group (e.g., race/ethnicity, disability, English learner) |
number |
Number of students disciplined for race-based harassment |
percent |
Percent of total students in that category who were disciplined |
Black or African American - Percent |
% of Black students disciplined |
White - Percent |
% of White students disciplined |
Asian - Percent |
% of Asian students disciplined |
| (similar for other races / identities) |
I chose this data set because it highlights possible disparities in how students are disciplined for bullying or harassment based on race. While it’s important to support and protect victims, we also need to understand the students who are doing the harm. By looking at patterns in who gets flagged, we can start to uncover underlying issues and work toward solutions that prevent bullying before it happens. This kind of insight can help schools create fairer, more supportive environments for everyone.
Interpretation:
This bar plot displays the percentage of students disciplined for race-based harassment in California, broken down by demographic groups. The data reveals that Latino and Black students face the highest discipline rates, and both groups showing much higher rates than Asian or Pacific Islander students.
Given that Latino students make up a large share of California’s school population, part of the higher count could stem from population size. However, national studies have found that Latino and Black students are often disciplined more harshly and more frequently than their counterparts, even when engaging in similar behaviors. Factors contributing to this include:
Implicit biases from teachers or administrators
Language and cultural barriers leading to misunderstanding or unfair assumptions
Zero-tolerance policies that penalize minor behaviors, disproportionately affecting students of color
This graph raises concern about systemic discipline inequities in California schools and highlights the need for culturally responsive practices and restorative alternatives.
This map shows how discipline rates for Black students vary across states. The darkest areas indicate states where Black students are disciplined most frequently for race-based harassment. This geographic pattern suggests the issue isn’t just local but national—rooted in broader systemic factors. The rates raise concern about bias in reporting, inconsistent school policies, and uneven enforcement across state lines.
Paired t-test
data: t_data$`Black or African American` and t_data$White
t = -7.5898, df = 52, p-value = 5.689e-10
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
-40.92323 -23.80885
sample estimates:
mean difference
-32.36604
I used a paired t-test because I’m comparing discipline rates in the same states between two groups.
This test tells us whether the difference in average percent disciplined between Black and White students is statistically significant. These were the t-test results:
This t-test shows that across states, the average percent of Black students disciplined for race-based harassment is 32.4 percentage points higher than that of White students — and this is statistically significant, with a p-value essentially zero. This reveals a large disparity in how different racial groups are represented in disciplinary data. This difference is not due to chance. It suggests institutional or systemic inequities—possibly related to how behavior is interpreted, who gets reported, or how school personnel enforce discipline differently depending on race.
Interpretation
This boxplot shows how discipline rates for Black students are consistently higher and more widely spread across states than for White students. White students have lower and more tightly grouped discipline percentages. This reinforces the finding from the t-test and visualizes the nationwide pattern of disparity in school discipline.
Call:
lm(formula = percent ~ category + state, data = df_long)
Residuals:
Min 1Q Median 3Q Max
-52.719 -4.210 -1.883 3.119 55.056
Coefficients:
Estimate Std. Error t value
(Intercept) 4.281e+00 6.072e+00 0.705
categoryAsian -2.109e+00 2.762e+00 -0.764
categoryBlack or African American 1.537e+01 2.762e+00 5.566
categoryHispanic or Latino of any race 1.140e+01 2.762e+00 4.128
categoryNative Hawaiian or Other Pacific Islander -2.287e+00 2.762e+00 -0.828
categoryWhite 4.774e+01 2.762e+00 17.285
stateAlabama 5.167e-01 8.208e+00 0.063
stateAlaska -8.667e-01 8.208e+00 -0.106
stateArizona 1.667e-01 8.208e+00 0.020
stateArkansas -2.333e-01 8.208e+00 -0.028
stateCalifornia -1.000e-01 8.208e+00 -0.012
stateColorado -8.333e-02 8.208e+00 -0.010
stateConnecticut -6.167e-01 8.208e+00 -0.075
stateDelaware 7.000e-01 8.208e+00 0.085
stateDistrict of Columbia 7.000e-01 8.208e+00 0.085
stateFlorida 2.000e-01 8.208e+00 0.024
stateGeorgia 4.500e-01 8.208e+00 0.055
stateHawaii -1.050e+00 8.208e+00 -0.128
stateIdaho 6.000e-01 8.208e+00 0.073
stateIllinois -3.333e-02 8.208e+00 -0.004
stateIndiana 3.333e-02 8.208e+00 0.004
stateIowa -1.333e-01 8.208e+00 -0.016
stateKansas -1.000e-01 8.208e+00 -0.012
stateKentucky -4.833e-01 8.208e+00 -0.059
stateLouisiana 7.167e-01 8.208e+00 0.087
stateMaine 1.167e-01 8.208e+00 0.014
stateMaryland -2.333e-01 8.208e+00 -0.028
stateMassachusetts 1.000e-01 8.208e+00 0.012
stateMichigan 1.500e-01 8.208e+00 0.018
stateMinnesota -1.667e-01 8.208e+00 -0.020
stateMississippi 5.500e-01 8.208e+00 0.067
stateMissouri 8.972e-16 8.208e+00 0.000
stateMontana 2.000e-01 8.208e+00 0.024
stateNebraska -1.000e-01 8.208e+00 -0.012
stateNevada -6.667e-02 8.208e+00 -0.008
stateNew Hampshire 5.667e-01 8.208e+00 0.069
stateNew Jersey 3.333e-01 8.208e+00 0.041
stateNew Mexico 7.000e-01 8.208e+00 0.085
stateNew York 2.667e-01 8.208e+00 0.032
stateNorth Carolina 2.333e-01 8.208e+00 0.028
stateNorth Dakota 7.000e-01 8.208e+00 0.085
stateOhio -1.167e-01 8.208e+00 -0.014
stateOklahoma -4.500e-01 8.208e+00 -0.055
stateOregon -6.833e-01 8.208e+00 -0.083
statePennsylvania 5.000e-02 8.208e+00 0.006
statePuerto Rico -1.597e+01 8.208e+00 -1.945
stateRhode Island -4.000e-01 8.208e+00 -0.049
stateSouth Carolina 1.833e-01 8.208e+00 0.022
stateSouth Dakota -8.500e-01 8.208e+00 -0.104
stateTennessee 2.000e-01 8.208e+00 0.024
stateTexas 2.500e-01 8.208e+00 0.030
stateUtah 4.000e-01 8.208e+00 0.049
stateVermont 8.333e-02 8.208e+00 0.010
stateVirginia 1.465e-15 8.208e+00 0.000
stateWashington -7.833e-01 8.208e+00 -0.095
stateWest Virginia 1.000e-01 8.208e+00 0.012
stateWisconsin 6.667e-02 8.208e+00 0.008
stateWyoming 3.000e-01 8.208e+00 0.037
Pr(>|t|)
(Intercept) 0.4814
categoryAsian 0.4457
categoryBlack or African American 6.49e-08 ***
categoryHispanic or Latino of any race 4.92e-05 ***
categoryNative Hawaiian or Other Pacific Islander 0.4084
categoryWhite < 2e-16 ***
stateAlabama 0.9499
stateAlaska 0.9160
stateArizona 0.9838
stateArkansas 0.9773
stateCalifornia 0.9903
stateColorado 0.9919
stateConnecticut 0.9402
stateDelaware 0.9321
stateDistrict of Columbia 0.9321
stateFlorida 0.9806
stateGeorgia 0.9563
stateHawaii 0.8983
stateIdaho 0.9418
stateIllinois 0.9968
stateIndiana 0.9968
stateIowa 0.9871
stateKansas 0.9903
stateKentucky 0.9531
stateLouisiana 0.9305
stateMaine 0.9887
stateMaryland 0.9773
stateMassachusetts 0.9903
stateMichigan 0.9854
stateMinnesota 0.9838
stateMississippi 0.9466
stateMissouri 1.0000
stateMontana 0.9806
stateNebraska 0.9903
stateNevada 0.9935
stateNew Hampshire 0.9450
stateNew Jersey 0.9676
stateNew Mexico 0.9321
stateNew York 0.9741
stateNorth Carolina 0.9773
stateNorth Dakota 0.9321
stateOhio 0.9887
stateOklahoma 0.9563
stateOregon 0.9337
statePennsylvania 0.9951
statePuerto Rico 0.0528 .
stateRhode Island 0.9612
stateSouth Carolina 0.9822
stateSouth Dakota 0.9176
stateTennessee 0.9806
stateTexas 0.9757
stateUtah 0.9612
stateVermont 0.9919
stateVirginia 1.0000
stateWashington 0.9240
stateWest Virginia 0.9903
stateWisconsin 0.9935
stateWyoming 0.9709
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 14.22 on 260 degrees of freedom
(159 observations deleted due to missingness)
Multiple R-squared: 0.6529, Adjusted R-squared: 0.5768
F-statistic: 8.58 on 57 and 260 DF, p-value: < 2.2e-16
I ran a linear regression modeling the percent of students disciplined for harassment/bullying, using category (demographic group) and state as predictors.
categoryBlack or African American → +15.37%categoryHispanic or Latino → +11.40%categoryWhite → +47.74% (!)Interpretation: The linear regression model reveals that race/ethnicity strongly predicts how likely a student is to be disciplined. Being in the “Black” or “Hispanic or Latino” category increases the expected discipline rate, even after accounting for state. Interestingly, the coefficient for “White” is also very high, which may reflect how data is recorded—or different reporting practices by race. State itself wasn’t a significant factor, suggesting race has more predictive power than geography in this case.
This logistic regression model predicts the likelihood of a student being disciplined for bullying or harassing someone based on race. It uses race/ethnicity and state of residence as predictors. The model outputs odds ratios (ORs), which show how much more (or less) likely a group is to be disciplined compared to a baseline group (usually the first or omitted category, like “American Indian or Alaska Native”).
Race/Ethnicity:
Black or African American: ~88× higher odds (significant)
Hispanic or Latino: ~107× higher odds (significant)
White: ~1068× higher odds (significant)
Asian & Native Hawaiian: No significant difference from baseline
State: No state showed a statistically significant difference in odds after adjusting for race/ethnicity
Interpretation The logistic model shows the odds of being disciplined are dramatically higher for certain groups—especially White, Latino, and Black students—compared to the baseline group. For example, Latino students had odds over 100× higher than the reference category. These kinds of disparities cannot be explained by behavior alone. They reflect systemic issues in how discipline is assigned, possibly due to teacher bias, policy frameworks, and under-resourced support systems.
Adopt Restorative Justice Practices:
Shift from zero-tolerance toward approaches that prioritize reflection, accountability, and conflict resolution.
Train Educators on Implicit Bias:
Ensure school staff understand how race/ethnicity can unconsciously influence discipline decisions.
Standardize Reporting:
Implement clearer definitions and reporting protocols to reduce subjectivity and inconsistency.
Support Over Punish:
Increase funding for counselors, behavior specialists, and mental health professionals—especially in high-need districts.
Include Student Voice:
Engage students—especially from overdisciplined groups—in shaping discipline policies that are just and effective.
When implemented thoughtfully, these changes can help create school environments that are more fair, supportive, and inclusive for all students.