Graduation rates are a key measure of academic success and provide valuable insight into the effectiveness of educational systems. This analysis seeks to examine how high school graduation rates vary across the boroughs in New York City and over time. One hypothesis is that cohorts from recent years are more likely to reach a graduation rate of 70% or above compared to earlier cohorts. This may be due to improvements in educational policies and resources over the years. By understanding these trends, we can better assess the factors that contribute to students graduating high school and explore potential areas for future intervention.
How do high school graduation rates vary across the boroughs in New York City, and are recent cohorts more likely to achieve a graduation rate of 70% or higher?
This analysis utilizes graduation rate data provided by the Department of Education (DOE) and distributed by NYC Open Data.
The relevant variables for this analysis are:
Borough: The location of the high school attended by a cohort of students (Bronx, Brooklyn, Manhattan, Queens, Staten Island).
Cohort Year: The year in which a cohort of students entered 9th grade.
% Grad: The percent of students from a cohort who graduated high school.
# Import the data
DATA <- read_csv("C:/Users/dijan/Documents/DATA 712/graduation_data.csv", show_col_types = FALSE)
# Data description table
datasummary_skim(DATA)
Unique | Missing Pct. | Mean | SD | Min | Median | Max | Histogram | |
---|---|---|---|---|---|---|---|---|
Cohort Year | 15 | 0 | 2008.1 | 4.0 | 2001.0 | 2008.0 | 2015.0 | |
# Total Cohort | 193 | 0 | 14417.5 | 6413.3 | 13.0 | 15599.0 | 22595.0 | |
# Grads | 318 | 0 | 9928.0 | 4724.2 | 0.0 | 10487.0 | 16843.0 | |
% Grads | 214 | 0 | 67.5 | 14.7 | 0.0 | 71.1 | 84.7 | |
# Total Regents | 316 | 0 | 8430.7 | 4460.2 | 0.0 | 8629.0 | 15672.0 | |
% Total Regents of Cohort | 234 | 0 | 56.8 | 17.3 | 0.0 | 61.1 | 77.5 | |
% Total Regents of Grads | 198 | 1 | 81.9 | 15.6 | 9.9 | 88.9 | 96.6 | |
# Advanced Regents | 237 | 0 | 2497.0 | 1315.3 | 0.0 | 2829.0 | 5061.0 | |
% Advanced Regents of Cohort | 126 | 0 | 17.4 | 6.2 | 0.0 | 18.5 | 30.4 | |
% Advanced Regents of Grads | 147 | 1 | 24.7 | 6.9 | 0.0 | 26.1 | 37.4 | |
# Regents without Advanced | 315 | 0 | 5933.7 | 3273.7 | 0.0 | 6492.0 | 11570.0 | |
% Regents without Advanced of Cohort | 210 | 0 | 39.5 | 12.9 | 0.0 | 44.1 | 55.4 | |
% Regents without Advanced of Grads | 219 | 1 | 57.2 | 13.1 | 9.7 | 60.4 | 75.3 | |
# Local | 300 | 0 | 1499.9 | 1266.1 | 0.0 | 921.0 | 5532.0 | |
% Local of Cohort | 168 | 0 | 10.7 | 7.0 | 0.0 | 8.3 | 30.6 | |
% Local of Grads | 203 | 1 | 18.1 | 15.7 | 3.4 | 11.1 | 90.4 | |
# Still Enrolled | 314 | 0 | 1921.4 | 1551.6 | 3.0 | 1397.0 | 6571.0 | |
% Still Enrolled | 179 | 0 | 13.2 | 8.2 | 1.9 | 10.5 | 34.7 | |
# Dropout | 280 | 0 | 2108.9 | 1235.0 | 10.0 | 2140.0 | 5967.0 | |
% Dropout | 160 | 0 | 16.1 | 11.5 | 5.1 | 13.1 | 76.9 | |
# SACC (IEP Diploma) | 228 | 0 | 261.1 | 193.2 | 0.0 | 230.0 | 802.0 | |
% SACC (IEP Diploma) of Cohort | 48 | 0 | 1.8 | 1.1 | 0.0 | 1.6 | 6.9 | |
# TASC (GED) | 167 | 0 | 175.2 | 122.8 | 0.0 | 151.0 | 667.0 | |
% TASC (GED) of Cohort | 36 | 0 | 1.3 | 0.7 | 0.0 | 1.2 | 5.0 | |
N | % | |||||||
Borough | Bronx | 62 | 19.0 | |||||
Brooklyn | 62 | 19.0 | ||||||
District 79 | 17 | 5.2 | ||||||
Manhattan | 62 | 19.0 | ||||||
Queens | 62 | 19.0 | ||||||
Staten Island | 62 | 19.0 | ||||||
All Students | 327 | 100.0 | ||||||
Cohort | 4 year August | 57 | 17.4 | |||||
4 year June | 81 | 24.8 | ||||||
5 year August | 45 | 13.8 | ||||||
5 year June | 75 | 22.9 | ||||||
6 year June | 69 | 21.1 |
# Renaming variables
DATA <- DATA %>%
rename(Cohort_year = `Cohort Year`,
Grad_percentage = `% Grads`)
# Removing the rows where the borough is "District 79"
DATA <- DATA %>%
filter(Borough != "District 79")
# Convert Borough from character to factor
DATA <- DATA %>%
mutate(Borough = as.factor(Borough))
# Convert Cohort_year from numeric to factor
DATA <- DATA %>%
mutate(Cohort_year = as.factor(Cohort_year))
# Convert graduation percentage to a binary variable (1 if the graduation percentage is 70 or above, 0 if the graduation percentage is below 70)
DATA <- DATA %>%
mutate(Grad_binary = case_when(
Grad_percentage >= 70 ~ 1,
TRUE ~ 0))
# Model predicting graduation binary outcome by borough
m1 <- glm(Grad_binary ~ Borough, family = binomial, data = DATA)
summary(m1)
##
## Call:
## glm(formula = Grad_binary ~ Borough, family = binomial, data = DATA)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.9095 0.3788 -5.041 4.64e-07 ***
## BoroughBrooklyn 1.8450 0.4562 4.044 5.24e-05 ***
## BoroughManhattan 2.4376 0.4611 5.286 1.25e-07 ***
## BoroughQueens 2.5786 0.4642 5.554 2.79e-08 ***
## BoroughStaten Island 4.3432 0.6009 7.228 4.90e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 424.58 on 309 degrees of freedom
## Residual deviance: 329.49 on 305 degrees of freedom
## AIC: 339.49
##
## Number of Fisher Scoring iterations: 5
# Model predicting graduation binary outcome by borough and cohort year
m2 <- glm(Grad_binary ~ Borough + Cohort_year, family = binomial, data = DATA)
summary(m2)
##
## Call:
## glm(formula = Grad_binary ~ Borough + Cohort_year, family = binomial,
## data = DATA)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -8.361e+00 1.525e+00 -5.483 4.18e-08 ***
## BoroughBrooklyn 4.200e+00 1.098e+00 3.827 0.000130 ***
## BoroughManhattan 5.338e+00 1.129e+00 4.727 2.28e-06 ***
## BoroughQueens 5.603e+00 1.137e+00 4.930 8.23e-07 ***
## BoroughStaten Island 8.542e+00 1.294e+00 6.600 4.10e-11 ***
## Cohort_year2002 2.052e-15 1.354e+00 0.000 1.000000
## Cohort_year2003 8.404e-01 1.310e+00 0.641 0.521232
## Cohort_year2004 1.546e+00 1.286e+00 1.202 0.229392
## Cohort_year2005 1.851e+00 1.218e+00 1.520 0.128574
## Cohort_year2006 3.197e+00 1.185e+00 2.698 0.006985 **
## Cohort_year2007 3.761e+00 1.199e+00 3.138 0.001704 **
## Cohort_year2008 3.475e+00 1.190e+00 2.919 0.003513 **
## Cohort_year2009 3.761e+00 1.199e+00 3.138 0.001704 **
## Cohort_year2010 4.393e+00 1.228e+00 3.576 0.000348 ***
## Cohort_year2011 5.218e+00 1.299e+00 4.018 5.87e-05 ***
## Cohort_year2012 7.336e+00 1.618e+00 4.535 5.76e-06 ***
## Cohort_year2013 8.829e+00 1.729e+00 5.105 3.31e-07 ***
## Cohort_year2014 8.447e+00 1.754e+00 4.817 1.46e-06 ***
## Cohort_year2015 8.447e+00 2.000e+00 4.225 2.39e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 424.58 on 309 degrees of freedom
## Residual deviance: 188.35 on 291 degrees of freedom
## AIC: 226.35
##
## Number of Fisher Scoring iterations: 7
# Model predicting graduation binary outcome by borough and cohort year, including an interaction between borough and cohort year
m3 <- glm(Grad_binary ~ Borough * Cohort_year, family = binomial, data = DATA)
summary(m3)
##
## Call:
## glm(formula = Grad_binary ~ Borough * Cohort_year, family = binomial,
## data = DATA)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.057e+01 1.024e+04 -0.002 0.998
## BoroughBrooklyn 4.395e-06 1.448e+04 0.000 1.000
## BoroughManhattan 6.513e-06 1.448e+04 0.000 1.000
## BoroughQueens 4.940e-06 1.448e+04 0.000 1.000
## BoroughStaten Island 2.126e+01 1.024e+04 0.002 0.998
## Cohort_year2002 5.226e-06 1.448e+04 0.000 1.000
## Cohort_year2003 5.262e-06 1.448e+04 0.000 1.000
## Cohort_year2004 5.257e-06 1.448e+04 0.000 1.000
## Cohort_year2005 5.291e-06 1.354e+04 0.000 1.000
## Cohort_year2006 5.252e-06 1.295e+04 0.000 1.000
## Cohort_year2007 5.356e-06 1.295e+04 0.000 1.000
## Cohort_year2008 5.323e-06 1.295e+04 0.000 1.000
## Cohort_year2009 5.559e-06 1.295e+04 0.000 1.000
## Cohort_year2010 5.244e-06 1.295e+04 0.000 1.000
## Cohort_year2011 5.177e-06 1.295e+04 0.000 1.000
## Cohort_year2012 2.016e+01 1.024e+04 0.002 0.998
## Cohort_year2013 2.097e+01 1.024e+04 0.002 0.998
## Cohort_year2014 2.057e+01 1.024e+04 0.002 0.998
## Cohort_year2015 2.057e+01 1.024e+04 0.002 0.998
## BoroughBrooklyn:Cohort_year2002 -4.398e-06 2.047e+04 0.000 1.000
## BoroughManhattan:Cohort_year2002 -6.516e-06 2.047e+04 0.000 1.000
## BoroughQueens:Cohort_year2002 -4.944e-06 2.047e+04 0.000 1.000
## BoroughStaten Island:Cohort_year2002 -5.226e-06 1.448e+04 0.000 1.000
## BoroughBrooklyn:Cohort_year2003 -4.435e-06 2.047e+04 0.000 1.000
## BoroughManhattan:Cohort_year2003 -6.553e-06 2.047e+04 0.000 1.000
## BoroughQueens:Cohort_year2003 1.987e+01 1.773e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2003 -5.262e-06 1.448e+04 0.000 1.000
## BoroughBrooklyn:Cohort_year2004 -4.429e-06 2.047e+04 0.000 1.000
## BoroughManhattan:Cohort_year2004 1.987e+01 1.773e+04 0.001 0.999
## BoroughQueens:Cohort_year2004 1.987e+01 1.773e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2004 -5.257e-06 1.448e+04 0.000 1.000
## BoroughBrooklyn:Cohort_year2005 -4.464e-06 1.915e+04 0.000 1.000
## BoroughManhattan:Cohort_year2005 2.057e+01 1.698e+04 0.001 0.999
## BoroughQueens:Cohort_year2005 1.947e+01 1.698e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2005 4.055e-01 1.354e+04 0.000 1.000
## BoroughBrooklyn:Cohort_year2006 1.918e+01 1.651e+04 0.001 0.999
## BoroughManhattan:Cohort_year2006 2.097e+01 1.651e+04 0.001 0.999
## BoroughQueens:Cohort_year2006 2.097e+01 1.651e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2006 1.987e+01 1.518e+04 0.001 0.999
## BoroughBrooklyn:Cohort_year2007 2.016e+01 1.651e+04 0.001 0.999
## BoroughManhattan:Cohort_year2007 2.097e+01 1.651e+04 0.001 0.999
## BoroughQueens:Cohort_year2007 2.195e+01 1.651e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2007 1.987e+01 1.518e+04 0.001 0.999
## BoroughBrooklyn:Cohort_year2008 2.016e+01 1.651e+04 0.001 0.999
## BoroughManhattan:Cohort_year2008 2.097e+01 1.651e+04 0.001 0.999
## BoroughQueens:Cohort_year2008 2.097e+01 1.651e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2008 1.987e+01 1.518e+04 0.001 0.999
## BoroughBrooklyn:Cohort_year2009 2.097e+01 1.651e+04 0.001 0.999
## BoroughManhattan:Cohort_year2009 2.097e+01 1.651e+04 0.001 0.999
## BoroughQueens:Cohort_year2009 2.097e+01 1.651e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2009 1.987e+01 1.518e+04 0.001 0.999
## BoroughBrooklyn:Cohort_year2010 2.097e+01 1.651e+04 0.001 0.999
## BoroughManhattan:Cohort_year2010 2.195e+01 1.651e+04 0.001 0.999
## BoroughQueens:Cohort_year2010 2.195e+01 1.651e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2010 1.987e+01 1.518e+04 0.001 0.999
## BoroughBrooklyn:Cohort_year2011 2.195e+01 1.651e+04 0.001 0.999
## BoroughManhattan:Cohort_year2011 2.195e+01 1.651e+04 0.001 0.999
## BoroughQueens:Cohort_year2011 4.113e+01 1.831e+04 0.002 0.998
## BoroughStaten Island:Cohort_year2011 1.987e+01 1.518e+04 0.001 0.999
## BoroughBrooklyn:Cohort_year2012 1.792e+00 1.448e+04 0.000 1.000
## BoroughManhattan:Cohort_year2012 2.097e+01 1.651e+04 0.001 0.999
## BoroughQueens:Cohort_year2012 2.097e+01 1.651e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2012 -2.877e-01 1.295e+04 0.000 1.000
## BoroughBrooklyn:Cohort_year2013 2.016e+01 1.651e+04 0.001 0.999
## BoroughManhattan:Cohort_year2013 2.016e+01 1.651e+04 0.001 0.999
## BoroughQueens:Cohort_year2013 2.016e+01 1.651e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2013 -1.099e+00 1.295e+04 0.000 1.000
## BoroughBrooklyn:Cohort_year2014 2.057e+01 1.698e+04 0.001 0.999
## BoroughManhattan:Cohort_year2014 2.057e+01 1.698e+04 0.001 0.999
## BoroughQueens:Cohort_year2014 2.057e+01 1.698e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2014 -6.932e-01 1.354e+04 0.000 1.000
## BoroughBrooklyn:Cohort_year2015 2.057e+01 1.915e+04 0.001 0.999
## BoroughManhattan:Cohort_year2015 2.057e+01 1.915e+04 0.001 0.999
## BoroughQueens:Cohort_year2015 2.057e+01 1.915e+04 0.001 0.999
## BoroughStaten Island:Cohort_year2015 -6.932e-01 1.619e+04 0.000 1.000
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 424.58 on 309 degrees of freedom
## Residual deviance: 172.11 on 235 degrees of freedom
## AIC: 322.11
##
## Number of Fisher Scoring iterations: 19
# Model results table
models <- list(
"Model 1" = glm(Grad_binary ~ Borough, family = binomial, data = DATA),
"Model 2" = glm(Grad_binary ~ Borough + Cohort_year, family = binomial, data = DATA),
"Model 3" = glm(Grad_binary ~ Borough * Cohort_year, family = binomial, data = DATA)
)
modelsummary(models)
Model 1 | Model 2 | Model 3 | |
---|---|---|---|
(Intercept) | -1.910 | -8.361 | -20.566 |
(0.379) | (1.525) | (10236.636) | |
BoroughBrooklyn | 1.845 | 4.200 | 0.000 |
(0.456) | (1.098) | (14476.786) | |
BoroughManhattan | 2.438 | 5.338 | 0.000 |
(0.461) | (1.129) | (14476.791) | |
BoroughQueens | 2.579 | 5.603 | 0.000 |
(0.464) | (1.137) | (14476.789) | |
BoroughStaten Island | 4.343 | 8.542 | 21.259 |
(0.601) | (1.294) | (10236.636) | |
Cohort_year2002 | 0.000 | 0.000 | |
(1.354) | (14476.788) | ||
Cohort_year2003 | 0.840 | 0.000 | |
(1.310) | (14476.788) | ||
Cohort_year2004 | 1.546 | 0.000 | |
(1.286) | (14476.788) | ||
Cohort_year2005 | 1.851 | 0.000 | |
(1.218) | (13541.796) | ||
Cohort_year2006 | 3.197 | 0.000 | |
(1.185) | (12948.433) | ||
Cohort_year2007 | 3.761 | 0.000 | |
(1.199) | (12948.433) | ||
Cohort_year2008 | 3.475 | 0.000 | |
(1.190) | (12948.433) | ||
Cohort_year2009 | 3.761 | 0.000 | |
(1.199) | (12948.433) | ||
Cohort_year2010 | 4.393 | 0.000 | |
(1.228) | (12948.433) | ||
Cohort_year2011 | 5.218 | 0.000 | |
(1.299) | (12948.434) | ||
Cohort_year2012 | 7.336 | 20.161 | |
(1.618) | (10236.636) | ||
Cohort_year2013 | 8.829 | 20.972 | |
(1.729) | (10236.636) | ||
Cohort_year2014 | 8.447 | 20.566 | |
(1.754) | (10236.636) | ||
Cohort_year2015 | 8.447 | 20.566 | |
(2.000) | (10236.636) | ||
BoroughBrooklyn × Cohort_year2002 | -0.000 | ||
(20473.267) | |||
BoroughManhattan × Cohort_year2002 | -0.000 | ||
(20473.271) | |||
BoroughQueens × Cohort_year2002 | -0.000 | ||
(20473.270) | |||
BoroughStaten Island × Cohort_year2002 | -0.000 | ||
(14476.788) | |||
BoroughBrooklyn × Cohort_year2003 | -0.000 | ||
(20473.267) | |||
BoroughManhattan × Cohort_year2003 | -0.000 | ||
(20473.271) | |||
BoroughQueens × Cohort_year2003 | 19.873 | ||
(17730.372) | |||
BoroughStaten Island × Cohort_year2003 | -0.000 | ||
(14476.788) | |||
BoroughBrooklyn × Cohort_year2004 | -0.000 | ||
(20473.267) | |||
BoroughManhattan × Cohort_year2004 | 19.873 | ||
(17730.374) | |||
BoroughQueens × Cohort_year2004 | 19.873 | ||
(17730.372) | |||
BoroughStaten Island × Cohort_year2004 | -0.000 | ||
(14476.788) | |||
BoroughBrooklyn × Cohort_year2005 | -0.000 | ||
(19150.988) | |||
BoroughManhattan × Cohort_year2005 | 20.566 | ||
(16975.541) | |||
BoroughQueens × Cohort_year2005 | 19.467 | ||
(16975.539) | |||
BoroughStaten Island × Cohort_year2005 | 0.405 | ||
(13541.796) | |||
BoroughBrooklyn × Cohort_year2006 | 19.180 | ||
(16506.075) | |||
BoroughManhattan × Cohort_year2006 | 20.972 | ||
(16506.080) | |||
BoroughQueens × Cohort_year2006 | 20.972 | ||
(16506.078) | |||
BoroughStaten Island × Cohort_year2006 | 19.873 | ||
(15183.383) | |||
BoroughBrooklyn × Cohort_year2007 | 20.161 | ||
(16506.075) | |||
BoroughManhattan × Cohort_year2007 | 20.972 | ||
(16506.080) | |||
BoroughQueens × Cohort_year2007 | 21.952 | ||
(16506.078) | |||
BoroughStaten Island × Cohort_year2007 | 19.873 | ||
(15183.383) | |||
BoroughBrooklyn × Cohort_year2008 | 20.161 | ||
(16506.075) | |||
BoroughManhattan × Cohort_year2008 | 20.972 | ||
(16506.080) | |||
BoroughQueens × Cohort_year2008 | 20.972 | ||
(16506.078) | |||
BoroughStaten Island × Cohort_year2008 | 19.873 | ||
(15183.383) | |||
BoroughBrooklyn × Cohort_year2009 | 20.972 | ||
(16506.075) | |||
BoroughManhattan × Cohort_year2009 | 20.972 | ||
(16506.080) | |||
BoroughQueens × Cohort_year2009 | 20.972 | ||
(16506.078) | |||
BoroughStaten Island × Cohort_year2009 | 19.873 | ||
(15183.383) | |||
BoroughBrooklyn × Cohort_year2010 | 20.972 | ||
(16506.075) | |||
BoroughManhattan × Cohort_year2010 | 21.952 | ||
(16506.080) | |||
BoroughQueens × Cohort_year2010 | 21.952 | ||
(16506.078) | |||
BoroughStaten Island × Cohort_year2010 | 19.873 | ||
(15183.383) | |||
BoroughBrooklyn × Cohort_year2011 | 21.952 | ||
(16506.076) | |||
BoroughManhattan × Cohort_year2011 | 21.952 | ||
(16506.080) | |||
BoroughQueens × Cohort_year2011 | 41.132 | ||
(18311.850) | |||
BoroughStaten Island × Cohort_year2011 | 19.873 | ||
(15183.384) | |||
BoroughBrooklyn × Cohort_year2012 | 1.792 | ||
(14476.786) | |||
BoroughManhattan × Cohort_year2012 | 20.972 | ||
(16506.080) | |||
BoroughQueens × Cohort_year2012 | 20.972 | ||
(16506.078) | |||
BoroughStaten Island × Cohort_year2012 | -0.288 | ||
(12948.433) | |||
BoroughBrooklyn × Cohort_year2013 | 20.161 | ||
(16506.076) | |||
BoroughManhattan × Cohort_year2013 | 20.161 | ||
(16506.080) | |||
BoroughQueens × Cohort_year2013 | 20.161 | ||
(16506.078) | |||
BoroughStaten Island × Cohort_year2013 | -1.099 | ||
(12948.433) | |||
BoroughBrooklyn × Cohort_year2014 | 20.566 | ||
(16975.536) | |||
BoroughManhattan × Cohort_year2014 | 20.566 | ||
(16975.541) | |||
BoroughQueens × Cohort_year2014 | 20.566 | ||
(16975.539) | |||
BoroughStaten Island × Cohort_year2014 | -0.693 | ||
(13541.796) | |||
BoroughBrooklyn × Cohort_year2015 | 20.566 | ||
(19150.988) | |||
BoroughManhattan × Cohort_year2015 | 20.566 | ||
(19150.992) | |||
BoroughQueens × Cohort_year2015 | 20.566 | ||
(19150.990) | |||
BoroughStaten Island × Cohort_year2015 | -0.693 | ||
(16185.541) | |||
Num.Obs. | 310 | 310 | 310 |
AIC | 339.5 | 226.4 | 322.1 |
BIC | 358.2 | 297.3 | 602.4 |
Log.Lik. | -164.744 | -94.177 | -86.057 |
F | 14.876 | 3.373 | 0.195 |
RMSE | 0.42 | 0.32 | 0.31 |
# Likelihood ratio test
anova(m1, m2, m3, test = "Chisq")
## Analysis of Deviance Table
##
## Model 1: Grad_binary ~ Borough
## Model 2: Grad_binary ~ Borough + Cohort_year
## Model 3: Grad_binary ~ Borough * Cohort_year
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 305 329.49
## 2 291 188.35 14 141.133 <2e-16 ***
## 3 235 172.11 56 16.241 1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# AIC and BIC for Model 1
AIC_1 <- AIC(m1)
BIC_1 <- BIC(m1)
# AIC and BIC for Model 2
AIC_2 <- AIC(m2)
BIC_2 <- BIC(m2)
# AIC and BIC for Model 3
AIC_3 <- AIC(m3)
BIC_3 <- BIC(m3)
# Display the results
print(paste("AIC and BIC for Model 1 (Borough only):"))
## [1] "AIC and BIC for Model 1 (Borough only):"
print(paste("AIC:", AIC_1))
## [1] "AIC: 339.487418510756"
print(paste("BIC:", BIC_1))
## [1] "BIC: 358.170279998152"
print(paste("AIC and BIC for Model 2 (Borough + Cohort_year):"))
## [1] "AIC and BIC for Model 2 (Borough + Cohort_year):"
print(paste("AIC:", AIC_2))
## [1] "AIC: 226.35437910205"
print(paste("BIC:", BIC_2))
## [1] "BIC: 297.349252754154"
print(paste("AIC and BIC for Model 3 (Borough * Cohort_year):"))
## [1] "AIC and BIC for Model 3 (Borough * Cohort_year):"
print(paste("AIC:", AIC_3))
## [1] "AIC: 322.113587752576"
print(paste("BIC:", BIC_3))
## [1] "BIC: 602.356510063515"
Based on the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) values, Model 2, which includes both borough and cohort year, is the best model. It has the lowest AIC value (226.35) and the lowest BIC value (297.35) among the three models, indicating the best balance between model fit and complexity.
# Model 2
summary(m2)
##
## Call:
## glm(formula = Grad_binary ~ Borough + Cohort_year, family = binomial,
## data = DATA)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -8.361e+00 1.525e+00 -5.483 4.18e-08 ***
## BoroughBrooklyn 4.200e+00 1.098e+00 3.827 0.000130 ***
## BoroughManhattan 5.338e+00 1.129e+00 4.727 2.28e-06 ***
## BoroughQueens 5.603e+00 1.137e+00 4.930 8.23e-07 ***
## BoroughStaten Island 8.542e+00 1.294e+00 6.600 4.10e-11 ***
## Cohort_year2002 2.052e-15 1.354e+00 0.000 1.000000
## Cohort_year2003 8.404e-01 1.310e+00 0.641 0.521232
## Cohort_year2004 1.546e+00 1.286e+00 1.202 0.229392
## Cohort_year2005 1.851e+00 1.218e+00 1.520 0.128574
## Cohort_year2006 3.197e+00 1.185e+00 2.698 0.006985 **
## Cohort_year2007 3.761e+00 1.199e+00 3.138 0.001704 **
## Cohort_year2008 3.475e+00 1.190e+00 2.919 0.003513 **
## Cohort_year2009 3.761e+00 1.199e+00 3.138 0.001704 **
## Cohort_year2010 4.393e+00 1.228e+00 3.576 0.000348 ***
## Cohort_year2011 5.218e+00 1.299e+00 4.018 5.87e-05 ***
## Cohort_year2012 7.336e+00 1.618e+00 4.535 5.76e-06 ***
## Cohort_year2013 8.829e+00 1.729e+00 5.105 3.31e-07 ***
## Cohort_year2014 8.447e+00 1.754e+00 4.817 1.46e-06 ***
## Cohort_year2015 8.447e+00 2.000e+00 4.225 2.39e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 424.58 on 309 degrees of freedom
## Residual deviance: 188.35 on 291 degrees of freedom
## AIC: 226.35
##
## Number of Fisher Scoring iterations: 7
The results of Model 2 suggest that both borough and cohort year significantly influenced whether a cohort of students achieved a graduation rate of 70% or higher. Compared to the Bronx, cohorts from Brooklyn, Manhattan, Queens, and Staten Island were more likely to achieve a graduation rate of 70% or higher. Staten Island showed the highest increase in odds, with a coefficient of 8.542. The cohort year, which refers to the year when a cohort began ninth grade in a given school, also revealed a positive trend. Later cohorts (from 2006 to 2015) had significantly higher odds of achieving the 70% graduation rate. This suggests that graduation rates improved over time, as cohorts from later years were more likely to meet the 70% benchmark compared to earlier cohorts (2002 to 2005).The differences for these earlier cohorts were not shown to be statistically significant. These findings imply that certain boroughs and later cohorts saw improvements in graduation rates over time. This could have been influenced by various factors such as educational policies, school resources, and community engagement. To gain a better understanding of the factors leading to changes in graduation rates, further investigation is required.
(Greene and Winters 2001)
(Kemple 2013)