We want to get NJASK data - year by year - grade by grade

The measure we want to see is proficiency/pass rate

Some context

Some assessments (ie the SAT) have a scale that is constant across time - that is to say, the score of 1000 on the SAT maps to the same level of absolute student performance, no matter if you took the test in 1999 or 2009.

It would be helpful if all of the NCLB assessments had this feature; unfortunately, they do not. It is common for the test difficulty to drift over time.

This makes interpretation difficult, because a change in school or district performance can indicate either: - a change in actual school performance - a change in the difficulty of the test - random variation (we’ll set that aside for the moment.)

Below, we’ll assemble the statewide pass rate (pct students proficient or advanced) by grade and subject, and graph.

Pre-processing:

#handle empty string vs zeros
all_assess_tidy$school_code <- all_assess_tidy$school_code %>% as.numeric() %>% magrittr::multiply_by(1)
all_assess_tidy$school_code <- ifelse(is.na(all_assess_tidy$school_code), 0, all_assess_tidy$school_code)

#tag charters with their home city
cc_slim <- charter_city %>% 
  dplyr::select(school_district_code, charter_city, charter_city_code)

all_assess_tidy <- all_assess_tidy %>%
  dplyr::left_join(cc_slim, by = c('district_code' = 'school_district_code'))
state_data <- all_assess_tidy %>%
  dplyr::filter(
    county_code == 'ST' & assess_name == 'NJASK' & 
    subgroup == 'total_population' & test_name != 'science'
  )
  
state_pass_rate <- ggplot(
  data = state_data,
  aes(
    x = testing_year, 
    y = proficient + advanced_proficient,
    group = paste0(grade, school_code),
    color = factor(grade)
  )
) +
geom_point() +
geom_line() +
theme_bw() +
theme(panel.grid = element_blank()) +
facet_grid(. ~ test_name) +
labs(
  x = 'Ending Year',
  y = 'Pct Proficient or Advanced'
) + scale_x_continuous(
  limits = c(2010, 2014)
)

Show state data:

state_pass_rate +
labs(title = 'NJ State NJASK Pass Rate by Grade')

We can try to account for the changing difficulty of the assessment instrument itself by showing the gap between a school or district and the state. This is known as the ‘difference in difference’ approach in econometrics, and is a common research method. (This method isn’t robust, however, when there are wholesale changes in the assessment, as happened in 2008-2009; because the test got so much harder. Re-scaling into a common unit of measurement would need to take place.)

I’ll present both the unadjusted numbers, and the gap/difference (district minus state) numbers, for comparison.

NPS Pass Rate (unadjusted)

Pre-processing

newark_all <- all_assess_tidy %>% 
  dplyr::filter(district_code == '3570' | charter_city_code == '3570')

newark_all$charter_district <- ifelse(newark_all$district_code == '3570', 'NPS', 'Charter')

NPS

nps_all_data <- newark_all %>%
  dplyr::filter(
    district_code == '3570' & assess_name == 'NJASK' & 
    subgroup == 'total_population' & test_name != 'science'
  )

nps_district_only <- nps_all_data %>% dplyr::filter(school_code == 0)
nps_school_only <- nps_all_data %>% dplyr::filter(school_code != 0)

NPS pass rate

nps_district_pass_rate <- state_pass_rate %+% nps_district_only

nps_district_pass_rate

Charter Pass Rate (unadjusted)

newark_charter_only <- newark_all %>%
  dplyr::filter(
    district_code != '3570' & assess_name == 'NJASK' & 
    subgroup == 'total_population' & test_name != 'science' &
    school_code != 0
  )

Build weighted proficiency average (enrollments aren’t equal)

newark_charter_only %>%
  dplyr::mutate(
    number_passed = (proficient/100) * number_valid_scale_scores,
    number_passed = number_passed + ((advanced_proficient/100) * number_valid_scale_scores),
    number_passed = number_passed %>% round(0)
  ) -> newark_charter_only

newark_charter_agg <- newark_charter_only %>%
  dplyr::group_by(
    assess_name, testing_year, grade, county_code, subgroup, test_name, charter_district
  ) %>%
  dplyr::summarize(
    number_enrolled = sum(number_enrolled, na.rm = TRUE),
    number_valid_scale_scores = sum(number_valid_scale_scores, na.rm = TRUE),
    passed = sum(number_passed, na.rm = TRUE),
    passed = passed / sum(number_valid_scale_scores, na.rm = TRUE),
    passed = (passed * 100) %>% round(1)
  )

charter_pass_rate <- ggplot(
  data = newark_charter_agg,
  aes(
    x = testing_year, 
    y = passed,
    group = grade,
    color = factor(grade)
  )
) +
geom_point() +
geom_line() +
theme_bw() +
theme(panel.grid = element_blank()) +
facet_grid(. ~ test_name) +
labs(
  x = 'Ending Year',
  y = 'Pct Proficient or Advanced'
) + scale_x_continuous(
  limits = c(2010, 2014)
)

Charter pass rate:

charter_pass_rate +
  labs(title = 'Newark Charter Pass Rate by Grade')

Combined

nps_district_only_for_comparison <- nps_district_only %>%
  dplyr::mutate(
    passed = proficient + advanced_proficient
  ) %>% dplyr::select(
    assess_name, testing_year, grade, county_code, subgroup, 
    test_name, charter_district, number_enrolled, number_valid_scale_scores, passed
  ) 

marcus_comparison <- rbind(nps_district_only_for_comparison, newark_charter_agg)

compare_sectors <- ggplot(
  data = marcus_comparison,
  aes(
    x = testing_year,
    y = passed,
    group = paste0(charter_district, grade),
    color = charter_district,
    label = passed
  )
) +
geom_text(
  size = 4,
  color = 'gray20',
  alpha = 0.8
) +
geom_point() +
geom_line() +
theme_bw() +
theme(panel.grid = element_blank()) +  
facet_grid(
  grade ~ test_name
) +
scale_x_continuous(
  limits = c(2010, 2014)
) +
scale_y_continuous(
  limits = c(0, 100)
)

write.csv(x = marcus_comparison, file = 'sector_comparison.csv')

Charter pass rate:

compare_sectors +
  labs(title = 'NPS vs Newark Charter Pass Rate, by Grade and Subject')

(note that charter sector is combined and weighted by student enrollment)

NPS and Charter Pass Rate (adjusted)

TK - charter and district pass rate as a function of diff from state avg