Introduction

This work aims to provide a visualization of public high school graduation rates by U.S. state for the class of 2015. The visualization chosen is a choropleth map, or color map in which previously defined regions, like states or voting districts, are shaded in proportion to the statistical quantity that is being visualized. The choropleth map generated in this work is interactive. Users can click on any U.S. state and a text box will display data corresponding to that state.

The 4-year adjusted cohort graduation rate (ACGR) is the version of high school graduation rate that is studied and presented in this work. The ACGR accounts for student mobility, including school transfers, emigration, and even death during the 4-year academic period. ACGR is considered to be an accurate, if not the most accurate, estimate of 4-year graduation rates. The choropleth map produced here displays the graduation rate for all students in each U.S. state.


Methods

Data for this project were downloaded from the Common Core of Data (CCD) web site. These data were processed in the R Statistical package by utilizing the RStudio graphical user interface. The processed graduation rate data were then converted into a choropleth map of the United States with the utilization of a shapefile and Leaflet for R. The choropleth map is presented in the Results section of this document.


Data Sources

Adjusted Cohort Graduation Rate data were collected from the Common Core of Data (CCD) website, which is a program of the U.S. Department of Education’s National Center for Education Statistics (NCES). Each year the CCD collects data for all public schools, public school districts, and state education organizations that operate within the United States. All data points reflect the 2014-2015 school year for school districts residing in the United States, including the District of Columbia, and are indexed with a unique Federal Information Processing Standard (FIPS) code. To create a choropleth map of the ACGR data, a TIGER/Line Shapefile containing a spatial polygon with dataframe to visually represent the 50 United States was downloaded from the US Census website. Web links to the ACGR data and TIGER/Line Shapefiles are provided in Appendix A of this document.


Data Processing

The ACGR data were cleaned and transformed for analysis. After loading into R, the data were automatically represented as factor variables. User-defined functions were then utilized to transform these factors into the appropriate variable type (numeric variables for graduation rates or character strings for state names) so that the data could be analyzed. Numerical ranks were assigned to the ACGR values. That is, each reported ACGR value was numerically ranked from 1 to 51 (i.e. 50 states + the District of Columbia) so that the state with the highest reported ACGR was given the rank of 1. All ties were broken based on the order that the values were examined, so it possible that two or more states could have the same ACGR value but slightly different rank. For example, three states with ACGR of 89.0% could be ranked as 3,4,5. The R code that was developed to complete this work is freely available for others to use. A web link to the source code is included in Appendex B.


Interactive Visualization

The Leaflet for R package was used to overlay a modified shapefile onto a map of the U.S. The original, unmodified shapefile was downloaded from the U.S. Census website. This shapefile contains a spatial polygon data frame representing the 50 U.S. states, the District of Columbia, and additional U.S. territories. To make this shapefile useful for creating a choropleth map of the processed graduation rates, the spatial polygon dataframe associated with the shapefile was updated to include the processed ACGR data. This modified spatial polygon was then overlayed onto a map of the U.S. using a color palette for various ranges of graduation rates.


Results

An interactive choropleth map has been produced and is presented below. While at first glance the figure appears to show only the 48 contiguous states, the interactive nature of the map allows users to pan in any of the four cardinal directions (i.e. North, South, East, West) to reveal Hawaii and Alaska. It is also possible to click on each state to reveal the corresponding graduation rate and rank among all states.


Figure 1: The graduation rate for all students in the U.S. was 83.2% in 2015. Dragging the map
provides a view of Hawaii and Alaska. Click on any state to view graduation rate and ranking.


The graduation rates for students in each of the 50 United States is displayed in Figure 1. Iowa topped the country at 90.8%, followed closely by New Jersey (89.7%), Alabama (89.3%), and Texas (89.0%). The state with the lowest graduation rate was New Mexico at 68.6%. States which fared slightly better were Nevada (71.3%), Oregon (73.8%), and Mississippi (75.4%).


Appendix A: Data Sources

The following is a list of the data sources used to create the visualization presented in this work.

  1. Adjusted Cohort Graduate Rates for School Year 2014-2015 provided by Common Core of Data:
    https://nces.ed.gov/ccd/tables/xls/ACGR_RE_Characteristics_2014-15.xlsx

  2. TIGER/Line geographic shapefile (i.e. spatial polygon representation of the United States + territories): ftp://ftp2.census.gov/geo/tiger/TIGER2015/STATE/tl_2015_us_state.zip

    More information about TIGER/Line Shapefiles can be found at https://www.census.gov/geo/maps-data/data/tiger-line.html


Appendix B: Code Used to Complete This Project

The code utilized for processing and visualizing the 2015 Adjusted Cohort Graduation Rate data is posted online. It is freely available for others to use.


Note: Special thanks to Zev Ross for having written a tutorial on creating interactive maps with leaflet. I relied on the content from that tutorial for generating the choropleth map presented here.