Evaluating Blight Reduction in New Orleans, 2010-2014

by Peter Yaukey, PhD and Dylan Knaggs

This document describes how we used the HUD-aggregated United States Postal Service administrative data on address vacancies (USPS) and the University of New Orleans Geography Department property condition survey (UNO) to estimate blight reduction in New Orleans from 2010 October 2012-April 2013 and October 2014-February 2015.

Estimates of blight reduction in New Orleans using USPS data and property condition surveys from UNO show that conservatively, blight was reduced by about 15,000 addresses and by perhaps as many as 20,000 addresses between September 2010 and October 2014-February 2015. This represents a 35%-46% decrease.

Background and Approach

USPS Vacancy Data

Through September 2010, the best known data source for tracking blight in New Orleans was USPS vacancy data. This data reflected the numbers of “no-stat” addresses, which are vacant addresses that appear to postal service workers as if they have been/will be vacant for some time. This data source was not perfect. Most notable, the registry covers data for addresses, rather than properties, meaning that the count of “no-stat” addresses is almost certainly greater than the count of blighted properties. The “no-stat” addresses also include vacant lots and unoccupied houses undergoing renovation. Despite these limitations, USPS was the most reliable source from 2005-2010. It had several benefits:

  1. The data was tracked at frequent, regular intervals (quarterly).
  2. It was tracked nation-wide, making it comparable across jurisdictions.
  3. It was analyzed and disseminated by the Greater New Orleans Community Data Center (GNOCDC, now the Data Center), giving it verification from a respected independent source.

However, for reasons that remain unclear, the quality of USPS greatly diminished between September 2010 and March 2012. The data, which had been released on a quarterly basis before September 2010, became unavailable for 18 months. When it resurfaced, the registry of no-stat addresses in Orleans Parish was reduced from 43,755 addresses in September 2010 to an implausible 2,532 addresses in March 2012. GNOCDC analyzed the compromised data in August 2012, and by holding constant the total amount of properties given in the September 2010 data release, estimated that there were actually roughly 35,700 “no-stat” addresses. However, given the poor quality of the data, GNOCDC made their August 2012 report the last of its kind.

UNO Survey

Without the USPS vacancy data, the most reliable method for estimating blight in New Orleans is a longitudinal survey of blighted properties in New Orleans, maintained by Professor Peter Yaukey of the UNO Geography Department since 2007. The three most recent surveys were performed in September 2010 (aligning with the last reliable version of the USPS data), in the period between October 2012 and April 2013, and in period between October 2014 and February 2015.

UNO used a two stage cluster sample, with census block groups as the primary sampling unit and houses within selected block groups as the secondary sampling unit. Block groups are selected at random from the population. Surveyors draw a route through each selected block group without prior knowledge of the survey area. All houses on this route were sampled. With minimal exceptions, the same sample was used in each iteration of the survey.

There are some drawbacks to using UNO to make citywide estimates of blighted addresses. The survey covers most of Orleans Parish, but excludes areas outside of the flood zone of Hurricane Katrina: the West Bank and much of the area between the Mississippi and St. Charles Ave. Also, due to the sample size and survey design, making direct estimates from the UNO data leads to very imprecise results with confidence intervals too large to make significant conclusions. But as the next section shows, we can use UNO in combination with USPS to provide useful results.

Methodology

Independently, neither USPS nor UNO is reliable enough to provide an estimate of the current level of blight in New Orleans. However, USPS, at least in the past, has served as a reasonable proxy for the total magnitude of blight, while UNO gives a good sense of the change in blight. Therefore, combining the two data sets can offer reliable and precise estimates.

We make these estimates through the statistical technique of ratio estimation. Ratio estimation uses a well-known variable (called the auxiliary variable) to predict the value of a correlated variable of interest for which fewer data are available. In this case, the auxiliary variable is no-stat addresses in 2010, when there is both USPS and UNO data. The variable of interest is no-stat addresses in 2012-2013 and 2014-2015, for which there is only UNO data. By looking at the ratio in the level of blight from 2010 to more recent years in UNO, we can estimate how many no-stats there would have been in the USPS data if it were still in good condition. And because patterns in blight should remain relatively constant over the span of 5 years (a high-blight neighborhood in 2010 will generally still be a high-blight neighborhood in 2015), the variables are correlated and ratio estimation should be a suitable technique.

To use ratio estimation, we define the following variables:

  • \(X\): the total number of no-stat addresses in 2010 (43,755, from USPS)

  • \(Y\): the total number of no-stat addresses in subsequent years

  • \(x_i\): the number of sampled houses in block group \(i\) that can be defined as no-stats in 2010

  • \(y_i\): the number of sampled houses in block group \(i\) that can be defined as no-stats in subsequent years

  • \(m_i\): the total number of sampled houses in block group \(i\)

A population total \(\hat{Y}\) can be estimated by: \[ \hat{Y}=\frac{\sum\limits_{i=1}^{n} y_i/m_i}{\sum\limits_{i=1}^{n} x_i/m_i} \cdot X \]

We use a weight of \(1/m_i\) to account for the unequal number of properties sampled in each block group. Note that this represents a slight methodological change from previous iterations of this analysis, where the weights reflected the inverse of selection probability: \(\frac{N}{n}\frac{M_i}{m_i}\), where \(N\) and \(n\) are respectively the number of total and sampled block groups, and \(M_i\) is the number of total addresses in block group \(i\). Mathematically, the only difference is the factor of \(M_i\) (\(N\) and \(n\) are constants), and the differences in estimation of total blight are very small: no more than 3% difference between the two methods and less than 1% for most points.

Another factor to consider is the fact that we are using two different data sources with different definitions of what constitutes a blighted address. To address this, we consider three different models to analyze UNO. In Model A, only properties that are considered blighted by UNO are used. This model is designed to provide the most straightforward estimate of blight reduction. Model B, designed to capture everything that could feasibly be considered a no-stat, uses properties that UNO coded as blighted, under the process of renovation, and vacant lots. The third model is designed to most closely line up with the definition of a no-stat. Model C incorporates blight, addresses under the process of renovation, and vacant lots that had been coded as vacant lots in the previous survey. This handling of vacant lots is chosen because as Alison Plyer and Elaine Ortiz note in their work Benchmarks for Blight: How much blight does New Orleans have? many vacant lots are counted as no-stat addresses. However, the US Department of Housing and Urban Development’s documentation of the no-stat data also implies that in most cases where demolition occurs, the address is removed from the registry. Therefore, Model C keeps the lots that have been listed as no-stats for a substantial period of time, but removes recent demolitions.

To find the variance and confidence intervals around our estimates, we use jackknife variance estimation. Jackknife estimation takes the following form: \[ \hat{\sigma^2}=\frac{n-1}{n}\sum\limits_{i=1}^{n}(\hat{Y}_{-i}-\hat{Y})^2 \]

where \(\hat{Y}_{-i}\) is the estimate of \(\hat{Y}\) excluding the \(i\)th observation. Essentially, the greater the effect each individual data point has on the final estimate, the greater the resulting variance.