Data

This is a cumulative draft summarizing key insights extracted from the CoSA Municipal Court InCode Court case management system database extracts. The document will be updated iteratively, including following interim feedback.

Distributio of Citations by age

The histogram shows that the rate of citations skews notably younger (the distribution is also fully congruent with insurance companies’ policy to charge elevated premiums for drivers under the age of 25). The mean age is 34.5 and the median is 31, further illustrating the skew (i.e., half of the citations are associated with individuals between 16 and 31 years of age). The variable “age” has been cleaned by removing observations with typos resulting in impossible values for age (e.g., 0 or negative), and also removing observations with perfectly possible but less plausible values (e.g. ages under 16, or ages over 90 [including values up to 120 currently recorded]). Recommendation: while the number of records affected is not very large, data quality can be improved with implementation of an auto check during data entry: 1) Rejecting impossible values, and 2) providing warnings for possible, but less plausible values

Distribution of Citations by race and gender

The chart above summarizes the race information as recorded in the court database for each citation. Records with unknown or missing race were removed, and the categories “Asian”, “Middle-eastern”, and “Native American” were recoded into “Other”. This distribution needs to be interpreted carefully due to the unknown degree of overlap between the categories “White” and “Hispanic” (as most Hispanic individuals are also white). For example, the population of San Antonio is 71.4% white and 64.5% Hispanic, i.e. with a fairly small minority of non-Hispanic whites. The proportion of Black residents in the database (10.5%) is higher than the city-wide share (6.8%).

Acknowledging the implementation difficulties, and the dependence on state and other system, we recommend exploring transitioning the race data in the two-question format used by the Census.

The gender distribution of court clients in this data base is 61.2% male, 38.4% female.

OmniBase holds by race

The table below shows the distribution of OmniBase holds by race. There is some minor, but notable variation: while the rate of OmniBase holds for the entire data set is 17.4%, within the sub-group of Black court clients, this share is 21%, followed by Hispanic (18%), white (16%), and other (8%)

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Col Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  805104 
## 
##  
##               | Race 
## OmniBase hold |         B |         H |         O |         W | Row Total | 
## --------------|-----------|-----------|-----------|-----------|-----------|
##             0 |     67118 |    359952 |     12842 |    225017 |    664929 | 
##               |    159.99 |      1.16 |    159.34 |     31.63 |           | 
##               |      0.79 |      0.82 |      0.92 |      0.84 |           | 
## --------------|-----------|-----------|-----------|-----------|-----------|
##             1 |     18215 |     76666 |      1069 |     44225 |    140175 | 
##               |    758.90 |      5.51 |    755.84 |    150.05 |           | 
##               |      0.21 |      0.18 |      0.08 |      0.16 |           | 
## --------------|-----------|-----------|-----------|-----------|-----------|
##  Column Total |     85333 |    436618 |     13911 |    269242 |    805104 | 
##               |      0.11 |      0.54 |      0.02 |      0.33 |           | 
## --------------|-----------|-----------|-----------|-----------|-----------|
## 
## 

Distribution of OmniBase holds by geography (zip code)

The first map below shows the rate of OmniBase holds per 10k population in each zip code, against the backdrop of every zip code median income. This approach is similar to the analysis implemented in the “Driven by Debt: the Failure of the OmniBase program” report prepared by Texas Apleseed and Texas Fair Defence Project.

Similar to their findings, we observe a notable virtually linear pattern of the rate of OmniBase holds inversely related to area median income, i.e. seemingly supporting the argument that the program represents especially problematic burden for the poorest segments of the population. This seems to be further supported by almost linear relationship between zip code median income and rate of OmniBase holds on tha scatter plot following the map.

We used the word “seemingly”, because we find the measure methodologically objectionable. Simply calculating the rate of OmniBase holds per population ignores the obvious detail that the rate of OmniBase holds will be related to overall rate of violations of residents within certain area. The rate of violations is not uniform across city areas. Indeed, it will be spuriously related to both income and rate of OmniBase holds: more intense traffic and greater number of traffic violations are much more likely in inner city areas (which also tend to be poorer) than on serene sub-division roads (which also tend to be more affluent). Perhaps one reason for the use of such sub optimal measure of OmniBase burden has been lack of violation-level court data like the one used in the present analysis

Accordingly, we propose a much more valid measure, avoiding the spurious effect of spurious variation in rate of violations, and it is the percent of violations for each zip code that were reported to OmniBase.

Rate of OmniBase holds and median household income by ZIP code (all San Antonio ZIP codes)

Plot of ZIP code median income vs. rate of OmniBase holds

The scatter plot above reinforces the pattern suggested by the zip code map that there is a notable, virtually linear inverse relationship between income and rate of omni base holds.

However, as noted above, this approach is at least partially methodologically questionable as it does not control for rate of violations which however is necessarily related to the rate of OmniBase holds even if all else is equal, but it is also likely spuriously related to area income: poorer inner city areas are more traffic heavy and will necessarily experience higher rate of violations (and attendant OmniBase holds) than less traffic dense more affluent areas in the outskirts.

To remedy this problem, below we repeat the analysis by replacing “rate of OmniBase holds per population” with “Percent of violations subjected to OmniBase holds”, which eliminates the spurious effect of rate of violations across different areas.

Percent of cases with OmniBase holds and median household income by ZIP code (all San Antonio ZIP codes)

Unlike the previous map showing the rate of OmniBase holds per population, the map below plots the percent of violations with OmniBase holds. Careful examination still demonstrates some degree of connection: at least in the most affluent areas, the percentage of OmniBase cases appears smaller than in all other areas. However, the relationship is not nearly as clear as when rate of OmniBase holds per population is used (as above). Rather than linear, the pattern appears bifurcated: no apparent trend in the lower range of incomes, but a drop (in OmniBase cases) in the most affluent areas. To further examine the nascent pattern, we also provide a plot of the two variables (income and percent of cases with OmniBase holds) below.

Plot of ZIP code median income vs. percent of cases with OmniBase holds

The plot reinforces and clarifies the nature of the somewhat muted relationship implicit in the zip code map. The plot further shows that there is no discernible relationship between income and percent of OmniBase holds up until an area reaches a median income of about $60,000. Only after that level of income an inverse relationship between the two variables becomes notable.

This necessitates introducing some further nuance when contemplating the consequences of the OmniBase program. Generally it is assumed that its burdens most heavily fall on the poorest and most vulnerable segments of the population. The results presented here suggest this concern might be somewhat exaggerated insofar areas with incomes around the area median income (i.e. not impoverished by definition) show very similar patterns in percent of violations referred to OmniBase to the patterns seen in poorer areas.

One interpretation is that a traffic violation and the possible attendant fine represent non-trivial challenge (financial or otherwise, as a disruption) for most individuals or families, including what is approximately considered middle class. Whether it is for financial reasons or for competing priorities and distractions, in all areas with median incomes from $20,000 to $60,000, a stable 15%-20% of violations remain unattended and accordingly get reported to OmniBase. Only after the zip code median income surpasses ~$60,000, some attendant reduction in percent of violations reported to OmniBase begins to drop. Even so, it should be noted that most of even the most affluent areas show non-trivial percent (10%+) of violations sent to OmniBase. This suggests that the implications of the program are not only financial: certain proportion of the population shows propensity to not prioritize case resolution regarless of degree of financial constraints, although certainly more so in lower income areas.