Source document. 1966 statistical return submitted by the Severočeský kraj to the Vládní výbor pro otázky cikánského obyvatelstva (VVOCO), the government committee established by Resolution 502/1965.
Dataset.
data_set_kpoco_1966_census_clean.csv — ten districts of the
Severočeský kraj (Česká Lípa, Děčín, Chomutov, Jablonec nad Nisou,
Liberec, Litoměřice, Louny, Most, Teplice, Ústí nad Labem) by forty-nine
variables.
Question. The 1966 return classified each enumerated Roma person into one of four groups (0, I, II, III) according to assessed “level of social adaptability.” The classification carried direct fiscal consequences: per-person funding rates were 20 Kčs for Group I, approximately 40 Kčs for Group II, and 75.23 Kčs for Group III. This report tests whether the resulting classification reflects features of the underlying Roma population, as the form’s design implies, or whether it reflects district-level administrative practice.
The kraj reported 18,406 enumerated Roma in the
total_persons_total column. However, summing the four group
counts (Group 0, I, II, III) across the ten districts gives 20,646. The
discrepancy of 2,240 persons is not a counting error — it is a
systematic reporting inconsistency. Districts disagreed
on whether to include Group 0 in their summary total:
| District | Group 0 in summary? | Group 0 count | Reported total | Sum of 4 groups |
|---|---|---|---|---|
| Česká Lípa | No | 156 | 400 | 556 |
| Děčín | Yes | 399 | 2,167 | 2,167 |
| Chomutov | No | 110 | 1,626 | 1,736 |
| Jablonec | No | 122 | 751 | 873 |
| Liberec | Yes | 300 | 2,152 | 2,152 |
| Litoměřice | (G 0 = 0) | 0 | 860 | 860 |
| Louny | No | 1,694 | 2,005 | 3,699 |
| Most | Yes | 380 | 4,038 | 4,038 |
| Teplice | Yes | 754 | 2,597 | 2,597 |
| Ústí nad Labem | (G 0 = 0) | 0 | 1,810 | 1,968 |
This is not a marginal issue. Louny’s reported total of 2,005
excludes its 1,694 Group 0 persons; the consistent enumeration
is 3,699. All analyses that use a district share or rate are
sensitive to which denominator is chosen. This report therefore
uses the consistent denominator (sum of the four group counts, \(N = 20{,}646\)) throughout, except where
explicitly stated otherwise. This differs from the original
cramers_v_full_results.csv analysis, which used the
as-reported total (\(N = 18{,}406\));
that choice partly conflated classification heterogeneity with
reporting-convention heterogeneity. The substantive consequence of the
corrected denominator is reported in §2 and §5 below.
The kraj’s enumerated population, using the consistent denominator:
| Group | Headcount | Share of kraj |
|---|---|---|
| Group 0 (residual / “not needing care”) | 3,915 | 19.0 % |
| Group I (most adapted) | 5,628 | 27.3 % |
| Group II (partially adapted) | 7,035 | 34.1 % |
| Group III (least adapted) | 4,068 | 19.7 % |
| Total enumerated | 20,646 | 100 % |
If the classification system was applied uniformly across the kraj, every district should distribute its enumerated Roma into the four groups in approximately the same proportions as the kraj as a whole. The chi-square test of independence evaluates whether two categorical variables are independent. For each group, we construct a 10×2 contingency table (district by “in this group” vs. “not in this group”) and compute:
\[\chi^2 = \sum_{i=1}^{10} \sum_{j=1}^{2} \frac{(O_{ij} - E_{ij})^2}{E_{ij}}, \quad \text{where} \quad E_{ij} = \frac{(\text{row total}_i)(\text{column total}_j)}{N}\]
Cramér’s V rescales the chi-square statistic to a standardised range from 0 (perfect uniformity) to 1 (maximum divergence):
\[V = \sqrt{\frac{\chi^2}{N \cdot \min(r-1, c-1)}}\]
For a 10×2 table this simplifies to \(V = \sqrt{\chi^2/N}\). By convention, \(V < 0.10\) is low, \(V \approx 0.30\) is moderate, \(V > 0.40\) is high.
| Group | \(\chi^2\) | df | \(p\)-value | Cramér’s V |
|---|---|---|---|---|
| Group I | 956.5 | 9 | \(< 10^{-200}\) | 0.215 |
| Group II | 3,938.6 | 9 | \(\approx 0\) | 0.437 |
| Group III | 2,888.5 | 9 | \(\approx 0\) | 0.374 |
| Group 0 | 3,066.1 | 9 | \(\approx 0\) | 0.385 |
Comparison with the as-reported denominator. Using \(N = 18{,}406\) produces V values of 0.244, 0.467, 0.351, and 0.586 respectively — most notably, Group 0’s V drops from 0.586 to 0.385 once the denominator inconsistency is fixed. Roughly half of the apparent Group 0 heterogeneity in the original analysis is an artefact of districts disagreeing about whether to count Group 0 in their summary total, not about whether to classify people into Group 0. The corrected V values are the basis for everything that follows.
The null hypothesis of uniform classification is rejected for all four groups at \(p\)-values below floating-point representability. The classification system was demonstrably not being applied uniformly.
A high V value can be produced in two structurally different ways: by all ten districts varying moderately, or by one or two districts varying extremely while the others sit close to the uniform baseline. The standardised residual distinguishes these cases.
For each district-by-group cell:
\[z_{ij} = \frac{O_{ij} - E_{ij}}{\sqrt{E_{ij}}}\]
Values above \(|2|\) are statistically unusual; above \(|5|\) are extreme.
Three districts produce nearly all the heterogeneity in the kraj.
The other seven districts have residuals that are smaller in absolute value, generally below \(|15|\), and many close to \(|5|\) or below.
Each cell’s squared residual is its contribution to the total chi-square statistic. The contributions decompose as follows:
| Group | Total \(\chi^2\) | Top contributor | Top 2 contributors |
|---|---|---|---|
| Group I | 956 | Děčín (15.5²) — 25 % | + Chomutov — 38 % |
| Group II | 3,939 | Ústí (+38.1²) — 37 % | + Most — 44 % |
| Group III | 2,889 | Most (+38.9²) — 52 % | + Ústí — 66 % |
| Group 0 | 3,066 | Louny (+37.5²) — 46 % | + Ústí — 58 % |
For Groups II, III, and 0, a single district produces between 37 % and 52 % of the entire heterogeneity statistic. Only Group I has a distributed pattern, where no single district dominates.
The contribution decomposition implies that V should fall sharply when the dominant outliers are removed. The test was rerun on contingency tables with the outlier district(s) excluded and expected values recomputed against the new marginal totals.
| Group | Full \(V\) | Drop top outlier | Drop top 2 outliers |
|---|---|---|---|
| Group I | 0.215 | 0.183 (drop Děčín) | 0.166 (drop Děčín, Chomutov) |
| Group II | 0.437 | 0.297 (drop Ústí) | 0.305 (drop Ústí, Most) |
| Group III | 0.374 | 0.214 (drop Most) | 0.162 (drop Most, Ústí) |
| Group 0 | 0.385 | 0.276 (drop Louny) | 0.239 (drop Louny, Ústí) |
Group III collapses from \(V = 0.374\) to \(V = 0.162\) when Most and Ústí are removed — a value below the conventional “low” threshold of 0.20. Group 0 falls from 0.385 to 0.239 when Louny alone is removed. Group II falls from 0.437 to 0.297 when Ústí is removed, and barely moves further when Most is also dropped. Group I is stable across exclusions and is the only group whose heterogeneity is not outlier-driven.
The substantive conclusion is sharper than the headline V values suggested in §2.2. Eight or nine of the ten districts agreed on a baseline classification rate for each group. The kraj-level heterogeneity is produced by specific localised deviations in Most, Ústí, and Louny, not by broad cross-district variation. The classification system was applied with reasonable consistency where there was no specific local reason to deviate.
Plotting each district’s group composition makes the visual point compactly:
Most is the only district whose Group III bar (in dark red) approaches half its enumerated population. Ústí is the only district with no Group III bar at all. Louny is the only district whose Group 0 bar dominates. Each outlier is visible without computation; the statistics quantify what the form already shows.
If the typology was measuring what it claimed to measure, observable district-level indicators should predict the classification rates. This section tests fifty pairwise relationships.
The Pearson product-moment correlation:
\[r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \cdot \sum_{i=1}^{n}(y_i - \bar{y})^2}}\]
is computed alongside the Spearman rank correlation (which is the Pearson formula applied to ranks rather than raw values, and is much less sensitive to single-outlier leverage). Reporting both serves as a robustness check: agreement between Pearson and Spearman indicates the relationship is approximately linear and not driven by one observation; disagreement indicates the relationship is concentrated in an outlier or is non-linear.
For \(n = 10\) (df = 8), the critical \(|r|\) at \(\alpha = 0.05\) is approximately 0.632; at \(\alpha = 0.01\), 0.765; at \(\alpha = 0.001\), 0.872.
Bonferroni correction. Running fifty tests at \(\alpha = 0.05\) each gives a roughly \(1 - (1-0.05)^{50} \approx 92\%\) chance of finding at least one false positive by chance alone. The Bonferroni correction reduces the per-test threshold to \(\alpha^* = \alpha/k = 0.001\). For \(n = 10\) this corresponds to \(|r| > 0.872\).
Two of the ten predictors and several of the others are affected by missing data. The relevant audit:
The illness predictor’s reduced sample size has a methodologically interesting consequence: the two districts dropped from the illness analysis are exactly the two G III outliers. The illness-vs-Group-III correlation is therefore being computed on the eight-district remainder — the relatively homogeneous set — which is itself diagnostic.
Two tests survive correction, both involving the same predictor:
| Predictor | Outcome | \(n\) | Pearson \(r\) | \(r^2\) | Spearman \(\rho\) |
|---|---|---|---|---|---|
| Mean family size | Group III share | 10 | 0.948 | 0.898 | 0.891 |
| Mean family size | Group III count | 10 | 0.932 | 0.869 | 0.758 |
Mean family size, computed as
total_persons_total ÷ total_families (the as-reported
version, not the consistent-denominator version), captures 90 % of the
variance in Group III share across districts. The robustness of this
finding is examined in §5.
| Predictor | Outcome | Pearson \(r\) | Spearman \(\rho\) | Class |
|---|---|---|---|---|
| Roma share of district | Group III count | 0.802 | 0.782 | Structural |
| Total Roma enumerated | Group III count | 0.842 | 0.770 | Mechanical |
| Overall employment rate | Group I share | 0.664 | 0.648 | Endogenous |
The first two reduce to a near-trivial observation: districts with more Roma have more Roma in any given group. The third is endogenous because employment is itself part of the typology’s stated criteria, so a correlation between aggregate employment and Group I share cannot be cleanly read either as a predictor relationship or as a consequence relationship.
Five predictors fail to predict Group III share or count at any conventional significance level:
| Predictor | \(n\) | Max \(|r|\) | Substantive note |
|---|---|---|---|
| District total population | 10 | 0.388 | Conventional fiscal-population proxy fails |
| Overall regular attendance rate | 10 | 0.454 | Pillar of typology’s textual rationale |
| Overall never-attends rate | 10 | 0.504 | Complement of attendance |
| Overall serious illness rate | 8 | 0.537 (negative) | Explicit ground for G III in typology |
| Total Roma share of district | 10 | 0.388 | Population-density argument fails |
The two strongest theoretical candidates for predicting Group III classification — school attendance and serious illness — show no statistical relationship to Group III share at any conventional threshold. Notably, the illness correlation is negative (r = −0.537): in the eight districts that report illness data, higher illness is weakly associated with lower Group III share, the opposite of what the typology’s textual criteria predict. The relationship is not statistically significant, but its sign is the wrong one.
Group 0’s share varies from 0 % (three districts) to 84.5 % (Louny). Yet none of the ten predictors correlates with Group 0 share at any conventional threshold (max \(|r| = 0.40\)). The Group 0 classification is large in magnitude, internally consistent (Louny’s residual of +37.5 stands alone), and unpredicted by any structural or demographic feature of the underlying population.
The mean-family-size-to-Group-III correlation is the only finding that survives strict correction. Two checks address whether it reflects a stable cross-district pattern or one driven by Most.
Most has the highest mean family size (8.29 persons/family on the as-reported denominator) and the highest Group III share (46.9 %). Removing Most and recomputing gives the following.
| Test | Full sample (\(n=10\)) | Without Most (\(n=9\)) |
|---|---|---|
| As-reported family size → G III share | \(r = 0.948, p < 0.0001\) | \(r = 0.823, p = 0.007\) |
| As-reported family size → G III count | \(r = 0.932, p = 0.0001\) | \(r = 0.735, p = 0.024\) |
The correlation attenuates substantially when Most is removed but remains significant at the uncorrected \(\alpha = 0.05\) level.
Recall from §1 that the as-reported total_persons_total
includes Group 0 inconsistently across districts. Mean family size
computed with the consistent denominator (sum of the four group counts ÷
total families) gives a very different picture:
| Test | Full sample (\(n=10\)) | Without Most (\(n=9\)) |
|---|---|---|
| Consistent family size → G III share | \(r = 0.294, p = 0.41\) | \(r = -0.083, p = 0.83\) |
When the denominator is corrected, the correlation collapses to non-significance even in the full sample, and goes slightly negative without Most. The Bonferroni-passing finding was an artefact of the same Group 0 inclusion inconsistency identified in §1.
Panel (a) shows the as-reported family-size relationship that produces \(r = 0.948\). Panel (b) shows the same relationship after fixing the denominator. The dramatic change in slope and the dispersion of the cloud demonstrate that the strong correlation in (a) was generated jointly by Most’s anomalous reporting and by the Group 0 denominator inconsistency that pushes Louny down the family-size axis to 4.95 in (a) but up to 9.13 in (b).
Whether or not the cross-district correlation is real, Most’s reported family-size figures are statistically anomalous against the rest of the kraj.
Null hypothesis. Most’s persons-per-family rate equals the rate in the other nine districts.
Under \(H_0\), given Most’s 487 reported families and the kraj-without-Most rate of 14,368 ÷ 2,620 = 5.483 persons/family, the expected persons in Most is:
\[E_{\text{Most}} = 487 \times 5.483 = 2{,}670\]
Observed: 4,038. The chi-square goodness-of-fit statistic:
\[\chi^2 = \frac{(4{,}038 - 2{,}670)^2}{2{,}670} = \frac{1{,}871{,}424}{2{,}670} \approx 701\]
with 1 df, giving \(p < 10^{-150}\).
The same test for Group III alone (excluding Ústí, which has missing G III families):
\[E_{\text{Most, G III}} = 215 \times \frac{2{,}176}{325} = 215 \times 6.695 = 1{,}439\]
\[\chi^2 = \frac{(1{,}892 - 1{,}439)^2}{1{,}439} = \frac{205{,}209}{1{,}439} \approx 143\]
with 1 df, \(p < 10^{-32}\).
Caveat. This test treats family count as fixed and person count as the random variable; a more rigorous Poisson rate-comparison would produce slightly different statistics but the same qualitative conclusion. The order-of-magnitude rejection is robust to specification.
The substantive interpretation: Most’s reported persons-per-family ratio differs from the rest of the kraj at probability levels that exclude any plausible random explanation. Three substantive possibilities:
The data alone cannot distinguish these. The first is substantively consistent with the historical record (multi-kin households in condemned residences), but it is also exactly the housing condition the demolition was meant to resolve.
The classification produces money, not simply description. The 1966 return allocated 699,996 Kčs across the kraj.
Most absorbed 27.6 % of the kraj envelope while housing 19.6 % of the kraj’s enumerated Roma — a funding-to-population ratio of 1.41×, the highest in the kraj. Within Most’s allocation, 142,335 Kčs (73.6 %) was attached to Group III alone. Most’s Group III headcount of 1,892 persons accounts for 46.5 % of the kraj’s entire Group III population.
The mechanism is the per-person rate:
\[\frac{\text{Rate G III}}{\text{Rate G I}} = \frac{75.23}{20.00} = 3.76\]
A district that classifies a person into Group III rather than Group I receives 3.76 times as much funding for that person. Most’s Group III rate of 47 % against a kraj average of approximately 20 % produces the over-allocation.
Within a closed kraj envelope, this transfer was financed by reduced allocations to the other nine districts — visible in Figure 4 as the green ratios (under-allocated districts: Teplice 0.85×, Děčín 0.80×, Louny 0.59×, Jablonec 0.86×, Česká Lípa 0.80×). Louny is particularly notable: Louny has a substantial enumerated Roma population (≈18 % of kraj, consistent denominator) but receives only ≈10 % of the envelope, because Louny’s high Group 0 rate produces a low effective per-person draw.
The missingness pattern in §4.2 is itself diagnostic. Most reports complete person-counts and family-counts by group, but misses every per-group breakdown of the demographic indicators (school attendance, employment) that constitute the classification’s textual criteria. Most also fails to report a total seriously ill count, although it reports zero in Group III. Liberec has a similar gap.
This is consistent with one of two readings. First, the per-group demographic data was simply not collected in Most and Liberec — the form was filled in the columns where data was available and left blank elsewhere. Second, the per-group demographic data was collected but not reported because it would have undercut the classification: if Most’s Group III population was disproportionately not in chronic poor health, not characterised by school non-attendance, and not unemployed, the classification’s textual rationale would not survive scrutiny. The data alone cannot distinguish these readings. What can be said is that the two districts whose per-group demographics are missing are also two of the three districts that drive the kraj-level classification heterogeneity (Most for G III; Liberec is closer to baseline). The verification trail that would have allowed a kraj or VVOCO auditor to check Most’s classification against its stated criteria does not exist in the form.
Three additional internal arithmetic inconsistencies are documented in the source data notes (Chomutov: families sum 295 vs total 289; Jablonec: illness sum 20 vs total 13; Louny: school-children sum 650 vs total 620; Ústí: persons sum 1,968 vs total 1,810). These are not large in magnitude but indicate that the kraj-level submission was not being checked for internal consistency before transmission.
Six findings follow from the calculations above.
District classifications were demonstrably non-uniform, but the non-uniformity is concentrated in three outlier districts. Chi-square tests reject uniform application for all four groups at \(p\)-values below floating-point representability. Cramér’s V values for the full sample using the consistent denominator (0.215, 0.437, 0.374, 0.385) are moderate to high, but they collapse sharply when outliers are removed: Group III from 0.374 to 0.162 (drop Most and Ústí), Group 0 from 0.385 to 0.239 (drop Louny and Ústí), Group II from 0.437 to 0.305. Only Group I is stable. Eight of ten districts agreed on a baseline classification rate for each group. The three outlier districts are Most (Group III), Ústí (Group II ↑, Group III ↓), and Louny (Group 0).
The headline Cramér’s V values reported in the predecessor analysis were partly an artefact of denominator inconsistency. The original analysis used the as-reported persons total, which mixes districts that include Group 0 in their summary with districts that exclude it. Recomputing on the consistent sum-of-groups denominator drops Group 0’s V from 0.586 to 0.385 and Group II’s from 0.467 to 0.437. The substantive conclusion is unchanged but the magnitudes and the Group 0 picture are more cautious.
The indicators the typology cited as criteria for the I/II/III ordering do not predict the classification across districts. School attendance, illness, and employment all fail to predict Group III share at any conventional threshold. The illness rate, which is the explicit ground for Group III in the typology’s textual rationale, has a negative (though not significant) correlation with Group III share in the eight districts that report it. The system’s stated criteria are statistically unrelated to its outputs, and where they correlate at all, the sign is the opposite of what the typology predicts.
The only finding that survived strict statistical correction is artefactual. Mean family size predicts Group III share at \(r = 0.948\) when computed with the as-reported denominator. With the consistent denominator from §1, the correlation falls to \(r = 0.294\) and ceases to be significant; without Most, it goes to \(r = -0.083\). The finding was generated jointly by Most’s anomalous family-size reporting and by the Group 0 denominator inconsistency. The substantive conclusion is now stronger than the original report’s: there is no surviving exogenous predictor of Group III classification.
Most’s family-size reporting is statistically anomalous against the rest of the kraj at \(p < 10^{-150}\) for the overall ratio and \(p < 10^{-32}\) for Group III specifically. The anomaly admits multiple substantive interpretations (genuinely larger families, undercounted families, inflated headcount), all of which point to Most’s reporting practice diverging from kraj norms.
Most received fiscal resources at a rate substantially above what its population share would imply. The over-allocation flowed mechanically through the differential per-person rates attached to Group III, financed within a closed kraj envelope by the other nine districts. Most’s funding-to-population ratio of 1.41× is the highest in the kraj.
The numerical evidence does not, on its own, establish the cause of Most’s anomaly. It identifies the cell of the kraj that any explanation has to account for — and shows that the classification system, applied with reasonable consistency in eight of ten districts, was applied in Most in a way that produced the headcount needed to extract maximum funding for the population the demolition decision required to be moved. The historical record (the 1964 commitment to demolish Old Most for brown-coal extraction, the housing-allocation decisions that had concentrated Roma in the demolition-targeted district, the documented failure of the dispersal effort the classification was meant to fund) supplies the substantive context. Whether Most’s commission was applying the classification strategically or merely responding to the city’s housing concentration is a question the data cannot resolve. What the data establish is that Most is the cell, that its anomaly is the engine of the kraj’s classification heterogeneity, and that no exogenous indicator of the underlying population accounts for it.
Sample size. With \(n = 10\) districts, statistical power to detect medium-sized correlations is limited. A medium effect (\(|r| = 0.5\)) has approximately 34 % power at \(\alpha = 0.05\). The absence of significant correlations should be read as “no strong relationship in the available data,” not “no relationship of any size.” The substantive argument rests on the contrast between the magnitude of the chi-square heterogeneity and the absence of any predictor that captures it, not on a precise zero correlation in any individual test.
Endogeneity of Class B predictors. Aggregate demographic rates (school attendance, employment, illness) are partly produced by the same enumeration process whose outputs are the outcome variables. A correlation between an aggregate rate and a group share could reflect either the classification rule producing the aggregate, or the aggregate informing the classification. The analysis cannot distinguish these on the available data.
Bonferroni is conservative. It assumes independent tests; the fifty tests share underlying observations and are not independent. Less conservative corrections (Holm-Bonferroni, Benjamini-Hochberg) would relax the threshold somewhat. Under any reasonable correction, the family-size finding fails once the denominator is corrected.
Most as a leverage point. Most is an outlier on essentially every variable. Whether Most should be analysed within the kraj-level statistics or treated as a separate case is itself an interpretive question; this report follows the convention of reporting both with-Most and without-Most results.
The denominator choice in §1 is an interpretive decision. Both the as-reported total and the consistent sum-of-groups total can be defended. The as-reported total is what the kraj submitted; the consistent total is what would result if all districts followed the same convention. This report uses the consistent total because the question being tested is about classification practice, not reporting practice. A different methodological choice would change some V values modestly but not the substantive conclusion.
Missing data was handled by listwise deletion (analyses with missing variables drop those districts). The illness predictor was therefore tested on \(n = 8\), not \(n = 10\). This affects the statistical power of the illness null but not its sign — which, notably, is the wrong sign for the typology’s stated rationale.