Statistical Analysis of the 1966 KPOCO Census Return for the Severočeský Kraj

Source document. 1966 statistical return submitted by the Severočeský kraj to the Vládní výbor pro otázky cikánského obyvatelstva (VVOCO), the government committee established by Resolution 502/1965.

Dataset. data_set_kpoco_1966_census_clean.csv — ten districts of the Severočeský kraj (Česká Lípa, Děčín, Chomutov, Jablonec nad Nisou, Liberec, Litoměřice, Louny, Most, Teplice, Ústí nad Labem) by forty-nine variables.

Question. The 1966 return classified each enumerated Roma person into one of four groups (0, I, II, III) according to assessed “level of social adaptability.” The classification carried direct fiscal consequences: per-person funding rates were 20 Kčs for Group I, approximately 40 Kčs for Group II, and 75.23 Kčs for Group III. This report tests whether the resulting classification reflects features of the underlying Roma population, as the form’s design implies, or whether it reflects district-level administrative practice.


1. Descriptive baseline and a critical denominator choice

The kraj reported 18,406 enumerated Roma in the total_persons_total column. However, summing the four group counts (Group 0, I, II, III) across the ten districts gives 20,646. The discrepancy of 2,240 persons is not a counting error — it is a systematic reporting inconsistency. Districts disagreed on whether to include Group 0 in their summary total:

District Group 0 in summary? Group 0 count Reported total Sum of 4 groups
Česká Lípa No 156 400 556
Děčín Yes 399 2,167 2,167
Chomutov No 110 1,626 1,736
Jablonec No 122 751 873
Liberec Yes 300 2,152 2,152
Litoměřice (G 0 = 0) 0 860 860
Louny No 1,694 2,005 3,699
Most Yes 380 4,038 4,038
Teplice Yes 754 2,597 2,597
Ústí nad Labem (G 0 = 0) 0 1,810 1,968

This is not a marginal issue. Louny’s reported total of 2,005 excludes its 1,694 Group 0 persons; the consistent enumeration is 3,699. All analyses that use a district share or rate are sensitive to which denominator is chosen. This report therefore uses the consistent denominator (sum of the four group counts, \(N = 20{,}646\)) throughout, except where explicitly stated otherwise. This differs from the original cramers_v_full_results.csv analysis, which used the as-reported total (\(N = 18{,}406\)); that choice partly conflated classification heterogeneity with reporting-convention heterogeneity. The substantive consequence of the corrected denominator is reported in §2 and §5 below.

The kraj’s enumerated population, using the consistent denominator:

Group Headcount Share of kraj
Group 0 (residual / “not needing care”) 3,915 19.0 %
Group I (most adapted) 5,628 27.3 %
Group II (partially adapted) 7,035 34.1 %
Group III (least adapted) 4,068 19.7 %
Total enumerated 20,646 100 %

2. Test of classification uniformity across districts

2.1 Hypothesis and test

If the classification system was applied uniformly across the kraj, every district should distribute its enumerated Roma into the four groups in approximately the same proportions as the kraj as a whole. The chi-square test of independence evaluates whether two categorical variables are independent. For each group, we construct a 10×2 contingency table (district by “in this group” vs. “not in this group”) and compute:

\[\chi^2 = \sum_{i=1}^{10} \sum_{j=1}^{2} \frac{(O_{ij} - E_{ij})^2}{E_{ij}}, \quad \text{where} \quad E_{ij} = \frac{(\text{row total}_i)(\text{column total}_j)}{N}\]

Cramér’s V rescales the chi-square statistic to a standardised range from 0 (perfect uniformity) to 1 (maximum divergence):

\[V = \sqrt{\frac{\chi^2}{N \cdot \min(r-1, c-1)}}\]

For a 10×2 table this simplifies to \(V = \sqrt{\chi^2/N}\). By convention, \(V < 0.10\) is low, \(V \approx 0.30\) is moderate, \(V > 0.40\) is high.

2.2 Results (consistent denominator, \(N = 20{,}646\))

Group \(\chi^2\) df \(p\)-value Cramér’s V
Group I 956.5 9 \(< 10^{-200}\) 0.215
Group II 3,938.6 9 \(\approx 0\) 0.437
Group III 2,888.5 9 \(\approx 0\) 0.374
Group 0 3,066.1 9 \(\approx 0\) 0.385

Comparison with the as-reported denominator. Using \(N = 18{,}406\) produces V values of 0.244, 0.467, 0.351, and 0.586 respectively — most notably, Group 0’s V drops from 0.586 to 0.385 once the denominator inconsistency is fixed. Roughly half of the apparent Group 0 heterogeneity in the original analysis is an artefact of districts disagreeing about whether to count Group 0 in their summary total, not about whether to classify people into Group 0. The corrected V values are the basis for everything that follows.

The null hypothesis of uniform classification is rejected for all four groups at \(p\)-values below floating-point representability. The classification system was demonstrably not being applied uniformly.


3. Locating the source of the non-uniformity

A high V value can be produced in two structurally different ways: by all ten districts varying moderately, or by one or two districts varying extremely while the others sit close to the uniform baseline. The standardised residual distinguishes these cases.

For each district-by-group cell:

\[z_{ij} = \frac{O_{ij} - E_{ij}}{\sqrt{E_{ij}}}\]

Values above \(|2|\) are statistically unusual; above \(|5|\) are extreme.

3.1 The picture in one figure

Figure 2
Figure 2

Three districts produce nearly all the heterogeneity in the kraj.

  • Most has a Group III residual of \(+38.9\), by far the largest single anomaly in the dataset. Most also carries large negative residuals on Groups 0 (\(-13.9\)) and II (\(-15.9\)), meaning Most has pulled people from the lower-funding categories into the highest-funding category.
  • Ústí nad Labem is the mirror image on Group III: residual \(-19.7\) (zero observed against an expected 400). On Group II its residual is \(+38.1\) — the second-largest single anomaly in the dataset. Ústí pulled people from Group III into Group II, the opposite of what Most did.
  • Louny has a Group 0 residual of \(+37.5\) — the third-largest single anomaly. Louny classified 1,694 of its 2,005 form-reported Roma (or 1,694 of 3,699 in the consistent count) into the residual “not needing care” category.

The other seven districts have residuals that are smaller in absolute value, generally below \(|15|\), and many close to \(|5|\) or below.

3.2 Chi-square contribution decomposition

Each cell’s squared residual is its contribution to the total chi-square statistic. The contributions decompose as follows:

Group Total \(\chi^2\) Top contributor Top 2 contributors
Group I 956 Děčín (15.5²) — 25 % + Chomutov — 38 %
Group II 3,939 Ústí (+38.1²) — 37 % + Most — 44 %
Group III 2,889 Most (+38.9²) — 52 % + Ústí — 66 %
Group 0 3,066 Louny (+37.5²) — 46 % + Ústí — 58 %

For Groups II, III, and 0, a single district produces between 37 % and 52 % of the entire heterogeneity statistic. Only Group I has a distributed pattern, where no single district dominates.

3.3 Cramér’s V after outlier removal

The contribution decomposition implies that V should fall sharply when the dominant outliers are removed. The test was rerun on contingency tables with the outlier district(s) excluded and expected values recomputed against the new marginal totals.

Figure 3
Figure 3
Group Full \(V\) Drop top outlier Drop top 2 outliers
Group I 0.215 0.183 (drop Děčín) 0.166 (drop Děčín, Chomutov)
Group II 0.437 0.297 (drop Ústí) 0.305 (drop Ústí, Most)
Group III 0.374 0.214 (drop Most) 0.162 (drop Most, Ústí)
Group 0 0.385 0.276 (drop Louny) 0.239 (drop Louny, Ústí)

Group III collapses from \(V = 0.374\) to \(V = 0.162\) when Most and Ústí are removed — a value below the conventional “low” threshold of 0.20. Group 0 falls from 0.385 to 0.239 when Louny alone is removed. Group II falls from 0.437 to 0.297 when Ústí is removed, and barely moves further when Most is also dropped. Group I is stable across exclusions and is the only group whose heterogeneity is not outlier-driven.

The substantive conclusion is sharper than the headline V values suggested in §2.2. Eight or nine of the ten districts agreed on a baseline classification rate for each group. The kraj-level heterogeneity is produced by specific localised deviations in Most, Ústí, and Louny, not by broad cross-district variation. The classification system was applied with reasonable consistency where there was no specific local reason to deviate.

3.4 The same picture in a different lens

Plotting each district’s group composition makes the visual point compactly:

Figure 1
Figure 1

Most is the only district whose Group III bar (in dark red) approaches half its enumerated population. Ústí is the only district with no Group III bar at all. Louny is the only district whose Group 0 bar dominates. Each outlier is visible without computation; the statistics quantify what the form already shows.


4. What predicts the classification?

If the typology was measuring what it claimed to measure, observable district-level indicators should predict the classification rates. This section tests fifty pairwise relationships.

4.1 Method

The Pearson product-moment correlation:

\[r = \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2 \cdot \sum_{i=1}^{n}(y_i - \bar{y})^2}}\]

is computed alongside the Spearman rank correlation (which is the Pearson formula applied to ranks rather than raw values, and is much less sensitive to single-outlier leverage). Reporting both serves as a robustness check: agreement between Pearson and Spearman indicates the relationship is approximately linear and not driven by one observation; disagreement indicates the relationship is concentrated in an outlier or is non-linear.

For \(n = 10\) (df = 8), the critical \(|r|\) at \(\alpha = 0.05\) is approximately 0.632; at \(\alpha = 0.01\), 0.765; at \(\alpha = 0.001\), 0.872.

Bonferroni correction. Running fifty tests at \(\alpha = 0.05\) each gives a roughly \(1 - (1-0.05)^{50} \approx 92\%\) chance of finding at least one false positive by chance alone. The Bonferroni correction reduces the per-test threshold to \(\alpha^* = \alpha/k = 0.001\). For \(n = 10\) this corresponds to \(|r| > 0.872\).

4.2 Sample size with respect to missingness

Two of the ten predictors and several of the others are affected by missing data. The relevant audit:

Figure 6
Figure 6
  • Total persons, total families, total school children, total persons aged 16+, regular and never-attends counts, total employed: complete (\(n = 10\)).
  • Total seriously ill persons: missing for Most and Ústí (\(n = 8\)).
  • All per-group demographic breakdowns: missing for Most (almost completely) and Liberec (substantially).

The illness predictor’s reduced sample size has a methodologically interesting consequence: the two districts dropped from the illness analysis are exactly the two G III outliers. The illness-vs-Group-III correlation is therefore being computed on the eight-district remainder — the relatively homogeneous set — which is itself diagnostic.

4.3 Findings significant at the strict (Bonferroni-corrected) threshold

Two tests survive correction, both involving the same predictor:

Predictor Outcome \(n\) Pearson \(r\) \(r^2\) Spearman \(\rho\)
Mean family size Group III share 10 0.948 0.898 0.891
Mean family size Group III count 10 0.932 0.869 0.758

Mean family size, computed as total_persons_total ÷ total_families (the as-reported version, not the consistent-denominator version), captures 90 % of the variance in Group III share across districts. The robustness of this finding is examined in §5.

4.4 Findings significant at uncorrected \(\alpha = 0.05\) on both Pearson and Spearman

Predictor Outcome Pearson \(r\) Spearman \(\rho\) Class
Roma share of district Group III count 0.802 0.782 Structural
Total Roma enumerated Group III count 0.842 0.770 Mechanical
Overall employment rate Group I share 0.664 0.648 Endogenous

The first two reduce to a near-trivial observation: districts with more Roma have more Roma in any given group. The third is endogenous because employment is itself part of the typology’s stated criteria, so a correlation between aggregate employment and Group I share cannot be cleanly read either as a predictor relationship or as a consequence relationship.

4.5 The substantively important findings: the nulls

Five predictors fail to predict Group III share or count at any conventional significance level:

Predictor \(n\) Max \(|r|\) Substantive note
District total population 10 0.388 Conventional fiscal-population proxy fails
Overall regular attendance rate 10 0.454 Pillar of typology’s textual rationale
Overall never-attends rate 10 0.504 Complement of attendance
Overall serious illness rate 8 0.537 (negative) Explicit ground for G III in typology
Total Roma share of district 10 0.388 Population-density argument fails

The two strongest theoretical candidates for predicting Group III classification — school attendance and serious illness — show no statistical relationship to Group III share at any conventional threshold. Notably, the illness correlation is negative (r = −0.537): in the eight districts that report illness data, higher illness is weakly associated with lower Group III share, the opposite of what the typology’s textual criteria predict. The relationship is not statistically significant, but its sign is the wrong one.

4.6 Group 0 has no predictor

Group 0’s share varies from 0 % (three districts) to 84.5 % (Louny). Yet none of the ten predictors correlates with Group 0 share at any conventional threshold (max \(|r| = 0.40\)). The Group 0 classification is large in magnitude, internally consistent (Louny’s residual of +37.5 stands alone), and unpredicted by any structural or demographic feature of the underlying population.


5. Robustness check on the surviving result

The mean-family-size-to-Group-III correlation is the only finding that survives strict correction. Two checks address whether it reflects a stable cross-district pattern or one driven by Most.

5.1 Leave-one-out

Most has the highest mean family size (8.29 persons/family on the as-reported denominator) and the highest Group III share (46.9 %). Removing Most and recomputing gives the following.

Test Full sample (\(n=10\)) Without Most (\(n=9\))
As-reported family size → G III share \(r = 0.948, p < 0.0001\) \(r = 0.823, p = 0.007\)
As-reported family size → G III count \(r = 0.932, p = 0.0001\) \(r = 0.735, p = 0.024\)

The correlation attenuates substantially when Most is removed but remains significant at the uncorrected \(\alpha = 0.05\) level.

5.2 Fixing the denominator inconsistency

Recall from §1 that the as-reported total_persons_total includes Group 0 inconsistently across districts. Mean family size computed with the consistent denominator (sum of the four group counts ÷ total families) gives a very different picture:

Test Full sample (\(n=10\)) Without Most (\(n=9\))
Consistent family size → G III share \(r = 0.294, p = 0.41\) \(r = -0.083, p = 0.83\)

When the denominator is corrected, the correlation collapses to non-significance even in the full sample, and goes slightly negative without Most. The Bonferroni-passing finding was an artefact of the same Group 0 inclusion inconsistency identified in §1.

Figure 5
Figure 5

Panel (a) shows the as-reported family-size relationship that produces \(r = 0.948\). Panel (b) shows the same relationship after fixing the denominator. The dramatic change in slope and the dispersion of the cloud demonstrate that the strong correlation in (a) was generated jointly by Most’s anomalous reporting and by the Group 0 denominator inconsistency that pushes Louny down the family-size axis to 4.95 in (a) but up to 9.13 in (b).

5.3 Most’s family size is itself anomalous

Whether or not the cross-district correlation is real, Most’s reported family-size figures are statistically anomalous against the rest of the kraj.

Null hypothesis. Most’s persons-per-family rate equals the rate in the other nine districts.

Under \(H_0\), given Most’s 487 reported families and the kraj-without-Most rate of 14,368 ÷ 2,620 = 5.483 persons/family, the expected persons in Most is:

\[E_{\text{Most}} = 487 \times 5.483 = 2{,}670\]

Observed: 4,038. The chi-square goodness-of-fit statistic:

\[\chi^2 = \frac{(4{,}038 - 2{,}670)^2}{2{,}670} = \frac{1{,}871{,}424}{2{,}670} \approx 701\]

with 1 df, giving \(p < 10^{-150}\).

The same test for Group III alone (excluding Ústí, which has missing G III families):

\[E_{\text{Most, G III}} = 215 \times \frac{2{,}176}{325} = 215 \times 6.695 = 1{,}439\]

\[\chi^2 = \frac{(1{,}892 - 1{,}439)^2}{1{,}439} = \frac{205{,}209}{1{,}439} \approx 143\]

with 1 df, \(p < 10^{-32}\).

Caveat. This test treats family count as fixed and person count as the random variable; a more rigorous Poisson rate-comparison would produce slightly different statistics but the same qualitative conclusion. The order-of-magnitude rejection is robust to specification.

The substantive interpretation: Most’s reported persons-per-family ratio differs from the rest of the kraj at probability levels that exclude any plausible random explanation. Three substantive possibilities:

  1. Most’s families really were larger — possible given the documented housing situation in Old Most.
  2. The family count is undercounted relative to the headcount.
  3. The headcount is inflated relative to the family count.

The data alone cannot distinguish these. The first is substantively consistent with the historical record (multi-kin households in condemned residences), but it is also exactly the housing condition the demolition was meant to resolve.


6. Fiscal arithmetic

The classification produces money, not simply description. The 1966 return allocated 699,996 Kčs across the kraj.

Figure 4
Figure 4

Most absorbed 27.6 % of the kraj envelope while housing 19.6 % of the kraj’s enumerated Roma — a funding-to-population ratio of 1.41×, the highest in the kraj. Within Most’s allocation, 142,335 Kčs (73.6 %) was attached to Group III alone. Most’s Group III headcount of 1,892 persons accounts for 46.5 % of the kraj’s entire Group III population.

The mechanism is the per-person rate:

\[\frac{\text{Rate G III}}{\text{Rate G I}} = \frac{75.23}{20.00} = 3.76\]

A district that classifies a person into Group III rather than Group I receives 3.76 times as much funding for that person. Most’s Group III rate of 47 % against a kraj average of approximately 20 % produces the over-allocation.

Within a closed kraj envelope, this transfer was financed by reduced allocations to the other nine districts — visible in Figure 4 as the green ratios (under-allocated districts: Teplice 0.85×, Děčín 0.80×, Louny 0.59×, Jablonec 0.86×, Česká Lípa 0.80×). Louny is particularly notable: Louny has a substantial enumerated Roma population (≈18 % of kraj, consistent denominator) but receives only ≈10 % of the envelope, because Louny’s high Group 0 rate produces a low effective per-person draw.


7. Reporting completeness and what it implies

The missingness pattern in §4.2 is itself diagnostic. Most reports complete person-counts and family-counts by group, but misses every per-group breakdown of the demographic indicators (school attendance, employment) that constitute the classification’s textual criteria. Most also fails to report a total seriously ill count, although it reports zero in Group III. Liberec has a similar gap.

This is consistent with one of two readings. First, the per-group demographic data was simply not collected in Most and Liberec — the form was filled in the columns where data was available and left blank elsewhere. Second, the per-group demographic data was collected but not reported because it would have undercut the classification: if Most’s Group III population was disproportionately not in chronic poor health, not characterised by school non-attendance, and not unemployed, the classification’s textual rationale would not survive scrutiny. The data alone cannot distinguish these readings. What can be said is that the two districts whose per-group demographics are missing are also two of the three districts that drive the kraj-level classification heterogeneity (Most for G III; Liberec is closer to baseline). The verification trail that would have allowed a kraj or VVOCO auditor to check Most’s classification against its stated criteria does not exist in the form.

Three additional internal arithmetic inconsistencies are documented in the source data notes (Chomutov: families sum 295 vs total 289; Jablonec: illness sum 20 vs total 13; Louny: school-children sum 650 vs total 620; Ústí: persons sum 1,968 vs total 1,810). These are not large in magnitude but indicate that the kraj-level submission was not being checked for internal consistency before transmission.


8. What the analysis establishes

Six findings follow from the calculations above.

  1. District classifications were demonstrably non-uniform, but the non-uniformity is concentrated in three outlier districts. Chi-square tests reject uniform application for all four groups at \(p\)-values below floating-point representability. Cramér’s V values for the full sample using the consistent denominator (0.215, 0.437, 0.374, 0.385) are moderate to high, but they collapse sharply when outliers are removed: Group III from 0.374 to 0.162 (drop Most and Ústí), Group 0 from 0.385 to 0.239 (drop Louny and Ústí), Group II from 0.437 to 0.305. Only Group I is stable. Eight of ten districts agreed on a baseline classification rate for each group. The three outlier districts are Most (Group III), Ústí (Group II ↑, Group III ↓), and Louny (Group 0).

  2. The headline Cramér’s V values reported in the predecessor analysis were partly an artefact of denominator inconsistency. The original analysis used the as-reported persons total, which mixes districts that include Group 0 in their summary with districts that exclude it. Recomputing on the consistent sum-of-groups denominator drops Group 0’s V from 0.586 to 0.385 and Group II’s from 0.467 to 0.437. The substantive conclusion is unchanged but the magnitudes and the Group 0 picture are more cautious.

  3. The indicators the typology cited as criteria for the I/II/III ordering do not predict the classification across districts. School attendance, illness, and employment all fail to predict Group III share at any conventional threshold. The illness rate, which is the explicit ground for Group III in the typology’s textual rationale, has a negative (though not significant) correlation with Group III share in the eight districts that report it. The system’s stated criteria are statistically unrelated to its outputs, and where they correlate at all, the sign is the opposite of what the typology predicts.

  4. The only finding that survived strict statistical correction is artefactual. Mean family size predicts Group III share at \(r = 0.948\) when computed with the as-reported denominator. With the consistent denominator from §1, the correlation falls to \(r = 0.294\) and ceases to be significant; without Most, it goes to \(r = -0.083\). The finding was generated jointly by Most’s anomalous family-size reporting and by the Group 0 denominator inconsistency. The substantive conclusion is now stronger than the original report’s: there is no surviving exogenous predictor of Group III classification.

  5. Most’s family-size reporting is statistically anomalous against the rest of the kraj at \(p < 10^{-150}\) for the overall ratio and \(p < 10^{-32}\) for Group III specifically. The anomaly admits multiple substantive interpretations (genuinely larger families, undercounted families, inflated headcount), all of which point to Most’s reporting practice diverging from kraj norms.

  6. Most received fiscal resources at a rate substantially above what its population share would imply. The over-allocation flowed mechanically through the differential per-person rates attached to Group III, financed within a closed kraj envelope by the other nine districts. Most’s funding-to-population ratio of 1.41× is the highest in the kraj.

The numerical evidence does not, on its own, establish the cause of Most’s anomaly. It identifies the cell of the kraj that any explanation has to account for — and shows that the classification system, applied with reasonable consistency in eight of ten districts, was applied in Most in a way that produced the headcount needed to extract maximum funding for the population the demolition decision required to be moved. The historical record (the 1964 commitment to demolish Old Most for brown-coal extraction, the housing-allocation decisions that had concentrated Roma in the demolition-targeted district, the documented failure of the dispersal effort the classification was meant to fund) supplies the substantive context. Whether Most’s commission was applying the classification strategically or merely responding to the city’s housing concentration is a question the data cannot resolve. What the data establish is that Most is the cell, that its anomaly is the engine of the kraj’s classification heterogeneity, and that no exogenous indicator of the underlying population accounts for it.


9. Methodological caveats

  1. Sample size. With \(n = 10\) districts, statistical power to detect medium-sized correlations is limited. A medium effect (\(|r| = 0.5\)) has approximately 34 % power at \(\alpha = 0.05\). The absence of significant correlations should be read as “no strong relationship in the available data,” not “no relationship of any size.” The substantive argument rests on the contrast between the magnitude of the chi-square heterogeneity and the absence of any predictor that captures it, not on a precise zero correlation in any individual test.

  2. Endogeneity of Class B predictors. Aggregate demographic rates (school attendance, employment, illness) are partly produced by the same enumeration process whose outputs are the outcome variables. A correlation between an aggregate rate and a group share could reflect either the classification rule producing the aggregate, or the aggregate informing the classification. The analysis cannot distinguish these on the available data.

  3. Bonferroni is conservative. It assumes independent tests; the fifty tests share underlying observations and are not independent. Less conservative corrections (Holm-Bonferroni, Benjamini-Hochberg) would relax the threshold somewhat. Under any reasonable correction, the family-size finding fails once the denominator is corrected.

  4. Most as a leverage point. Most is an outlier on essentially every variable. Whether Most should be analysed within the kraj-level statistics or treated as a separate case is itself an interpretive question; this report follows the convention of reporting both with-Most and without-Most results.

  5. The denominator choice in §1 is an interpretive decision. Both the as-reported total and the consistent sum-of-groups total can be defended. The as-reported total is what the kraj submitted; the consistent total is what would result if all districts followed the same convention. This report uses the consistent total because the question being tested is about classification practice, not reporting practice. A different methodological choice would change some V values modestly but not the substantive conclusion.

  6. Missing data was handled by listwise deletion (analyses with missing variables drop those districts). The illness predictor was therefore tested on \(n = 8\), not \(n = 10\). This affects the statistical power of the illness null but not its sign — which, notably, is the wrong sign for the typology’s stated rationale.