Kabinga Chanda
kabingachanda@ | LinkedIn
Cervical cancer is one of the most preventable cancers in the world. We have the vaccine. We have the screening tools. We have the treatment. And yet, in 2022, it remains the fourth most common cancer among women globally, claiming the lives of hundreds of thousands each year.
This report asks a simple but important question: where is the burden concentrated, and what is driving it?
Using data from the Global Cancer Observatory (WHO, 2022) and a series of spatial statistical methods, this analysis maps the global distribution of cervical cancer incidence, identifies significant geographic clusters, and examines the country-level factors associated with elevated risk.
Key finding: 42 contiguous countries in Sub-Saharan Africa form a single, statistically significant high-burden cluster, where women face nearly double the cervical cancer incidence compared to the global average (SMR = 1.98, p = 0.001).
| Dataset | Source | Year |
|---|---|---|
| Cervical cancer incidence rate (per 100,000) | Global Cancer Observatory, WHO | 2022 |
| HPV vaccination coverage | WHO Immunization Data | 2022 |
| HIV prevalence | World Bank | 2022 |
| GDP per capita | World Bank | 2022 |
| Urbanisation rate | World Bank | 2022 |
| Mean years of female education | UNDP HDI | 2022 |
| Country boundaries | Natural Earth (rnaturalearth) | 2024 |
The analysis followed a structured spatial epidemiology workflow:
All spatial analyses were conducted in R using the
spdep, SpatialEpi, lme4, and
mice packages.
Figure 1. Cervical cancer incidence rate per 100,000 women, 2022. Grey indicates no data available.
The map reveals a stark geographic pattern. Sub-Saharan Africa carries the heaviest burden, with deep red shading across Eastern and Southern Africa. Europe, North America, and Australia show the lowest incidence rates. A large portion of Central Asia and parts of Oceania have no data available.
Figure 2. Top 10 countries by cervical cancer incidence rate per 100,000 women, 2022.
Every one of the top 10 countries is in Sub-Saharan Africa. Eswatini records the highest rate in the world at over 95 cases per 100,000 women, more than ten times the rate seen in many Western European countries.
Before identifying clusters, it is important to ask whether the geographic pattern in the map reflects genuine spatial clustering or could simply be random variation.
Global Moran’s I tests this formally. A statistic close to +1 indicates that similar values cluster together; close to 0 suggests a random pattern.
| Table 1. Global Moran’s I Results | |
| Parameter | Value |
|---|---|
| Moran I statistic | 0.606 |
| Expectation | -0.007 |
| Variance | 0.004 |
| p-value | p<0.001 |
The Moran’s I statistic is positive and statistically significant (p < 0.001), confirming that cervical cancer incidence is not randomly distributed. High-burden countries cluster near other high-burden countries.
Global Moran’s I tells us clustering exists. Local Moran’s I (LISA) tells us where.
Figure 3. LISA cluster map showing local spatial autocorrelation in cervical cancer incidence, 2022.
How to read this map:
The High-High cluster covers most of Sub-Saharan Africa. The Low-Low cluster covers the developed world. The contrast could not be more striking.
The LISA map identifies individual country-level clusters. The Kulldorff spatial scan statistic takes this further, identifying the single most likely disease cluster and testing its statistical significance using Monte Carlo simulation.
| Table 2. Kulldorff Spatial Scan Statistics | |
| Most Likely Cluster, Cervical Cancer Incidence, 2022 | |
| Parameter | Value |
|---|---|
| Number of Countries in Cluster | 42 |
| Countries Included | Namibia, Botswana, Angola, South Africa, Lesotho, Zambia, Zimbabwe, eSwatini, Dem. Rep. Congo, Congo, Gabon, Malawi, Mozambique, Burundi, Eq. Guinea, Rwanda, São Tomé and Principe, Tanzania, Cameroon, Central African Rep., Uganda, Nigeria, S. Sudan, Kenya, Comoros, Togo, Benin, Chad, Madagascar, Ghana, Niger, Sudan, Côte d'Ivoire, Ethiopia, Burkina Faso, Liberia, Somalia, Eritrea, Djibouti, Mali, Sierra Leone, Guinea |
| Observed Cases | 149015 |
| Expected Cases | 75306 |
| SMR (Relative Risk) | 1.98 |
| Log-Likelihood Ratio | 39953.54 |
| Monte Carlo Rank | 1 |
| P-value | 0.001 |
| Source: GCO, WHO, 2022 | SpatialEpi Kulldorff scan, 999 Monte Carlo simulations | |
Figure 4. Kulldorff spatial scan cluster map. The most likely cluster (red) covers 42 Sub-Saharan African countries.
The Standardised Morbidity Ratio (SMR) of 1.98 means that women within this 42-country cluster experience 98% higher cervical cancer incidence than would be expected under the global null. This finding is highly statistically significant (p = 0.001).
To understand what country-level factors are associated with higher incidence, two complementary models were fitted.
OLS regression was applied to complete cases (n = 77 countries) to establish baseline associations. Spatial diagnostics on the OLS residuals confirmed that Moran’s I was non-significant (I = 0.003, p = 0.44) and Lagrange Multiplier tests did not indicate the need for a spatial lag or error specification. This suggests the covariates largely explain the observed spatial clustering.
Multilevel regression nested countries within UN regions (random intercept) and used multiple imputation (MICE, m = 5, predictive mean matching) to handle missing covariate data. The Intraclass Correlation Coefficient (ICC = 0.13) confirmed that 13% of residual variance in incidence was attributable to between-region differences, justifying the multilevel structure.
|
Outcome: Cervical Cancer Incidence Rate
|
||
|---|---|---|
| OLS | Multilevel | |
| Intercept | 38.010*** | 17.848*** |
| (4.545) | (1.653) | |
| HPV Vaccination Coverage | -0.043 | -1.432 |
| (0.041) | (1.364) | |
| HIV Prevalence | 2.301*** | 8.653*** |
| (0.269) | (1.307) | |
| GDP per Capita | -0.000 | 0.199 |
| (0.000) | (0.985) | |
| Urbanisation | -0.102 | -2.653 |
| (0.076) | (1.337) | |
| Female Education (mean years) | -1.296** | -3.112 |
| (0.485) | (1.447) | |
| Num.Obs. | 77 | |
| R2 | 0.707 | |
| R2 Adj. | 0.686 | |
| * p < 0.05, ** p < 0.01, *** p < 0.001 | ||
| OLS: complete cases only (n = 77). Multilevel: multiple imputation (m = 5, PMM). | ||
| Multilevel model includes random intercept for UN region (ICC = 0.13). | ||
| Multilevel predictors are standardised (mean = 0, SD = 1). | ||
| Source: GCO WHO 2022; UNDP HDI 2022; World Bank 2022. | ||
HIV prevalence is the strongest driver of cervical cancer incidence.
In both models, HIV prevalence is highly significant. In the multilevel model, each standard deviation increase in HIV prevalence is associated with 8.15 additional cervical cancer cases per 100,000 women (p < 0.001). The biological mechanism is well established: HIV compromises immune function, allowing HPV to persist and progress to cancer rather than being cleared naturally.
Urbanisation is protective.
A one standard deviation increase in urbanisation is associated with 3.72 fewer cases per 100,000 women (p = 0.003). Urban women have greater access to cervical screening services, HPV vaccination programmes, health information, and trained healthcare workers. Rural women are systematically disadvantaged.
HPV vaccination, GDP and education were not statistically significant in either model, likely reflecting the high proportion of missing data for these variables (HPV vaccination was missing for 58% of countries) rather than a true null effect.
This analysis has several important limitations that should be considered when interpreting the findings.
Missing data is substantial. 58 countries had no cervical cancer incidence data, and HPV vaccination coverage was missing for over half of all countries. Multiple imputation was used to address this, but results for data-sparse regions should be interpreted cautiously.
Ecological fallacy. All analyses are conducted at the country level. The associations found between HIV prevalence and cervical cancer incidence are country-level associations. This does not mean that it is HIV-positive women who are developing cervical cancer. Drawing individual-level conclusions from country-level data is a methodological error known as the ecological fallacy.
Cross-sectional design. This is a single time-point analysis (2022). Causal conclusions cannot be drawn. The associations identified are correlational.
Spatial weights based on shared borders. Island nations and geographically isolated countries had no spatial neighbours and were excluded from spatial analysis. This may underrepresent burden in Pacific and Caribbean countries.
Surveillance bias. Countries with stronger health systems are more likely to have accurate and complete incidence data. High-burden, low-resource countries may underreport true incidence, meaning the disparities identified here could be conservative estimates.
The global burden of cervical cancer is not evenly distributed. It is concentrated, it is measurable, and it is linked to identifiable structural factors.
Forty-two Sub-Saharan African countries form a single statistically significant cluster where women face nearly double the expected cervical cancer burden. HIV prevalence and low urbanisation are the strongest country-level predictors of elevated incidence.
The tragedy is not a lack of solutions. HPV vaccination, cervical screening, and early treatment can prevent the vast majority of cervical cancer deaths. The challenge is ensuring that the women who carry the greatest burden have access to these tools.
The data tells us where the problem is. The response requires political commitment, sustained investment, and health systems that reach women where they live.
All analyses were conducted in R (version 4.6+). Key packages used:
spdep for spatial weights and Moran’s ISpatialEpi for Kulldorff scan statisticlme4 and lmerTest for multilevel
regressionmice for multiple imputationperformance for ICC and R-squaredmodelsummary and gt for publication-ready
tablesggplot2 and sf for all maps and
figuresrnaturalearth for country boundary dataOLS model: \[\text{inc\_rat}_i = \beta_0 + \beta_1\text{hpv\_vac} + \beta_2\text{hiv} + \beta_3\text{gdp} + \beta_4\text{urb} + \beta_5\text{educ} + \epsilon_i\]
Multilevel model (standardised predictors): \[\text{inc\_rat}_{ij} = \beta_0 + \beta_1\text{hpv\_vac}_s + \beta_2\text{hiv}_s + \beta_3\text{gdp}_s + \beta_4\text{urb}_s + \beta_5\text{educ}_s + u_{0j} + \epsilon_{ij}\]
Where \(i\) indexes countries, \(j\) indexes UN regions, and \(u_{0j}\) is the region-level random intercept.
Report generated using R Markdown. Data: Global Cancer Observatory (WHO), World Bank, UNDP, 2022.