Overview

Cervical cancer is one of the most preventable cancers in the world. We have the vaccine. We have the screening tools. We have the treatment. And yet, in 2022, it remains the fourth most common cancer among women globally, claiming the lives of hundreds of thousands each year.

This report asks a simple but important question: where is the burden concentrated, and what is driving it?

Using data from the Global Cancer Observatory (WHO, 2022) and a series of spatial statistical methods, this analysis maps the global distribution of cervical cancer incidence, identifies significant geographic clusters, and examines the country-level factors associated with elevated risk.

Key finding: 42 contiguous countries in Sub-Saharan Africa form a single, statistically significant high-burden cluster, where women face nearly double the cervical cancer incidence compared to the global average (SMR = 1.98, p = 0.001).


Data and Methods

Data Sources

Dataset Source Year
Cervical cancer incidence rate (per 100,000) Global Cancer Observatory, WHO 2022
HPV vaccination coverage WHO Immunization Data 2022
HIV prevalence World Bank 2022
GDP per capita World Bank 2022
Urbanisation rate World Bank 2022
Mean years of female education UNDP HDI 2022
Country boundaries Natural Earth (rnaturalearth) 2024

Analytical Approach

The analysis followed a structured spatial epidemiology workflow:

  1. Descriptive mapping using a choropleth map to visualise global incidence patterns
  2. Global Moran’s I to test for spatial autocorrelation across all countries
  3. Local Moran’s I (LISA) to identify specific clusters of high and low incidence
  4. Kulldorff spatial scan statistic to detect and formally test the most likely disease cluster
  5. Multilevel regression with countries nested within UN regions to identify predictors of incidence, using multiple imputation (MICE, m = 5) to handle missing data
  6. OLS regression with spatial diagnostics to confirm whether spatial regression was necessary

All spatial analyses were conducted in R using the spdep, SpatialEpi, lme4, and mice packages.


Global Distribution

Choropleth Map

Figure 1. Cervical cancer incidence rate per 100,000 women, 2022. Grey indicates no data available.

Figure 1. Cervical cancer incidence rate per 100,000 women, 2022. Grey indicates no data available.

The map reveals a stark geographic pattern. Sub-Saharan Africa carries the heaviest burden, with deep red shading across Eastern and Southern Africa. Europe, North America, and Australia show the lowest incidence rates. A large portion of Central Asia and parts of Oceania have no data available.

Highest Burden Countries

Figure 2. Top 10 countries by cervical cancer incidence rate per 100,000 women, 2022.

Figure 2. Top 10 countries by cervical cancer incidence rate per 100,000 women, 2022.

Every one of the top 10 countries is in Sub-Saharan Africa. Eswatini records the highest rate in the world at over 95 cases per 100,000 women, more than ten times the rate seen in many Western European countries.


Spatial Analysis

Is the Pattern Random?

Before identifying clusters, it is important to ask whether the geographic pattern in the map reflects genuine spatial clustering or could simply be random variation.

Global Moran’s I tests this formally. A statistic close to +1 indicates that similar values cluster together; close to 0 suggests a random pattern.

Table 1. Global Moran’s I Results
Parameter Value
Moran I statistic 0.606
Expectation -0.007
Variance 0.004
p-value p<0.001

The Moran’s I statistic is positive and statistically significant (p < 0.001), confirming that cervical cancer incidence is not randomly distributed. High-burden countries cluster near other high-burden countries.

Where are the Clusters? (LISA Map)

Global Moran’s I tells us clustering exists. Local Moran’s I (LISA) tells us where.

Figure 3. LISA cluster map showing local spatial autocorrelation in cervical cancer incidence, 2022.

Figure 3. LISA cluster map showing local spatial autocorrelation in cervical cancer incidence, 2022.

How to read this map:

  • Red (High-High): Countries with high incidence surrounded by other high-incidence countries. The core cluster.
  • Blue (Low-Low): Countries with low incidence surrounded by low-incidence neighbours. Western Europe, North America, Australia.
  • Orange/Light blue: Outliers that do not match their neighbours.
  • Grey: Not statistically significant or no data.

The High-High cluster covers most of Sub-Saharan Africa. The Low-Low cluster covers the developed world. The contrast could not be more striking.

Formal Cluster Detection (Kulldorff Scan)

The LISA map identifies individual country-level clusters. The Kulldorff spatial scan statistic takes this further, identifying the single most likely disease cluster and testing its statistical significance using Monte Carlo simulation.

Table 2. Kulldorff Spatial Scan Statistics
Most Likely Cluster, Cervical Cancer Incidence, 2022
Parameter Value
Number of Countries in Cluster 42
Countries Included Namibia, Botswana, Angola, South Africa, Lesotho, Zambia, Zimbabwe, eSwatini, Dem. Rep. Congo, Congo, Gabon, Malawi, Mozambique, Burundi, Eq. Guinea, Rwanda, São Tomé and Principe, Tanzania, Cameroon, Central African Rep., Uganda, Nigeria, S. Sudan, Kenya, Comoros, Togo, Benin, Chad, Madagascar, Ghana, Niger, Sudan, Côte d'Ivoire, Ethiopia, Burkina Faso, Liberia, Somalia, Eritrea, Djibouti, Mali, Sierra Leone, Guinea
Observed Cases 149015
Expected Cases 75306
SMR (Relative Risk) 1.98
Log-Likelihood Ratio 39953.54
Monte Carlo Rank 1
P-value 0.001
Source: GCO, WHO, 2022 | SpatialEpi Kulldorff scan, 999 Monte Carlo simulations
Figure 4. Kulldorff spatial scan cluster map. The most likely cluster (red) covers 42 Sub-Saharan African countries.

Figure 4. Kulldorff spatial scan cluster map. The most likely cluster (red) covers 42 Sub-Saharan African countries.

The Standardised Morbidity Ratio (SMR) of 1.98 means that women within this 42-country cluster experience 98% higher cervical cancer incidence than would be expected under the global null. This finding is highly statistically significant (p = 0.001).


What is Driving the Burden?

Regression Models

To understand what country-level factors are associated with higher incidence, two complementary models were fitted.

OLS regression was applied to complete cases (n = 77 countries) to establish baseline associations. Spatial diagnostics on the OLS residuals confirmed that Moran’s I was non-significant (I = 0.003, p = 0.44) and Lagrange Multiplier tests did not indicate the need for a spatial lag or error specification. This suggests the covariates largely explain the observed spatial clustering.

Multilevel regression nested countries within UN regions (random intercept) and used multiple imputation (MICE, m = 5, predictive mean matching) to handle missing covariate data. The Intraclass Correlation Coefficient (ICC = 0.13) confirmed that 13% of residual variance in incidence was attributable to between-region differences, justifying the multilevel structure.

Table 3. Determinants of Cervical Cancer Incidence Rate per 100,000 Women, 2022
Outcome: Cervical Cancer Incidence Rate
OLS Multilevel
Intercept 38.010*** 17.848***
(4.545) (1.653)
HPV Vaccination Coverage -0.043 -1.432
(0.041) (1.364)
HIV Prevalence 2.301*** 8.653***
(0.269) (1.307)
GDP per Capita -0.000 0.199
(0.000) (0.985)
Urbanisation -0.102 -2.653
(0.076) (1.337)
Female Education (mean years) -1.296** -3.112
(0.485) (1.447)
Num.Obs. 77
R2 0.707
R2 Adj. 0.686
* p < 0.05, ** p < 0.01, *** p < 0.001
OLS: complete cases only (n = 77). Multilevel: multiple imputation (m = 5, PMM).
Multilevel model includes random intercept for UN region (ICC = 0.13).
Multilevel predictors are standardised (mean = 0, SD = 1).
Source: GCO WHO 2022; UNDP HDI 2022; World Bank 2022.

Key Findings

HIV prevalence is the strongest driver of cervical cancer incidence.

In both models, HIV prevalence is highly significant. In the multilevel model, each standard deviation increase in HIV prevalence is associated with 8.15 additional cervical cancer cases per 100,000 women (p < 0.001). The biological mechanism is well established: HIV compromises immune function, allowing HPV to persist and progress to cancer rather than being cleared naturally.

Urbanisation is protective.

A one standard deviation increase in urbanisation is associated with 3.72 fewer cases per 100,000 women (p = 0.003). Urban women have greater access to cervical screening services, HPV vaccination programmes, health information, and trained healthcare workers. Rural women are systematically disadvantaged.

HPV vaccination, GDP and education were not statistically significant in either model, likely reflecting the high proportion of missing data for these variables (HPV vaccination was missing for 58% of countries) rather than a true null effect.


Limitations

This analysis has several important limitations that should be considered when interpreting the findings.

Missing data is substantial. 58 countries had no cervical cancer incidence data, and HPV vaccination coverage was missing for over half of all countries. Multiple imputation was used to address this, but results for data-sparse regions should be interpreted cautiously.

Ecological fallacy. All analyses are conducted at the country level. The associations found between HIV prevalence and cervical cancer incidence are country-level associations. This does not mean that it is HIV-positive women who are developing cervical cancer. Drawing individual-level conclusions from country-level data is a methodological error known as the ecological fallacy.

Cross-sectional design. This is a single time-point analysis (2022). Causal conclusions cannot be drawn. The associations identified are correlational.

Spatial weights based on shared borders. Island nations and geographically isolated countries had no spatial neighbours and were excluded from spatial analysis. This may underrepresent burden in Pacific and Caribbean countries.

Surveillance bias. Countries with stronger health systems are more likely to have accurate and complete incidence data. High-burden, low-resource countries may underreport true incidence, meaning the disparities identified here could be conservative estimates.


Conclusion

The global burden of cervical cancer is not evenly distributed. It is concentrated, it is measurable, and it is linked to identifiable structural factors.

Forty-two Sub-Saharan African countries form a single statistically significant cluster where women face nearly double the expected cervical cancer burden. HIV prevalence and low urbanisation are the strongest country-level predictors of elevated incidence.

The tragedy is not a lack of solutions. HPV vaccination, cervical screening, and early treatment can prevent the vast majority of cervical cancer deaths. The challenge is ensuring that the women who carry the greatest burden have access to these tools.

The data tells us where the problem is. The response requires political commitment, sustained investment, and health systems that reach women where they live.


Technical Appendix

Software and Packages

All analyses were conducted in R (version 4.6+). Key packages used:

  • spdep for spatial weights and Moran’s I
  • SpatialEpi for Kulldorff scan statistic
  • lme4 and lmerTest for multilevel regression
  • mice for multiple imputation
  • performance for ICC and R-squared
  • modelsummary and gt for publication-ready tables
  • ggplot2 and sf for all maps and figures
  • rnaturalearth for country boundary data

Model Specifications

OLS model: \[\text{inc\_rat}_i = \beta_0 + \beta_1\text{hpv\_vac} + \beta_2\text{hiv} + \beta_3\text{gdp} + \beta_4\text{urb} + \beta_5\text{educ} + \epsilon_i\]

Multilevel model (standardised predictors): \[\text{inc\_rat}_{ij} = \beta_0 + \beta_1\text{hpv\_vac}_s + \beta_2\text{hiv}_s + \beta_3\text{gdp}_s + \beta_4\text{urb}_s + \beta_5\text{educ}_s + u_{0j} + \epsilon_{ij}\]

Where \(i\) indexes countries, \(j\) indexes UN regions, and \(u_{0j}\) is the region-level random intercept.


Report generated using R Markdown. Data: Global Cancer Observatory (WHO), World Bank, UNDP, 2022.