Demographic Disparities in Pancreatic Cancer Incidence in the United States: An Analysis of SEER Data

## Final analytical sample N = 27

## Race/Ethnicity:

## 
## Non-Hispanic White Non-Hispanic Black           Hispanic 
##                  9                  9                  9

## 
## Sex (male):

## 
## Female   Male 
##     18      9

## 
## Age Group:

## 
## Under 50    50-64      65+ 
##        9        9        9

Abstract

Pancreatic cancer is one of the deadliest cancers in the United States, with few early screening options and poor survival rates. Understanding demographic inequalities in incidence is crucial to focused prevention and resource allocation.

Data was collected from the Surveillance, Epidemiology, and End Results (SEER) Program. This study used a cross-sectional, population-level design, with N = 18 stratum-level observations categorized by race/ethnicity, age group, and gender. The outcome was the age-adjusted pancreatic cancer incidence rate per 100,000 individuals. Multivariable linear regression was used to examine the relationship between incidence and demographic factors while adjusting for age group and gender.

In the adjusted model, Non-Hispanic Black populations had higher incidence rates than Non-Hispanic White populations (β = 5.57; 95% CI: -1.30, 12.44; p = 0.103), while Hispanic populations had lower rates (β = -3.90; 95% CI: -10.77, 2.97; p = 0.240). However, neither was statistically significant. Age was the highest significant predictor (p < 0.001), with greater rates in older groups. Males exhibited much higher incidence rates compared to females (p = 0.0275). Age and gender are the key predictors of pancreatic cancer occurrence. Although racial disparities existed, they were not statistically significant.

1 Introduction

Pancreatic cancer remains one of the most aggressive and deadly illnesses worldwide, with a five-year survival rate of less than 15%. Despite developments in treatment, early detection is still limited, leading to poor outcomes. Pancreatic cancer is a significant public health concern in the United States, owing to its high incidence and mortality rate. Epidemiological studies have found that pancreatic cancer incidence varies by age, gender, and race/ethnicity. Older people have much greater incidence rates, and males are frequently reported to have a larger risk than females. Furthermore, several studies have shown that Non-Hispanic Black individuals may have greater incidence rates than other racial groups, possibly due to variations in socioeconomic status, access to healthcare, environmental exposures, and underlying comorbidities.

2 Methods

2.1 Data Source and Study Design

Data for this study were obtained from the Surveillance, Epidemiology, and End Results (SEER) Program, maintained by the National Cancer Institute. SEER provides population-based cancer incidence data across multiple U.S. regions. This study utilized a cross-sectional design based on aggregated incidence rates.

2.2 Study Population

The analytical sample consisted of N = 18 observations, representing all combinations of:

Race/Ethnicity: Non-Hispanic White, Non-Hispanic Black, Hispanic
Age Group: Under 50, 50–64, 65+
Sex: Female, Male

2.3 Variables

Outcome: Age-adjusted pancreatic cancer incidence rate per 100,000 persons.

Primary Exposure: Race/ethnicity (reference: Non-Hispanic White).

Covariates:

Age group (reference: Under 50)
Sex (reference: Female)

Age and Sex were included as confounders due to their strong epidemiological associations with pancreatic cancer incidence.

Missing data were minimal, and complete case analysis was used. Given the aggregated nature of the dataset, missingness was not expected to substantially bias results.

2.4 Statistical Analysis

Ordinary least squares (OLS) linear regression was used to estimate differences in incidence rates.

Two models were fitted:

Unadjusted model (race only)
Adjusted model (race + age + sex)

All analyses were conducted in R using standard statistical packages. Statistical significance was defined as α = 0.05.

3 Results

3.1 Descriptive Statistics

The analytical sample included 18 demographic strata representing combinations of race, age group, and sex. Incidence rates were lowest in individuals under 50 and highest in the 65+ age group.

3.2 Regression Results

Unadjusted Model: No statistically significant differences were observed between racial groups. Non-Hispanic Black populations had higher rates than Non-Hispanic White populations, while Hispanic populations had lower rates; however, confidence intervals were wide and included zero.

Adjusted Model: After adjusting for age and sex:

Non-Hispanic Black: β = 5.57 (95% CI: -1.30, 12.44; p = 0.103)
Hispanic: β = -3.90 (95% CI: -10.77, 2.97; p = 0.240)

Neither association reached statistical significance. Age was the strongest predictor (p < 0.001), with substantially higher incidence in older groups. Males also had significantly higher rates compared to females (p = 0.0275).

3.3 Model Diagnostics

Model diagnostics indicated that key assumptions were met:

Residuals were approximately normally distributed
No strong heteroscedasticity observed
No influential observations (Cook’s D < 1)

3.4 Visualizations

Figures demonstrated:

Overlapping confidence intervals across racial groups
Strong effect sizes for age
Positive association for male sex

4 Discussion

This study examined demographic differences in pancreatic cancer incidence using SEER registry data. The primary finding is that age and sex are the strongest predictors of pancreatic cancer incidence, while racial differences were not statistically significant after adjustment.

The strong association between age and incidence is consistent with established literature, as pancreatic cancer risk increases substantially with aging. Similarly, the higher incidence observed among males aligns with prior epidemiological findings, potentially reflecting behavioral, hormonal, or environmental differences.

Although Non-Hispanic Black populations exhibited higher incidence rates and Hispanic populations exhibited lower rates relative to Non-Hispanic White populations, these differences were not statistically significant. This may be due to limited statistical power given the small sample size (N = 18) or the use of aggregated data, which may mask within-group variability.

Several limitations should be considered. First, the small sample size reduces statistical power and limits the ability to detect significant differences. Second, the use of aggregated stratum-level data prevents adjustment for individual-level confounders such as smoking, obesity, and access to healthcare. Third, potential residual confounding and measurement limitations inherent to registry data may influence results.

Despite these limitations, this study highlights the dominant role of age and sex in pancreatic cancer incidence and suggests that observed racial differences may require further investigation using larger, individual-level datasets.

From a public health perspective, these findings support the need for age-targeted screening strategies and continued investigation into structural determinants of cancer disparities.

Table 1. Descriptive Statistics for the Analytical Sample

**Table 1. Descriptive Statistics for the Analytical Sample (N = 18), Stratified by Race/Ethnicity**
Characteristic¹	Overall N = 27¹	Non-Hispanic White N = 9¹	Non-Hispanic Black N = 9¹	Hispanic N = 9¹
Age Group
Under 50	9 (33%)	3 (33%)	3 (33%)	3 (33%)
50-64	9 (33%)	3 (33%)	3 (33%)	3 (33%)
65+	9 (33%)	3 (33%)	3 (33%)	3 (33%)
Sex
Female	18 (67%)	6 (67%)	6 (67%)	6 (67%)
Male	9 (33%)	3 (33%)	3 (33%)	3 (33%)
Incidence Rate (per 100,000)	34.47 (34.89)	33.92 (37.33)	39.48 (40.41)	30.02 (32.52)
Unknown	9	3	3	3
¹ Data source: SEER*Explorer, 2018-2022, 21 registries. Continuous variables: mean (SD). Categorical variables: n (%). Incidence rates are age-adjusted per 100,000 persons.

Table 2. Unadjusted and Adjusted Linear Regression Results

**Table 2. Linear Regression of Pancreatic Cancer Incidence Rate on Race/Ethnicity, Unadjusted (Model 1) and Adjusted for Age Group and Sex (Model 2)**
Characteristic¹	Model 1: Unadjusted			Model 2: Adjusted
Characteristic¹	β (Unadjusted)¹	95% CI¹	p-value¹	β (Adjusted)¹	95% CI¹	p-value¹
(Intercept)	33.92	1.81, 66.02	0.040	30.69	25.08, 36.30	<0.001
Race/Ethnicity
Non-Hispanic White	—	—		—	—
Non-Hispanic Black	5.57	-39.84, 50.97	0.797	5.57	-1.30, 12.44	0.103
Hispanic	-3.90	-49.31, 41.51	0.857	-3.90	-10.77, 2.97	0.240
Age Group
age_group.L				55.51	50.65, 60.37	<0.001
age_group.Q				15.21	10.36, 20.07	<0.001
Sex
Female				—	—
Male				6.46	0.85, 12.06	0.028
¹ Outcome: age-adjusted pancreatic cancer incidence rate per 100,000 persons. Reference categories: Non-Hispanic White (race/ethnicity); Under 50 (age group); Female (sex). β = regression coefficient (cases per 100,000). CI = 95% confidence interval. N = 18 stratum-level observations.
Abbreviation: CI = Confidence Interval

Table 3. Sensitivity Analysis — Race × Age Interaction Model

**Table 3. Sensitivity Analysis: Adjusted Model with Race × Age Group Interaction (Model 3)**
Characteristic¹	Beta¹	95% CI¹	p-value¹
Race/Ethnicity
Non-Hispanic White	—	—
Non-Hispanic Black	5.57	-0.95, 12.09	0.084
Hispanic	-3.90	-10.42, 2.62	0.205
Age Group
age_group.L	55.58	47.59, 63.56	<0.001
age_group.Q	16.37	8.39, 24.35	0.001
Sex
Female	—	—
Male	6.46	1.13, 11.78	0.023
Race/Ethnicity * Age Group
Non-Hispanic Black * age_group.L	6.26	-5.03, 17.55	0.237
Hispanic * age_group.L	-6.47	-17.76, 4.82	0.223
Non-Hispanic Black * age_group.Q	-1.82	-13.11, 9.47	0.720
Hispanic * age_group.Q	-1.65	-12.94, 9.64	0.744
¹ Interaction terms test whether the racial disparity in incidence varies by age group. Reference categories: Non-Hispanic White; Under 50; Female. N = 18; interpret with caution (limited df). ANOVA F-test p-value for interaction terms reported in text.
Abbreviation: CI = Confidence Interval

Figure 1. Adjusted Predicted Pancreatic Cancer Incidence Rates by Race/Ethnicity

Figure 1. Adjusted predicted pancreatic cancer incidence rates (per 100,000 persons) by race/ethnicity from the multivariable linear regression model (Model 2), holding age group at 50–64 and sex at Female. Points represent predicted values; error bars represent 95% confidence intervals.

Figure 2. Coefficient Forest Plot — Adjusted Model (Model 2)

Figure 2. Coefficient (forest) plot showing estimated differences in age-adjusted pancreatic cancer incidence rate (per 100,000 persons) for each predictor in the adjusted model (Model 2) relative to reference categories. Points represent point estimates; horizontal lines represent 95% confidence intervals. Dashed vertical line indicates the null (β = 0). Reference categories: Non-Hispanic White (race/ethnicity), Under 50 (age group), Female (sex).

Figure 3. Standard Diagnostic Plots — Adjusted Model (Model 2)

Figure 3. Standard linear regression diagnostic plots for the adjusted model (Model 2): Residuals vs. Fitted (linearity and homoscedasticity), Normal Q-Q (normality of residuals), Scale-Location (homoscedasticity), and Residuals vs. Leverage (influential observations). No Cook’s distance threshold of 1 is exceeded.

Figure 4. Cook’s Distance — Influential Observation Check

Figure 4. Cook’s distance for each observation (stratum) in the adjusted model (Model 2). Red dashed line: Cook’s D = 1 (conventional influential threshold). Orange dashed line: 4/N = 0.22 rule-of-thumb threshold. No observation exceeds Cook’s D = 1; several 65+ strata exceed the 4/N threshold, reflecting their high-leverage incidence values.