## Final analytical sample N = 27
## Race/Ethnicity:
##
## Non-Hispanic White Non-Hispanic Black Hispanic
## 9 9 9
##
## Sex (male):
##
## Female Male
## 18 9
##
## Age Group:
##
## Under 50 50-64 65+
## 9 9 9
Pancreatic cancer is one of the deadliest cancers in the United States, with few early screening options and poor survival rates. Understanding demographic inequalities in incidence is crucial to focused prevention and resource allocation.
Data was collected from the Surveillance, Epidemiology, and End Results (SEER) Program. This study used a cross-sectional, population-level design, with N = 18 stratum-level observations categorized by race/ethnicity, age group, and gender. The outcome was the age-adjusted pancreatic cancer incidence rate per 100,000 individuals. Multivariable linear regression was used to examine the relationship between incidence and demographic factors while adjusting for age group and gender.
In the adjusted model, Non-Hispanic Black populations had higher incidence rates than Non-Hispanic White populations (β = 5.57; 95% CI: -1.30, 12.44; p = 0.103), while Hispanic populations had lower rates (β = -3.90; 95% CI: -10.77, 2.97; p = 0.240). However, neither was statistically significant. Age was the highest significant predictor (p < 0.001), with greater rates in older groups. Males exhibited much higher incidence rates compared to females (p = 0.0275). Age and gender are the key predictors of pancreatic cancer occurrence. Although racial disparities existed, they were not statistically significant.
Pancreatic cancer remains one of the most aggressive and deadly illnesses worldwide, with a five-year survival rate of less than 15%. Despite developments in treatment, early detection is still limited, leading to poor outcomes. Pancreatic cancer is a significant public health concern in the United States, owing to its high incidence and mortality rate. Epidemiological studies have found that pancreatic cancer incidence varies by age, gender, and race/ethnicity. Older people have much greater incidence rates, and males are frequently reported to have a larger risk than females. Furthermore, several studies have shown that Non-Hispanic Black individuals may have greater incidence rates than other racial groups, possibly due to variations in socioeconomic status, access to healthcare, environmental exposures, and underlying comorbidities.
Data for this study were obtained from the Surveillance, Epidemiology, and End Results (SEER) Program, maintained by the National Cancer Institute. SEER provides population-based cancer incidence data across multiple U.S. regions. This study utilized a cross-sectional design based on aggregated incidence rates.
The analytical sample consisted of N = 18 observations, representing all combinations of:
Outcome: Age-adjusted pancreatic cancer incidence rate per 100,000 persons.
Primary Exposure: Race/ethnicity (reference: Non-Hispanic White).
Covariates:
Age and Sex were included as confounders due to their strong epidemiological associations with pancreatic cancer incidence.
Missing data were minimal, and complete case analysis was used. Given the aggregated nature of the dataset, missingness was not expected to substantially bias results.
Ordinary least squares (OLS) linear regression was used to estimate differences in incidence rates.
Two models were fitted:
All analyses were conducted in R using standard statistical packages. Statistical significance was defined as α = 0.05.
The analytical sample included 18 demographic strata representing combinations of race, age group, and sex. Incidence rates were lowest in individuals under 50 and highest in the 65+ age group.
Unadjusted Model: No statistically significant differences were observed between racial groups. Non-Hispanic Black populations had higher rates than Non-Hispanic White populations, while Hispanic populations had lower rates; however, confidence intervals were wide and included zero.
Adjusted Model: After adjusting for age and sex:
Neither association reached statistical significance. Age was the strongest predictor (p < 0.001), with substantially higher incidence in older groups. Males also had significantly higher rates compared to females (p = 0.0275).
Model diagnostics indicated that key assumptions were met:
Figures demonstrated:
This study examined demographic differences in pancreatic cancer incidence using SEER registry data. The primary finding is that age and sex are the strongest predictors of pancreatic cancer incidence, while racial differences were not statistically significant after adjustment.
The strong association between age and incidence is consistent with established literature, as pancreatic cancer risk increases substantially with aging. Similarly, the higher incidence observed among males aligns with prior epidemiological findings, potentially reflecting behavioral, hormonal, or environmental differences.
Although Non-Hispanic Black populations exhibited higher incidence rates and Hispanic populations exhibited lower rates relative to Non-Hispanic White populations, these differences were not statistically significant. This may be due to limited statistical power given the small sample size (N = 18) or the use of aggregated data, which may mask within-group variability.
Several limitations should be considered. First, the small sample size reduces statistical power and limits the ability to detect significant differences. Second, the use of aggregated stratum-level data prevents adjustment for individual-level confounders such as smoking, obesity, and access to healthcare. Third, potential residual confounding and measurement limitations inherent to registry data may influence results.
Despite these limitations, this study highlights the dominant role of age and sex in pancreatic cancer incidence and suggests that observed racial differences may require further investigation using larger, individual-level datasets.
From a public health perspective, these findings support the need for age-targeted screening strategies and continued investigation into structural determinants of cancer disparities.
| Characteristic1 | Overall N = 271 |
Non-Hispanic White N = 91 |
Non-Hispanic Black N = 91 |
Hispanic N = 91 |
|---|---|---|---|---|
| Age Group | ||||
| Under 50 | 9 (33%) | 3 (33%) | 3 (33%) | 3 (33%) |
| 50-64 | 9 (33%) | 3 (33%) | 3 (33%) | 3 (33%) |
| 65+ | 9 (33%) | 3 (33%) | 3 (33%) | 3 (33%) |
| Sex | ||||
| Female | 18 (67%) | 6 (67%) | 6 (67%) | 6 (67%) |
| Male | 9 (33%) | 3 (33%) | 3 (33%) | 3 (33%) |
| Incidence Rate (per 100,000) | 34.47 (34.89) | 33.92 (37.33) | 39.48 (40.41) | 30.02 (32.52) |
| Unknown | 9 | 3 | 3 | 3 |
| 1 Data source: SEER*Explorer, 2018-2022, 21 registries. Continuous variables: mean (SD). Categorical variables: n (%). Incidence rates are age-adjusted per 100,000 persons. | ||||
| Characteristic1 |
Model 1: Unadjusted
|
Model 2: Adjusted
|
||||
|---|---|---|---|---|---|---|
| β (Unadjusted)1 | 95% CI1 | p-value1 | β (Adjusted)1 | 95% CI1 | p-value1 | |
| (Intercept) | 33.92 | 1.81, 66.02 | 0.040 | 30.69 | 25.08, 36.30 | <0.001 |
| Race/Ethnicity | ||||||
| Non-Hispanic White | — | — | — | — | ||
| Non-Hispanic Black | 5.57 | -39.84, 50.97 | 0.797 | 5.57 | -1.30, 12.44 | 0.103 |
| Hispanic | -3.90 | -49.31, 41.51 | 0.857 | -3.90 | -10.77, 2.97 | 0.240 |
| Age Group | ||||||
| age_group.L | 55.51 | 50.65, 60.37 | <0.001 | |||
| age_group.Q | 15.21 | 10.36, 20.07 | <0.001 | |||
| Sex | ||||||
| Female | — | — | ||||
| Male | 6.46 | 0.85, 12.06 | 0.028 | |||
| 1 Outcome: age-adjusted pancreatic cancer incidence rate per 100,000 persons. Reference categories: Non-Hispanic White (race/ethnicity); Under 50 (age group); Female (sex). β = regression coefficient (cases per 100,000). CI = 95% confidence interval. N = 18 stratum-level observations. | ||||||
| Abbreviation: CI = Confidence Interval | ||||||
| Characteristic1 | Beta1 | 95% CI1 | p-value1 |
|---|---|---|---|
| Race/Ethnicity | |||
| Non-Hispanic White | — | — | |
| Non-Hispanic Black | 5.57 | -0.95, 12.09 | 0.084 |
| Hispanic | -3.90 | -10.42, 2.62 | 0.205 |
| Age Group | |||
| age_group.L | 55.58 | 47.59, 63.56 | <0.001 |
| age_group.Q | 16.37 | 8.39, 24.35 | 0.001 |
| Sex | |||
| Female | — | — | |
| Male | 6.46 | 1.13, 11.78 | 0.023 |
| Race/Ethnicity * Age Group | |||
| Non-Hispanic Black * age_group.L | 6.26 | -5.03, 17.55 | 0.237 |
| Hispanic * age_group.L | -6.47 | -17.76, 4.82 | 0.223 |
| Non-Hispanic Black * age_group.Q | -1.82 | -13.11, 9.47 | 0.720 |
| Hispanic * age_group.Q | -1.65 | -12.94, 9.64 | 0.744 |
| 1 Interaction terms test whether the racial disparity in incidence varies by age group. Reference categories: Non-Hispanic White; Under 50; Female. N = 18; interpret with caution (limited df). ANOVA F-test p-value for interaction terms reported in text. | |||
| Abbreviation: CI = Confidence Interval | |||
Figure 1. Adjusted predicted pancreatic cancer incidence rates (per 100,000 persons) by race/ethnicity from the multivariable linear regression model (Model 2), holding age group at 50–64 and sex at Female. Points represent predicted values; error bars represent 95% confidence intervals.
Figure 2. Coefficient (forest) plot showing estimated differences in age-adjusted pancreatic cancer incidence rate (per 100,000 persons) for each predictor in the adjusted model (Model 2) relative to reference categories. Points represent point estimates; horizontal lines represent 95% confidence intervals. Dashed vertical line indicates the null (β = 0). Reference categories: Non-Hispanic White (race/ethnicity), Under 50 (age group), Female (sex).
Figure 3. Standard linear regression diagnostic plots for the adjusted model (Model 2): Residuals vs. Fitted (linearity and homoscedasticity), Normal Q-Q (normality of residuals), Scale-Location (homoscedasticity), and Residuals vs. Leverage (influential observations). No Cook’s distance threshold of 1 is exceeded.
Figure 4. Cook’s distance for each observation (stratum) in the adjusted model (Model 2). Red dashed line: Cook’s D = 1 (conventional influential threshold). Orange dashed line: 4/N = 0.22 rule-of-thumb threshold. No observation exceeds Cook’s D = 1; several 65+ strata exceed the 4/N threshold, reflecting their high-leverage incidence values.