This study used a one-sample enrichment framework to evaluate whether the proportion of individuals carrying pathogenic TNFRSF13B (TACI) variants in a clinical cohort was greater than expected based on population-level allele frequency data from the Genome Aggregation Database (gnomAD).
The clinical cohort consisted of 194 individuals, of whom 18 were identified as carriers of at least one pathogenic TNFRSF13B variant. Carrier status was defined at the individual level as the presence of one or more pathogenic variants classified as disease-associated in the literature and curated variant databases.
The observed carrier proportion was calculated as the number of carriers divided by the total sample size.
Population-level variant frequencies were obtained from gnomAD by aggregating allele count (AC) and allele number (AN) data across all pathogenic TNFRSF13B variants included in the analysis. The pooled allele frequency (AF) was calculated as allele count divided by allele number.
Because gnomAD reports allele-level rather than individual-level carrier data, the probability of carrying at least one pathogenic allele was approximated using a rare-variant assumption:
P(carrier)≈2×AF
This approximation assumes Hardy–Weinberg equilibrium and low allele frequency such that homozygous contributions are negligible.
Analyses
Clinical Cohort Data
Number of patients in clinical sample with pathogenic TAC
patho_TACI <- 18
Total number of patients in clinical sample (pathogenic and non-pathogenic TACI)
total_TACI <- 194
Number of patients in clinical sample with non-pathogenic TACI
normal_TACI<- total_TACI - patho_TACI
normal_TACI
## [1] 176
Calculated proportion of patients in clinical sample with pathogenic TACI
patho_PROP <- patho_TACI / total_TACI
patho_PROP
## [1] 0.09278351
Pathogenic TACI Allele Population Frequency
Estimated using gnomAD Allele count for pathogenic TACI gnomAD data for each individual based on pathogenic TACI variant, race, and gender Summed gnomAD “allele count” column
variants <- data.frame(
participant = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18"),
allele_count = c(4173,
4173,
3253,
3465,
3465,
4173,
4173,
4173,
3465,
348,
4173,
3465,
4173,
4173,
67,
348,
3465,
3465),
allele_number = c(611636,
611636,
568386,
611632,
611632,
611636,
611636,
611636,
611632,
611632,
611636,
611632,
611636,
611636,
568388,
611632,
611632,
611632)
)
variants
## participant allele_count allele_number
## 1 1 4173 611636
## 2 2 4173 611636
## 3 3 3253 568386
## 4 4 3465 611632
## 5 5 3465 611632
## 6 6 4173 611636
## 7 7 4173 611636
## 8 8 4173 611636
## 9 9 3465 611632
## 10 10 348 611632
## 11 11 4173 611636
## 12 12 3465 611632
## 13 13 4173 611636
## 14 14 4173 611636
## 15 15 67 568388
## 16 16 348 611632
## 17 17 3465 611632
## 18 18 3465 611632
Total pathogenic alleles
patho_al <- sum(variants$allele_count)
patho_al
## [1] 58190
Total alleles (pathogenic + non-pathogenic)
total_al <- sum(variants$allele_number)
total_al
## [1] 10922918
Pathogenic TACI allele frequency in gnomAD sample
freq_al <- patho_al / total_al
freq_al
## [1] 0.005327331
Converted frequency percentage to carrier probability Multiplied by 2 because each person has 2 copies of the gene Goal to estimate proportion of people in the general population likely carry at least one pathogenic TACI variant.
carry_prob <- 2 * freq_al
carry_prob
## [1] 0.01065466
Conduct Binomial Test Rationale: Use a binomial test when comparing one sample proportion to an expected probability. Must use proportions because that is the type of data provided by gnomAD
binom_result <- binom.test(
x = patho_TACI,
n = total_TACI,
p = carry_prob,
alternative = "greater"
)
binom_result
##
## Exact binomial test
##
## data: patho_TACI and total_TACI
## number of successes = 18, number of trials = 194, p-value = 5.535e-12
## alternative hypothesis: true probability of success is greater than 0.01065466
## 95 percent confidence interval:
## 0.06082948 1.00000000
## sample estimates:
## probability of success
## 0.09278351
Interpretation:
A binomial test was conducted to examine whether the proportion of individuals with pathogenic TACI (TNFRSF13B) variants in the clinical cohort was greater than the expected proportion in the general population estimated from gnomAD.
In the clinical sample, 18 of 194 individuals (9.28%) carried a pathogenic TACI variant. The expected carrier probability in the general population, derived from gnomAD allele frequency data, was 0.0107 (1.07%).
Results of the exact binomial test indicated that the proportion of pathogenic TACI carriers in the clinical cohort was significantly # greater than expected under the gnomAD-derived baseline, p < .001. The observed proportion (95% CI [0.061, 1.000]) exceeded the expected population probability.
Effect size: Odds Ratio
clinical odds
clinical_odds <- patho_TACI / (total_TACI - patho_TACI)
clinical_odds
## [1] 0.1022727
population odds
pop_odds <- carry_prob / (1 - carry_prob)
pop_odds
## [1] 0.01076941
odds ratio
OR <- clinical_odds / pop_odds
OR
## [1] 9.496598
The odds of carrying a pathogenic TNFRSF13B variant were compared between the clinical cohort and expected population-level odds derived from gnomAD. Results indicated that individuals in the clinical cohort had approximately 9.50 times higher odds of carrying a pathogenic variant relative to population expectations (OR = 9.50), suggesting strong enrichment of pathogenic variants in the clinical sample.