This study used a one-sample enrichment framework to evaluate whether the proportion of individuals carrying pathogenic TNFRSF13B (TACI) variants in a clinical cohort was greater than expected based on population-level allele frequency data from the Genome Aggregation Database (gnomAD).

The clinical cohort consisted of 194 individuals, of whom 18 were identified as carriers of at least one pathogenic TNFRSF13B variant. Carrier status was defined at the individual level as the presence of one or more pathogenic variants classified as disease-associated in the literature and curated variant databases.

The observed carrier proportion was calculated as the number of carriers divided by the total sample size.

Population-level variant frequencies were obtained from gnomAD by aggregating allele count (AC) and allele number (AN) data across all pathogenic TNFRSF13B variants included in the analysis. The pooled allele frequency (AF) was calculated as allele count divided by allele number.

Because gnomAD reports allele-level rather than individual-level carrier data, the probability of carrying at least one pathogenic allele was approximated using a rare-variant assumption:

P(carrier)≈2×AF

This approximation assumes Hardy–Weinberg equilibrium and low allele frequency such that homozygous contributions are negligible.

Analyses

Clinical Cohort Data

Number of patients in clinical sample with pathogenic TAC

patho_TACI <- 18

Total number of patients in clinical sample (pathogenic and non-pathogenic TACI)

total_TACI <- 194

Number of patients in clinical sample with non-pathogenic TACI

normal_TACI<- total_TACI - patho_TACI
normal_TACI
## [1] 176

Calculated proportion of patients in clinical sample with pathogenic TACI

patho_PROP <- patho_TACI / total_TACI
patho_PROP
## [1] 0.09278351

Pathogenic TACI Allele Population Frequency

Estimated using gnomAD Allele count for pathogenic TACI gnomAD data for each individual based on pathogenic TACI variant, race, and gender Summed gnomAD “allele count” column

variants <- data.frame(
  participant = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18"),
  allele_count = c(4173, 
                   4173,
                   3253,
                   3465,
                   3465,
                   4173,
                   4173,
                   4173,
                   3465,
                   348,
                   4173,
                   3465,
                   4173,
                   4173,
                   67,
                   348,
                   3465,
                   3465),
  allele_number = c(611636,
                    611636,
                    568386,
                    611632,
                    611632,
                    611636,
                    611636,
                    611636,
                    611632,
                    611632,
                    611636,
                    611632,
                    611636,
                    611636,
                    568388,
                    611632,
                    611632,
                    611632)
)

variants
##    participant allele_count allele_number
## 1            1         4173        611636
## 2            2         4173        611636
## 3            3         3253        568386
## 4            4         3465        611632
## 5            5         3465        611632
## 6            6         4173        611636
## 7            7         4173        611636
## 8            8         4173        611636
## 9            9         3465        611632
## 10          10          348        611632
## 11          11         4173        611636
## 12          12         3465        611632
## 13          13         4173        611636
## 14          14         4173        611636
## 15          15           67        568388
## 16          16          348        611632
## 17          17         3465        611632
## 18          18         3465        611632

Total pathogenic alleles

patho_al <- sum(variants$allele_count)
patho_al
## [1] 58190

Total alleles (pathogenic + non-pathogenic)

total_al <- sum(variants$allele_number)
total_al
## [1] 10922918

Pathogenic TACI allele frequency in gnomAD sample

freq_al <- patho_al / total_al
freq_al
## [1] 0.005327331

Converted frequency percentage to carrier probability Multiplied by 2 because each person has 2 copies of the gene Goal to estimate proportion of people in the general population likely carry at least one pathogenic TACI variant.

carry_prob <- 2 * freq_al
carry_prob
## [1] 0.01065466

Conduct Binomial Test Rationale: Use a binomial test when comparing one sample proportion to an expected probability. Must use proportions because that is the type of data provided by gnomAD

binom_result <- binom.test(
  x = patho_TACI,
  n = total_TACI,
  p = carry_prob,
  alternative = "greater"
)

binom_result
## 
##  Exact binomial test
## 
## data:  patho_TACI and total_TACI
## number of successes = 18, number of trials = 194, p-value = 5.535e-12
## alternative hypothesis: true probability of success is greater than 0.01065466
## 95 percent confidence interval:
##  0.06082948 1.00000000
## sample estimates:
## probability of success 
##             0.09278351

Interpretation:

A binomial test was conducted to examine whether the proportion of individuals with pathogenic TACI (TNFRSF13B) variants in the clinical cohort was greater than the expected proportion in the general population estimated from gnomAD.

In the clinical sample, 18 of 194 individuals (9.28%) carried a pathogenic TACI variant. The expected carrier probability in the general population, derived from gnomAD allele frequency data, was 0.0107 (1.07%).

Results of the exact binomial test indicated that the proportion of pathogenic TACI carriers in the clinical cohort was significantly # greater than expected under the gnomAD-derived baseline, p < .001. The observed proportion (95% CI [0.061, 1.000]) exceeded the expected population probability.

Effect size: Odds Ratio

clinical odds

clinical_odds <- patho_TACI / (total_TACI - patho_TACI)
clinical_odds
## [1] 0.1022727

population odds

pop_odds <- carry_prob / (1 - carry_prob)
pop_odds
## [1] 0.01076941

odds ratio

OR <- clinical_odds / pop_odds
OR
## [1] 9.496598

The odds of carrying a pathogenic TNFRSF13B variant were compared between the clinical cohort and expected population-level odds derived from gnomAD. Results indicated that individuals in the clinical cohort had approximately 9.50 times higher odds of carrying a pathogenic variant relative to population expectations (OR = 9.50), suggesting strong enrichment of pathogenic variants in the clinical sample.