CRPS Mutation Analysis Report

This report analyzed the 488T>C in MT-RNR2/ M-SHLP2, c.11A>G, p.Lys4Arg. in Haplogroups J, U and A using data from mutect2.haplogroup1.tab and mutect2.03.merge.vcf.

UKBiobank

Data processing

First import data, then merge haplogroup information with genetic variant data. Extract m.2639 mutation status for all samples.

# Load clean haplogroup data
haplo_clean <- read_tsv("data/CRPS_summary/mutect2.haplogroup1.tab")

# Load VCF data
vcf <- read.vcfR("data/CRPS_summary/mutect2.03.merge.vcf")

## Scanning file to determine attributes.
## File attributes:
##   meta lines: 47
##   header_line: 48
##   variant count: 3645
##   column count: 2506
## Meta line 47 read in.
## All meta lines processed.
## gt matrix initialized.
## Character matrix gt created.
##   Character matrix gt rows: 3645
##   Character matrix gt cols: 2506
##   skip: 0
##   nrows: 3645
##   row_num: 0
## Processed variant 1000Processed variant 2000Processed variant 3000Processed variant: 3645
## All variants processed

# Clean sample IDs
haplo_clean <- haplo_clean %>%
  mutate(IID_clean = str_split_i(Run, "_", 1))

# Extract 488T>C mutation (more frequently referred to as m.2158T>C)
m488_pos <- "2158"
m488_index <- which(vcf@fix[, "POS"] == m488_pos)
gt <- extract.gt(vcf, element = "GT")
m488_gt <- gt[m488_index, ]

mutation_data <- data.frame(
  IID_vcf = names(m488_gt),
  m488_mutation = ifelse(grepl("1", m488_gt), 1, 0)
) %>%
  mutate(IID_clean = str_split_i(IID_vcf, "_", 1))

# Merge data
final_data <- haplo_clean %>%
  left_join(mutation_data, by = "IID_clean") %>%
  filter(!is.na(m488_mutation))

Summary the merge data:

total_samples <- nrow(final_data)
mutated_samples <- sum(final_data$m488_mutation)
mutation_rate <- mutated_samples / total_samples * 100

cat("**Total samples analyzed:**", total_samples, "\n\n")

## **Total samples analyzed:** 2495

cat("**Samples with m.488 mutation:**", mutated_samples, "\n\n")

## **Samples with m.488 mutation:** 36

cat("**Overall mutation frequency:**", round(mutation_rate, 2), "%\n")

## **Overall mutation frequency:** 1.44 %

Calculated mutation frequencies across haplogroups

haplo_summary <- final_data %>%
  group_by(haplogroup) %>%
  summarise(
    n_samples = n(),
    n_mutated = sum(m488_mutation),
    mutation_freq = n_mutated / n_samples * 100
  ) %>%
  arrange(desc(mutation_freq))

# Display top haplogroups with mutations
haplo_summary_mutated <- haplo_summary %>%
  filter(n_mutated > 0)



haplo_summary %>%
  knitr::kable(
    col.names = c("Haplogroup", "Total Samples", "Mutated Samples", "Mutation Frequency (%)"),
    digits = 2
  )

Haplogroup	Total Samples	Mutated Samples	Mutation Frequency (%)
J	261	36	13.79
B	1	0	0.00
C	1	0	0.00
D	1	0	0.00
F	1	0	0.00
H	1097	0	0.00
HV	57	0	0.00
I	87	0	0.00
K	202	0	0.00
L1	3	0	0.00
L2	2	0	0.00
L3	4	0	0.00
M	3	0	0.00
N	6	0	0.00
R	9	0	0.00
T	267	0	0.00
U	347	0	0.00
V	77	0	0.00
W	42	0	0.00
X	27	0	0.00

Specific Enrichment in J Haplogroup

The m.488 mutation shows an high frequency (13.79%) within the J haplogroup.

This represents a significant enrichment pattern compared to other haplogroups.

CRPS Mutation Analysis Report

Hening Cui

March 11, 2026

UKBiobank

Data processing

Specific Enrichment in J Haplogroup