Neuroimaging Data Visualization & Statistical Analysis with ggseg

Author

Timothy Achala

Code
# ggseg 2.0.0 requires ggseg.formats for atlas data
# install.packages(c("ggseg", "ggseg.formats", "tidyverse",
#                    "effectsize", "ggcorrplot", "broom", "ggrepel"))

library(ggseg)          # 2.0.0: dk is a function, use dk() everywhere
library(ggseg.formats)  # provides atlas data and atlas_regions()
library(tidyverse)
library(broom)
library(effectsize)
library(ggcorrplot)
library(ggrepel)
library(patchwork)

1. Simulate Neuroimaging Data

Code
set.seed(42)

# In ggseg 2.0.0, dk() returns the brain_atlas object
# atlas_regions() from ggseg.formats extracts the region names
all_regions <- atlas_regions(dk())

regions <- tibble(
  region = rep(all_regions, 2),
  hemi   = rep(c("left", "right"), each = length(all_regions))
)

n_subjects <- 60

subject_data <- regions |>
  crossing(subject = 1:n_subjects) |>
  mutate(
    group     = ifelse(subject <= 30, "Patient", "Control"),
    age       = rnorm(n(), mean = 45, sd = 12),
    sex       = sample(c("Male", "Female"), n(), replace = TRUE),
    thickness = rnorm(n(), mean = 2.5 - 0.3 * (group == "Patient"), sd = 0.4),
    area      = rnorm(n(), mean = 2000, sd = 300),
    volume    = rnorm(n(), mean = 4000, sd = 600)
  )

glimpse(subject_data)
Rows: 4,200
Columns: 9
$ region    <chr> "banks of superior temporal sulcus", "banks of superior temp…
$ hemi      <chr> "left", "left", "left", "left", "left", "left", "left", "lef…
$ subject   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 1…
$ group     <chr> "Patient", "Patient", "Patient", "Patient", "Patient", "Pati…
$ age       <dbl> 61.45150, 38.22362, 49.35754, 52.59435, 49.85122, 43.72651, …
$ sex       <chr> "Male", "Male", "Male", "Female", "Female", "Female", "Male"…
$ thickness <dbl> 2.520752, 1.781223, 2.464086, 2.495363, 2.927046, 1.868883, …
$ area      <dbl> 2275.760, 1718.285, 2788.369, 2339.058, 2370.998, 2546.012, …
$ volume    <dbl> 4029.431, 3129.679, 4633.523, 3267.092, 3728.216, 4119.132, …

2. Brain Atlas Visualization

Note: geom_brain() auto-joins your data to the atlas by the region column.
In ggseg 2.0.0, always pass atlas = dk() (with parentheses).

2a. Mean Cortical Thickness by Region

Code
mean_thick <- subject_data |>
  group_by(region) |>
  summarise(mean_thickness = mean(thickness), .groups = "drop")

ggplot(mean_thick) +
  geom_brain(
    atlas   = dk(),
    mapping = aes(fill = mean_thickness)
  ) +
  scale_fill_gradient2(
    low = "#2166ac", mid = "#f7f7f7", high = "#d6604d",
    midpoint = 2.5, name = "Thickness (mm)"
  ) +
  labs(
    title    = "Mean Cortical Thickness by Region (Desikan-Killiany Atlas)",
    subtitle = "Averaged across all subjects"
  ) +
  theme_void() +
  theme(legend.position = "bottom")

Interpretation: Warmer colors = thicker cortex; cooler = thinner. Primary sensorimotor regions typically show lower thickness than association cortices.


2b. Group Difference Map (Patient − Control)

Code
group_diff <- subject_data |>
  group_by(region, group) |>
  summarise(mean_t = mean(thickness), .groups = "drop") |>
  pivot_wider(names_from = group, values_from = mean_t) |>
  mutate(diff = Patient - Control)

ggplot(group_diff) +
  geom_brain(
    atlas   = dk(),
    mapping = aes(fill = diff)
  ) +
  scale_fill_gradient2(
    low = "#2166ac", mid = "white", high = "#d6604d",
    midpoint = 0, name = "Difference (mm)\n[Patient − Control]"
  ) +
  labs(
    title    = "Cortical Thickness: Patient vs Control",
    subtitle = "Red = thicker in patients | Blue = thinner in patients"
  ) +
  theme_void() +
  theme(legend.position = "bottom")

Interpretation: Blue = thinner in patients (potential atrophy); red = thicker. Clinically relevant regions (e.g., entorhinal, temporal) warrant follow-up.


3. Statistical Tests

3a. Per-Region t-test + FDR Correction

Code
ttest_results <- subject_data |>
  group_by(region) |>
  summarise(
    t_stat  = t.test(thickness ~ group)$statistic,
    p_value = t.test(thickness ~ group)$p.value,
    .groups = "drop"
  ) |>
  mutate(
    p_fdr       = p.adjust(p_value, method = "fdr"),
    significant = p_fdr < 0.05,
    neg_log_p   = -log10(p_fdr)
  )

ggplot(ttest_results) +
  geom_brain(
    atlas   = dk(),
    mapping = aes(fill = neg_log_p)
  ) +
  scale_fill_viridis_c(
    option = "magma", direction = -1,
    name   = "-log10(p)\nFDR-corrected"
  ) +
  labs(
    title    = "Significance Map (t-test, FDR corrected)",
    subtitle = "Brighter = more significant | threshold at -log10(0.05) = 1.3"
  ) +
  theme_void() +
  theme(legend.position = "bottom")

Interpretation: Regions with −log10(p) > 1.3 survive FDR correction, controlling the expected proportion of false discoveries across all region-wise tests.


3b. Effect Size Map (Cohen’s d)

Code
cohens_d_results <- subject_data |>
  group_by(region) |>
  summarise(
    d = {
      res <- cohens_d(thickness ~ group)
      # column name differs by effectsize version
      col <- intersect(c("Cohens_d", "d"), names(res))
      res[[col[1]]]
    },
    .groups = "drop"
  )

ggplot(cohens_d_results) +
  geom_brain(
    atlas   = dk(),
    mapping = aes(fill = d)
  ) +
  scale_fill_gradient2(
    low = "#2166ac", mid = "white", high = "#d6604d",
    midpoint = 0, name = "Cohen's d"
  ) +
  labs(
    title    = "Effect Size Map (Cohen's d)",
    subtitle = "|d|: 0.2 = small | 0.5 = medium | 0.8 = large"
  ) +
  theme_void() +
  theme(legend.position = "bottom")

Interpretation: Cohen’s d separates statistical from clinical significance. A region can be statistically significant but have trivially small d — the brain map reveals where effects are both real and meaningful.


3c. Linear Regression — Thickness ~ Age + Group + Sex

Code
lm_results <- subject_data |>
  group_by(region) |>
  reframe(tidy(lm(thickness ~ age + group + sex))) |>
  filter(term == "age") |>
  mutate(
    p_fdr     = p.adjust(p.value, method = "fdr"),
    neg_log_p = -log10(p_fdr)
  )

ggplot(lm_results) +
  geom_brain(
    atlas   = dk(),
    mapping = aes(fill = estimate)
  ) +
  scale_fill_gradient2(
    low = "#2166ac", mid = "white", high = "#d6604d",
    midpoint = 0, name = "β (Age)"
  ) +
  labs(
    title    = "Age Effect on Cortical Thickness (Regression β)",
    subtitle = "Controlling for group and sex | Negative β = thinning with age"
  ) +
  theme_void() +
  theme(legend.position = "bottom")

Interpretation: Negative β = cortex thins with age. Frontal and temporal regions classically show the steepest age-related decline, consistent with neuroimaging literature.


3d. Correlation Matrix Across Morphometric Measures

Code
corr_data <- subject_data |>
  group_by(region) |>
  summarise(
    thickness = mean(thickness),
    area      = mean(area),
    volume    = mean(volume),
    .groups   = "drop"
  ) |>
  select(thickness, area, volume)

corr_mat <- cor(corr_data, method = "spearman")

ggcorrplot(
  corr_mat,
  method = "circle", type = "lower", lab = TRUE,
  colors = c("#2166ac", "white", "#d6604d"),
  title  = "Spearman Correlations: Thickness vs Area vs Volume"
)

Interpretation: Area and volume correlate strongly (volume ≈ thickness × area). Low thickness–area correlation validates thickness as an independent morphometric marker.


3e. Volcano Plot — Effect Size vs Significance

Code
volcano_data <- ttest_results |>
  left_join(group_diff,      by = "region") |>
  left_join(cohens_d_results, by = "region") |>
  mutate(label = ifelse(p_fdr < 0.05 & abs(d) > 0.2, region, NA))

ggplot(volcano_data, aes(x = diff, y = neg_log_p, color = significant)) +
  geom_point(size = 2.5, alpha = 0.8) +
  geom_hline(yintercept = -log10(0.05), linetype = "dashed", color = "red") +
  geom_vline(xintercept = c(-0.1, 0.1), linetype = "dashed", color = "gray60") +
  geom_text_repel(aes(label = label), size = 3, max.overlaps = 15) +
  scale_color_manual(values = c("FALSE" = "gray70", "TRUE" = "#d6604d")) +
  labs(
    title    = "Volcano Plot: Group Difference vs Significance",
    x        = "Mean Difference (Patient − Control, mm)",
    y        = "-log10(p) FDR-corrected",
    subtitle = "Red dashed = FDR threshold | Labeled = significant & |d| > 0.2"
  ) +
  theme_minimal(base_size = 13)

Interpretation: Top-right = significantly thicker in patients; top-left = significantly thinner. Labeled regions pass both significance and minimum effect size filters simultaneously.


4. Summary Table of Significant Regions

Code
ttest_results |>
  filter(p_fdr < 0.05) |>
  left_join(group_diff,       by = "region") |>
  left_join(cohens_d_results, by = "region") |>
  select(region, t_stat, p_value, p_fdr, diff, d) |>
  arrange(p_fdr) |>
  mutate(across(where(is.numeric), \(x) round(x, 3))) |>
  knitr::kable(
    caption = "FDR-significant regions: t-stat, raw/corrected p, mean difference, Cohen's d"
  )
FDR-significant regions: t-stat, raw/corrected p, mean difference, Cohen’s d
region t_stat p_value p_fdr diff d
paracentral 6.463 0.000 0.000 -0.417 1.180
pars orbitalis 5.693 0.000 0.000 -0.439 1.039
caudal middle frontal 5.167 0.000 0.000 -0.383 0.943
pericalcarine 5.234 0.000 0.000 -0.413 0.956
rostral middle frontal 5.134 0.000 0.000 -0.320 0.937
isthmus cingulate 5.071 0.000 0.000 -0.359 0.926
frontal pole 5.016 0.000 0.000 -0.335 0.916
lateral orbitofrontal 4.991 0.000 0.000 -0.411 0.911
corpus callosum 4.734 0.000 0.000 -0.322 0.864
pars triangularis 4.726 0.000 0.000 -0.322 0.863
inferior temporal 4.632 0.000 0.000 -0.313 0.846
entorhinal 4.557 0.000 0.000 -0.308 0.832
fusiform 4.542 0.000 0.000 -0.318 0.829
supramarginal 4.496 0.000 0.000 -0.363 0.821
posterior cingulate 4.373 0.000 0.000 -0.319 0.798
parahippocampal 4.341 0.000 0.000 -0.328 0.793
caudal anterior cingulate 4.224 0.000 0.000 -0.286 0.771
middle temporal 4.017 0.000 0.000 -0.273 0.733
lingual 3.964 0.000 0.000 -0.297 0.724
superior frontal 3.902 0.000 0.000 -0.305 0.712
superior parietal 3.910 0.000 0.000 -0.316 0.714
lateral occipital 3.845 0.000 0.000 -0.285 0.702
insula 3.707 0.000 0.000 -0.284 0.677
medial orbitofrontal 3.500 0.001 0.001 -0.261 0.639
transverse temporal 3.508 0.001 0.001 -0.250 0.640
cuneus 3.363 0.001 0.001 -0.249 0.614
postcentral 3.379 0.001 0.001 -0.243 0.617
precentral 3.370 0.001 0.001 -0.225 0.615
pars opercularis 3.300 0.001 0.002 -0.269 0.603
precuneus 3.058 0.003 0.003 -0.239 0.558
superior temporal 2.868 0.005 0.006 -0.205 0.524
temporal pole 2.818 0.006 0.006 -0.217 0.515
inferior parietal 2.466 0.015 0.016 -0.195 0.450
rostral anterior cingulate 2.400 0.018 0.019 -0.170 0.438
banks of superior temporal sulcus 2.309 0.023 0.023 -0.167 0.422

5. Statistical Methods Reference

Method Purpose Key output
t-test per region 2-group mean difference t, p
FDR correction Multiple comparisons control p_fdr
Cohen’s d Effect size (clinical relevance) d
Linear regression Control confounders (age, sex) β, p
Spearman correlation Non-parametric morphometric association ρ
Volcano plot Joint effect size + significance view Visual