Overview

Cancer-associated fibroblasts (CAFs) are key stromal contributors to ovarian cancer (OC) progression and therapeutic resistance, primarily through dysregulation of the extracellular matrix (ECM). CAFs are a heterogeneous population derived from different cell types through activation and reprogramming by the tumor microenvironment. Current research models rely on either uncharacterized primary CAFs isolated from patient tumors, or normal fibroblasts that fail to fully recapitulate CAF-like behavior. A clinically relevant intermediate, normal fibroblasts reprogrammed by tumor-derived conditioned media, was recently characterized by Axemaker et al. (2024), who demonstrated that Kuramochi and SKOV-3 ovarian cancer cell line conditioned media is sufficient to reprogram normal human uterine fibroblasts (HUFs) into functional CAF-like cells.

This analysis performs independent differential expression analysis on the bulk RNA-seq dataset (GSE280564) deposited alongside Axemaker et al. (2024), comparing three CAF types against normal HUFs using a full upstream pipeline built with Nextflow DSL2, STAR alignment against GRCh38, and featureCounts for gene quantification. Beyond replicating the original transcriptional findings, this project introduces a novel analytical angle: identifying genes that primary (tumor-derived) CAFs change but conditioned media CAFs fail to recapitulate — the transcriptional “resisters” that reveal the limits of in vitro reprogramming models.

Dataset: GSE280564 — 3 HUF (normal), 3 Kuramochi-conditioned CAF, 3 SKOV3-conditioned CAF, 3 Primary OC-CAF (12 samples total)


Data Loading and Preprocessing

The count matrix was generated by the upstream Nextflow DSL2 pipeline using featureCounts on all 12 BAM files simultaneously, ensuring consistent gene rows across samples. Raw counts were loaded and annotation columns were removed, retaining only the 12 sample count columns.

counts <- read.table("~/rnaseq-pipeline/results/featurecounts/counts_matrix.txt",
                     header = TRUE,
                     skip = 1,
                     row.names = 1)

count_matrix <- counts[, 6:17]
colnames(count_matrix) <- gsub("\\.Aligned\\.sortedByCoord\\.out\\.bam", "", colnames(count_matrix))

dim(count_matrix)
## [1] 78932    12
colnames(count_matrix)
##  [1] "HUF_2"           "Kuramochi_CAF_2" "Primary_CAF_3"   "SKOV3_CAF_2"    
##  [5] "SKOV3_CAF_1"     "SKOV3_CAF_3"     "Kuramochi_CAF_1" "Primary_CAF_1"  
##  [9] "Primary_CAF_2"   "HUF_3"           "HUF_1"           "Kuramochi_CAF_3"

The count matrix contains 78932 genes across 12 samples. featureCounts was run on all BAM files simultaneously to ensure identical gene rows across samples, avoiding the row-mismatch problem that arises from per-sample counting followed by concatenation.


Differential Expression Analysis

Experimental Design

Sample metadata was constructed manually from the GEO record, with HUF set as the reference level so that all log2 fold changes represent expression relative to normal fibroblasts.

metadata <- data.frame(
    sample = colnames(count_matrix),
    condition = c("HUF", "Kuramochi_CAF", "Primary_CAF",
                  "SKOV3_CAF", "SKOV3_CAF", "SKOV3_CAF",
                  "Kuramochi_CAF", "Primary_CAF", "Primary_CAF",
                  "HUF", "HUF", "Kuramochi_CAF"),
    row.names = colnames(count_matrix)
)

metadata$condition <- factor(metadata$condition,
                             levels = c("HUF", "Kuramochi_CAF", "Primary_CAF", "SKOV3_CAF"))

table(metadata$condition)
## 
##           HUF Kuramochi_CAF   Primary_CAF     SKOV3_CAF 
##             3             3             3             3

DESeq2 Model

dds <- DESeqDataSetFromMatrix(countData = count_matrix,
                              colData = metadata,
                              design = ~ condition)
dds <- DESeq(dds)

Differential expression was performed using DESeq2 with a negative binomial model and empirical Bayes dispersion shrinkage. Three contrasts were extracted — each CAF type versus HUF as reference.

Results Extraction

kuramochi_res <- results(dds, name = "condition_Kuramochi_CAF_vs_HUF", alpha = 0.05)
primary_res   <- results(dds, name = "condition_Primary_CAF_vs_HUF", alpha = 0.05)
skov3_res     <- results(dds, name = "condition_SKOV3_CAF_vs_HUF", alpha = 0.05)

summary(kuramochi_res)
## 
## out of 38076 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up)       : 3965, 10%
## LFC < 0 (down)     : 3806, 10%
## outliers [1]       : 0, 0%
## low counts [2]     : 17413, 46%
## (mean count < 2)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
summary(primary_res)
## 
## out of 38076 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up)       : 5095, 13%
## LFC < 0 (down)     : 4895, 13%
## outliers [1]       : 0, 0%
## low counts [2]     : 15323, 40%
## (mean count < 1)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
summary(skov3_res)
## 
## out of 38076 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 0 (up)       : 4171, 11%
## LFC < 0 (down)     : 3650, 9.6%
## outliers [1]       : 0, 0%
## low counts [2]     : 16020, 42%
## (mean count < 1)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
kuramochi_df <- as.data.frame(kuramochi_res)
primary_df   <- as.data.frame(primary_res)
skov3_df     <- as.data.frame(skov3_res)

# Add gene symbols
for (df_name in c("kuramochi_df", "primary_df", "skov3_df")) {
    df <- get(df_name)
    df$symbol <- mapIds(org.Hs.eg.db,
                        keys = rownames(df),
                        column = "SYMBOL",
                        keytype = "ENSEMBL",
                        multiVals = "first")
    assign(df_name, df)
}

All three comparisons yield thousands of differentially expressed genes at padj < 0.05. Notably, Primary CAF vs HUF produces nearly twice as many DEGs (~10,000) compared to either conditioned CAF type (~7,800), reflecting the more extensive transcriptional reprogramming that occurs in genuine tumor-derived fibroblasts relative to in vitro conditioned models. No outlier samples were detected in any comparison, confirming pipeline and data quality.

The deeper sequencing depth of this dataset (16-25 million reads per sample) relative to the original publication results in greater statistical power to detect subtle transcriptional changes, explaining the larger DEG counts compared to the 748-1543 DEGs reported by Axemaker et al. (2024), who used a stricter fold change cutoff of 1.5x with shallower sequencing.


Exploratory Data Analysis

Variance Stabilized Counts

vsd <- vst(dds, blind = FALSE)

Sample Distance Heatmap

pheatmap(as.matrix(dist(t(assay(vsd)))),
         annotation_col = data.frame(condition = metadata$condition,
                                     row.names = colnames(assay(vsd))),
         show_rownames = FALSE,
         show_colnames = FALSE,
         main = "Sample Distances")

The sample distance heatmap confirms that HUF samples form a tight & distinct cluster completely separate from all CAF types, validating the experimental design and confirming that CAF reprogramming drives a clear and consistent transcriptional shift away from the normal fibroblast state. All three CAF types cluster together on the opposite side, with relatively similar within-group distances, suggesting that regardless of reprogramming stimulus, CAF identity converges on a shared transcriptional state that is fundamentally distinct from HUF.

PCA

pca_data <- plotPCA(vsd, intgroup = "condition", returnData = TRUE)
pct_var <- round(100 * attr(pca_data, "percentVar"))

ggplot(pca_data, aes(PC1, PC2, color = condition)) +
    geom_point(size = 4) +
    xlab(paste0("PC1: ", pct_var[1], "% variance")) +
    ylab(paste0("PC2: ", pct_var[2], "% variance")) +
    theme_bw() +
    ggtitle("PCA — All Samples") +
    theme(plot.title = element_text(hjust = 0.5))

PC1 explains 76% of total variance and cleanly separates HUF from all three CAF types, confirming that CAF reprogramming is the dominant source of transcriptional variation in this dataset. The clean separation on PC1 gives strong confidence that the differential expression results reflect true biological signal. Notably, all three CAF types (Kuramochi, SKOV3, and Primary) cluster tightly together on PC1, suggesting they share a broadly similar transcriptional state relative to HUF. PC2 captures 21% of variance and shows some within-CAF variation, though no clear separation between CAF types is visible, indicating the differences between CAF models are subtle relative to the HUF-to-CAF transition.


Volcano Plots

Volcano plots were generated for each comparison using EnhancedVolcano. Given the deeper sequencing depth of this dataset, a fold change cutoff of FC = 1 (log2FC threshold of 1) was applied alongside padj < 0.05. Most significant genes show modest fold changes (median |LFC| ~0.5-0.8 for significant genes) which is consistent with the subtle, broad transcriptional reprogramming expected when comparing fibroblast subtypes rather than tumor vs normal epithelium.

EnhancedVolcano(kuramochi_df,
    lab = kuramochi_df$symbol,
    x = 'log2FoldChange',
    y = 'padj',
    title = 'Kuramochi CAF (FC:1) vs HUF',
    pCutoff = 0.05,
    FCcutoff = 1,
    pointSize = 0.3,
    labSize = 3,
    colAlpha = 0.5,
    xlim = c(-8, 8),
    ylim = c(0, 200))

EnhancedVolcano(primary_df,
    lab = primary_df$symbol,
    x = 'log2FoldChange',
    y = 'padj',
    title = 'Primary CAF (FC:1) vs HUF',
    pCutoff = 0.05,
    FCcutoff = 1,
    pointSize = 0.3,
    labSize = 3,
    colAlpha = 0.5,
    xlim = c(-8, 8),
    ylim = c(0, 200))

EnhancedVolcano(skov3_df,
    lab = skov3_df$symbol,
    x = 'log2FoldChange',
    y = 'padj',
    title = 'SKOV3 CAF (FC:1) vs HUF',
    pCutoff = 0.05,
    FCcutoff = 1,
    pointSize = 0.3,
    labSize = 3,
    colAlpha = 0.5,
    xlim = c(-8, 8),
    ylim = c(0, 200))

The volcano plots across all three comparisons show similar patterns where statistically significant genes distributed on both sides of the FC threshold, with Primary CAF showing more genes with extreme fold changes consistent with its larger DEG count. Known CAF-associated genes including MMP1, MMP3, DKK1, PLAU, VTN, and CCDC80 appear prominently across all three comparisons, independently replicating the transcriptional signature reported by Axemaker et al. (2024).


Core CAF Signature

DEG Overlap — Venn Diagram

To identify the conserved CAF transcriptional program shared across all three reprogramming models, DEGs were extracted at padj < 0.05 and |LFC| > 1 for each comparison and their overlap computed.

kuramochi_sig <- rownames(kuramochi_df)[which(kuramochi_df$padj < 0.05 & abs(kuramochi_df$log2FoldChange) > 1)]
primary_sig   <- rownames(primary_df)[which(primary_df$padj < 0.05 & abs(primary_df$log2FoldChange) > 1)]
skov3_sig     <- rownames(skov3_df)[which(skov3_df$padj < 0.05 & abs(skov3_df$log2FoldChange) > 1)]

deg_list <- list(
    Kuramochi_CAF = kuramochi_sig,
    Primary_CAF   = primary_sig,
    SKOV3_CAF     = skov3_sig
)

ggvenn(deg_list,
    fill_color = c('#E69F00', '#56B4E9', '#009E73'),
    stroke_size = 0.5,
    set_name_size = 4)

cat("Kuramochi DEGs:", length(kuramochi_sig), "\n")
## Kuramochi DEGs: 2287
cat("Primary CAF DEGs:", length(primary_sig), "\n")
## Primary CAF DEGs: 4073
cat("SKOV3 CAF DEGs:", length(skov3_sig), "\n")
## SKOV3 CAF DEGs: 2460
core_caf_genes <- kuramochi_sig
for (sig in list(primary_sig, skov3_sig)) {
    core_caf_genes <- core_caf_genes[core_caf_genes %in% sig]
}
cat("Core CAF genes (all three):", length(core_caf_genes), "\n")
## Core CAF genes (all three): 864

864 genes are differentially expressed in all three CAF types relative to HUF, representing approximately 15% of total DEGs. These constitute the core CAF transcriptional signature wherein genes whose dysregulation is a conserved feature of CAF identity regardless of the reprogramming stimulus or whether the CAF is tumor-derived or in vitro conditioned. Primary CAF has nearly twice as many DEGs as either conditioned type, consistent with the more extensive reprogramming achievable by the full tumor microenvironment.

Core CAF Heatmap

# Score and rank core CAF genes
core_caf_df <- kuramochi_df[core_caf_genes, c("log2FoldChange", "padj", "symbol")]
min_padj <- min(core_caf_df$padj[core_caf_df$padj > 0], na.rm = TRUE)
core_caf_df$padj_adj <- ifelse(core_caf_df$padj == 0, min_padj, core_caf_df$padj)
core_caf_df$score <- -log10(core_caf_df$padj_adj) * abs(core_caf_df$log2FoldChange)
core_caf_df <- core_caf_df[order(core_caf_df$score, decreasing = TRUE), ]

# Top 50 for heatmap
top50_genes <- rownames(core_caf_df)[!is.na(core_caf_df$symbol)][1:50]
vsd_matrix <- assay(vsd)
heatmap_matrix <- vsd_matrix[top50_genes, ]
rownames(heatmap_matrix) <- core_caf_df[top50_genes, "symbol"]
heatmap_matrix_scaled <- t(scale(t(heatmap_matrix)))

pheatmap(heatmap_matrix_scaled,
    annotation_col = data.frame(condition = metadata$condition,
                                row.names = colnames(heatmap_matrix)),
    cluster_rows = TRUE,
    cluster_cols = TRUE,
    show_colnames = FALSE,
    fontsize_row = 8,
    color = colorRampPalette(c("steelblue", "white", "firebrick"))(100),
    main = "Core CAF Signature — Top 50 Genes")

The heatmap reveals two biologically coherent gene clusters. The lower cluster contains genes upregulated in Kuramochi and SKOV3 conditioned CAFs relative to HUF (including CCDC80, PLIN2, LMOD1, VTN, KRT7, and ISLR), representing activated ECM organization and contractility programs. The upper cluster contains genes consistently downregulated in conditioned CAFs relative to HUF (including MMP3, DKK1, PLAU, IL33, CD44, and SPP1), reflecting loss of the normal fibroblast secretory and signaling profile. HUF samples cluster completely separately from all CAF types, confirming that CAF reprogramming drives a fundamental and consistent transcriptional shift away from the normal fibroblast state. What is particularly striking is the divergent pattern of Primary CAF compared to both conditioned CAF types. Several genes that are strongly suppressed in Kuramochi and SKOV3 CAFs, including MMP1, CCND1, NTM, SAA1, COL11A1 and ABLIM3, remain expressed at high levels in Primary CAF. At the same time, ECM and metabolic reprogramming genes that are strongly activated in conditioned CAFs (such as PLIN2, LMOD1 and CCDC80) show near-neutral or negative values in Primary CAF. This pattern reflects a fundamental difference between in vitro conditioned media reprogramming and genuine tumor-derived CAF identity, and foreshadows the resister analysis findings presented below. In the column dendrogram, Kuramochi and SKOV3 conditioned CAFs cluster most closely together while Primary CAF forms a separate intermediate branch, suggesting the two conditioned CAF models are more transcriptionally similar to each other than either is to Primary CAF. Several of these core signature genes (DKK1, MMP3, VTN, IGFBP3, SPP1) are explicitly highlighted in the original publication (Axemaker et al., 2024), providing independent validation of this analysis.


Novel Analysis: Resister Genes

Rationale

Most CAF research asks what tumors turn on in fibroblasts. This analysis flips that question, asking what changes Primary CAF undergoes that conditioned media models cannot replicate. If the same transcriptional changes are missing across two biologically different reprogramming stimuli (Kuramochi and SKOV3), those programs may be fundamentally resistant to soluble factor-driven reprogramming — potentially reflecting epigenetically protected fibroblast identity.

Definition

Resister genes are defined as:

  • Significantly changed in Primary CAF vs HUF (padj < 0.05, |LFC| > 1) i.e. what full CAF conversion alters
  • NOT significantly changed in either Kuramochi OR SKOV3 conditioned CAF vs HUF i.e. what conditioned media fails to recapitulate

This captures both directions: genes that primary CAFs turn on and conditioned CAFs cannot, and genes that primary CAFs turn off and conditioned CAFs cannot silence.

Identification

primary_up      <- rownames(primary_df)[which(primary_df$padj < 0.05 & primary_df$log2FoldChange > 1)]
primary_down    <- rownames(primary_df)[which(primary_df$padj < 0.05 & primary_df$log2FoldChange < -1)]
primary_changed <- c(primary_up, primary_down)

kuramochi_changed <- rownames(kuramochi_df)[which(kuramochi_df$padj < 0.05 & abs(kuramochi_df$log2FoldChange) > 1)]
skov3_changed     <- rownames(skov3_df)[which(skov3_df$padj < 0.05 & abs(skov3_df$log2FoldChange) > 1)]

resisters <- primary_changed[
    !primary_changed %in% kuramochi_changed &
    !primary_changed %in% skov3_changed
]

cat("Primary CAF total DEGs:", length(primary_changed), "\n")
## Primary CAF total DEGs: 4073
cat("Resister genes:", length(resisters), "\n")
## Resister genes: 2517
cat("Percentage of Primary DEGs that resisted:", round(length(resisters)/length(primary_changed)*100, 1), "%\n")
## Percentage of Primary DEGs that resisted: 61.8 %

2517 resister genes were identified — representing 61.8% of Primary CAF DEGs that conditioned media models completely failed to recapitulate. This is a substantial proportion, suggesting that conditioned media captures the broad CAF activation signal but misses a meaningful and specific layer of transcriptional reprogramming that only occurs in the context of the real tumor microenvironment.

Top Resister Genes

resisters_df <- primary_df[resisters, c("log2FoldChange", "padj", "symbol")]
resisters_df <- resisters_df[order(abs(resisters_df$log2FoldChange), decreasing = TRUE), ]

resisters_up   <- resisters_df[resisters_df$log2FoldChange > 0, ]
resisters_down <- resisters_df[resisters_df$log2FoldChange < 0, ]

top_up   <- head(resisters_up[order(resisters_up$log2FoldChange, decreasing = TRUE), ], 10)
top_down <- head(resisters_down[order(resisters_down$log2FoldChange), ], 10)

top_up$direction   <- "Upregulated"
top_down$direction <- "Downregulated"

top_resisters_table <- rbind(top_up, top_down)
top_resisters_table$log2FoldChange <- round(top_resisters_table$log2FoldChange, 2)
top_resisters_table$padj <- formatC(top_resisters_table$padj, format = "e", digits = 2)

top_resisters_table[, c("symbol", "log2FoldChange", "padj", "direction")]
##                  symbol log2FoldChange     padj     direction
## ENSG00000174697     LEP          14.86 4.20e-35   Upregulated
## ENSG00000237515  SHISA9          12.41 1.36e-24   Upregulated
## ENSG00000131668   BARX1          12.29 1.64e-47   Upregulated
## ENSG00000154856  APCDD1          10.86 8.76e-19   Upregulated
## ENSG00000109906  ZBTB16          10.04 2.12e-21   Upregulated
## ENSG00000157542   KCNJ6          10.03 7.16e-16   Upregulated
## ENSG00000176887   SOX11           9.56 2.57e-14   Upregulated
## ENSG00000185652    NTF3           9.41 3.23e-14   Upregulated
## ENSG00000157782   CABP1           9.40 3.56e-14   Upregulated
## ENSG00000004948   CALCR           9.34 1.47e-13   Upregulated
## ENSG00000180818  HOXC10         -10.62 2.68e-18 Downregulated
## ENSG00000125872   LRRN4          -9.55 8.03e-15 Downregulated
## ENSG00000187955 COL14A1          -8.57 9.78e-66 Downregulated
## ENSG00000274512  TBC1D3          -8.55 5.08e-12 Downregulated
## ENSG00000078399   HOXA9          -8.49 1.06e-15 Downregulated
## ENSG00000294153    <NA>          -8.19 5.71e-11 Downregulated
## ENSG00000123388  HOXC11          -8.02 1.58e-10 Downregulated
## ENSG00000127241   MASP1          -7.99 2.72e-46 Downregulated
## ENSG00000287171    <NA>          -7.97 1.64e-10 Downregulated
## ENSG00000228630  HOTAIR          -7.95 2.02e-10 Downregulated

The top resister genes reveal a striking pattern. Upregulated resisters include LEP (leptin), BARX1, GATA3, SOX11, ZBTB16 and NTF3, master transcription factors for tissue and developmental identity. Downregulated resisters are dominated by HOX family genes (HOXC10, HOXA9, HOXC11, HOXD11, HOXD10) alongside HOTAIR, a long non-coding RNA that directly regulates HOX gene chromatin states. GATA4 and LHX9 further reinforce the developmental transcription factor theme.

The consistent appearance of HOX genes and developmental regulators as resisters is biologically meaningful: these genes encode the positional identity of uterine fibroblasts, established during embryonic development through Polycomb/Trithorax-mediated chromatin remodeling. Their resistance to conditioned media reprogramming suggests that tumor-secreted soluble factors, while sufficient to activate ECM and inflammatory programs, cannot overwrite the epigenetically locked developmental identity of fibroblasts without the sustained epigenetic pressure of the full tumor microenvironment, including direct cell contact, hypoxia, and mechanical stress over extended time periods.


Resister Gene Pathway Enrichment

GO Biological Process Enrichment

resisters_entrez <- bitr(resisters,
                         fromType = "ENSEMBL",
                         toType = "ENTREZID",
                         OrgDb = org.Hs.eg.db)

go_resisters <- enrichGO(gene = resisters_entrez$ENTREZID,
                         OrgDb = org.Hs.eg.db,
                         ont = "BP",
                         pAdjustMethod = "BH",
                         pvalueCutoff = 0.05,
                         qvalueCutoff = 0.05,
                         readable = TRUE)

cat("Significant GO terms:", nrow(go_resisters@result[go_resisters@result$p.adjust < 0.05, ]), "\n")
## Significant GO terms: 617

Enrichment Bar Plot

go_results <- go_resisters@result[1:20, ]

# Shorten long pathway name
go_results$Description <- gsub(
    "positive regulation of phosphatidylinositol 3-kinase/protein kinase B signal transduction",
    "PI3K/AKT signaling",
    go_results$Description)

go_results$Description <- factor(go_results$Description,
                                  levels = rev(go_results$Description))

ggplot(go_results, aes(x = Count, y = Description, fill = p.adjust)) +
    geom_bar(stat = "identity") +
    scale_fill_gradient(low = "firebrick", high = "steelblue") +
    theme_bw() +
    labs(title = "GO Enrichment — Resister Genes",
         x = "Gene Count",
         y = NULL,
         fill = "Adjusted p-value")

The GO enrichment results confirm and extend the top gene findings. The most significantly enriched pathways are developmental and positional identity programs — renal system development, kidney development, regionalization, pattern specification process, and anterior/posterior pattern formation. These are precisely the pathways governed by HOX genes and their regulatory networks.

Critically, these are not ECM or inflammatory pathways as those are the programs conditioned media successfully reprograms. The resisters are enriched for a completely orthogonal set of biological processes: the embryonic developmental programs that define fibroblast positional identity. Musculoskeletal and cardiac development terms likely reflect the shared HOX-regulatory network between mesenchymal lineages.

This pattern might suggest a mechanistic explanation: tumor-secreted factors operate through receptor-mediated signaling cascades that can rapidly remodel signal-responsive, chromatin-accessible gene programs. But HOX codes and positional identity are maintained by Polycomb repressive complexes and TAD-level 3D chromatin architecture that cannot be rewritten by extracellular signals alone as they require sustained epigenetic remodeling machinery delivered through direct cellular interaction, the kind only available in the real tumor microenvironment.


Summary and Conclusions

This analysis performed an independent differential expression analysis of GSE280564 and introduced a novel analytical framework i.e. resister gene analysis to characterize the limits of in vitro CAF reprogramming models.

Key findings:

  1. Core CAF signature (864 genes): A conserved transcriptional program is shared across all three CAF types, including upregulation of ECM remodeling genes (CCDC80, VTN, ISLR, LMOD1) and downregulation of normal fibroblast programs (MMP1, DKK1, PLAU, CD44). This independently replicates the findings of Axemaker et al. (2024).

  2. Primary CAF is transcriptionally distinct: Primary tumor-derived CAFs have nearly twice as many DEGs as conditioned CAFs (~10,000 vs ~7,800), with larger fold change magnitudes, confirming that the full tumor microenvironment drives more extensive reprogramming than soluble factors alone.

  3. 2,517 resister genes: A substantial proportion of Primary CAF’s transcriptional changes are not recapitulated by either Kuramochi or SKOV3 conditioned media. These resisters are enriched for developmental identity programs like HOX genes, positional patterning, embryonic morphogenesis, suggesting that the epigenetically locked developmental identity of fibroblasts is resistant to soluble factor-mediated reprogramming.

  4. Therapeutic implication: If HOX-driven positional identity represents an epigenetically protected program that tumor signals cannot overwrite, it opens two potential avenues. First, HOX expression patterns may be stable and consistent enough across patient samples to serve as molecular biomarkers that distinguish CAF subtypes or stratify patients by stromal composition. Second, the epigenetic machinery that locks these programs in place, particularly Polycomb repressive complexes such as PRC2, represents a tractable therapeutic target. Drugs that disrupt PRC2-mediated gene silencing, such as EZH2 inhibitors, could potentially unlock these developmental programs and push CAFs toward a less activated state, offering a stromal-targeting complement to existing cancer therapies.


Session Info

sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: aarch64-apple-darwin20
## Running under: macOS 15.7.4
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/New_York
## tzcode source: internal
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ggvenn_0.1.19               clusterProfiler_4.14.6     
##  [3] org.Hs.eg.db_3.20.0         AnnotationDbi_1.68.0       
##  [5] EnhancedVolcano_1.24.0      ggrepel_0.9.6              
##  [7] pheatmap_1.0.13             lubridate_1.9.4            
##  [9] forcats_1.0.1               stringr_1.5.2              
## [11] dplyr_1.1.4                 purrr_1.1.0                
## [13] readr_2.1.5                 tidyr_1.3.1                
## [15] tibble_3.3.0                ggplot2_4.0.0              
## [17] tidyverse_2.0.0             DESeq2_1.46.0              
## [19] SummarizedExperiment_1.36.0 Biobase_2.66.0             
## [21] MatrixGenerics_1.18.1       matrixStats_1.5.0          
## [23] GenomicRanges_1.58.0        GenomeInfoDb_1.42.3        
## [25] IRanges_2.40.1              S4Vectors_0.44.0           
## [27] BiocGenerics_0.52.0        
## 
## loaded via a namespace (and not attached):
##  [1] DBI_1.2.3               gson_0.1.0              rlang_1.1.6            
##  [4] magrittr_2.0.4          DOSE_4.0.1              compiler_4.4.1         
##  [7] RSQLite_2.4.3           reshape2_1.4.4          png_0.1-8              
## [10] vctrs_0.6.5             pkgconfig_2.0.3         crayon_1.5.3           
## [13] fastmap_1.2.0           XVector_0.46.0          labeling_0.4.3         
## [16] rmarkdown_2.30          enrichplot_1.26.6       tzdb_0.5.0             
## [19] UCSC.utils_1.2.0        bit_4.6.0               xfun_0.53              
## [22] zlibbioc_1.52.0         cachem_1.1.0            aplot_0.2.9            
## [25] jsonlite_2.0.0          blob_1.2.4              DelayedArray_0.32.0    
## [28] BiocParallel_1.40.2     parallel_4.4.1          R6_2.6.1               
## [31] bslib_0.9.0             stringi_1.8.7           RColorBrewer_1.1-3     
## [34] jquerylib_0.1.4         GOSemSim_2.32.0         Rcpp_1.1.0             
## [37] knitr_1.50              ggtangle_0.1.1          R.utils_2.13.0         
## [40] igraph_2.2.0            splines_4.4.1           Matrix_1.7-4           
## [43] timechange_0.3.0        tidyselect_1.2.1        qvalue_2.38.0          
## [46] rstudioapi_0.17.1       abind_1.4-8             yaml_2.3.10            
## [49] codetools_0.2-20        plyr_1.8.9              lattice_0.22-7         
## [52] treeio_1.30.0           withr_3.0.2             KEGGREST_1.46.0        
## [55] S7_0.2.0                evaluate_1.0.5          gridGraphics_0.5-1     
## [58] Biostrings_2.74.1       ggtree_3.14.0           pillar_1.11.1          
## [61] ggfun_0.2.0             generics_0.1.4          hms_1.1.4              
## [64] tidytree_0.4.7          scales_1.4.0            glue_1.8.0             
## [67] lazyeval_0.2.2          tools_4.4.1             data.table_1.17.8      
## [70] fgsea_1.32.4            locfit_1.5-9.12         fs_1.6.6               
## [73] fastmatch_1.1-8         cowplot_1.2.0           grid_4.4.1             
## [76] ape_5.8-1               colorspace_2.1-2        nlme_3.1-168           
## [79] patchwork_1.3.2         GenomeInfoDbData_1.2.13 cli_3.6.5              
## [82] rappdirs_0.3.3          S4Arrays_1.6.0          gtable_0.3.6           
## [85] R.methodsS3_1.8.2       yulab.utils_0.2.4       sass_0.4.10            
## [88] digest_0.6.37           ggplotify_0.1.3         SparseArray_1.6.2      
## [91] farver_2.1.2            memoise_2.0.1           htmltools_0.5.8.1      
## [94] R.oo_1.27.1             lifecycle_1.0.4         httr_1.4.7             
## [97] GO.db_3.20.0            bit64_4.6.0-1

References

  • Axemaker H, Plesselova S, Calar K, et al. Reprogramming of normal fibroblasts into ovarian cancer-associated fibroblasts via non-vesicular paracrine signaling induces an activated fibroblast phenotype. BBA - Molecular Cell Research. 2024;1871:119801.
  • Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology. 2014;15:550.
  • Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284-287.
  • Schuettengruber B, Bourbon HM, Di Croce L, Cavalli G. Genome regulation by Polycomb and Trithorax: 70 years and counting. Cell. 2017;171(1):34-57.
  • Duan R, Du W, Guo W. EZH2: a novel target for cancer treatment. Journal of Hematology & Oncology. 2020;13:104.