Psoriasis is a chronic inflammatory skin disease affecting approximately 2-3% of the world’s population. Its pathogenesis involves complex immune dysregulation, keratinocyte hyperproliferation, and chronic inflammation. While the inflammatory transcriptional signature of psoriasis is well established in the literature, the role of oxidative stress — defined as an imbalance between reactive oxygen species (ROS) production and antioxidant defense capacity — remains an important and often overlooked contributor to disease pathogenesis (Pleńkowska et al., 2020).
This analysis uses a large publicly available bulk RNA-seq dataset (GSE54456; Li et al., 2014) to perform differential expression analysis between lesional psoriatic and normal healthy skin. Beyond characterizing the global transcriptional landscape, this project specifically investigates the oxidative stress gene expression signature in psoriatic lesions — asking whether antioxidant and ROS-related genes show coordinated dysregulation that is distinct from the dominant inflammatory signal.
Dataset: GSE54456 — 90 lesional psoriatic skin biopsies vs 81 normal skin biopsies (171 samples total after QC filtering)
Raw integer counts were obtained from NCBI’s standardized recount pipeline applied to the SRA data deposited alongside GSE54456. Sample metadata was downloaded from the SRA Run Selector.
## [1] 39376 171
##
## lesional psoriatic skin normal skin
## 90 81
The count matrix contains 39376 genes across 171 samples. Five samples had multiple SRR runs mapped to the same GSM ID and were deduplicated. Three additional samples present in the metadata were absent from the raw count matrix, likely excluded during NCBI’s QC pipeline. The final dataset consists of 90 lesional psoriatic and 81 normal skin samples — a well-balanced case-control design.
Differential expression was performed using DESeq2, which models count data with a negative binomial distribution and applies empirical Bayes shrinkage to stabilize dispersion estimates across genes. Normal skin was set as the reference level so that positive log2 fold changes represent upregulation in psoriatic lesions.
##
## out of 37065 with nonzero total read count
## adjusted p-value < 0.05
## LFC > 1.00 (up) : 1290, 3.5%
## LFC < -1.00 (down) : 1419, 3.8%
## outliers [1] : 0, 0%
## low counts [2] : 5730, 15%
## (mean count < 0)
## [1] see 'cooksCutoff' argument of ?results
## [2] see 'independentFiltering' argument of ?results
## symbol log2FoldChange padj
## 6703 SPRR2D 5.881270 0
## 6700 SPRR2A 7.264776 0
## 6701 SPRR2B 6.336754 0
## 6283 S100A12 8.305059 0
## 56300 IL36G 5.649347 0
## 8942 KYNU 4.379899 0
Applying thresholds of padj < 0.05 and |log2FC| > 1 yields 1,290 upregulated and 1,419 downregulated genes in psoriatic skin. The top differentially expressed genes include well-established psoriasis markers — SPRR family proteins involved in skin barrier remodeling, S100A12 as a damage-associated inflammatory signal, and IL36G as a key driver of psoriatic inflammation — validating the analysis.
To further characterize the transcriptional landscape and assess data quality, a series of exploratory visualizations were performed.
PC1 explains 55% of total variance and cleanly separates psoriatic from normal samples, confirming that disease status is the dominant source of transcriptional variation in this dataset. PC2 captures 9% of variance and likely reflects residual donor-to-donor variability, which is expected in a human cohort study. The clean separation on PC1 gives strong confidence that the differential expression results reflect true biological signal rather than technical noise.
The dispersion plot shows the expected pattern for a well-modeled RNA-seq dataset — dispersion is highest for lowly expressed genes and decreases as mean expression increases. The fitted trend line runs smoothly through the cloud of per-gene estimates and the final shrunken estimates cluster tightly around it, indicating DESeq2’s empirical Bayes shrinkage is working correctly. Zero outlier samples were detected, confirming data quality.
annotation_col <- data.frame(Condition = metadata$tissue_type)
rownames(annotation_col) <- colnames(assay(vsd))
pheatmap(as.matrix(dist(t(assay(vsd)))),
annotation_col = annotation_col,
show_rownames = FALSE,
show_colnames = FALSE,
main = "Sample Distances")The sample distance heatmap confirms the PCA result — two distinct blocks are visible corresponding to psoriatic and normal samples, with low within-group distances and high between-group distances. A small number of psoriatic samples show slightly higher distances from the main psoriatic cluster, reflecting mild inter-patient variability in disease severity. No samples were identified as outliers requiring exclusion.
The volcano plot reveals an asymmetric distribution of differentially expressed genes. Upregulated genes in psoriatic skin show both greater fold change magnitudes and higher statistical significance compared to downregulated genes, suggesting a more pronounced transcriptional activation response. This pattern is consistent with the active inflammatory, proliferative, and stress response programs known to be induced in psoriatic lesions.
plot_gene_heatmap(vsd, res_df,
gene_ids = rownames(top50sig),
title = "Top 50 DEGs",
metadata = metadata,
cluster_rows = TRUE,
fontsize = 7)The top 50 DEGs heatmap shows near-perfect separation of psoriatic and normal samples by hierarchical clustering. The gene clustering reveals biologically coherent modules — SPRR family proteins reflect skin barrier disruption, S100 and interleukin genes represent the inflammatory signature, and DEFB antimicrobial peptides form a distinct group. The dominant pattern is upregulation in psoriatic skin, consistent with the volcano plot.
Having established confidence in data quality and the overall DE landscape, the analysis now turns to the specific biological question of interest: the oxidative stress transcriptional signature in psoriatic skin.
Oxidative stress — driven by excess reactive oxygen species (ROS) and insufficient antioxidant capacity — has been increasingly recognized as an important contributor to psoriasis pathogenesis (Pleńkowska et al., 2020). ROS overproduction in psoriatic skin activates NF-κB and MAPK signaling pathways, stimulates Th1 and Th17 immune responses, and drives keratinocyte hyperproliferation. Markers of oxidative stress including malondialdehyde (MDA) and nitric oxide (NO) correlate with disease severity (Gabr & Al-Ghadir, 2012), suggesting the oxidative stress burden is functionally relevant rather than merely coincidental.
A curated panel of 34 genes was assembled covering the major arms of oxidative stress biology: ROS-generating enzymes (NADPH oxidases, NOS isoforms, MPO), antioxidant defense (SOD family, CAT, GPX family, thioredoxins, peroxiredoxins), the master Nrf2/KEAP1 stress response pathway, mitochondrial stress sensors (CYCS, SESN family), and SIRT1 which has been specifically reported as downregulated in psoriatic fibroblasts.
## Gene Base Mean Log2FC SE Adjusted P-value
## 4843 NOS2 386.36 5.596 0.192 5.381e-124
## 6648 SOD2 10449.68 1.533 0.069 2.538e-13
## 2877 GPX2 405.15 1.530 0.092 1.008e-07
## 3162 HMOX1 2051.68 1.405 0.073 2.570e-07
## 54205 CYCS 2286.27 1.259 0.068 1.145e-03
From the 34-gene panel, 5 genes reached statistical significance (padj < 0.05, |log2FC| > 1):
Critically, none of these 5 genes appeared among the top 50 most significantly differentially expressed genes overall — underscoring the importance of targeted gene panel approaches. The oxidative stress signature represents a distinct and subtle transcriptional layer beneath the dominant inflammatory response.
plot_gene_heatmap(vsd, res_df,
gene_ids = sig_ox_genes,
title = "Oxidative Stress Genes",
metadata = metadata,
cluster_rows = FALSE,
fontsize = 10)All 5 significant oxidative stress genes are consistently upregulated across psoriatic samples. NOS2 shows the strongest and most uniform upregulation. SOD2 and CYCS show more variability within the psoriatic group, suggesting heterogeneity in mitochondrial stress response that may relate to differences in disease severity across patients.
The dot plot summarizes both the magnitude and statistical significance of the 5 significant oxidative stress genes simultaneously. NOS2 stands out as the most significantly and strongly upregulated gene. The pattern is consistent with simultaneous ROS overproduction (NOS2) and compensatory antioxidant upregulation (SOD2, GPX2, HMOX1), with CYCS indicating downstream mitochondrial consequences.
To contextualize these individual gene findings within broader biological pathways, pathway enrichment analyses were performed.
ORA tests whether genes from a predefined biological pathway are overrepresented in the significant DE gene list beyond what would be expected by chance. Analysis was performed against Gene Ontology (GO) terms across all three ontologies using clusterProfiler with Benjamini-Hochberg multiple testing correction.
barplot(ora_BP, showCategory = 20,
title = "Top 20 Enriched GO Biological Processes") +
theme(axis.text.y = element_text(size = 8),
plot.title = element_text(hjust = 0.5))The BP enrichment results reflect the well-established biology of psoriasis. The top enriched processes include cell killing, humoral immune response, antimicrobial peptide response, and defense response to bacteria — representing active innate and adaptive immune activation. Keratinocyte differentiation, keratinization, and epidermis development terms reflect aberrant skin barrier biology. Chemokine-mediated signaling and leukocyte chemotaxis are consistent with immune cell recruitment into psoriatic lesions. Retinoid and terpenoid metabolic processes likely reflect altered lipid metabolism linked to oxidative stress through lipid peroxidation.
barplot(ora_MF, showCategory = 20,
title = "Top 20 Enriched GO Molecular Functions") +
theme(axis.text.y = element_text(size = 8),
plot.title = element_text(hjust = 0.5))The MF results are dominated by cytokine activity, cytokine receptor binding, chemokine activity, and immune receptor activity — consistent with the cytokine-driven inflammation in psoriasis. Peptidase inhibitor and endopeptidase regulator activity terms suggest alterations in protease regulation relevant to skin barrier dysfunction and inflammatory mediator processing.
barplot(ora_CC, showCategory = 20,
title = "Top 20 Enriched GO Cellular Components") +
theme(axis.text.y = element_text(size = 8),
plot.title = element_text(hjust = 0.5))The CC results show enrichment of the cornified envelope consistent with aberrant keratinocyte differentiation. Synaptic membrane and ion channel complex terms likely reflect neuroimmune mediators and ion channels involved in itch signaling and keratinocyte calcium homeostasis, both disrupted in psoriasis. The collagen-containing extracellular matrix term reflects dermal remodeling.
While ORA provides a clear picture of pathway enrichment among the most significantly changed genes, it is limited by its reliance on a significance cutoff. GSEA was therefore performed to capture coordinated pathway-level signals across the entire transcriptome.
GSEA was performed using the fgsea package against the MSigDB Hallmark gene set collection — 50 well-curated biological signatures. Genes were ranked by the DESeq2 Wald statistic, which incorporates both fold change magnitude and its standard error, making it more robust than ranking by fold change alone.
## Total significant pathways: 30
## Upregulated: 26
## Downregulated: 4
30 out of 50 Hallmark pathways were significantly enriched (padj < 0.05), with 26 upregulated and only 4 downregulated in psoriatic skin — consistent with the predominantly activating transcriptional response observed throughout the analysis.
sig_pathways_split <- rbind(sig_up, sig_down)
sig_pathways_split$direction <- ifelse(sig_pathways_split$NES > 0, "Upregulated", "Downregulated")
ggplot(sig_pathways_split, aes(x = NES, y = reorder(pathway, NES), fill = direction)) +
geom_bar(stat = "identity") +
scale_fill_manual(values = c("Upregulated" = "firebrick", "Downregulated" = "steelblue")) +
labs(title = "Significant Hallmark Pathways in Psoriatic Skin",
x = "Normalized Enrichment Score",
y = "",
fill = "Direction") +
theme_bw() +
theme(axis.text.y = element_text(size = 7),
plot.title = element_text(hjust = 0.5))The top upregulated pathways — E2F targets, MYC targets, and G2M checkpoint — reflect keratinocyte hyperproliferation. Interferon gamma and alpha response pathways confirm the well-established role of interferons in psoriasis autoimmunity. IL6/JAK/STAT3 and TNFα/NF-κB signaling confirm the cytokine-driven inflammatory environment. mTORC1 signaling enrichment is particularly noteworthy given its reported hyperactivation in psoriatic skin and its mechanistic link to oxidative stress through sestrin regulation (Pleńkowska et al., 2020). The REACTIVE_OXYGEN_SPECIES_PATHWAY among significant upregulated pathways (NES = 1.54, padj = 0.025) provides direct pathway-level support for the oxidative stress hypothesis of this project.
The 4 downregulated pathways tell an equally meaningful story: MYOGENESIS downregulation reflects loss of normal differentiation programs; UV_RESPONSE_DN suggests overlap between UV stress and psoriatic gene programs; BILE_ACID_METABOLISM reflects broader metabolic dysregulation consistent with lipid peroxidation in psoriasis; and EPITHELIAL_MESENCHYMAL_TRANSITION downregulation may indicate psoriatic keratinocytes are locked in a hyperproliferative rather than differentiating state.
The HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY was the most directly relevant GSEA result to the central question of this project. To characterize which specific genes are driving this enrichment, the GSEA leading edge was examined.
plotEnrichment(hallmark_list[["HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY"]],
ranked_genes) +
labs(title = "GSEA: Hallmark Reactive Oxygen Species Pathway",
x = "Rank",
y = "Enrichment Score") +
theme_bw() +
theme(plot.title = element_text(hjust = 0.5))The enrichment plot shows a steep early climb in the enrichment score, indicating that ROS pathway genes are concentrated toward the top of the ranked gene list and are systematically upregulated in psoriatic skin. The curve peaks around rank 8,000-9,000 before gradually declining, producing a positive NES of 1.54. The dense clustering of tick marks on the left side confirms the non-random upregulated distribution of ROS pathway genes. The leading edge — genes to the left of the peak — represents those most responsible for driving the enrichment signal.
## Gene Base Mean Log2FC SE Adjusted P-value
## 6648 SOD2 10449.68 1.533 0.069 2.538e-13
## 22904 SBNO2 2430.26 1.479 0.090 9.624e-07
## 3726 JUNB 2986.17 1.401 0.106 1.240e-03
## 4363 ABCC1 2347.24 1.205 0.053 7.820e-04
## 140809 SRXN1 1091.40 1.204 0.067 1.456e-02
## 3163 HMOX2 1497.41 0.974 0.055 1.000e+00
## 475 ATOX1 960.46 0.854 0.057 1.000e+00
## 2539 G6PD 2288.68 0.749 0.082 1.000e+00
## 4353 MPO 6.05 0.724 0.167 1.000e+00
## 7296 TXNRD1 1374.01 0.721 0.045 1.000e+00
## 26034 IPCEF1 84.11 0.602 0.071 1.000e+00
## 7295 TXN 4140.75 0.570 0.061 1.000e+00
## 5052 PRDX1 5749.05 0.501 0.039 1.000e+00
## 4710 NDUFB4 1827.32 0.481 0.048 1.000e+00
## 9943 OXSR1 1318.98 0.449 0.040 1.000e+00
## 5214 PFKP 2264.37 0.380 0.063 1.000e+00
## 51022 GLRX2 110.51 0.362 0.067 1.000e+00
## 4700 NDUFA6 1693.53 0.330 0.052 1.000e+00
## 10542 LAMTOR5 1053.72 0.320 0.037 1.000e+00
## 9124 PDLIM1 3972.51 0.254 0.049 1.000e+00
## 4720 NDUFS2 2305.53 0.171 0.037 1.000e+00
## 9588 PRDX6 3535.55 0.171 0.037 1.000e+00
24 leading edge genes were identified. These fall into functionally coherent groups:
ROS-generating machinery: MPO reflects neutrophil-derived ROS production consistent with immune infiltration in psoriatic lesions. Mitochondrial complex I subunits NDUFB4, NDUFA6, and NDUFS2 represent upregulation of oxidative phosphorylation components — a primary intracellular source of superoxide.
Antioxidant defense: SOD2, TXNRD1, TXN, PRDX1, PRDX6, GLRX2, and SRXN1 collectively represent coordinated upregulation of the thioredoxin and peroxiredoxin antioxidant systems. G6PD upregulation ensures NADPH supply for antioxidant enzyme recycling. ATOX1 provides copper chaperone protection against metal-induced oxidative damage.
Stress response signaling: JUNB is a stress-responsive AP-1 transcription factor. OXSR1 is a kinase activated directly by oxidative stress. SBNO2 and LAMTOR5 represent broader stress and metabolic signaling responses.
The simultaneous upregulation of both ROS-generating (MPO, mitochondrial complex I) and antioxidant defense genes (thioredoxins, peroxiredoxins, SOD2) indicates active oxidative imbalance in psoriatic skin — the antioxidant system is being induced as a compensatory response but is operating against a sustained and elevated ROS burden.
plot_gene_heatmap(vsd, res_df,
gene_ids = leading_edge_ids,
title = "ROS Pathway Leading Edge Genes",
metadata = metadata,
cluster_rows = TRUE,
fontsize = 8)The leading edge heatmap confirms consistent upregulation of ROS pathway genes across psoriatic samples. Hierarchical clustering reveals two broad expression modules — a top cluster dominated by MPO, stress response genes, and metabolic regulators with stronger and more uniform upregulation, and a bottom cluster enriched for mitochondrial and thioredoxin system genes showing more modest but consistent upregulation.
## R version 4.4.1 (2024-06-14)
## Platform: aarch64-apple-darwin20
## Running under: macOS 15.7.4
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/New_York
## tzcode source: internal
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] msigdbr_25.1.1 fgsea_1.32.4
## [3] clusterProfiler_4.14.6 pheatmap_1.0.13
## [5] EnhancedVolcano_1.24.0 ggrepel_0.9.6
## [7] org.Hs.eg.db_3.20.0 AnnotationDbi_1.68.0
## [9] DESeq2_1.46.0 SummarizedExperiment_1.36.0
## [11] Biobase_2.66.0 MatrixGenerics_1.18.1
## [13] matrixStats_1.5.0 GenomicRanges_1.58.0
## [15] GenomeInfoDb_1.42.3 IRanges_2.40.1
## [17] S4Vectors_0.44.0 BiocGenerics_0.52.0
## [19] lubridate_1.9.4 forcats_1.0.1
## [21] stringr_1.5.2 dplyr_1.1.4
## [23] purrr_1.1.0 readr_2.1.5
## [25] tidyr_1.3.1 tibble_3.3.0
## [27] ggplot2_4.0.0 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] RColorBrewer_1.1-3 rstudioapi_0.17.1 jsonlite_2.0.0
## [4] magrittr_2.0.4 ggtangle_0.1.1 farver_2.1.2
## [7] rmarkdown_2.30 fs_1.6.6 zlibbioc_1.52.0
## [10] ragg_1.5.0 vctrs_0.6.5 memoise_2.0.1
## [13] ggtree_3.14.0 htmltools_0.5.8.1 S4Arrays_1.6.0
## [16] curl_7.0.0 SparseArray_1.6.2 gridGraphics_0.5-1
## [19] sass_0.4.10 bslib_0.9.0 plyr_1.8.9
## [22] cachem_1.1.0 igraph_2.2.0 lifecycle_1.0.4
## [25] pkgconfig_2.0.3 Matrix_1.7-4 R6_2.6.1
## [28] fastmap_1.2.0 gson_0.1.0 GenomeInfoDbData_1.2.13
## [31] digest_0.6.37 aplot_0.2.9 enrichplot_1.26.6
## [34] colorspace_2.1-2 patchwork_1.3.2 textshaping_1.0.4
## [37] RSQLite_2.4.3 labeling_0.4.3 timechange_0.3.0
## [40] httr_1.4.7 abind_1.4-8 compiler_4.4.1
## [43] bit64_4.6.0-1 withr_3.0.2 S7_0.2.0
## [46] BiocParallel_1.40.2 DBI_1.2.3 R.utils_2.13.0
## [49] rappdirs_0.3.3 DelayedArray_0.32.0 tools_4.4.1
## [52] ape_5.8-1 R.oo_1.27.1 glue_1.8.0
## [55] nlme_3.1-168 GOSemSim_2.32.0 grid_4.4.1
## [58] reshape2_1.4.4 generics_0.1.4 gtable_0.3.6
## [61] tzdb_0.5.0 R.methodsS3_1.8.2 data.table_1.17.8
## [64] hms_1.1.4 XVector_0.46.0 pillar_1.11.1
## [67] babelgene_22.9 yulab.utils_0.2.4 splines_4.4.1
## [70] treeio_1.30.0 lattice_0.22-7 bit_4.6.0
## [73] tidyselect_1.2.1 GO.db_3.20.0 locfit_1.5-9.12
## [76] Biostrings_2.74.1 knitr_1.50 xfun_0.53
## [79] stringi_1.8.7 UCSC.utils_1.2.0 lazyeval_0.2.2
## [82] ggfun_0.2.0 yaml_2.3.10 evaluate_1.0.5
## [85] codetools_0.2-20 qvalue_2.38.0 ggplotify_0.1.3
## [88] cli_3.6.5 systemfonts_1.3.1 jquerylib_0.1.4
## [91] Rcpp_1.1.0 png_0.1-8 parallel_4.4.1
## [94] assertthat_0.2.1 blob_1.2.4 DOSE_4.0.1
## [97] tidytree_0.4.7 scales_1.4.0 crayon_1.5.3
## [100] rlang_1.1.6 cowplot_1.2.0 fastmatch_1.1-8
## [103] KEGGREST_1.46.0