R Markdown

================================================================

1. Load Libraries Needed for Analysis

================================================================

## 
## The downloaded binary packages are in
##  /var/folders/rg/x_7b05fn3sj3v_jq8q367xzm0000gn/T//Rtmp4VxZC7/downloaded_packages

================================================================

2. Acquire Dataset & Prepare for DE Analysis

================================================================

## 
##                                                
##                                             14 
##   smoking status: COPD, GOLD-I, 119 pack-years 
##                                              1 
##    smoking status: COPD, GOLD-I, 14 pack-years 
##                                              1 
##    smoking status: COPD, GOLD-I, 22 pack-years 
##                                              1 
##    smoking status: COPD, GOLD-I, 23 pack-years 
##                                              1 
##    smoking status: COPD, GOLD-I, 24 pack-years 
##                                              1 
##    smoking status: COPD, GOLD-I, 26 pack-years 
##                                              1 
##  smoking status: COPD, GOLD-I, 32.5 pack-years 
##                                              1 
##    smoking status: COPD, GOLD-I, 48 pack-years 
##                                              1 
##    smoking status: COPD, GOLD-I, 50 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 15 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 20 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 24 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 27 pack-years 
##                                              1 
## smoking status: COPD, GOLD-II, 27.5 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 29 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 33 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 34 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 35 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 60 pack-years 
##                                              1 
##   smoking status: COPD, GOLD-II, 75 pack-years 
##                                              1 
## smoking status: COPD, GOLD-III, 110 pack-years 
##                                              1 
##  smoking status: COPD, GOLD-III, 53 pack-years 
##                                              1 
##                     smoking status: non-smoker 
##                                             63 
##          smoking status: smoker,  1 pack-years 
##                                              1 
##         smoking status: smoker,  16 pack-years 
##                                              1 
##         smoking status: smoker,  23 pack-years 
##                                              1 
##         smoking status: smoker, 0.5 pack-years 
##                                              1 
##          smoking status: smoker, 10 pack-years 
##                                              1 
##        smoking status: smoker, 10.5 pack-years 
##                                              1 
##          smoking status: smoker, 11 pack-years 
##                                              2 
##          smoking status: smoker, 12 pack-years 
##                                              1 
##          smoking status: smoker, 13 pack-years 
##                                              1 
##          smoking status: smoker, 14 pack-years 
##                                              1 
##          smoking status: smoker, 15 pack-years 
##                                              2 
##          smoking status: smoker, 16 pack-years 
##                                              1 
##        smoking status: smoker, 17.5 pack-years 
##                                              1 
##          smoking status: smoker, 18 pack-years 
##                                              1 
##          smoking status: smoker, 19 pack-years 
##                                              2 
##        smoking status: smoker, 19.5 pack-years 
##                                              1 
##          smoking status: smoker, 20 pack-years 
##                                              4 
##          smoking status: smoker, 21 pack-years 
##                                              1 
##          smoking status: smoker, 22 pack-years 
##                                              2 
##        smoking status: smoker, 22.5 pack-years 
##                                              1 
##          smoking status: smoker, 23 pack-years 
##                                              3 
##          smoking status: smoker, 24 pack-years 
##                                              3 
##          smoking status: smoker, 26 pack-years 
##                                              3 
##        smoking status: smoker, 26.5 pack-years 
##                                              1 
##          smoking status: smoker, 27 pack-years 
##                                              2 
##          smoking status: smoker, 28 pack-years 
##                                              1 
##          smoking status: smoker, 29 pack-years 
##                                              2 
##           smoking status: smoker, 3 pack-years 
##                                              1 
##         smoking status: smoker, 3.8 pack-years 
##                                              1 
##          smoking status: smoker, 30 pack-years 
##                                              2 
##          smoking status: smoker, 32 pack-years 
##                                              1 
##          smoking status: smoker, 33 pack-years 
##                                              2 
##          smoking status: smoker, 35 pack-years 
##                                              1 
##          smoking status: smoker, 36 pack-years 
##                                              1 
##          smoking status: smoker, 38 pack-years 
##                                              3 
##          smoking status: smoker, 43 pack-years 
##                                              1 
##        smoking status: smoker, 44.3 pack-years 
##                                              1 
##          smoking status: smoker, 45 pack-years 
##                                              4 
##          smoking status: smoker, 46 pack-years 
##                                              2 
##          smoking status: smoker, 47 pack-years 
##                                              2 
##           smoking status: smoker, 5 pack-years 
##                                              1 
##          smoking status: smoker, 51 pack-years 
##                                              1 
##        smoking status: smoker, 56.5 pack-years 
##                                              1 
##          smoking status: smoker, 60 pack-years 
##                                              1 
##         smoking status: smoker, 7.6 pack-years 
##                                              1 
##          smoking status: smoker, 71 pack-years 
##                                              1 
##          smoking status: smoker, 80 pack-years 
##                                              1 
##         smoking status: smoker, 9.3 pack-years 
##                                              1
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 6 features, 171 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: GSM101096 GSM101097 ... GSM549782 (171 total)
##   varLabels: title geo_accession ... smoking status:ch1 (45 total)
##   varMetadata: labelDescription
## featureData
##   featureNames: 1007_s_at 1053_at ... 1294_at (6 total)
##   fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16 total)
##   fvarMetadata: Column Description labelDescription
## experimentData: use 'experimentData(object)'
##   pubMedIds: 21829517
## 32277131 
## Annotation: GPL570
## 
## Non.smoker     Smoker 
##         63         94
##  [1] "ID"                               "GB_ACC"                          
##  [3] "SPOT_ID"                          "Species Scientific Name"         
##  [5] "Annotation Date"                  "Sequence Type"                   
##  [7] "Sequence Source"                  "Target Description"              
##  [9] "Representative Public ID"         "Gene Title"                      
## [11] "Gene Symbol"                      "ENTREZ_GENE_ID"                  
## [13] "RefSeq Transcript ID"             "Gene Ontology Biological Process"
## [15] "Gene Ontology Cellular Component" "Gene Ontology Molecular Function"

Here we see that our data as 63 non-smokers and 94 smokers, including the COPD study participants.

================================================================

3. Differential Expression (DE Analysis)

================================================================

##              Symbol                                                   Gene_Name
## 210505_at      ADH7 alcohol dehydrogenase 7 (class IV), mu or sigma polypeptide
## 205623_at   ALDH3A1                  aldehyde dehydrogenase 3 family, member A1
## 207469_s_at     PIR                        pirin (iron-binding nuclear protein)
## 210519_s_at    NQO1                            NAD(P)H dehydrogenase, quinone 1
## 204059_s_at     ME1                malic enzyme 1, NADP(+)-dependent, cytosolic
## 201467_s_at    NQO1                            NAD(P)H dehydrogenase, quinone 1
##             Entrez_ID     logFC   AveExpr        t      P.Value    adj.P.Val
## 210505_at         131  6411.741  5099.084 14.56583 7.018352e-31 3.837284e-26
## 205623_at         218 27204.698 21314.073 14.35497 2.596014e-30 7.096852e-26
## 207469_s_at      8544  3195.521  3347.489 14.13356 1.028288e-29 1.874054e-25
## 210519_s_at      1728 12922.702 14083.440 13.91535 4.004319e-29 5.473404e-25
## 204059_s_at      4199  2574.942  2294.907 13.65817 1.994186e-28 2.180642e-24
## 201467_s_at      1728  4765.171  4708.853 13.02344 1.061018e-26 9.668530e-23
##                    B
## 210505_at   7.894573
## 205623_at   7.725022
## 207469_s_at 7.543940
## 210519_s_at 7.362392
## 204059_s_at 7.144492
## 201467_s_at 6.588435
## Here we see that the top DE genes are: ADH7 (alcohol dehydrogenase 7 (class IV), mu or sigma polypeptide), ALDH3A1 (aldehyde dehydrogenase 3 family, member A1), PIR (pirin (iron-binding nuclear protein)), NQO1 (NAD(P)H dehydrogenase, quinone 1), ME1 (malic enzyme 1, NADP(+)-dependent, cytosolic), NQO1 (NAD(P)H dehydrogenase, quinone 1), AKR1C1 (aldo-keto reductase family 1, member C1), AKR1C2 /// LOC101930400 (aldo-keto reductase family 1, member C2 /// aldo-keto reductase family 1 member C2-like), NQO1 (NAD(P)H dehydrogenase, quinone 1)
## [1] "For context, ADH7 can affect how alcohol is metabolized upregulation and/or variants have been associated with alcohol-related disease risk. It's important to note however that this risk can be mitigated/amplified by other factors. Thus, later we will review some system or pathway analysis."

================================================================

4. Visualization and Results Interpretation

================================================================

Here we can see that most genes are NOT significantly affected by smoking. However, we can evaluate systemic changes or even the compounded effect of multiple “small gene changes”. This is transcriptomics being converted to systems biology. The way to measure this goes beyond basic DE analysis, although DE analysis can be foundation to downstream analysis [3].

Let’s first do some additional analysis and visualization. We will first filter for low occurrence to improve signal/noise, thereby improving analysis and statistical significance. This improves confidence, reduces false-discovery, increases confidence, and concentrations active transcripts.

================================================================

5a. Downstream Analysis: Filtering for Higher Confidence

================================================================

## [1] "Probes retained after intensity filtering: 54675"
##              Symbol                                                   Gene_Name
## 210505_at      ADH7 alcohol dehydrogenase 7 (class IV), mu or sigma polypeptide
## 205623_at   ALDH3A1                  aldehyde dehydrogenase 3 family, member A1
## 207469_s_at     PIR                        pirin (iron-binding nuclear protein)
## 210519_s_at    NQO1                            NAD(P)H dehydrogenase, quinone 1
## 204059_s_at     ME1                malic enzyme 1, NADP(+)-dependent, cytosolic
## 201467_s_at    NQO1                            NAD(P)H dehydrogenase, quinone 1
##             Entrez_ID     logFC   AveExpr        t      P.Value    adj.P.Val
## 210505_at         131  6411.741  5099.084 14.56583 7.018352e-31 3.837284e-26
## 205623_at         218 27204.698 21314.073 14.35497 2.596014e-30 7.096852e-26
## 207469_s_at      8544  3195.521  3347.489 14.13356 1.028288e-29 1.874054e-25
## 210519_s_at      1728 12922.702 14083.440 13.91535 4.004319e-29 5.473404e-25
## 204059_s_at      4199  2574.942  2294.907 13.65817 1.994186e-28 2.180642e-24
## 201467_s_at      1728  4765.171  4708.853 13.02344 1.061018e-26 9.668530e-23
##                    B
## 210505_at   7.894573
## 205623_at   7.725022
## 207469_s_at 7.543940
## 210519_s_at 7.362392
## 204059_s_at 7.144492
## 201467_s_at 6.588435

================================================================

5b. Downstream Analysis: New Plots with filtered probes

================================================================

================================================================

6. Pathway Analysis, Reactome

================================================================

================================================================

7. Pathway Analysis, GO and KEGG - Smokers, up-regulated ONLY

================================================================

House Keeping

Pathway Analysis Section: GO, Kegg, & Reactome Analysis Type,Function,Best For: GO (BP),enrichGO,“Broad biological mechanisms (e.g.,”“Cell Proliferation”“)” KEGG,enrichKEGG,“Well-defined metabolic/signaling maps (e.g.,”“Glycolysis”“)” Reactome,enrichPathway,Detailed molecular reactions and hierarchies

GO DETAIL: Category,Question it Answers,Level of Detail BP (Biological Process),What is the overall goal?,System-wide / Cellular program MF (Molecular Function),What is the chemical task?,Molecular / Biochemical CC (Cellular Component),Where is this happening?,Structural / Spatial

Results Explained:

The observed transcriptional changes exert a compounded effect at the network level, whereby coordinated modulation of ER stress, proteasomal degradation, and NRF2-dependent antioxidant pathways converges on central hubs, producing a non-linear amplification of stress tolerance and cellular robustness.

References

## [1] "1. Rosebrock D, Vingron M, Arndt PF. Modeling gene expression cascades during cell state transitions. iScience. 2024 Mar 4;27(4):109386. doi: 10.1016/j.isci.2024.109386. PMID: 38500834; PMCID: PMC10946328."
## [1] "2.Tilley AE, O'Connor TP, Hackett NR, Strulovici-Barel Y et al. Biologic phenotyping of the human small airway epithelial response to cigarette smoking. PLoS One 2011;6(7):e22798. PMID: 21829517\nGindele JA, Kiechle T, Benediktus K, Birk G et al. Intermittent exposure to whole cigarette smoke alters the differentiation of primary small airway epithelial cells in the air-liquid interface culture. Sci Rep 2020 Apr 10;10(1):6257. PMID: 32277131"
## [1] "3. Software:"
## Please cite the following if utilizing the GEOquery software:
## 
##   Davis S, Meltzer P (2007). "GEOquery: a bridge between the Gene
##   Expression Omnibus (GEO) and BioConductor." _Bioinformatics_, *14*,
##   1846-1847. doi:10.1093/bioinformatics/btm254
##   <https://doi.org/10.1093/bioinformatics/btm254>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     author = {Sean Davis and Paul Meltzer},
##     title = {GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor},
##     journal = {Bioinformatics},
##     year = {2007},
##     volume = {14},
##     pages = {1846--1847},
##     doi = {10.1093/bioinformatics/btm254},
##   }
## To cite package 'limma' in publications use:
## 
##   Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., and
##   Smyth, G.K. (2015). limma powers differential expression analyses for
##   RNA-sequencing and microarray studies. Nucleic Acids Research 43(7),
##   e47.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     author = {Matthew E Ritchie and Belinda Phipson and Di Wu and Yifang Hu and Charity W Law and Wei Shi and Gordon K Smyth},
##     title = {{limma} powers differential expression analyses for {RNA}-sequencing and microarray studies},
##     journal = {Nucleic Acids Research},
##     year = {2015},
##     volume = {43},
##     number = {7},
##     pages = {e47},
##     doi = {10.1093/nar/gkv007},
##   }
## To cite package 'pheatmap' in publications use:
## 
##   Kolde R (2025). _pheatmap: Pretty Heatmaps_.
##   doi:10.32614/CRAN.package.pheatmap
##   <https://doi.org/10.32614/CRAN.package.pheatmap>, R package version
##   1.0.13, <https://CRAN.R-project.org/package=pheatmap>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {pheatmap: Pretty Heatmaps},
##     author = {Raivo Kolde},
##     year = {2025},
##     note = {R package version 1.0.13},
##     url = {https://CRAN.R-project.org/package=pheatmap},
##     doi = {10.32614/CRAN.package.pheatmap},
##   }
## Please cite G. Yu (2015) for using ReactomePA. In addition, please cite
## G. Yu (2012) when using compareCluster in clusterProfiler package, G.
## Yu (2015) when applying enrichment analysis to NGS data by using
## ChIPseeker
## 
##   Guangchuang Yu, Qing-Yu He. ReactomePA: an R/Bioconductor package for
##   reactome pathway analysis and visualization. Molecular BioSystems
##   2016, 12(2):477-479
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     title = {ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization},
##     author = {Guangchuang Yu and Qing-Yu He},
##     journal = {Molecular BioSystems},
##     year = {2016},
##     volume = {12},
##     number = {12},
##     pages = {477-479},
##     pmid = {26661513},
##     url = {http://pubs.rsc.org/en/Content/ArticleLanding/2015/MB/C5MB00663E},
##     doi = {10.1039/C5MB00663E},
##   }
## Please cite S. Xu (2024) for using clusterProfiler. In addition, please
## cite G. Yu (2010) when using GOSemSim, G. Yu (2015) when using DOSE and
## G. Yu (2015) when using ChIPseeker.
## 
##   G Yu. Thirteen years of clusterProfiler. The Innovation. 2024,
##   5(6):100722
## 
##   S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R
##   Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize
##   multiomics data. Nature Protocols. 2024, 19(11):3292-3320
## 
##   T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L
##   Zhan, X Fu, S Liu, X Bo, and G Yu. clusterProfiler 4.0: A universal
##   enrichment tool for interpreting omics data. The Innovation. 2021,
##   2(3):100141
## 
##   Guangchuang Yu, Li-Gen Wang, Yanyan Han and Qing-Yu He.
##   clusterProfiler: an R package for comparing biological themes among
##   gene clusters. OMICS: A Journal of Integrative Biology 2012,
##   16(5):284-287
## 
## To see these entries in BibTeX format, use 'print(<citation>,
## bibtex=TRUE)', 'toBibtex(.)', or set
## 'options(citation.bibtex.max=999)'.
## To cite package 'org.Hs.eg.db' in publications use:
## 
##   Carlson M (2025). _org.Hs.eg.db: Genome wide annotation for Human_. R
##   package version 3.22.0.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {org.Hs.eg.db: Genome wide annotation for Human},
##     author = {Marc Carlson},
##     year = {2025},
##     note = {R package version 3.22.0},
##   }
## 
## ATTENTION: This citation information has been auto-generated from the
## package DESCRIPTION file and may need manual editing, see
## 'help("citation")'.
## To cite ggplot2 in publications, please use
## 
##   H. Wickham. ggplot2: Elegant Graphics for Data Analysis.
##   Springer-Verlag New York, 2016.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Book{,
##     author = {Hadley Wickham},
##     title = {ggplot2: Elegant Graphics for Data Analysis},
##     publisher = {Springer-Verlag New York},
##     year = {2016},
##     isbn = {978-3-319-24277-4},
##     url = {https://ggplot2.tidyverse.org},
##   }
## [1] "Jairam S, Edenberg HJ. An enhancer-blocking element regulates the cell-specific expression of alcohol dehydrogenase 7. Gene. 2014 Sep 1;547(2):239-44. doi: 10.1016/j.gene.2014.06.047. Epub 2014 Jun 24. PMID: 24971505; PMCID: PMC4136687."