R Markdown

================================================================

1. Load Libraries Needed for Analysis

================================================================

## 
## The downloaded binary packages are in
##  /var/folders/rg/x_7b05fn3sj3v_jq8q367xzm0000gn/T//Rtmpq3bK8k/downloaded_packages

================================================================

2. Acquire Dataset & Prepare for DE Analysis

================================================================

## 
## phenotype: alcoholic   phenotype: control 
##                   20                   19
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 6 features, 39 samples 
##   element names: exprs 
## protocolData: none
## phenoData
##   sampleNames: GSM1085665 GSM1085666 ... GSM1085703 (39 total)
##   varLabels: title geo_accession ... tissue:ch1 (51 total)
##   varMetadata: labelDescription
## featureData
##   featureNames: 7896736 7896738 ... 7896746 (6 total)
##   fvarLabels: ID GB_LIST ... category (12 total)
##   fvarMetadata: Column Description labelDescription
## experimentData: use 'experimentData(object)'
##   pubMedIds: 23981442 
## Annotation: GPL6244
## 
## Alcohol.abuse       Control 
##            20            19
##  [1] "ID"              "GB_LIST"         "SPOT_ID"         "seqname"        
##  [5] "RANGE_GB"        "RANGE_STRAND"    "RANGE_START"     "RANGE_STOP"     
##  [9] "total_probes"    "gene_assignment" "mrna_assignment" "category"

Here we see that our data as 7 controls and 15 alcohol abusers, including the alcohol vs control study participants.

================================================================

3. Differential Expression (DE Analysis)

================================================================

##              ID
## 7927186 7927186
## 8125919 8125919
## 8021081 8021081
## 7961595 7961595
## 7995838 7995838
## 8130867 8130867
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            gene_assignment
## 7927186                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NM_032023 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// XM_006718015 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// ENST00000427758 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// ENST00000471808 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// ENST00000472561 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// ENST00000483709 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// ENST00000489171 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// AB209446 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// AF260335 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// AY216713 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// AY216714 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// AY216715 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// AY216716 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// BC032593 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// ENST00000340258 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// ENST00000471941 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// AK097272 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// ENST00000465735 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937 /// AK092770 // RASSF4 // Ras association (RalGDS/AF-6) domain family member 4 // 10q11.21 // 83937
## 8125919                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     NM_001145775 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// NM_001145776 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// NM_004117 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// ENST00000357266 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// ENST00000536438 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// ENST00000539068 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// AF194172 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// AK302704 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// AK303356 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// BC042605 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// BC111050 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// U71321 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// NM_001145777 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// ENST00000542713 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289 /// HM245391 // FKBP5 // FK506 binding protein 5 // 6p21.31 // 2289
## 8021081 NM_001128588 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// NM_001146036 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// NM_001146037 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// NM_015865 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000321925 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000402943 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000415427 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000502059 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000586056 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000588179 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000589700 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000591541 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000591943 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// AK289608 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// AK294129 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// BC050539 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// HQ709264 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// U35735 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000436407 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000586142 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// AK091064 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// AK123681 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// ENST00000535474 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563 /// AK295268 // SLC14A1 // solute carrier family 14 (urea transporter), member 1 (Kidd blood group) // 18q11-q12 // 6563
## 7961595                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            NM_001286201 // RERGL // RERG/RAS-like // 12p12.3 // 79785 /// NM_024730 // RERGL // RERG/RAS-like // 12p12.3 // 79785 /// NR_104413 // RERGL // RERG/RAS-like // 12p12.3 // 79785 /// ENST00000229002 // RERGL // RERG/RAS-like // 12p12.3 // 79785 /// ENST00000536890 // RERGL // RERG/RAS-like // 12p12.3 // 79785 /// ENST00000538724 // RERGL // RERG/RAS-like // 12p12.3 // 79785 /// ENST00000540148 // RERGL // RERG/RAS-like // 12p12.3 // 79785 /// BC042888 // RERGL // RERG/RAS-like // 12p12.3 // 79785 /// ENST00000541632 // RERGL // RERG/RAS-like // 12p12.3 // 79785
## 7995838                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NM_005952 // MT1X // metallothionein 1X // 16q13 // 4501 /// ENST00000394485 // MT1X // metallothionein 1X // 16q13 // 4501 /// ENST00000564974 // MT1X // metallothionein 1X // 16q13 // 4501 /// ENST00000568370 // MT1X // metallothionein 1X // 16q13 // 4501 /// BC018190 // MT1X // metallothionein 1X // 16q13 // 4501 /// BC032131 // MT1X // metallothionein 1X // 16q13 // 4501 /// BC032338 // MT1X // metallothionein 1X // 16q13 // 4501 /// BC053882 // MT1X // metallothionein 1X // 16q13 // 4501 /// ENST00000562939 // MT1X // metallothionein 1X // 16q13 // 4501
## 8130867                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      NM_003247 // THBS2 // thrombospondin 2 // 6q27 // 7058 /// ENST00000366787 // THBS2 // thrombospondin 2 // 6q27 // 7058 /// AK292429 // THBS2 // thrombospondin 2 // 6q27 // 7058 /// BC146676 // THBS2 // thrombospondin 2 // 6q27 // 7058 /// BC150175 // THBS2 // thrombospondin 2 // 6q27 // 7058 /// L12350 // THBS2 // thrombospondin 2 // 6q27 // 7058
##          Symbol      logFC  AveExpr         t      P.Value  adj.P.Val        B
## 7927186  RASSF4  0.5454994 7.824115  5.960516 4.701672e-07 0.01357326 5.346430
## 8125919   FKBP5  1.1413625 8.306565  5.669950 1.223577e-06 0.01766172 4.589410
## 8021081 SLC14A1  1.2943641 8.589024  5.533925 1.912486e-06 0.01840385 4.234217
## 7961595   RERGL -0.5290057 4.155558 -5.439894 2.602681e-06 0.01878420 3.988572
## 7995838    MT1X  0.9896910 8.542903  5.158875 6.511221e-06 0.03759449 3.254934
## 8130867   THBS2 -0.5856362 7.563478 -5.098082 7.932425e-06 0.03816686 3.096501
## Here we see that the top DE genes are: RASSF4 (), FKBP5 (), SLC14A1 (), RERGL (), MT1X (), THBS2 (), NR3C1 (), IVNS1ABP (), CNDP1 ()
## [1] "For context, RASSF4 (Ras association domain family member 4) is a protein-coding gene that acts primarily as a tumor suppressor, frequently inactivated in cancers like non-small cell lung cancer, osteosarcoma, and hepatocellular carcinoma. It inhibits tumor cell proliferation, migration, invasion, and the epithelial-mesenchymal transition (EMT). RASSF4 is also required for skeletal muscle differentiation and regulates store-operated Ca2+ entry (SOCE) It's important to note however that this risk can be mitigated/amplified by other factors. Thus, later we will review some system or pathway analysis."

================================================================

4. Visualization and Results Interpretation

================================================================

Here we can see that most genes are NOT significantly affected by alcohol abuse. However, we can evaluate systemic changes or even the compounded effect of multiple “small gene changes”. This is transcriptomics being converted to systems biology. The way to measure this goes beyond basic DE analysis, although DE analysis can be foundation to downstream analysis.

Let’s first do some additional analysis and visualization. We will first filter for low occurrence to improve signal/noise, thereby improving analysis and statistical significance. This improves confidence, reduces false-discovery, increases confidence, and concentrations active transcripts.

================================================================

5a. Downstream Analysis: Filtering for Higher Confidence

================================================================

## [1] "Probes retained after intensity filtering: 23395"
##              ID
## 7927186 7921773
## 8125919 8085138
## 8021081 7998878
## 7961595 7948476
## 7995838 7976858
## 8114814 8075493
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             gene_assignment
## 7927186                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   NM_030916 // PVRL4 // poliovirus receptor-related 4 // 1q22-q23.2 // 81607 /// ENST00000368012 // PVRL4 // poliovirus receptor-related 4 // 1q22-q23.2 // 81607 /// AB755430 // PVRL4 // poliovirus receptor-related 4 // 1q22-q23.2 // 81607 /// AF160477 // PVRL4 // poliovirus receptor-related 4 // 1q22-q23.2 // 81607 /// AF426163 // PVRL4 // poliovirus receptor-related 4 // 1q22-q23.2 // 81607 /// AK298981 // PVRL4 // poliovirus receptor-related 4 // 1q22-q23.2 // 81607 /// AK301481 // PVRL4 // poliovirus receptor-related 4 // 1q22-q23.2 // 81607 /// BC010423 // PVRL4 // poliovirus receptor-related 4 // 1q22-q23.2 // 81607 /// BX641083 // PVRL4 // poliovirus receptor-related 4 // 1q22-q23.2 // 81607
## 8125919                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       NM_000916 // OXTR // oxytocin receptor // 3p25 // 5021 /// ENST00000316793 // OXTR // oxytocin receptor // 3p25 // 5021 /// ENST00000431493 // OXTR // oxytocin receptor // 3p25 // 5021 /// ENST00000449615 // OXTR // oxytocin receptor // 3p25 // 5021 /// AY389507 // OXTR // oxytocin receptor // 3p25 // 5021 /// BC137443 // OXTR // oxytocin receptor // 3p25 // 5021
## 8021081                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            NM_022119 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// XM_006720915 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// ENST00000161006 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// ENST00000574768 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// ENST00000575164 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// ENST00000577177 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// AB010779 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// AF321182 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// BC009726 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// ENST00000571228 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// ENST00000576381 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// AK309624 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063 /// ENST00000570950 // PRSS22 // protease, serine, 22 // 16p13.3 // 64063
## 7961595                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         NM_014502 // PRPF19 // pre-mRNA processing factor 19 // 11q12.2 // 27339 /// ENST00000227524 // PRPF19 // pre-mRNA processing factor 19 // 11q12.2 // 27339 /// ENST00000535326 // PRPF19 // pre-mRNA processing factor 19 // 11q12.2 // 27339 /// ENST00000541371 // PRPF19 // pre-mRNA processing factor 19 // 11q12.2 // 27339 /// ENST00000546152 // PRPF19 // pre-mRNA processing factor 19 // 11q12.2 // 27339 /// BC008719 // PRPF19 // pre-mRNA processing factor 19 // 11q12.2 // 27339 /// BC018665 // PRPF19 // pre-mRNA processing factor 19 // 11q12.2 // 27339 /// BC018698 // PRPF19 // pre-mRNA processing factor 19 // 11q12.2 // 27339 /// ENST00000539960 // PRPF19 // pre-mRNA processing factor 19 // 11q12.2 // 27339
## 7995838                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  NM_001362 // DIO3 // deiodinase, iodothyronine, type III // 14q32 // 1735 /// ENST00000510508 // DIO3 // deiodinase, iodothyronine, type III // 14q32 // 1735 /// AK292310 // DIO3 // deiodinase, iodothyronine, type III // 14q32 // 1735 /// BC017717 // DIO3 // deiodinase, iodothyronine, type III // 14q32 // 1735 /// S79854 // DIO3 // deiodinase, iodothyronine, type III // 14q32 // 1735
## 8114814 NM_014323 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// NM_032050 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// NM_032052 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// ENST00000266269 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// ENST00000351933 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// ENST00000405309 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// AF119256 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// AF254083 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// AF254084 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// AF254085 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// AY028384 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// BC051357 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// CR456613 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// NM_032051 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// ENST00000215919 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// AF254082 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// AK291803 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598 /// BC021091 // PATZ1 // POZ (BTB) and AT hook containing zinc finger 1 // 22q12.2 // 23598
##         Symbol      logFC  AveExpr         t      P.Value  adj.P.Val        B
## 7927186  PVRL4  0.5454994 7.824115  5.934935 5.119776e-07 0.01197772 5.424284
## 8125919   OXTR  1.1413625 8.306565  5.664309 1.247466e-06 0.01459223 4.703230
## 8021081 PRSS22  1.2943641 8.589024  5.529658 1.940884e-06 0.01513566 4.343924
## 7961595 PRPF19 -0.5290057 4.155558 -5.419120 2.787778e-06 0.01630502 4.048936
## 7995838   DIO3  0.9896910 8.542903  5.153292 6.634527e-06 0.03104295 3.340411
## 8114814  PATZ1 -0.3673555 7.691193 -4.996450 1.103076e-05 0.03600408 2.923675

================================================================

5b. Downstream Analysis: New Plots with filtered probes

================================================================

### Repeat plots from before with filtered data #####

###Volcano Plot
volcano_data <- res_limma
volcano_data$negLogPval <- -log10(volcano_data$adj.P.Val)

ggplot(volcano_data, aes(x = logFC, y = negLogPval)) +
  geom_point(alpha = 0.4, color = "grey") +
  # Highlight significant genes in blue
  geom_point(data = subset(volcano_data, adj.P.Val < 0.05 & abs(logFC) > 1), 
             aes(x = logFC, y = negLogPval), color = "steelblue", alpha = 0.8) +
  geom_hline(yintercept = -log10(0.05), linetype = "dashed", color = "red") +
  geom_vline(xintercept = c(-1, 1), linetype = "dashed", color = "red") +
  labs(title = "Volcano Plot: Alcohol Abuse vs Control", 
       x = "Log2 Fold Change", 
       y = "-Log10 Adjusted P-value") +
  theme_minimal()

# Standard Limma MDS plot 
plotMDS(exprs(eset), col = as.numeric(groups), labels = metadata$status)
legend("topright", legend = levels(groups), col = 1:2, pch = 15)

# 1. Identify the Top 10 Up and Top 10 Down Probes
top_up_probes <- rownames(head(res_limma[res_limma$logFC > 0, ], 10))
top_down_probes <- rownames(head(res_limma[res_limma$logFC < 0, ], 10))

# Set up the plotting area to show two plots side-by-side
par(mfrow = c(1, 2)) 

# --- Plot 1: Top 10 Upregulated Genes ---
plotMDS(exprs(eset)[top_up_probes, ], 
        col = as.numeric(factor(metadata$status)), 
        pch = 19,
        main = "Top 10 Upregulated Genes")
legend("topright", legend = levels(factor(metadata$status)), 
       col = 1:2, pch = 19, cex = 0.8)

# --- Plot 2: Top 10 Downregulated Genes ---
plotMDS(exprs(eset)[top_down_probes, ], 
        col = as.numeric(factor(metadata$status)), 
        pch = 19,
        main = "Top 10 Downregulated Genes")
legend("topright", legend = levels(factor(metadata$status)), 
       col = 1:2, pch = 19, cex = 0.8)

# Reset plotting layout to single plot
par(mfrow = c(1, 1))

### Heatmap

#library(pheatmap)

# 1. Clean Gene Symbols (Remove "///" and long strings)
# This keeps only the first gene name listed for each probe
res_limma$CleanSymbol <- gsub(" ///.*", "", res_limma$Symbol)

# 2. Ensure each gene only appears once in the plot
# We sort by significance first so that we keep the "best" probe for each gene
res_unique <- res_limma[order(res_limma$adj.P.Val), ] 
res_unique <- res_unique[!duplicated(res_unique$CleanSymbol), ]

# 3. Identify and Sort Top 25 Up and Top 25 Down
# Upregulated: Sorted from highest logFC to lowest
top_up <- res_unique[res_unique$logFC > 0, ]
top_up <- head(top_up[order(top_up$logFC, decreasing = TRUE), ], 25)

# Downregulated: Sorted from most negative logFC to least negative
top_down <- res_unique[res_unique$logFC < 0, ]
top_down <- head(top_down[order(top_down$logFC, decreasing = FALSE), ], 25)

# 4. Combine IDs and Symbols for the Heatmap
hm_probes <- c(rownames(top_up), rownames(top_down))
hm_symbols <- c(top_up$CleanSymbol, top_down$CleanSymbol)

# 5. Extract expression and sort SAMPLES (Control block first, then Alcohol.abuse block)
sample_order <- order(metadata$status)
plot_matrix <- exprs(eset_filtered)[hm_probes, sample_order]
rownames(plot_matrix) <- hm_symbols

# 6. Define the Annotations
# Sample Annotation (Columns)
anno_col <- data.frame(Status = metadata$status[sample_order])
rownames(anno_col) <- colnames(plot_matrix)

# Gene Annotation (Rows)
anno_row <- data.frame(
  Regulation = rep(c("Upregulated", "Downregulated"), each = 25)
)
rownames(anno_row) <- hm_symbols

# 7. Generate the Heatmap
pheatmap(plot_matrix, 
         cluster_rows = FALSE,  # Keeps our custom LogFC sorting
         cluster_cols = FALSE,  # Keeps our Smoking Status grouping
         scale = "row",         # Standardizes colors to show relative change
         annotation_col = anno_col, 
         annotation_row = anno_row,
         fontsize_row = 8,      # Smaller font to prevent label overlapping
         cellheight = 8,
         main = "Top 50 DE Genes: Alcohol.abuse vs Control",
         color = colorRampPalette(c("blue", "white", "red"))(100))

print("Some important questions for any of these studies is to figure out how or why these genes were interrogated prior to the study. In other words, which made them special in this analysis. We assume not all genes in the genome were interrogated.")

================================================================

6. Pathway Analysis, Reactome

================================================================

## [1] "There seems to be only 2-3 genes in this pathway analysis. Curious.  “Pathway enrichment reflects single-gene associations due to limited numbers of differentially expressed genes under stringent thresholds. Let's do some addtional exploration:"
##                                                             Description geneID
## R-HSA-416572     Sema4D induced cell migration and growth-cone collapse   RHOC
## R-HSA-5627117                                RHO GTPases Activate ROCKs   RHOC
## R-HSA-5625900                                  RHO GTPases activate CIT   RHOC
## R-HSA-400685                             Sema4D in semaphorin signaling   RHOC
## R-HSA-5625740                                 RHO GTPases activate PKNs   RHOC
## R-HSA-6781823                  Formation of TC-NER Pre-Incision Complex PRPF19
## R-HSA-373755                                    Semaphorin interactions   RHOC
## R-HSA-6782135                                   Dual incision in TC-NER PRPF19
## R-HSA-6782210   Gap-filling DNA repair synthesis and ligation in TC-NER PRPF19
## R-HSA-9013106                                         RHOC GTPase cycle   RHOC
## R-HSA-416482                          G alpha (12/13) signalling events   RHOC
## R-HSA-6781827 Transcription-Coupled Nucleotide Excision Repair (TC-NER) PRPF19
## R-HSA-5696398                                Nucleotide Excision Repair PRPF19
## R-HSA-5663220                              RHO GTPases Activate Formins   RHOC
## R-HSA-72163                               mRNA Splicing - Major Pathway PRPF19
## R-HSA-72172                                               mRNA Splicing PRPF19
## R-HSA-9918481                            Dengue Virus-Host Interactions PRPF19
## R-HSA-195258                                       RHO GTPase Effectors   RHOC
## R-HSA-73894                                                  DNA Repair PRPF19
## R-HSA-72203             Processing of Capped Intron-Containing Pre-mRNA PRPF19
## R-HSA-9839923                                    Dengue Virus Infection PRPF19
## R-HSA-9012999                                          RHO GTPase cycle   RHOC
## R-HSA-422475                                              Axon guidance   RHOC
## R-HSA-9675108                                Nervous system development   RHOC

## [1] "Downregulation of RHOC implicates RHO GTPase signaling and axon guidance pathways, while PRPF19 downregulation suggests impairment of transcription-coupled nucleotide excision repair and mRNA splicing."
## [1] FALSE
## [1] FALSE
## [1] FALSE
## [1] 17842
## [1] FALSE
## [1] 0
## [1] TRUE
##                          ID
## R-HSA-211979   R-HSA-211979
## R-HSA-4090294 R-HSA-4090294
## R-HSA-446203   R-HSA-446203
## R-HSA-196791   R-HSA-196791
## R-HSA-9843743 R-HSA-9843743
## R-HSA-9844594 R-HSA-9844594
##                                                                                   Description
## R-HSA-211979                                                                      Eicosanoids
## R-HSA-4090294                                          SUMOylation of intracellular receptors
## R-HSA-446203                                                Asparagine N-linked glycosylation
## R-HSA-196791                                                Vitamin D (calciferol) metabolism
## R-HSA-9843743         Transcriptional regulation of brown and beige adipocyte differentiation
## R-HSA-9844594 Transcriptional regulation of brown and beige adipocyte differentiation by EBF2
##                     NES  p.adjust
## R-HSA-211979  -1.936015 0.2992477
## R-HSA-4090294  1.947662 0.6468599
## R-HSA-446203  -1.495309 0.6468599
## R-HSA-196791   1.960427 0.7049877
## R-HSA-9843743  1.807618 0.7049877
## R-HSA-9844594  1.807618 0.7049877
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        core_enrichment
## R-HSA-211979                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   5740/4051/126410/8529/284541/11283/1580
## R-HSA-4090294                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            7329/7421/2099/8856/7067/7376/5465/6613/51588/9971/9063/10401
## R-HSA-446203  54344/124872/10382/2530/10559/10121/79090/51035/10802/79644/116150/1798/9526/256435/378/254263/55343/1781/374/55768/10402/51272/6185/140735/51143/57731/2890/11196/8655/2804/6811/201595/5861/6487/55907/10113/55860/51332/80896/7311/113457/1650/9871/832/91869/7109/122553/91949/51005/89866/23423/55741/9486/129807/5589/10825/4248/10540/57511/29925/3630/5265/29880/79748/5476/7369/9761/79694/256281/10427/55808/7316/81876/1453/140838/80086/2348/372/377/64689/199857/10972/3703/8702/79053/4759/746/51399/23243/4121/81849/8615/6708/22818/6184
## R-HSA-196791                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            7329/7421/1594/8029/5641/4036/26119/6613/51588
## R-HSA-9843743                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     3192/8648/83401/1108/57504/7350/5465/1107/3065/63976
## R-HSA-9844594                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     3192/8648/83401/1108/57504/7350/5465/1107/3065/63976
## [1] "Gene set enrichment analysis using Reactome did not identify pathways that remained significant after multiple-testing correction. However, nominal enrichment trends were observed for lipid-related pathways (eicosanoids, fatty acids) and regulatory processes such as vitamin D metabolism and SUMOylation, suggesting subtle, coordinated transcriptional shifts."
## 
## Notice we are NOT switching analysis, simply expanding to get more results that might have been missed. Thus, it's important to note:
## 
## | Feature                | `enrichPathway()` | `gsePathway()` |
## | ---------------------- | ----------------- | -------------- |
## | Analysis type          | ORA               | GSEA           |
## | Uses cutoffs           | ✅ Yes             | ❌ No           |
## | Uses all genes         | ❌ No              | ✅ Yes          |
## | Sensitive to list size | ❌ Very            | ✅ No           |
## | Single-gene artifacts  | ❌ Common          | ✅ Avoided      |
## | Best for               | Strong DE         | Subtle shifts  |
## | Your dataset           | ❌ Fragile         | ✅ Ideal        |

================================================================

7. Pathway Analysis, GO and KEGG - Alcohol.abuse, up-regulated ONLY

================================================================

House Keeping

Pathway Analysis Section: GO, Kegg, & Reactome Analysis Type,Function,Best For: GO (BP),enrichGO,“Broad biological mechanisms (e.g.,”“Cell Proliferation”“)” KEGG,enrichKEGG,“Well-defined metabolic/signaling maps (e.g.,”“Glycolysis”“)” Reactome,enrichPathway,Detailed molecular reactions and hierarchies

GO DETAIL: Category,Question it Answers,Level of Detail BP (Biological Process),What is the overall goal?,System-wide / Cellular program MF (Molecular Function),What is the chemical task?,Molecular / Biochemical CC (Cellular Component),Where is this happening?,Structural / Spatial

Results Explained:

The observed transcriptional changes exert a compounded effect at the network level, whereby coordinated modulation of ER stress, proteasomal degradation, and NRF2-dependent antioxidant pathways converges on central hubs, producing a non-linear amplification of stress tolerance and cellular robustness.

References

## [1] "1.Lin YT, Deel MD, Linardic CM. RASSF4 is required for skeletal muscle differentiation. Cell Biol Int. 2020 Feb;44(2):381-390. doi: 10.1002/cbin.11238. Epub 2019 Sep 25. PMID: 31508857; PMCID: PMC6980882.."
## [1] "2. McClintick JN, Xuei X, Tischfield JA, Goate A et al. Stress-response pathways are altered in the hippocampus of chronic alcoholics. Alcohol 2013 Nov;47(7):505-15. PMID: 23981442"
## [1] "3. Software:"
## Please cite the following if utilizing the GEOquery software:
## 
##   Davis S, Meltzer P (2007). "GEOquery: a bridge between the Gene
##   Expression Omnibus (GEO) and BioConductor." _Bioinformatics_, *14*,
##   1846-1847. doi:10.1093/bioinformatics/btm254
##   <https://doi.org/10.1093/bioinformatics/btm254>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     author = {Sean Davis and Paul Meltzer},
##     title = {GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor},
##     journal = {Bioinformatics},
##     year = {2007},
##     volume = {14},
##     pages = {1846--1847},
##     doi = {10.1093/bioinformatics/btm254},
##   }
## To cite package 'limma' in publications use:
## 
##   Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., and
##   Smyth, G.K. (2015). limma powers differential expression analyses for
##   RNA-sequencing and microarray studies. Nucleic Acids Research 43(7),
##   e47.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     author = {Matthew E Ritchie and Belinda Phipson and Di Wu and Yifang Hu and Charity W Law and Wei Shi and Gordon K Smyth},
##     title = {{limma} powers differential expression analyses for {RNA}-sequencing and microarray studies},
##     journal = {Nucleic Acids Research},
##     year = {2015},
##     volume = {43},
##     number = {7},
##     pages = {e47},
##     doi = {10.1093/nar/gkv007},
##   }
## To cite package 'pheatmap' in publications use:
## 
##   Kolde R (2025). _pheatmap: Pretty Heatmaps_.
##   doi:10.32614/CRAN.package.pheatmap
##   <https://doi.org/10.32614/CRAN.package.pheatmap>, R package version
##   1.0.13, <https://CRAN.R-project.org/package=pheatmap>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {pheatmap: Pretty Heatmaps},
##     author = {Raivo Kolde},
##     year = {2025},
##     note = {R package version 1.0.13},
##     url = {https://CRAN.R-project.org/package=pheatmap},
##     doi = {10.32614/CRAN.package.pheatmap},
##   }
## Please cite G. Yu (2015) for using ReactomePA. In addition, please cite
## G. Yu (2012) when using compareCluster in clusterProfiler package, G.
## Yu (2015) when applying enrichment analysis to NGS data by using
## ChIPseeker
## 
##   Guangchuang Yu, Qing-Yu He. ReactomePA: an R/Bioconductor package for
##   reactome pathway analysis and visualization. Molecular BioSystems
##   2016, 12(2):477-479
## 
## A BibTeX entry for LaTeX users is
## 
##   @Article{,
##     title = {ReactomePA: an R/Bioconductor package for reactome pathway analysis and visualization},
##     author = {Guangchuang Yu and Qing-Yu He},
##     journal = {Molecular BioSystems},
##     year = {2016},
##     volume = {12},
##     number = {12},
##     pages = {477-479},
##     pmid = {26661513},
##     url = {http://pubs.rsc.org/en/Content/ArticleLanding/2015/MB/C5MB00663E},
##     doi = {10.1039/C5MB00663E},
##   }
## Please cite S. Xu (2024) for using clusterProfiler. In addition, please
## cite G. Yu (2010) when using GOSemSim, G. Yu (2015) when using DOSE and
## G. Yu (2015) when using ChIPseeker.
## 
##   G Yu. Thirteen years of clusterProfiler. The Innovation. 2024,
##   5(6):100722
## 
##   S Xu, E Hu, Y Cai, Z Xie, X Luo, L Zhan, W Tang, Q Wang, B Liu, R
##   Wang, W Xie, T Wu, L Xie, G Yu. Using clusterProfiler to characterize
##   multiomics data. Nature Protocols. 2024, 19(11):3292-3320
## 
##   T Wu, E Hu, S Xu, M Chen, P Guo, Z Dai, T Feng, L Zhou, W Tang, L
##   Zhan, X Fu, S Liu, X Bo, and G Yu. clusterProfiler 4.0: A universal
##   enrichment tool for interpreting omics data. The Innovation. 2021,
##   2(3):100141
## 
##   Guangchuang Yu, Li-Gen Wang, Yanyan Han and Qing-Yu He.
##   clusterProfiler: an R package for comparing biological themes among
##   gene clusters. OMICS: A Journal of Integrative Biology 2012,
##   16(5):284-287
## 
## To see these entries in BibTeX format, use 'print(<citation>,
## bibtex=TRUE)', 'toBibtex(.)', or set
## 'options(citation.bibtex.max=999)'.
## To cite package 'org.Hs.eg.db' in publications use:
## 
##   Carlson M (2025). _org.Hs.eg.db: Genome wide annotation for Human_. R
##   package version 3.22.0.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {org.Hs.eg.db: Genome wide annotation for Human},
##     author = {Marc Carlson},
##     year = {2025},
##     note = {R package version 3.22.0},
##   }
## 
## ATTENTION: This citation information has been auto-generated from the
## package DESCRIPTION file and may need manual editing, see
## 'help("citation")'.
## To cite ggplot2 in publications, please use
## 
##   H. Wickham. ggplot2: Elegant Graphics for Data Analysis.
##   Springer-Verlag New York, 2016.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Book{,
##     author = {Hadley Wickham},
##     title = {ggplot2: Elegant Graphics for Data Analysis},
##     publisher = {Springer-Verlag New York},
##     year = {2016},
##     isbn = {978-3-319-24277-4},
##     url = {https://ggplot2.tidyverse.org},
##   }
## [1] "Jairam S, Edenberg HJ. An enhancer-blocking element regulates the cell-specific expression of alcohol dehydrogenase 7. Gene. 2014 Sep 1;547(2):239-44. doi: 10.1016/j.gene.2014.06.047. Epub 2014 Jun 24. PMID: 24971505; PMCID: PMC4136687."