R Notebook: Provides reproducible analysis for metatranscriptome (MT) data in the following manuscript:
Citation: Romanowicz, KJ, Crump, BC, Kling, GW. (2021) Rainfall alters permafrost soil redox conditions, but meta-omics show divergent microbial community responses by tundra type in the arctic. Soil Systems 5(1): 17. https://doi.org/10.3390/soilsystems5010017
GitHub Repository: https://github.com/kromanowicz/2021-Romanowicz-SoilSystems
NCBI BioProject: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA666429
Accepted for Publication: Soil Systems 10 March 2021
This R Notebook provides complete reproducibility of the data analysis in “Rainfall alters permafrost soil redox conditions, but meta-omics show divergent microbial community responses by tundra type in the Arctic” by Romanowicz, Crump, and Kling. In this experiment, mesocosms containing soil from the active layer of two dominant tundra types were subjected to simulated rainfall to alter redox conditions. The microbial functional potential (metagenomics) and gene expression (metatranscriptomics) patterns were measured during saturated anoxic redox conditions prior to rainfall and at multiple time points following the simulated rainfall event. Other measurements include soil properties as well as microbial respiration (CO2) and methane (CH4) production from soil subsamples collected at each sampling time point. The purpose was to determine if rainfall, as a form of soil oxidation, is sufficient to alter the anoxic redox conditions in arctic tundra and enhance the microbial degradation of organic carbon and CH4 to CO2.
Conceptual Figure. A total of 12 tundra mesocosms (3 replicates x 2 tundra types x 2 sets of response cores) were acclimated initially under anoxic redox conditions to mimic field conditions (T0). Dissolved oxygen was supplied to soils through the downward flow of oxygenated water during a simulated rainfall event. Dissolved oxygen will likely change the redox gradient directly following rainfall (T4) as a short-term effect. Anoxic conditions will likely be re-established after 24 hours (T24) as the pulse of oxygen is consumed through abiotic and biotic soil processes. Under anoxic redox conditions (T0), microorganisms likely degrade organic carbon through anaerobic and fermentation pathways, producing CH4 and reducing Fe(III) to Fe(II). Rainfall-induced soil oxidation (T4) should stimulate heterotrophic microorganisms that degrade organic carbon and CH4 through aerobic metabolic pathways, releasing CO2. Soil oxidation should also stimulate aerobic autotrophic iron oxidizing bacteria that oxidize Fe(II) to Fe(III) and convert CO2 into microbial biomass. The long-term response (T24) will likely be a combination of aerobic and anaerobic metabolism as well as a combination of reduction and oxidation iron reactions as dissolved oxygen is consumed. The predicted redox conditions and predicted redox reactions for coupled Fe(II)/Fe(III) cycling, as well as the microbial-induced release of CO2 or CH4 at each time point are based on the predicted availability of dissolved oxygen entering tundra soils through simulated rainfall.
Soil Sampling for Microbial Gene Expression
An initial soil sampling event for microbial activity was conducted at the end of the anoxic acclimation period (4-7 days) in all mesocosm replicates, representing sampling time point T0. Mesocosms were then flushed to simulate a rainfall event. Additional soil sampling events were conducted at T4 (4-hrs) and T24 (24-hrs) following the rainfall event to determine the temporal extent of microbial gene expression. Soil cores (2.54 cm diameter, 30 cm length) were extracted in duplicate from each mesocosm replicate at each sampling time point and homogenized by depth in 10-cm increments. The 10-20 cm soil increment, composed of organic soil in all mesocosm replicates, was chosen for microbial gene expression analysis and preserved in RNAlater Stabilization Reagent in sterile tubes at 4°C for 18 hours and then stored at -80°C until extraction.
Field Experiment. Tundra soil cores were collected from field sites in August 2017 (top left) and placed in buckets to establish the mesocosm experiment (bottom left). Tussock tundra cores were composed of an organic soil layer overlying a mineral soil layer (top middle) while wet sedge tundra cores were composed entirely of organic soil (bottom middle). Soil subsampling for microbial activity was taken from the 10-20 cm depth of duplicate soil cores in Tussock (top right) and Wet Sedge (bottom right).
# Make a vector of required packages
required.packages <- c("ape","cowplot","data.table","devtools","dplyr","DT","ggplot2","ggpubr","grid","gridExtra","kableExtra","knitr","pheatmap","png","RColorBrewer","reshape","rstatix","statmod","stringr","tibble","tidyr","tidyverse","vegan","yaml")
# Load required packages
lapply(required.packages, library, character.only = TRUE)
Total RNA was extracted from soil samples and treated with RiboZERO rRNA removal (leaving mRNA as the bulk of the total RNA pool). The mRNA pool was sequenced on the Illumina HiSeq 4000 platform (150bp paired-end reads) at the University of Michigan Advanced Genomics Core.
| ID | Sample | Concentration (ng/uL) | Vol (uL) | 260/280 | 260/230 | RIN |
|---|---|---|---|---|---|---|
| S108379 | WS1-T0 | 36.5 | 50 | 1.60 | 0.90 | 5.0 |
| S108381 | WS3-T0 | 33.1 | 50 | 1.78 | 0.97 | 4.4 |
| S108382 | Tuss1-T0 | 61.9 | 50 | 1.90 | 1.25 | 4.9 |
| S108383 | Tuss2-T0 | 52.0 | 50 | 1.84 | 1.13 | 4.7 |
| S108385 | WS1-T4 | 45.1 | 50 | 1.69 | 0.98 | 5.7 |
| S108386 | WS2-T4 | 58.6 | 50 | 1.70 | 1.10 | 6.2 |
| S108388 | WS1-T24 | 59.9 | 42 | 1.63 | 0.96 | 4.1 |
| S108390 | WS3-T24 | 79.3 | 43 | 1.97 | 1.61 | 3.3 |
| S108391 | Tuss1-T4 | 32.9 | 44 | 1.75 | 1.14 | 4.8 |
| S108392 | Tuss2-T4 | 68.7 | 50 | 1.88 | 1.41 | 4.7 |
| S108394 | Tuss1-T24 | 114.0 | 50 | 1.69 | 1.25 | 2.5 |
| S108396 | Tuss3-T24 | 143.0 | 50 | 1.84 | 1.47 | 1.8 |
| S108380 | WS2-T0 | 10.3 | 40 | 1.44 | 0.91 | 0.0 |
| S108384 | Tuss3-T0 | 93.3 | 50 | 1.94 | 1.48 | 3.4 |
| S108387 | WS3-T4 | 31.9 | 45 | 1.48 | 0.71 | 5.4 |
| S108389 | WS2-T24 | 122.0 | 50 | 1.76 | 1.26 | 3.2 |
| S108393 | Tuss3-T4 | 69.3 | 50 | 1.84 | 1.22 | 4.4 |
| S108395 | Tuss2-T24 | 103.0 | 50 | 1.90 | 1.58 | 1.7 |
NOTE: RNA Samples in RED were excluded from sequencing
Prep and QC the Raw Reads (Hein: Comics – Omics Prep)
Raw RNA sequence reads (metatranscriptomics) were processed initially with the same Geomicro Omics container commands as for the raw DNA sequence reads (metagenomics), with several exceptions noted below: - This prep step assumes the raw data to be available in uncompressed FASTQ format, one file per read direction and each sample’s data in its own directory - The omics prep script will decompress and (if needed) concatenate split fastq files and put them into appropriately names directories
omics qc script was run for each sample but this command differs from the metagenome workflow by NOT dereplicating the RNA reads{Terminal}
cd /RNA_data/work
comics -- omics prep /RNA_data/1083*
comics -- omics qc --no-dereplicate 1083*
RNA.data.prep.qc.pbs (“/RNA_data/work/”) 00:15 (hr:min)
.fastq files within each sample folder in the “RNA_data/work” directory as “Sample_108379_fwd.fastq”, “Sample_108379_rev.fastq”, etc..fastq files (“Sample_108379_fwd.fastq”, “Sample_108379_rev.fastq”, etc.)****Export the nucleotide sequences of the Metagenome Anvi’o gene calls (for mapping MT reads to the MG genes)***
{Terminal}
cd /DNA_data/work/co_assembly
anvi-get-sequences-for-gene-calls -c DNA_data_bbnorm_anvio_contigs.db -o DNA_data_bbnorm_anvio_gene_calls.fna
DNA.data.bbnorm.anvio.export.gene.seqs.pbs (“/RNA_data/work/”) 0:16 (hr:min)
2.2.1 Add the prefix “genecall” to all nucleotide sequences in the “DNA_data_bbnorm_anvio_gene_calls.fna” file
{Terminal}
cd /RNA_data/work/mapping
sed 's/>/>genecall_/g' DNA_data_bbnorm_anvio_gene_calls.fna > DNA_data_bbnorm_anvio_gene_calls_edited.fna
mv DNA_data_bbnorm_anvio_gene_calls_edited.fna DNA_data_bbnorm_anvio_gene_calls.fna
2.2.2 Use the following script to remove all empty sequences from the “DNA_data_bbnorm_anvio_gene_calls.fna” file
{Terminal}
cd /RNA_data/work/mapping
sed '/^>/ {N; /\n$/d}' DNA_data_bbnorm_anvio_gene_calls.fna > DNA_data_bbnorm_anvio_gene_calls_clean.fna
mv DNA_data_bbnorm_anvio_gene_calls_clean.fna DNA_data_bbnorm_anvio_gene_calls.fna
Index the MG Genes and Map Back the MT Reads (Hein: Omics Mapping)
{Terminal}
cd /RNA_data/work/mapping
comics -- omics mapping -a DNA_data_bbnorm_anvio_gene_calls.fna --index-only
DNA.data.bbnorm.index.MG.genes.pbs (“/RNA_data/work/”) 00:26 (hr:min)
2.3.1 Mapping MT QC’d Reads from Each Sample to the Indexed MG Genes
{Terminal}
cd /RNA_data/work/mapping
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108379_fwd.good.fastq -r Sample_108379_rev.good.fastq -o 108379_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108381_fwd.good.fastq -r Sample_108381_rev.good.fastq -o 108381_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108382_fwd.good.fastq -r Sample_108382_rev.good.fastq -o 108382_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108383_fwd.good.fastq -r Sample_108383_rev.good.fastq -o 108383_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108385_fwd.good.fastq -r Sample_108385_rev.good.fastq -o 108385_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108386_fwd.good.fastq -r Sample_108386_rev.good.fastq -o 108386_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108388_fwd.good.fastq -r Sample_108388_rev.good.fastq -o 108388_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108390_fwd.good.fastq -r Sample_108390_rev.good.fastq -o 108390_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108391_fwd.good.fastq -r Sample_108391_rev.good.fastq -o 108391_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108392_fwd.good.fastq -r Sample_108392_rev.good.fastq -o 108392_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108394_fwd.good.fastq -r Sample_108394_rev.good.fastq -o 108394_MT_mapped_reads
comics -- omics mapping --index-dir /RNA_data/work/mapping/bowtie2-index_bbnorm_MG_genes/ -a DNA_data_bbnorm_anvio_gene_calls.fna -f Sample_108396_fwd.good.fastq -r Sample_108396_rev.good.fastq -o 108396_MT_mapped_reads
RNA.data.read.mapping.DNA.data.bbnorm.MG.genes.pbs (“/RNA_data/work/”) 12:12 (hr:min)
The results for each individual sample included a mapping directory containing the following files: "/RNA_data/work/mapping/*_MT_mapped_reads" (replace * with SampleID)
sorted.bam file, which contains the indexConvert the MT .bam Files to count tables to be processed statistically
{Terminal}
cd /RNA_data/work/mapping/108379_MT_mapped_reads
pileup.sh in=Sample_108379_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108379_RNA_data_bbnorm_anvio_gene_calls.pileup
#Output
head Sample_108379_RNA_data_bbnorm_anvio_gene_calls.pileup
#ID Avg_fold Length Ref_GC Covered_percent Covered_bases Plus_reads Minus_reads Read_GC Median_fold Std_Dev
genecall_0 0.0000 300 0.0000 0.0000 0 0 0 0.0000 0 0.00
genecall_1 0.0000 318 0.0000 0.0000 0 0 0 0.0000 0 0.00
genecall_2 0.0000 648 0.0000 0.0000 0 0 0 0.0000 0 0.00
genecall_3 0.0000 375 0.0000 0.0000 0 0 0 0.0000 0 0.00
genecall_4 0.0000 339 0.0000 0.0000 0 0 0 0.0000 0 0.00
genecall_5 0.0000 141 0.0000 0.0000 0 0 0 0.0000 0 0.00
genecall_6 0.0000 132 0.0000 0.0000 0 0 0 0.0000 0 0.00
genecall_7 1.9642 363 0.0000 92.8375 337 4 1 0.7149 1 1.39
genecall_8 0.0000 402 0.0000 0.0000 0 0 0 0.0000 0 0.00
sorted.bam file to get a table of genes and their counts{Terminal}
cd /RNA_data/work/mapping/108381_MT_mapped_reads
pileup.sh in=Sample_108381_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108381_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108382_MT_mapped_reads
pileup.sh in=Sample_108382_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108382_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108383_MT_mapped_reads
pileup.sh in=Sample_108383_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108383_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108385_MT_mapped_reads
pileup.sh in=Sample_108385_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108385_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108386_MT_mapped_reads
pileup.sh in=Sample_108386_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108386_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108388_MT_mapped_reads
pileup.sh in=Sample_108388_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108388_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108390_MT_mapped_reads
pileup.sh in=Sample_108390_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108390_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108391_MT_mapped_reads
pileup.sh in=Sample_108391_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108391_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108392_MT_mapped_reads
pileup.sh in=Sample_108392_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108392_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108394_MT_mapped_reads
pileup.sh in=Sample_108394_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108394_RNA_data_bbnorm_anvio_gene_calls.pileup
cd /RNA_data/work/mapping/108396_MT_mapped_reads
pileup.sh in=Sample_108396_RNA_data_bbnorm_anvio_gene_calls.sorted.bam out=Sample_108396_RNA_data_bbnorm_anvio_gene_calls.pileup
RNA.data.bam.to.pile.up.pbs (“/RNA_data/work/”) 00:01 (hr:min)
The following table is a summary of the RNA reads by sample and includes QC Reads, Mapped Reads, and Average Read Length (bp). All values were derived from the “Sample_*.final.contigs.fixed.bbnorm.sorted.bam” files using samtools stats.
| ID | Tundra | Time | Replicate | QC Reads | Mapped Reads | Average Length (bp) |
|---|---|---|---|---|---|---|
| S108382 | Tussock | T0 | 1 | 58,295,400 | 9,229,059 | 127 |
| S108383 | Tussock | T0 | 2 | 50,118,466 | 6,982,871 | 127 |
| S108391 | Tussock | T4 | 1 | 53,553,208 | 8,338,151 | 125 |
| S108392 | Tussock | T4 | 2 | 50,890,114 | 7,387,851 | 128 |
| S108394 | Tussock | T24 | 1 | 57,109,114 | 7,271,424 | 127 |
| S108396 | Tussock | T24 | 3 | 41,558,870 | 5,280,460 | 129 |
| S108379 | Wet Sedge | T0 | 1 | 64,669,902 | 15,825,372 | 118 |
| S108381 | Wet Sedge | T0 | 3 | 57,835,184 | 10,696,764 | 129 |
| S108385 | Wet Sedge | T4 | 1 | 53,169,696 | 11,952,373 | 129 |
| S108386 | Wet Sedge | T4 | 2 | 53,512,234 | 12,152,652 | 130 |
| S108388 | Wet Sedge | T24 | 1 | 47,583,386 | 8,193,077 | 130 |
| S108390 | Wet Sedge | T24 | 3 | 43,063,128 | 6,787,536 | 129 |
Following all Data Bioinformatics steps above, quality-controlled metatranscriptome reads were mapped to KEGG-annotated coding sequences (CDS) indexed from the metagenome assembly using BBMap to generate pile-up files (average read depth per gene), and SAMtools was used to extract counts and CDS lengths from the BBMap output.
In this section, all RNA.pileup files are imported into the R environment and combined into the first FULL dataset to be further formatted for downstream statistical analysis.
The following RNA.pileup files (average read depth per gene) for each metatranscriptome sample were generated via the BBMap module.
#S108379 -- WS_T0_R1
pileup_108379<-read.table("Pileup.Data/RNA.Pileup/Sample_108379_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108379$S108379 <- pileup_108379$Plus_reads + pileup_108379$Minus_reads
#S108381 -- WS_T0_R3
pileup_108381<-read.table("Pileup.Data/RNA.Pileup/Sample_108381_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108381$S108381 <- pileup_108381$Plus_reads + pileup_108381$Minus_reads
#S108385 -- WS_T4_R1
pileup_108385<-read.table("Pileup.Data/RNA.Pileup/Sample_108385_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108385$S108385 <- pileup_108385$Plus_reads + pileup_108385$Minus_reads
#S108386 -- WS_T4_R2
pileup_108386<-read.table("Pileup.Data/RNA.Pileup/Sample_108386_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108386$S108386 <- pileup_108386$Plus_reads + pileup_108386$Minus_reads
#S108388 -- WS_T24_R1
pileup_108388<-read.table("Pileup.Data/RNA.Pileup/Sample_108388_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108388$S108388 <- pileup_108388$Plus_reads + pileup_108388$Minus_reads
#S108390 -- WS_T24_R3
pileup_108390<-read.table("Pileup.Data/RNA.Pileup/Sample_108390_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108390$S108390 <- pileup_108390$Plus_reads + pileup_108390$Minus_reads
#S108382 -- TUSS_T0_R1
pileup_108382<-read.table("Pileup.Data/RNA.Pileup/Sample_108382_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108382$S108382 <- pileup_108382$Plus_reads + pileup_108382$Minus_reads
#S108383 -- TUSS_T0_R2
pileup_108383<-read.table("Pileup.Data/RNA.Pileup/Sample_108383_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108383$S108383 <- pileup_108383$Plus_reads + pileup_108383$Minus_reads
#S108391 -- TUSS_T4_R1
pileup_108391<-read.table("Pileup.Data/RNA.Pileup/Sample_108391_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108391$S108391 <- pileup_108391$Plus_reads + pileup_108391$Minus_reads
#S108392 -- TUSS_T4_R2
pileup_108392<-read.table("Pileup.Data/RNA.Pileup/Sample_108392_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108392$S108392 <- pileup_108392$Plus_reads + pileup_108392$Minus_reads
#S108394 -- TUSS_T24_R1
pileup_108394<-read.table("Pileup.Data/RNA.Pileup/Sample_108394_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108394$S108394 <- pileup_108394$Plus_reads + pileup_108394$Minus_reads
#S108396 -- TUSS_T24_R3
pileup_108396<-read.table("Pileup.Data/RNA.Pileup/Sample_108396_RNA_data_bbnorm_anvio_gene_calls.pileup", header = TRUE)
pileup_108396$S108396 <- pileup_108396$Plus_reads + pileup_108396$Minus_reads
In the previous step, each RNA.pileup file was imported and an additional column was added that summed the “Plus_reads” and the Minus_reads. The sum column in each RNA.pileup file was named after the RNA.pileup sample name (e.g. “S108379”). All RNA.pileup files contain identical genecalls in an identical order allowing us to extract the genecall ID and genecall Length columns from a single sample (S108379), along with the summed count column of each sample, to create a new dataframe that represents the initial, FULL RNA dataset for downstream analyses:
cts_rna_all <- data.frame(pileup_108379$ID, pileup_108379$Length, pileup_108379$S108379, pileup_108381$S108381, pileup_108382$S108382, pileup_108383$S108383, pileup_108385$S108385, pileup_108386$S108386, pileup_108388$S108388, pileup_108390$S108390, pileup_108391$S108391, pileup_108392$S108392, pileup_108394$S108394, pileup_108396$S108396)
names(cts_rna_all) <- c("ID", "Length", "S108379", "S108381", "S108382", "S108383", "S108385", "S108386", "S108388", "S108390", "S108391", "S108392", "S108394", "S108396")
The KEGG Functions file (GhostKoala) and KEGG Taxonomy file (GhostKoala) were imported and the functional and taxonomic annotations were matched to their corresponding genecall ID.
First, a KEGG reference file was imported with additional KEGG Tier pathways that provide additional, useful, functional categories for downstream analyses. The Tier II, III, and IV categories were merged to the GhostKoala KEGG functional annotations in subsequent steps.
# Import KO_Orthology.txt reference file
KO_ref<-read.table(file='Annotations.Data/KO_Orthology.txt', sep='\t', quote="", fill=TRUE, header = FALSE)
# Rename column headers
names(KO_ref)<-c("Tier_II","Tier_III","Tier_IV","KEGG")
# Separate "KEGG" column into "KO" and "Annotation" columns
KO_ref<-KO_ref %>% separate(KEGG, c("KO_Accession","Annotation"), " ", extra="merge")
# Further separatae "Annotation" column into "Symbol" and "Function" columns
KO_ref<-KO_ref %>% separate(Annotation, c("KO_Symbol","KO_Function"), "; ", extra="merge")
There are 3,917,616 total unique DNA-based genecalls in the metagenomic dataset. The GhostKOALA KEGG functional annotations file (“DNA_data_bbnorm_anvio_contigs_annotations.tsv”) contains annotations for 1,389,518 unique DNA-based genecalls. This indicates that 35.5% of the total unique DNA-based genecalls can be functionally annotated. That also means that 64.5% of unique DNA-based genecalls cannot be annotated and will not be included in downstream analyses. Here, the unique genecall functional annotations will be matched to the same unique genecalls in the DNA-based metagenomic dataset.
# Import the KEGG Functional Annotations file that was exported from the Anvi'o DNA contigs database
KeggAnvio<-read.table(file='Annotations.Data/DNA_data_bbnorm_anvio_contigs_annotations.tsv', sep='\t', quote = "", fill = TRUE, header = TRUE)
# Rename column headers
names(KeggAnvio)<-c("ID","Source","Accession","Function","e-value")
# Separate "Function" column into "Symbol" and "Function" columns
KeggAnvio<-KeggAnvio %>% separate(Function, c("Symbol","Function"), "; ", extra="merge")
# Add "genecall_" to the beginning of each gene ID value to match with downstream analyses
KeggAnvio$ID <- paste("genecall", KeggAnvio$ID, sep="_")
# Add "KO_Symbol" category to the dataset, matched by KO's between the KO_ref file and KeggAnvio
KeggAnvio$KO_Symbol = KO_ref[match(KeggAnvio$Accession, KO_ref$KO_Accession), "KO_Symbol"]
# Add "KO_Function" category to the dataset, matched by KO's between the KO_ref file and KeggAnvio
KeggAnvio$KO_Function = KO_ref[match(KeggAnvio$Accession, KO_ref$KO_Accession), "KO_Function"]
# Add "Tier_II" category to the dataset, matched by KO's between the KO_ref file and KeggAnvio
KeggAnvio$Tier_II = KO_ref[match(KeggAnvio$Accession, KO_ref$KO_Accession), "Tier_II"]
# Add "Tier_III" category to the dataset, matched by KO's between the KO_ref file and KeggAnvio
KeggAnvio$Tier_III = KO_ref[match(KeggAnvio$Accession, KO_ref$KO_Accession), "Tier_III"]
# Add "Tier_IV" category to the dataset, matched by KO's between the KO_ref file and KeggAnvio
KeggAnvio$Tier_IV = KO_ref[match(KeggAnvio$Accession, KO_ref$KO_Accession), "Tier_IV"]
# Keep the Function category from the KO Reference file to potentially fill in gaps in KeggAnvio annotations (many KO's labeled as "None", but the KO should match a known function from the reference file).
KeggFunction<-data.frame(KeggAnvio$ID,KeggAnvio$Accession,KeggAnvio$Symbol,KeggAnvio$KO_Function,KeggAnvio$Tier_II,KeggAnvio$Tier_III,KeggAnvio$Tier_IV)
names(KeggFunction)<-c("ID","KO","Symbol","Function","Tier_II","Tier_III","Tier_IV")
# Merge the KO, Symbol, Function, and all Tier columns into single "Combined" column for some downstream analyses
KeggFunction$Combined<-paste(KeggFunction$KO, ":", KeggFunction$Symbol, ":", KeggFunction$Function, ":", KeggFunction$Tier_II, ":", KeggFunction$Tier_III, ":", KeggFunction$Tier_IV)
# Keep just "ID" and "Combined" columns for use with some downstream analyses
KeggData<-data.frame(KeggFunction$ID, KeggFunction$Combined)
names(KeggData)<-c("ID","KEGG")
There are 3,917,616 total unique DNA-based genecalls in the metagenomic dataset. The GhostKoala KEGG taxonomy file (“KeggTaxonomy.txt”) contains annotations for 3,912,253 unique DNA-based genecalls. This indicates that 99.9% of the total unique DNA-based genecalls can be taxonomically annotated. Here, the unique genecall taxonomic annotations will be matched to the same unique genecalls in the DNA-based metagenomic dataset.
#Import KeggTaxonomy.txt from GhostKoala
KeggTaxa<-read.table("Annotations.Data/KeggTaxonomy.txt", header = TRUE, fill = TRUE)
names(KeggTaxa)<-c("ID","Domain","Phylum","Class","Order","Family","Genus","Species")
# Merge the taxonomic classes into a single column
KeggTaxa$Taxonomy <- paste(KeggTaxa$Domain, ":", KeggTaxa$Phylum, ":", KeggTaxa$Class, ":", KeggTaxa$Order, ":", KeggTaxa$Family, ":", KeggTaxa$Genus, ":", KeggTaxa$Species)
# Now keep just "ID" and "Taxonomy"
KeggTaxa <- data.frame(KeggTaxa$ID, KeggTaxa$Taxonomy)
names(KeggTaxa)<-c("ID","Taxonomy")
# Add "genecall_" to the beginning of each gene ID value to match with downstream analyses
KeggTaxa$ID<-paste("genecall", KeggTaxa$ID, sep="_")
The FULL RNA CTS dataset was assembled as a single dataframe (columns: [1] genecall, [2] length, [3-14] metatranscriptome Samples; rows: 3,917,616 unique “genecalls”), and subsequently subsampled into two new dataframes:
Subsample the unique genecalls that have greater than zero raw read counts (“Expressed Genes”) across all samples.
#cts_rna_expressed
cts_rna_expressed <- subset(cts_rna_all, S108379 > 0 | S108381 > 0 | S108382 > 0 | S108383 > 0 | S108385 > 0 | S108386 > 0 | S108388 > 0 | S108390 > 0 | S108391 > 0 | S108392 > 0 | S108394 > 0 | S108396 > 0, select=c(ID,Length,S108379,S108381,S108382,S108383,S108385,S108386,S108388,S108390,S108391,S108392,S108394,S108396))
Add functional and taxonomic annotations to each unique genecall within the “Expressed” subdata.
# Expressed Functional and Taxonomic Annotations
cts_rna_expressed$KEGG = KeggData[match(cts_rna_expressed$ID, KeggData$ID), "KEGG"]
cts_rna_expressed$Taxonomy = KeggTaxa[match(cts_rna_expressed$ID, KeggTaxa$ID), "Taxonomy"]
# Expressed Annotated Genes
cts_rna_exp_annotated<-na.omit(cts_rna_expressed, cols="KEGG")
Also subsample the unique genecalls that have only zero raw read counts (“Non-expressed Genes”) across all samples.
#cts_rna_zero
cts_rna_zero <- subset(cts_rna_all, S108379 == 0 & S108381 == 0 & S108382 == 0 & S108383 == 0 & S108385 == 0 & S108386 == 0 & S108388 == 0 & S108390 == 0 & S108391 == 0 & S108392 == 0 & S108394 == 0 & S108396 == 0, select=c(ID,Length,S108379,S108381,S108382,S108383,S108385,S108386,S108388,S108390,S108391,S108392,S108394,S108396))
Add functional and taxonomic annotations to each unique genecall within the “Non-Expressed” subdata.
# Non-Expressed Functional and Taxonomic Annotations
cts_rna_zero$KEGG = KeggData[match(cts_rna_zero$ID, KeggData$ID), "KEGG"]
cts_rna_zero$Taxonomy = KeggTaxa[match(cts_rna_zero$ID, KeggTaxa$ID), "Taxonomy"]
# Non-Expressed Annotated Genes
cts_rna_zero_annotated<-na.omit(cts_rna_zero, cols="KEGG")
From this point, we retain only those genecalls that were expressed and have a KEGG functional annotation. All remaining genecalls (64.5% of total dataset) are exempt from downstream analyses. Start by separating the KEGG categories into unique columns. Remove any genecall whose symbol annotation is “None”. Further remove any genecall annotated as Tier II “Organismal Systems” or “Human Diseases”.
# Separate "KEGG" column into "KO", "Symbol", "Function", "Tier_II", "Tier_III", and "Tier_IV" columns
cts_rna_exp_annotated<-cts_rna_exp_annotated %>% separate(KEGG, c("KO","Symbol","Function","Tier_II","Tier_III","Tier_IV"), ": ", extra="merge")
# Remove KO's with "None" as the "Symbol" annotation
cts_rna_exp_annotated<-cts_rna_exp_annotated[!grepl("None", cts_rna_exp_annotated$Symbol),]
# Remove KO's with Tier II category "Organismal Systems"
cts_rna_exp_annotated<-cts_rna_exp_annotated[!grepl("Organismal Systems ", cts_rna_exp_annotated$Tier_II),]
# Remove KO's with Tier II category "Human Diseases"
cts_rna_exp_annotated<-cts_rna_exp_annotated[!grepl("Human Diseases ", cts_rna_exp_annotated$Tier_II),]
Separate the Taxonomy categories into unique columns. Remove any genecall whose taxonomy is outside of “Archaea”, “Bacteria”, or “Fungi”.
# Separate "Taxonomy" column into "Kingdom", "Phylum", "Class", "Order", "Family", "Genus", and "Species" columns
cts_rna_exp_annotated<-cts_rna_exp_annotated %>% separate(Taxonomy, c("Kingdom","Phylum","Class","Order","Family","Genus","Species"), ": ", extra="merge")
# Remove KO's with "ag" as the "Kingdom" annotation
cts_rna_exp_annotated<-cts_rna_exp_annotated[!grepl("ag", cts_rna_exp_annotated$Kingdom),]
# Remove KO's with "Animals" as the "Kingdom" annotation
cts_rna_exp_annotated<-cts_rna_exp_annotated[!grepl("Animals", cts_rna_exp_annotated$Kingdom),]
# Remove KO's with "Plants" as the "Kingdom" annotation
cts_rna_exp_annotated<-cts_rna_exp_annotated[!grepl("Plants", cts_rna_exp_annotated$Kingdom),]
# Remove KO's with "Protists" as the "Kingdom" annotation
cts_rna_exp_annotated<-cts_rna_exp_annotated[!grepl("Protists", cts_rna_exp_annotated$Kingdom),]
#Re-merge the Taxonomy columns for downstream analysis
cts_rna_exp_annotated$Taxonomy <- paste(cts_rna_exp_annotated$Kingdom, ":", cts_rna_exp_annotated$Phylum, ":", cts_rna_exp_annotated$Class, ":", cts_rna_exp_annotated$Order, ":", cts_rna_exp_annotated$Family, ":", cts_rna_exp_annotated$Genus, ":", cts_rna_exp_annotated$Species)
TPM Normalization: Converts raw gene counts to Transcripts per Million. TPM normalization is based on methods described by Wagner et al. 2012 to normalize transcript abundance data and requires the following two steps:
At this point, we move forward with only the expressed annotated genecall data for downstream statistical analyses.
Calculate Tg values by multiplying the read count of each genecall by the average read length within each sample, then divide by the gene length of each genecall.
cts_rna_exp_annotated$S108379_Tg<-(cts_rna_exp_annotated$S108379 * 118) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108381_Tg<-(cts_rna_exp_annotated$S108381 * 129) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108382_Tg<-(cts_rna_exp_annotated$S108382 * 127) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108383_Tg<-(cts_rna_exp_annotated$S108383 * 127) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108385_Tg<-(cts_rna_exp_annotated$S108385 * 129) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108386_Tg<-(cts_rna_exp_annotated$S108386 * 130) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108388_Tg<-(cts_rna_exp_annotated$S108388 * 130) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108390_Tg<-(cts_rna_exp_annotated$S108390 * 129) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108391_Tg<-(cts_rna_exp_annotated$S108391 * 125) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108392_Tg<-(cts_rna_exp_annotated$S108392 * 128) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108394_Tg<-(cts_rna_exp_annotated$S108394 * 127) / cts_rna_exp_annotated$Length
cts_rna_exp_annotated$S108396_Tg<-(cts_rna_exp_annotated$S108396 * 129) / cts_rna_exp_annotated$Length
Now multiple each Tg value within a sample by 1e+06 and divide by the sum of Tg values for that sample to determine TPM.
cts_rna_exp_annotated$S108379_TPM<-(cts_rna_exp_annotated$S108379_Tg * 1000000) / sum(cts_rna_exp_annotated$S108379_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108381_TPM<-(cts_rna_exp_annotated$S108381_Tg * 1000000) / sum(cts_rna_exp_annotated$S108381_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108382_TPM<-(cts_rna_exp_annotated$S108382_Tg * 1000000) / sum(cts_rna_exp_annotated$S108382_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108383_TPM<-(cts_rna_exp_annotated$S108383_Tg * 1000000) / sum(cts_rna_exp_annotated$S108383_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108385_TPM<-(cts_rna_exp_annotated$S108385_Tg * 1000000) / sum(cts_rna_exp_annotated$S108385_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108386_TPM<-(cts_rna_exp_annotated$S108386_Tg * 1000000) / sum(cts_rna_exp_annotated$S108386_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108388_TPM<-(cts_rna_exp_annotated$S108388_Tg * 1000000) / sum(cts_rna_exp_annotated$S108388_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108390_TPM<-(cts_rna_exp_annotated$S108390_Tg * 1000000) / sum(cts_rna_exp_annotated$S108390_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108391_TPM<-(cts_rna_exp_annotated$S108391_Tg * 1000000) / sum(cts_rna_exp_annotated$S108391_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108392_TPM<-(cts_rna_exp_annotated$S108392_Tg * 1000000) / sum(cts_rna_exp_annotated$S108392_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108394_TPM<-(cts_rna_exp_annotated$S108394_Tg * 1000000) / sum(cts_rna_exp_annotated$S108394_Tg,na.rm=TRUE)
cts_rna_exp_annotated$S108396_TPM<-(cts_rna_exp_annotated$S108396_Tg * 1000000) / sum(cts_rna_exp_annotated$S108396_Tg,na.rm=TRUE)
Create the TPM dataframe by subsetting columns of interest from the cts_exp_annotated dataframe.
tpm_all<-data.frame(cts_rna_exp_annotated$ID,cts_rna_exp_annotated$S108379_TPM,cts_rna_exp_annotated$S108381_TPM,cts_rna_exp_annotated$S108382_TPM,cts_rna_exp_annotated$S108383_TPM,cts_rna_exp_annotated$S108385_TPM,cts_rna_exp_annotated$S108386_TPM,cts_rna_exp_annotated$S108388_TPM,cts_rna_exp_annotated$S108390_TPM,cts_rna_exp_annotated$S108391_TPM,cts_rna_exp_annotated$S108392_TPM,cts_rna_exp_annotated$S108394_TPM,cts_rna_exp_annotated$S108396_TPM,cts_rna_exp_annotated$KO,cts_rna_exp_annotated$Symbol,cts_rna_exp_annotated$Function,cts_rna_exp_annotated$Tier_II,cts_rna_exp_annotated$Tier_III,cts_rna_exp_annotated$Tier_IV,cts_rna_exp_annotated$Taxonomy)
names(tpm_all)<-c("ID","S108379_TPM","S108381_TPM","S108382_TPM","S108383_TPM","S108385_TPM","S108386_TPM","S108388_TPM","S108390_TPM","S108391_TPM","S108392_TPM","S108394_TPM","S108396_TPM","KO","Symbol","Function","Tier_II","Tier_III","Tier_IV","Taxonomy")
At this point, the FULL RNA dataset (cts_rna_all) was separated into Expressed (cts_rna_expressed) and Non-Expressed (cts_rna_zero) subdata. Then the functional and taxonomic annotations were imported and the Expressed and Non-Expressed subdata were further separated into Expressed Annotated (cts_rna_exp_annotated) and Non-Expressed Annotated (cts_rna_zero_annotated) subdata. The Non-Expressed Annotated subdata will be summarized as a list of genes that were NOT expressed at any point during our experiment. Then we moved forward with only the Expressed Annotated subdata and normalized the raw read counts using the Transcripts per Million (TPM) normalization parameters.
Now, we further separate the normalized expressed annotated subdata into their experimental treatments in order to analyze statistical differences in gene expression patterns and differential gene expression between experimental treatments. Pairwise similarities among metatranscriptomes will be calculated using Bray-Curtis similarity values, with differences between treatments assessed via PERMANOVA. Differential gene expresison will be calculated using the EdgeR package.
Separate the Tussock samples into their own dataset. Keep only those unique genecalls that have expression in at least 1 Tuss sample.
# Subset the TUSS TPM Normalized Expressed Annotated subdata
tpm_tuss <- subset(tpm_all, select=c(ID,S108382_TPM,S108383_TPM,S108391_TPM,S108392_TPM,S108394_TPM,S108396_TPM,KO,Symbol,Function,Tier_II,Tier_III,Tier_IV, Taxonomy))
names(tpm_tuss)<-c("ID","Tuss1_T0","Tuss2_T0","Tuss1_T4","Tuss2_T4","Tuss1_T24","Tuss3_T24","KO","Symbol","Function","Tier_II","Tier_III","Tier_IV","Taxonomy")
# Remove any non-expressed genes from the Tussock samples
tpm_tuss_expressed <- subset(tpm_tuss, Tuss1_T0 > 0 | Tuss2_T0 > 0 | Tuss1_T4 > 0 | Tuss2_T4 > 0 | Tuss1_T24 > 0 | Tuss3_T24 > 0, select=c(ID,Tuss1_T0,Tuss2_T0,Tuss1_T4,Tuss2_T4,Tuss1_T24,Tuss3_T24,KO,Symbol,Function,Tier_II,Tier_III,Tier_IV,Taxonomy))
Separate the Wet Sedge samples into their own dataset. Keep only those unique genecalls that have expression in at least 1 WS sample.
# Subset the WS TPM Normalized Expressed Annotated subdata (that's a mouthful...)
tpm_ws <- subset(tpm_all, select=c(ID,S108379_TPM,S108381_TPM,S108385_TPM,S108386_TPM,S108388_TPM,S108390_TPM,KO,Symbol,Function,Tier_II,Tier_III,Tier_IV,Taxonomy))
names(tpm_ws)<-c("ID","WS1_T0","WS3_T0","WS1_T4","WS2_T4","WS1_T24","WS3_T24","KO","Symbol","Function","Tier_II","Tier_III","Tier_IV","Taxonomy")
# Remove any non-expressed genes from the Wet Sedge samples
tpm_ws_expressed <- subset(tpm_ws, WS1_T0 > 0 | WS3_T0 > 0 | WS1_T4 > 0 | WS2_T4 > 0 | WS1_T24 > 0 | WS3_T24 > 0, select=c(ID,WS1_T0,WS3_T0,WS1_T4,WS2_T4,WS1_T24,WS3_T24,KO,Symbol,Function,Tier_II,Tier_III,Tier_IV,Taxonomy))
We also need the full TUSS and WS Expressed Annotated data together to make comparisons between ecosystems within each sampling timepoint. Reformat the tpm_all table and separate the “KEGG” column into all of its functional categories for downstream use.
# Make a new object for the tpm_all table so that you don't overwrite the original
tpm_all_exp_ann<-tpm_all
names(tpm_all_exp_ann)<-c("ID","S108379","S108381","S108382","S108383","S108385","S108386","S108388","S108390","S108391","S108392","S108394","S108396","KO","Symbol","Function","Tier_II","Tier_III","Tier_IV","Taxonomy")
Here, we determine the taxonomic composition of the microbial community by analyzing the relative abundance of gene expression for KEGG tier IV “Ribosomes” (03010 Ribosome PATH: ko03010). Within the “Ribosome” category, there are annotations for “small subunit ribosomal protein” (ssu) and “large subunit ribosomal protein” (lsu). For Bacteria and Archaea, we use on the ssu annotations (similar to 16S ssu rRNA targeted analysis). For Fungi, we use both the ssu (18S ssu rRNA) and lsu (28S lsu rRNA) annotations.
The first analysis pulls out all gene expression for “Ribosomes” and removes lsu annotations such that only ssu annotations are used for determining relative expression of Bacteria and Archaea at each sampling time point in both tussock and wet sedge tundra.
# Subset all unique genecalls whose annotation matches "Ribosomes"
taxa_ribo_all<- tpm_all_exp_ann[ which(tpm_all_exp_ann$Tier_IV=='03010 Ribosome [PATH:ko03010]'),]
# Make copy of Ribosome dataset to manipulate
taxa_ribo_ssu<-taxa_ribo_all
# Remove "large subunit ribosomal protein" from dataset
taxa_ribo_ssu<-taxa_ribo_ssu[!grepl("large subunit ribosomal protein", taxa_ribo_ssu$Function),]
# Write dataset to file
write.csv(taxa_ribo_ssu, 'Ribo.Results/ALPHA.taxa.rna.ssu.csv')
# Remove "Fungi" from dataset
taxa_ribo_ssu<-taxa_ribo_ssu[!grepl("Fungi", taxa_ribo_ssu$Taxonomy),]
# Separate "Taxonomy" column into all subdivisions
taxa_ribo_ssu<-taxa_ribo_ssu %>% separate(Taxonomy, c("Kingdom","Phylum","Class","Order","Family","Genus","Species"), ": ", extra="merge")
# Keep "Kingdom" and "Phylum"
taxa_ribo_ssu <- data.frame(taxa_ribo_ssu$Kingdom,taxa_ribo_ssu$Phylum, taxa_ribo_ssu$S108379, taxa_ribo_ssu$S108381,taxa_ribo_ssu$S108385,taxa_ribo_ssu$S108386,taxa_ribo_ssu$S108388,taxa_ribo_ssu$S108390,taxa_ribo_ssu$S108382,taxa_ribo_ssu$S108383,taxa_ribo_ssu$S108391,taxa_ribo_ssu$S108392,taxa_ribo_ssu$S108394,taxa_ribo_ssu$S108396)
names(taxa_ribo_ssu)<-c("Kingdom","Phylum","ws1-T0","ws3-T0","ws1-T4","ws2-T4","ws1-T24","ws3-T24","tuss1-T0","tuss2-T0","tuss1-T4","tuss2-T4","tuss1-T24","tuss3-T24")
# Remove "Unclassified" taxa otherwise it will clump unclassified Archaea with unclassified Bacteria (not what we want)
taxa_ribo_ssu<-taxa_ribo_ssu[!grepl("Unclassified", taxa_ribo_ssu$Phylum),]
# Remove "Other" taxa otherwise it will clump "other" Archaea with "other" Bacteria (not what we want)
taxa_ribo_ssu<-taxa_ribo_ssu[!grepl("Other", taxa_ribo_ssu$Phylum),]
# Sort data by Kingdom alphabetically
taxa_ribo_ssu<-taxa_ribo_ssu[order(taxa_ribo_ssu$Kingdom,taxa_ribo_ssu$Phylum),]
# Sum each column by unique Phylum
taxa_ribo_ssu_sum<-taxa_ribo_ssu %>% group_by(Phylum) %>% summarise_at(vars("ws1-T0","ws3-T0","ws1-T4","ws2-T4","ws1-T24","ws3-T24","tuss1-T0","tuss2-T0","tuss1-T4","tuss2-T4","tuss1-T24","tuss3-T24"), sum)
taxa_ribo_ssu_sum<-taxa_ribo_ssu_sum %>% mutate_at(vars('ws1-T0','ws3-T0','ws1-T4','ws2-T4','ws1-T24','ws3-T24','tuss1-T0','tuss2-T0','tuss1-T4','tuss2-T4','tuss1-T24','tuss3-T24'), funs(round(., 0)))
taxa_ribo_ssu_sum<-as.data.frame(taxa_ribo_ssu_sum)
taxa_ribo_ssu_sum
Repeat the same “Ribosome” analysis as above (at Phylum level), but this time keep all the taxonomy information.
# Make copy of taxa_ribo_all to manipulate
taxa_ribo_ssu2<-taxa_ribo_all
# Keep "Taxonomy" column
taxa_ribo_ssu2 <- data.frame(taxa_ribo_ssu2$Function,taxa_ribo_ssu2$Taxonomy,taxa_ribo_ssu2$S108379, taxa_ribo_ssu2$S108381,taxa_ribo_ssu2$S108385,taxa_ribo_ssu2$S108386,taxa_ribo_ssu2$S108388,taxa_ribo_ssu2$S108390,taxa_ribo_ssu2$S108382,taxa_ribo_ssu2$S108383,taxa_ribo_ssu2$S108391,taxa_ribo_ssu2$S108392,taxa_ribo_ssu2$S108394,taxa_ribo_ssu2$S108396)
names(taxa_ribo_ssu2)<-c("Function","Taxonomy","ws1-T0","ws3-T0","ws1-T4","ws2-T4","ws1-T24","ws3-T24","tuss1-T0","tuss2-T0","tuss1-T4","tuss2-T4","tuss1-T24","tuss3-T24")
# Remove "large subunit ribosomal protein" from dataset
taxa_ribo_ssu2<-taxa_ribo_ssu2[!grepl("large subunit ribosomal protein", taxa_ribo_ssu2$Function),]
# Remove "Fungi" from dataset
taxa_ribo_ssu2<-taxa_ribo_ssu2[!grepl("Fungi", taxa_ribo_ssu2$Taxonomy),]
# Sort data by Kingdom alphabetically
taxa_ribo_ssu2<-taxa_ribo_ssu2[order(taxa_ribo_ssu2$Function,taxa_ribo_ssu2$Taxonomy),]
# Sum each column by unique Taxonomy
taxa_ribo_ssu2_sum<-taxa_ribo_ssu2 %>% group_by(Taxonomy) %>% summarise_at(vars("ws1-T0","ws3-T0","ws1-T4","ws2-T4","ws1-T24","ws3-T24","tuss1-T0","tuss2-T0","tuss1-T4","tuss2-T4","tuss1-T24","tuss3-T24"), sum)
taxa_ribo_ssu2_sum<-taxa_ribo_ssu2_sum %>% mutate_at(vars('ws1-T0','ws3-T0','ws1-T4','ws2-T4','ws1-T24','ws3-T24','tuss1-T0','tuss2-T0','tuss1-T4','tuss2-T4','tuss1-T24','tuss3-T24'), funs(round(., 0)))
taxa_ribo_ssu2_sum<-as.data.frame(taxa_ribo_ssu2_sum)
taxa_ribo_ssu2_sum
The second analysis pulls out all gene expression for “Ribosomes” and removes Bacteria and Archaea annotations while retaining just Fungi with both ssu and lsu annotations to determine relative expression of Fungi at each sampling time point in both tussock and wet sedge tundra.
# Make copy of Ribosome dataset to manipulate
taxa_ribo_fungi<-taxa_ribo_all
# Remove "Bacteria" from dataset
taxa_ribo_fungi<-taxa_ribo_fungi[!grepl("Bacteria", taxa_ribo_fungi$Taxonomy),]
# Remove "Archaea" from dataset
taxa_ribo_fungi<-taxa_ribo_fungi[!grepl("Archaea", taxa_ribo_fungi$Taxonomy),]
# Separate "Taxonomy" column into all subdivisions
taxa_ribo_fungi<-taxa_ribo_fungi %>% separate(Taxonomy, c("Kingdom","Phylum","Class","Order","Family","Genus","Species"), ": ", extra="merge")
# Keep "Kingdom" and "Phylum"
taxa_ribo_fungi <- data.frame(taxa_ribo_fungi$Kingdom,taxa_ribo_fungi$Phylum, taxa_ribo_fungi$S108379, taxa_ribo_fungi$S108381,taxa_ribo_fungi$S108385,taxa_ribo_fungi$S108386,taxa_ribo_fungi$S108388,taxa_ribo_fungi$S108390,taxa_ribo_fungi$S108382,taxa_ribo_fungi$S108383,taxa_ribo_fungi$S108391,taxa_ribo_fungi$S108392,taxa_ribo_fungi$S108394,taxa_ribo_fungi$S108396)
names(taxa_ribo_fungi)<-c("Kingdom","Phylum","ws1-T0","ws3-T0","ws1-T4","ws2-T4","ws1-T24","ws3-T24","tuss1-T0","tuss2-T0","tuss1-T4","tuss2-T4","tuss1-T24","tuss3-T24")
# Remove "Unclassified" taxa
taxa_ribo_fungi<-taxa_ribo_fungi[!grepl("Unclassified", taxa_ribo_fungi$Phylum),]
# Remove "Other" taxa
taxa_ribo_fungi<-taxa_ribo_fungi[!grepl("Other", taxa_ribo_fungi$Phylum),]
# Sort data by Kingdom alphabetically
taxa_ribo_fungi<-taxa_ribo_fungi[order(taxa_ribo_fungi$Kingdom,taxa_ribo_fungi$Phylum),]
# Sum each column by unique Phylum
taxa_ribo_fungi_sum<-taxa_ribo_fungi %>% group_by(Phylum) %>% summarise_at(vars("ws1-T0","ws3-T0","ws1-T4","ws2-T4","ws1-T24","ws3-T24","tuss1-T0","tuss2-T0","tuss1-T4","tuss2-T4","tuss1-T24","tuss3-T24"), sum)
taxa_ribo_fungi_sum<-taxa_ribo_fungi_sum %>% mutate_at(vars('ws1-T0','ws3-T0','ws1-T4','ws2-T4','ws1-T24','ws3-T24','tuss1-T0','tuss2-T0','tuss1-T4','tuss2-T4','tuss1-T24','tuss3-T24'), funs(round(., 0)))
taxa_ribo_fungi_sum<-as.data.frame(taxa_ribo_fungi_sum)
taxa_ribo_fungi_sum
Repeat the same “Ribosome” analysis as above (at Phylum level), but this time keep all the taxonomy information.
# Make copy of taxa_ribo_all to manipulate
taxa_ribo_fungi2<-taxa_ribo_all
# Keep "Taxonomy" column
taxa_ribo_fungi2 <- data.frame(taxa_ribo_fungi2$Function,taxa_ribo_fungi2$Taxonomy,taxa_ribo_fungi2$S108379, taxa_ribo_fungi2$S108381,taxa_ribo_fungi2$S108385,taxa_ribo_fungi2$S108386,taxa_ribo_fungi2$S108388,taxa_ribo_fungi2$S108390,taxa_ribo_fungi2$S108382,taxa_ribo_fungi2$S108383,taxa_ribo_fungi2$S108391,taxa_ribo_fungi2$S108392,taxa_ribo_fungi2$S108394,taxa_ribo_fungi2$S108396)
names(taxa_ribo_fungi2)<-c("Function","Taxonomy","ws1-T0","ws3-T0","ws1-T4","ws2-T4","ws1-T24","ws3-T24","tuss1-T0","tuss2-T0","tuss1-T4","tuss2-T4","tuss1-T24","tuss3-T24")
# Remove "Bacteria" from dataset
taxa_ribo_fungi2<-taxa_ribo_fungi2[!grepl("Bacteria", taxa_ribo_fungi2$Taxonomy),]
# Remove "Archaea" from dataset
taxa_ribo_fungi2<-taxa_ribo_fungi2[!grepl("Archaea", taxa_ribo_fungi2$Taxonomy),]
# Sort data by Kingdom alphabetically
taxa_ribo_fungi2<-taxa_ribo_fungi2[order(taxa_ribo_fungi2$Function,taxa_ribo_fungi2$Taxonomy),]
# Sum each column by unique Taxonomy
taxa_ribo_fungi2_sum<-taxa_ribo_fungi2 %>% group_by(Taxonomy) %>% summarise_at(vars("ws1-T0","ws3-T0","ws1-T4","ws2-T4","ws1-T24","ws3-T24","tuss1-T0","tuss2-T0","tuss1-T4","tuss2-T4","tuss1-T24","tuss3-T24"), sum)
taxa_ribo_fungi2_sum<-taxa_ribo_fungi2_sum %>% mutate_at(vars('ws1-T0','ws3-T0','ws1-T4','ws2-T4','ws1-T24','ws3-T24','tuss1-T0','tuss2-T0','tuss1-T4','tuss2-T4','tuss1-T24','tuss3-T24'), funs(round(., 0)))
taxa_ribo_fungi2_sum<-as.data.frame(taxa_ribo_fungi2_sum)
taxa_ribo_fungi2_sum
Plot the relative abundance of taxa by all replicates within tussock and wet sedge tundra as a stackplot.
# Place taxa in order for plotting
taxa.MT.all.chart$Species<-factor(taxa.MT.all.chart$Species,levels = c("Acidobacteria","Actinobacteria","Alphaproteobacteria","Betaproteobacteria","Deltaproteobacteria","Gammaproteobacteria","Proteobacteria Unclassified","Bacteroidetes","Chloroflexi","Firmicutes","Planctomycetes","Verrucomicrobia","Bacteria Unclassified","Bacteria Other","Archaea","Fungi"))
taxa.MT.all.chart$Species<-fct_rev(taxa.MT.all.chart$Species)
taxa.MT.all.chart$Sample<-factor(taxa.MT.all.chart$Sample,levels = c("Tuss1-MT-T0","Tuss2-MT-T0","Tuss1-MT-T4","Tuss2-MT-T4","Tuss1-MT-T24","Tuss3-MT-T24","WS1-MT-T0","WS3-MT-T0","WS1-MT-T4","WS2-MT-T4","WS1-MT-T24","WS3-MT-T24"))
colourCount = length(unique(taxa.MT.all.chart$Species))
getPalette = colorRampPalette(brewer.pal(12, "Paired"))
taxa.MT.all.plot<-ggplot(taxa.MT.all.chart, aes(fill=Species, y=Value, x=Sample)) + geom_bar(position = "stack", stat="identity", color="black") + ylab(expression(atop("Relative Taxon", paste("Expression (%)")))) + theme_minimal() + theme(axis.text=element_text(size=10),axis.title=element_text(size=12),axis.title.x=element_blank()) + theme(legend.position = "right", legend.title=element_blank(), legend.text=element_text(size=8), legend.key.size = unit(0.75,"line"), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_blank(), panel.border = element_blank()) + scale_x_discrete(labels = function(Sample) str_wrap(Sample, width = 8)) + scale_size(guide=FALSE) + scale_fill_manual(values = rev(getPalette(colourCount))) + scale_x_discrete(labels = function(Sample) str_wrap(Sample, width = 15)) + theme(axis.text.x = element_text(angle = 270, hjust=0))
Scale for 'x' is already present. Adding another scale for 'x', which will replace the existing
scale.
taxa.MT.all.plot
Plot the mean relative abundance of taxa by sampling time point within tussock and wet sedge tundra as a stackplot.
# Place taxa in order for plotting
taxa.MT.time.mean.chart$Species<-factor(taxa.MT.time.mean.chart$Species,levels = c("Acidobacteria","Actinobacteria","Alphaproteobacteria","Betaproteobacteria","Deltaproteobacteria","Gammaproteobacteria","Proteobacteria Unclassified","Bacteroidetes","Chloroflexi","Firmicutes","Planctomycetes","Verrucomicrobia","Bacteria Unclassified","Bacteria Other","Archaea","Fungi"))
taxa.MT.time.mean.chart$Species<-fct_rev(taxa.MT.time.mean.chart$Species)
taxa.MT.time.mean.chart$Sample<-factor(taxa.MT.time.mean.chart$Sample,levels = c("Tuss-MT-T0","Tuss-MT-T4","Tuss-MT-T24","WS-MT-T0","WS-MT-T4","WS-MT-T24"))
colourCount = length(unique(taxa.MT.time.mean.chart$Species))
getPalette = colorRampPalette(brewer.pal(12, "Paired"))
taxa.MT.time.mean.plot<-ggplot(taxa.MT.time.mean.chart, aes(fill=Species, y=Value, x=Sample)) + geom_bar(position = "stack", stat="identity", color="black") + ylab(expression(atop("Relative Taxon", paste("Expression (%)")))) + theme_minimal() + theme(axis.text=element_text(size=10),axis.title=element_text(size=12),axis.title.x=element_blank()) + theme(legend.position = "right", legend.title=element_blank(), legend.text=element_text(size=8), legend.key.size = unit(0.75,"line"), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_blank(), panel.border = element_blank()) + scale_x_discrete(labels = function(Sample) str_wrap(Sample, width = 8)) + scale_size(guide=FALSE) + scale_fill_manual(values = rev(getPalette(colourCount))) + scale_x_discrete(labels = function(Sample) str_wrap(Sample, width = 15)) + theme(axis.text.x = element_text(angle = 270, hjust=0))
Scale for 'x' is already present. Adding another scale for 'x', which will replace the existing
scale.
taxa.MT.time.mean.plot
Plot the mean relative abundance of taxa by within tussock and wet sedge tundra as a stackplot.
# Place taxa in order for plotting
taxa.MT.tundra.mean.chart$Species<-factor(taxa.MT.tundra.mean.chart$Species,levels = c("Acidobacteria","Actinobacteria","Alphaproteobacteria","Betaproteobacteria","Deltaproteobacteria","Gammaproteobacteria","Proteobacteria Unclassified","Bacteroidetes","Chloroflexi","Firmicutes","Planctomycetes","Verrucomicrobia","Bacteria Unclassified","Bacteria Other","Archaea","Fungi"))
taxa.MT.tundra.mean.chart$Species<-fct_rev(taxa.MT.tundra.mean.chart$Species)
taxa.MT.tundra.mean.chart$Sample<-factor(taxa.MT.tundra.mean.chart$Sample,levels = c("Tuss-MT","WS-MT"))
colourCount = length(unique(taxa.MT.tundra.mean.chart$Species))
getPalette = colorRampPalette(brewer.pal(12, "Paired"))
taxa.MT.tundra.mean.plot<-ggplot(taxa.MT.tundra.mean.chart, aes(fill=Species, y=Value, x=Sample)) + geom_bar(position = "stack", stat="identity", color="black") + ylab(expression(atop("Relative Taxon", paste("Expression (%)")))) + theme_minimal() + theme(axis.text=element_text(size=10),axis.title=element_text(size=12),axis.title.x=element_blank()) + theme(legend.position = "right", legend.title=element_blank(), legend.text=element_text(size=8), legend.key.size = unit(0.75,"line"), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_blank(), panel.border = element_blank()) + scale_x_discrete(labels = function(Sample) str_wrap(Sample, width = 8)) + scale_size(guide=FALSE) + scale_fill_manual(values = rev(getPalette(colourCount))) + scale_x_discrete(labels = function(Sample) str_wrap(Sample, width = 15)) + theme(axis.text.x = element_text(angle = 270, hjust=0))
Scale for 'x' is already present. Adding another scale for 'x', which will replace the existing
scale.
taxa.MT.tundra.mean.plot
Plot the relative expression of ribosomes as a stackplot with the MEAN of samples by time point
# Place taxa in order for plotting
taxa.mean.ribo.chart$Species<-factor(taxa.mean.ribo.chart$Species,levels = c("Acidobacteria","Actinobacteria","Alphaproteobacteria","Betaproteobacteria","Deltaproteobacteria","Gammaproteobacteria","Proteobacteria Unclassified","Bacteroidetes","Chloroflexi","Firmicutes","Planctomycetes","Verrucomicrobia","Bacteria Other","Archaea","Fungi"))
taxa.mean.ribo.chart$Species<-fct_rev(taxa.mean.ribo.chart$Species)
taxa.mean.ribo.chart$Sample<-factor(taxa.mean.ribo.chart$Sample,levels = c("Tuss-16S","Tuss-MG","Tuss-MT-T0","Tuss-MT-T4","Tuss-MT-T24","WS-16S","WS-MG","WS-MT-T0","WS-MT-T4","WS-MT-T24"))
colourCount = length(unique(taxa.mean.ribo.chart$Species))
getPalette = colorRampPalette(brewer.pal(12, "Paired"))
taxa.mean.ribo.plot<-ggplot(taxa.mean.ribo.chart, aes(fill=Species, y=Value, x=Sample)) + geom_bar(position = "stack", stat="identity", color="black") + ylab(" ") + theme_minimal() + theme(axis.text=element_text(size=10),axis.title=element_text(size=12),axis.title.x=element_blank()) + theme(legend.position = "right", legend.title=element_blank(), legend.text=element_text(size=8), legend.key.size = unit(0.75,"line"), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_blank(), panel.border = element_blank()) + scale_x_discrete(labels = function(Sample) str_wrap(Sample, width = 8)) + scale_size(guide=FALSE) + guides(shape = guide_legend(override.aes = list(size = 1))) + scale_fill_manual(values = rev(getPalette(colourCount)), guide=guide_legend(reverse=FALSE)) + scale_x_discrete(labels = function(Sample) str_wrap(Sample, width = 15)) + theme(axis.text.x = element_text(angle = 270, hjust=0))
taxa.mean.ribo.plot
Plot the relative expression of TUSS ribosomes as a stackplot by MEAN of TIMEPOINT
# Place taxa in order for plotting
taxa.tuss.mean.ribo.chart$Species<-factor(taxa.tuss.mean.ribo.chart$Species,levels = c("Acidobacteria","Actinobacteria","Alphaproteobacteria","Betaproteobacteria","Deltaproteobacteria","Gammaproteobacteria","Proteobacteria Unclassified","Bacteroidetes","Chloroflexi","Firmicutes","Planctomycetes","Verrucomicrobia","Bacteria Other","Archaea","Fungi"))
taxa.tuss.mean.ribo.chart$Species<-fct_rev(taxa.tuss.mean.ribo.chart$Species)
taxa.tuss.mean.ribo.chart$Sample<-factor(taxa.tuss.mean.ribo.chart$Sample,levels = c("Tuss-16S","Tuss-MG","Tuss-MT-T0","Tuss-MT-T4","Tuss-MT-T24"))
colourCount = length(unique(taxa.tuss.mean.ribo.chart$Species))
getPalette = colorRampPalette(brewer.pal(12, "Paired"))
taxa.tuss.mean.ribo.plot<-ggplot(taxa.tuss.mean.ribo.chart, aes(fill=Species, y=Value, x=Sample)) + geom_bar(position = "stack", stat="identity", color="black") + ylab(expression(atop("Community Composition", paste("Relative Abundance (%)")))) + scale_fill_manual(values = rev(getPalette(colourCount)), guide=guide_legend(reverse=FALSE)) + theme_classic() + theme(axis.title.x=element_blank(), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank(), axis.ticks = element_blank(), axis.text.x = element_text(angle = 270, hjust=0), axis.text=element_text(size=10), axis.title=element_text(size=12), legend.position = "none", legend.title=element_blank(), legend.text=element_text(size=10), legend.key.size = unit(0.75,"line")) + scale_y_continuous(expand = c(0, 0), limits = c(0, 101))
taxa.tuss.mean.ribo.plot
Plot the relative expression of WS ribosomes as a stackplot by MEAN of TIMEPOINT
# Place taxa in order for plotting
taxa.ws.mean.ribo.chart$Species<-factor(taxa.ws.mean.ribo.chart$Species,levels = c("Acidobacteria","Actinobacteria","Alphaproteobacteria","Betaproteobacteria","Deltaproteobacteria","Gammaproteobacteria","Proteobacteria Unclassified","Bacteroidetes","Chloroflexi","Firmicutes","Planctomycetes","Verrucomicrobia","Bacteria Other","Archaea","Fungi"))
taxa.ws.mean.ribo.chart$Species<-fct_rev(taxa.ws.mean.ribo.chart$Species)
taxa.ws.mean.ribo.chart$Sample<-factor(taxa.ws.mean.ribo.chart$Sample,levels = c("WS-16S","WS-MG","WS-MT-T0","WS-MT-T4","WS-MT-T24"))
colourCount = length(unique(taxa.ws.mean.ribo.chart$Species))
getPalette = colorRampPalette(brewer.pal(12, "Paired"))
taxa.ws.mean.ribo.plot<-ggplot(taxa.ws.mean.ribo.chart, aes(fill=Species, y=Value, x=Sample)) + geom_bar(position = "stack", stat="identity", color="black ") + ylab(expression(atop("Community Composition", paste("Relative Abundance (%)")))) + scale_fill_manual(values = rev(getPalette(colourCount)), guide=guide_legend(reverse=FALSE)) + theme_classic() + theme(axis.title.x=element_blank(), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank(), axis.ticks = element_blank(), axis.text.x = element_text(angle = 270, hjust=0), axis.text=element_text(size=10), axis.title=element_text(size=12), legend.position = "right", legend.title=element_blank(), legend.text=element_text(size=10), legend.key.size = unit(0.75,"line")) + scale_y_continuous(expand = c(0, 0), limits = c(0, 101))
taxa.ws.mean.ribo.plot
To determine if there are significant differences in mean relative abundance of dominant taxa between tundra ecosystems, we calculate the mean (SD) of each phylum (dominant phylum; >1%) within tussock tundra and within wet sedge tundra throughout the experiment and compare to each other.
| Taxonomy | Tuss Relative Abundance (%) | WS Relative Abundance (%) | Mean Difference | Paired t-test (p-value) |
|---|---|---|---|---|
| Acidobacteria | 41 (11.8) | 8 (1.4) | +33% | 0.001 |
| Actinobacteria | 42 (12.3) | 40 (3.3) | N.S. | 0.681 |
| Alphaproteobacteria | 4 (1.8) | 3 (1.9) | +1% | 0.004 |
| Betaproteobacteria | 1 (0.6) | 4 (1.8) | -3% | 0.008 |
| Deltaproteobacteria | 3 (0.8) | 13 (1.3) | -10% | < 0.001 |
| Gammaproteobacteria | 3 (0.9) | 1 (0.5) | +2% | 0.003 |
| Bacteroidetes | 1 (0.3) | 2 (0.8) | -1% | 0.024 |
| Chloroflexi | 0 (0.1) | 8 (2.1) | -8% | < 0.001 |
| Firmicutes | 2 (0.2) | 10 (2.0) | -9% | < 0.001 |
| Euryarchaeota | 0 (0.1) | 7 (5.7) | -6% | 0.039 |
| Ascomycetes | 1 (1.0) | 0 (0) | +1% | 0.017 |
Test for differences in mean relative abundance of dominant phyla between tundra ecosystems.
# Pairwise comparisons between tundra ecosystems for each microbial taxonomic class
taxa.phylum.t.test.stats <- taxa.phylum.t.test %>%
group_by(Phylum) %>%
pairwise_t_test(
Abundance ~ Tundra, paired = TRUE,
p.adjust.method = "bonferroni"
) %>%
select(-.y., -n1, -n2, -df, -statistic, -p) # Remove details
taxa.phylum.t.test.stats
To determine if there are significant differences in mean relative abundance of dominant taxa between sampling time points within each tundra ecosystem, we calculate the mean (SD) of each phylum (dominant phylum; >1%) at each time point within tussock tundra or within wet sedge tundra throughout the experiment and compare to each other.
Click on the Show/Hide button to see the statistics.
Tussock Tundra Taxonomy (Phylum) ANOVA
# Subset response variables for MANOVA
tuss_mt_taxa_stats$response <- as.matrix(tuss_mt_taxa_stats[, 2:12])
# MANOVA test
tuss_mt_taxa_stats_manova <- manova(response ~ Timepoint, data=tuss_mt_taxa_stats)
summary.aov(tuss_mt_taxa_stats_manova)
Response Acidobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 569.68 284.839 6.7411 0.07765 .
Residuals 3 126.76 42.254
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Actinobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 515.87 257.937 3.1701 0.182
Residuals 3 244.10 81.366
Response Alphaproteobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3.3093 1.6546 0.3964 0.7034
Residuals 3 12.5209 4.1736
Response Betaproteobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.49775 0.24887 0.5572 0.6226
Residuals 3 1.33995 0.44665
Response Deltaproteobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.0768 0.53840 0.7706 0.5369
Residuals 3 2.0960 0.69867
Response Gammaproteobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.9972 0.4986 0.4557 0.6717
Residuals 3 3.2823 1.0941
Response Bacteroidetes :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.28561 0.142805 2.7536 0.2094
Residuals 3 0.15558 0.051861
Response Chloroflexi :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0110193 0.0055097 2.7275 0.2114
Residuals 3 0.0060601 0.0020200
Response Firmicutes :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.072658 0.036329 0.5627 0.6201
Residuals 3 0.193699 0.064566
Response Archaea :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0187581 0.0093790 4.6782 0.1196
Residuals 3 0.0060146 0.0020049
Response Fungi :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3.7024 1.85119 4.4145 0.1277
Residuals 3 1.2580 0.41935
## Run ANOVA for each category of interest (significant in MANOVA)
# Acidobacteria
tuss_mt_taxa_stats1<-aov(Acidobacteria~Timepoint,data=tuss_mt_taxa_stats)
summary.aov(tuss_mt_taxa_stats1)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 569.7 284.84 6.741 0.0777 .
Residuals 3 126.8 42.25
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_mt_taxa_stats1)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Acidobacteria ~ Timepoint, data = tuss_mt_taxa_stats)
$Timepoint
diff lwr upr p adj
Tuss-T24-Tuss-T0 -18.71302 -45.87623 8.450194 0.1238258
Tuss-T4-Tuss-T0 -22.18715 -49.35037 4.976064 0.0831148
Tuss-T4-Tuss-T24 -3.47413 -30.63734 23.689084 0.8610312
# Actinobacteria
tuss_mt_taxa_stats2<-aov(Actinobacteria~Timepoint,data=tuss_mt_taxa_stats)
summary.aov(tuss_mt_taxa_stats2)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 515.9 257.94 3.17 0.182
Residuals 3 244.1 81.37
TukeyHSD(tuss_mt_taxa_stats2)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Actinobacteria ~ Timepoint, data = tuss_mt_taxa_stats)
$Timepoint
diff lwr upr p adj
Tuss-T24-Tuss-T0 17.371869 -20.32181 55.06555 0.2765246
Tuss-T4-Tuss-T0 21.357505 -16.33618 59.05119 0.1879342
Tuss-T4-Tuss-T24 3.985636 -33.70805 41.67932 0.9013272
Wet Sedge Tundra Taxonomy (Phylum) ANOVA
# Subset response variables for MANOVA
ws_mt_taxa_stats$response <- as.matrix(ws_mt_taxa_stats[, 2:12])
# MANOVA test
ws_mt_taxa_stats_manova <- manova(response ~ Timepoint, data=ws_mt_taxa_stats)
summary.aov(ws_mt_taxa_stats_manova)
Response Acidobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7.6994 3.8497 5.9254 0.09079 .
Residuals 3 1.9491 0.6497
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Actinobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 18.633 9.3167 0.7441 0.5465
Residuals 3 37.562 12.5206
Response Alphaproteobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6.768 3.3840 0.8942 0.4959
Residuals 3 11.353 3.7845
Response Betaproteobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9.6474 4.8237 1.9898 0.2818
Residuals 3 7.2726 2.4242
Response Deltaproteobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.5414 0.77068 0.3253 0.745
Residuals 3 7.1082 2.36941
Response Gammaproteobacteria :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.07757 0.038786 0.1239 0.8878
Residuals 3 0.93904 0.313013
Response Bacteroidetes :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.8059 0.90295 1.8932 0.2939
Residuals 3 1.4308 0.47694
Response Chloroflexi :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7.0221 3.5111 0.752 0.5436
Residuals 3 14.0064 4.6688
Response Firmicutes :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9.3700 4.6850 1.4066 0.3707
Residuals 3 9.9923 3.3308
Response Archaea :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 38.139 19.069 0.453 0.6731
Residuals 3 126.299 42.100
Response Fungi :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0133818 0.0066909 3.5538 0.1617
Residuals 3 0.0056483 0.0018828
## Run ANOVA for each category of interest (significant in MANOVA)
# Acidobacteria
ws_mt_taxa_stats1<-aov(Acidobacteria~Timepoint,data=ws_mt_taxa_stats)
summary.aov(ws_mt_taxa_stats1)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7.699 3.85 5.925 0.0908 .
Residuals 3 1.949 0.65
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(ws_mt_taxa_stats1)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Acidobacteria ~ Timepoint, data = ws_mt_taxa_stats)
$Timepoint
diff lwr upr p adj
WS-T24-WS-T0 2.761829 -0.6064279 6.130085 0.0823421
WS-T4-WS-T0 1.148958 -2.2192980 4.517215 0.4342459
WS-T4-WS-T24 -1.612870 -4.9811265 1.755386 0.2586085
We further investigate which classes of taxa are driving these significant differences in relative taxon abundance observed at the phylum level (see above). We retain only those classes of each dominant phylum where at least 1 sample (out of 12) had > 1.0% relative abundance (to make biologically-relevant statistical comparisons of the data).
Here, we test for differences in mean relative abundance of biologically-relevant microbial classes within each dominant phylum (identified above) between tundra ecosystems.
# Pairwise comparisons between tundra ecosystems for each microbial taxonomic class
taxa.class.t.test.stats <- taxa.class.t.test %>%
group_by(Class) %>%
pairwise_t_test(
Abundance ~ Tundra, paired = TRUE,
p.adjust.method = "bonferroni"
) %>%
select(-.y., -group1, -group2, -n1, -n2, -df, -statistic, -p)
taxa.class.t.test.stats
Here, we separate these taxa by tundra ecosystem and look for differences between sampling time point over the course of the study.
Click on the Show/Hide button to see the statistics.
Tussock Tundra Taxonomy (All Levels) ANOVA
# Subset response variables for MANOVA
tuss_mt_all_taxa_stats$response <- as.matrix(tuss_mt_all_taxa_stats[, 2:44])
# MANOVA test
tuss_mt_all_taxa_stats_manova <- manova(response ~ Timepoint, data=tuss_mt_all_taxa_stats)
summary.aov(tuss_mt_all_taxa_stats_manova)
Response Archaea.Euryarchaeota.Methanosarcina. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00041968 2.0984e-04 3.615 0.1588
Residuals 3 0.00017414 5.8047e-05
Response Archaea.Euryarchaeota.Methanothrix. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0053763 0.00268814 9.1616 0.05277 .
Residuals 3 0.0008802 0.00029342
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Acidobacteria.Acidobacteriaceae.bacterium :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 94.457 47.229 14.2 0.02953 *
Residuals 3 9.978 3.326
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Acidobacteria.Acidobacterium. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 220.017 110.008 20.884 0.01735 *
Residuals 3 15.803 5.268
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Acidobacteria.Candidatus.Koribacter :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.35203 0.17602 0.2211 0.8136
Residuals 3 2.38840 0.79613
Response Bacteria.Acidobacteria.Candidatus.Solibacter :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 129.70 64.848 1.0011 0.4645
Residuals 3 194.34 64.778
Response Bacteria.Acidobacteria.Granulicella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 12.1780 6.0890 22.003 0.01612 *
Residuals 3 0.8302 0.2767
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Acidobacteria.Luteitalea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.49718 0.74859 3.1823 0.1813
Residuals 3 0.70571 0.23524
Response Bacteria.Acidobacteria.Terriglobus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7.2228 3.6114 29.342 0.01073 *
Residuals 3 0.3692 0.1231
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Actinobacteria.Blastococcus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.9275e-04 9.6374e-05 5.0224 0.1103
Residuals 3 5.7566e-05 1.9189e-05
Response Bacteria.Actinobacteria.Conexibacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.8091 1.40455 1.8927 0.294
Residuals 3 2.2263 0.74211
Response Bacteria.Actinobacteria.Geodermatophilus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0199709 0.0099855 22.676 0.01545 *
Residuals 3 0.0013211 0.0004404
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Actinobacteria.Kineococcus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0082458 0.0041229 2.853 0.2023
Residuals 3 0.0043354 0.0014451
Response Bacteria.Actinobacteria.Kribbella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 26.2973 13.1487 10.732 0.04294 *
Residuals 3 3.6754 1.2251
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Actinobacteria.Microlunatus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4.3734 2.18670 5.9499 0.09035 .
Residuals 3 1.1026 0.36752
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Actinobacteria.Nakamurella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00037268 0.00018634 0.2557 0.7897
Residuals 3 0.00218594 0.00072865
Response Bacteria.Actinobacteria.Nonomuraea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.9087 1.4544 1.2228 0.4089
Residuals 3 3.5682 1.1894
Response Bacteria.Actinobacteria.Plantactinospora. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4.0666 2.03330 3.0722 0.1879
Residuals 3 1.9855 0.66184
Response Bacteria.Actinobacteria.Streptacidiphilus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0173011 0.0086505 34.392 0.008544 **
Residuals 3 0.0007546 0.0002515
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Actinobacteria.Streptosporangium. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 13.0525 6.5262 2.1061 0.2683
Residuals 3 9.2963 3.0988
Response Bacteria.Actinobacteria.Thermobifida. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.3270 1.1635 0.4948 0.6521
Residuals 3 7.0547 2.3516
Response Bacteria.Actinobacteria.Thermomonospora. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 14.363 7.1816 0.7836 0.5324
Residuals 3 27.496 9.1652
Response Bacteria.Alphaproteobacteria.Rhodoplanes. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0051514 0.0025757 0.9459 0.4803
Residuals 3 0.0081691 0.0027230
Response Bacteria.Bacteroidetes.Alkalitalea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6.3370e-07 3.1684e-07 0.143 0.8723
Residuals 3 6.6457e-06 2.2153e-06
Response Bacteria.Bacteroidetes.Niastella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.081536 0.040768 2.8037 0.2058
Residuals 3 0.043622 0.014541
Response Bacteria.Betaproteobacteria.Rhizobacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.24174 0.120871 1.475 0.358
Residuals 3 0.24584 0.081945
Response Bacteria.Chloroflexi.Anaerolinea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00293452 0.00146726 25.417 0.01315 *
Residuals 3 0.00017318 0.00005773
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Chloroflexi.Herpetosiphon. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00194723 0.00097361 3.4429 0.1672
Residuals 3 0.00084837 0.00028279
Response Bacteria.Chloroflexi.Pelolinea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00014633 7.3166e-05 0.4076 0.6973
Residuals 3 0.00053852 1.7951e-04
Response Bacteria.Chloroflexi.Roseiflexus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00052666 0.00026333 0.866 0.5048
Residuals 3 0.00091221 0.00030407
Response Bacteria.Chloroflexi.Sphaerobacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00011227 5.6135e-05 0.1874 0.8381
Residuals 3 0.00089878 2.9959e-04
Response Bacteria.Deltaproteobacteria.Desulfobacca. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0034481 0.001724 0.27 0.7802
Residuals 3 0.0191579 0.006386
Response Bacteria.Deltaproteobacteria.Desulfobacula. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.29888 0.14944 0.868 0.5041
Residuals 3 0.51649 0.17216
Response Bacteria.Deltaproteobacteria.Geobacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0076615 0.0038308 4.4611 0.1262
Residuals 3 0.0025761 0.0008587
Response Bacteria.Deltaproteobacteria.Labilithrix. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.078762 0.039381 1.4636 0.3601
Residuals 3 0.080719 0.026906
Response Bacteria.Deltaproteobacteria.Stigmatella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.010184 0.0050920 0.7782 0.5343
Residuals 3 0.019631 0.0065436
Response Bacteria.Deltaproteobacteria.Syntrophobacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0011929 0.00059646 0.243 0.7984
Residuals 3 0.0073643 0.00245476
Response Bacteria.Deltaproteobacteria.Syntrophus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00036673 0.00018336 0.5656 0.6188
Residuals 3 0.00097259 0.00032420
Response Bacteria.Deltaproteobacteria.Vulgatibacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.012272 0.0061362 1.0175 0.4599
Residuals 3 0.018091 0.0060304
Response Bacteria.Firmicutes.Clostridia.Anoxybacter :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00028314 1.4157e-04 3.8365 0.149
Residuals 3 0.00011070 3.6901e-05
Response Bacteria.Firmicutes.Clostridia.Carboxydocella :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3.3545e-05 1.6773e-05 1.4618 0.3604
Residuals 3 3.4421e-05 1.1474e-05
Response Bacteria.Firmicutes.Clostridia.Pelotomaculum :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 5.5852e-05 2.7926e-05 0.7025 0.562
Residuals 3 1.1925e-04 3.9749e-05
Response Bacteria.Gammaproteobacteria.Others.Steroidobacter :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.57626 0.28813 0.4499 0.6747
Residuals 3 1.92116 0.64039
## Run ANOVA for each category of interest (significant in MANOVA)
# Acidobacteria
#tuss_mt_taxa_stats1<-aov(Acidobacteria~Timepoint,data=tuss_mt_taxa_stats)
#summary.aov(tuss_mt_taxa_stats1)
#TukeyHSD(tuss_mt_taxa_stats1)
# Actinobacteria
#tuss_mt_taxa_stats2<-aov(Actinobacteria~Timepoint,data=tuss_mt_taxa_stats)
#summary.aov(tuss_mt_taxa_stats2)
#TukeyHSD(tuss_mt_taxa_stats2)
Wet Sedge Tundra Taxonomy (Phylum) ANOVA
# Subset response variables for MANOVA
ws_mt_all_taxa_stats$response <- as.matrix(ws_mt_all_taxa_stats[, 2:44])
# MANOVA test
ws_mt_all_taxa_stats_manova <- manova(response ~ Timepoint, data=ws_mt_all_taxa_stats)
summary.aov(ws_mt_all_taxa_stats_manova)
Response Archaea.Euryarchaeota.Methanosarcina. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.8501 0.92505 1.0121 0.4614
Residuals 3 2.7419 0.91397
Response Archaea.Euryarchaeota.Methanothrix. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 14.176 7.0882 0.3716 0.7175
Residuals 3 57.220 19.0733
Response Bacteria.Acidobacteria.Acidobacteriaceae.bacterium :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0043835 0.0021918 0.4829 0.6579
Residuals 3 0.0136150 0.0045383
Response Bacteria.Acidobacteria.Acidobacterium. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.048155 0.0240775 7.7823 0.06496 .
Residuals 3 0.009282 0.0030939
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Acidobacteria.Candidatus.Koribacter :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.011145 0.005573 0.1597 0.8592
Residuals 3 0.104671 0.034890
Response Bacteria.Acidobacteria.Candidatus.Solibacter :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.69114 1.34557 13.316 0.03221 *
Residuals 3 0.30315 0.10105
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Acidobacteria.Granulicella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0015815 0.00079076 1.6458 0.3293
Residuals 3 0.0014414 0.00048046
Response Bacteria.Acidobacteria.Luteitalea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.9628 1.48141 2.8681 0.2012
Residuals 3 1.5495 0.51651
Response Bacteria.Acidobacteria.Terriglobus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0046043 0.0023022 1.3363 0.3846
Residuals 3 0.0051683 0.0017228
Response Bacteria.Actinobacteria.Blastococcus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.1122 0.55613 1.1742 0.4201
Residuals 3 1.4208 0.47362
Response Bacteria.Actinobacteria.Conexibacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.4254 0.71268 0.5785 0.6131
Residuals 3 3.6960 1.23198
Response Bacteria.Actinobacteria.Geodermatophilus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.47042 0.23521 0.5817 0.6116
Residuals 3 1.21299 0.40433
Response Bacteria.Actinobacteria.Kineococcus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.74335 0.87167 20.501 0.0178 *
Residuals 3 0.12755 0.04252
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Actinobacteria.Kribbella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.41550 0.20775 1.3161 0.3887
Residuals 3 0.47354 0.15785
Response Bacteria.Actinobacteria.Microlunatus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.11411 0.057054 1.103 0.4374
Residuals 3 0.15518 0.051727
Response Bacteria.Actinobacteria.Nakamurella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.6900 0.34502 0.2527 0.7917
Residuals 3 4.0954 1.36514
Response Bacteria.Actinobacteria.Nonomuraea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.022272 0.011136 0.7401 0.5479
Residuals 3 0.045139 0.015046
Response Bacteria.Actinobacteria.Plantactinospora. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.177e-03 0.00108850 35.329 0.00822 **
Residuals 3 9.243e-05 0.00003081
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Actinobacteria.Streptacidiphilus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.004291 0.0021454 0.1442 0.8713
Residuals 3 0.044621 0.0148737
Response Bacteria.Actinobacteria.Streptosporangium. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0137733 0.0068867 27.703 0.01164 *
Residuals 3 0.0007458 0.0002486
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Actinobacteria.Thermobifida. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0087797 0.0043898 1.9496 0.2867
Residuals 3 0.0067552 0.0022517
Response Bacteria.Actinobacteria.Thermomonospora. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.00903 0.004515 0.1292 0.8834
Residuals 3 0.10485 0.034951
Response Bacteria.Alphaproteobacteria.Rhodoplanes. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.20097 0.100483 1.4023 0.3716
Residuals 3 0.21497 0.071656
Response Bacteria.Bacteroidetes.Alkalitalea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.078465 0.039233 1.0123 0.4614
Residuals 3 0.116269 0.038756
Response Bacteria.Bacteroidetes.Niastella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7.7860e-05 3.8930e-05 1.3679 0.3783
Residuals 3 8.5376e-05 2.8459e-05
Response Bacteria.Betaproteobacteria.Rhizobacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.097226 0.048613 1.1257 0.4318
Residuals 3 0.129554 0.043185
Response Bacteria.Chloroflexi.Anaerolinea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.1939 0.59694 0.5768 0.6138
Residuals 3 3.1047 1.03489
Response Bacteria.Chloroflexi.Herpetosiphon. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.073961 0.036980 0.5227 0.6386
Residuals 3 0.212246 0.070749
Response Bacteria.Chloroflexi.Pelolinea. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.011915 0.0059574 0.2224 0.8127
Residuals 3 0.080355 0.0267852
Response Bacteria.Chloroflexi.Roseiflexus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.25570 0.12785 0.7113 0.5587
Residuals 3 0.53921 0.17974
Response Bacteria.Chloroflexi.Sphaerobacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.14094 0.070470 2.0791 0.2713
Residuals 3 0.10168 0.033895
Response Bacteria.Deltaproteobacteria.Desulfobacca. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.26282 0.131412 1.7055 0.3201
Residuals 3 0.23115 0.077051
Response Bacteria.Deltaproteobacteria.Desulfobacula. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.0042796 0.0021398 0.7241 0.5539
Residuals 3 0.0088652 0.0029551
Response Bacteria.Deltaproteobacteria.Geobacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.0399 1.01997 1.0622 0.4479
Residuals 3 2.8807 0.96022
Response Bacteria.Deltaproteobacteria.Labilithrix. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.23870 0.119349 4.4329 0.1271
Residuals 3 0.08077 0.026923
Response Bacteria.Deltaproteobacteria.Stigmatella. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.269432 0.134716 9.4202 0.05091 .
Residuals 3 0.042902 0.014301
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Bacteria.Deltaproteobacteria.Syntrophobacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.049322 0.024661 0.3472 0.7318
Residuals 3 0.213107 0.071036
Response Bacteria.Deltaproteobacteria.Syntrophus. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.57317 0.28658 1.6773 0.3244
Residuals 3 0.51258 0.17086
Response Bacteria.Deltaproteobacteria.Vulgatibacter. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.100681 0.050340 3.2288 0.1787
Residuals 3 0.046773 0.015591
Response Bacteria.Firmicutes.Clostridia.Anoxybacter :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.055540 0.027770 1.5626 0.3428
Residuals 3 0.053316 0.017772
Response Bacteria.Firmicutes.Clostridia.Carboxydocella :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.023846 0.011923 0.6326 0.5899
Residuals 3 0.056540 0.018847
Response Bacteria.Firmicutes.Clostridia.Pelotomaculum :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.045009 0.022505 1.8354 0.3016
Residuals 3 0.036784 0.012261
Response Bacteria.Gammaproteobacteria.Others.Steroidobacter :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.010112 0.0050558 1.2535 0.4021
Residuals 3 0.012100 0.0040334
## Run ANOVA for each category of interest (significant in MANOVA)
# Acidobacteria
#ws_mt_taxa_stats1<-aov(Acidobacteria~Timepoint,data=ws_mt_taxa_stats)
#summary.aov(ws_mt_taxa_stats1)
#TukeyHSD(ws_mt_taxa_stats1)
Alpha Diversity
# Richness: Sum up the number of non-zero entries per row
alpha.rna.taxa.rich <- apply(alpha.rna.taxa>0,1,sum)
write.csv(alpha.rna.taxa.rich, 'Ribo.Results/alpha.rna.taxa.rich.csv')
# Abundance: sum up the number of non-zero entries per row (1)
alpha.rna.taxa.abund <- apply(alpha.rna.taxa,1,sum)
write.csv(alpha.rna.taxa.abund, 'Ribo.Results/alpha.rna.taxa.abund.csv')
# Diversity: Shannon-Wiener Index (H')
alpha.rna.taxa.div <- diversity(alpha.rna.taxa, index="shannon")
write.csv(alpha.rna.taxa.div, 'Ribo.Results/alpha.rna.taxa.div.csv')
Run paired t-test to determine significance.
t.test(alpha.rna.taxa.t.test$Tuss,alpha.rna.taxa.t.test$WS,paired=TRUE)
Paired t-test
data: alpha.rna.taxa.t.test$Tuss and alpha.rna.taxa.t.test$WS
t = -29.477, df = 5, p-value = 8.425e-07
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.8021059 -0.6734298
sample estimates:
mean of the differences
-0.7377679
Beta Diversity
Calculate the Bray-Curtis dissimilarity matrix to use with adonis
# Make tpm_all_groups object
beta.rna.taxa_sample<-c("ws1-T0","ws3-T0","tuss1-T0","tuss2-T0","ws1-T4","ws2-T4","ws1-T24","ws3-T24","tuss1-T4","tuss2-T4","tuss1-T24","tuss3-T24")
beta.rna.taxa_veg<-c("WS","WS","TUSS","TUSS","WS","WS","WS","WS","TUSS","TUSS","TUSS","TUSS")
beta.rna.taxa_time<-c("T0","T0","T0","T0","T4","T4","T24","T24","T4","T4","T24","T24")
beta.rna.taxa_groups<-data.frame(beta.rna.taxa_sample,beta.rna.taxa_veg,beta.rna.taxa_time)
# Convert the values in Column 1 ("tpm_sample") into row names
rownames(beta.rna.taxa_groups)<-beta.rna.taxa_groups[,1]
beta.rna.taxa_groups<-beta.rna.taxa_groups[,-1]
# Bray-Curtis Dissimilarity Matrix for tpm_all_taxa_beta
rna.tpm.taxa.bc.dist<-vegdist(beta.rna.taxa, method = "bray")
Run PERMANOVA using adonis function of the Vegan package
PERMANOVA - MT-TPM-Taxa - Bray-Curtis
adonis(rna.tpm.taxa.bc.dist~beta.rna.taxa_veg*beta.rna.taxa_time, data=beta.rna.taxa_groups)
Call:
adonis(formula = rna.tpm.taxa.bc.dist ~ beta.rna.taxa_veg * beta.rna.taxa_time, data = beta.rna.taxa_groups)
Permutation: free
Number of permutations: 999
Terms added sequentially (first to last)
Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
beta.rna.taxa_veg 1 2.1184 2.11837 21.8967 0.63221 0.001 ***
beta.rna.taxa_time 2 0.3436 0.17179 1.7758 0.10254 0.154
beta.rna.taxa_veg:beta.rna.taxa_time 2 0.3083 0.15415 1.5934 0.09201 0.211
Residuals 6 0.5805 0.09674 0.17324
Total 11 3.3507 1.00000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Alpha diversity of each expressed, annotated metatranscriptome was assessed by the Shannon index. For beta diversity, pairwise similarities among metatranscriptomes were calculated using Bray-Curtis similarity values and visualized with principle coordinates analysis. The difference between treatments was assessed with PERMANOVA.
We calculated alpha diversity based on raw gene counts. Measurements include Richness, Abundance, and Shannon-Wiener Diversity Index (H’).
# Subset the tpm_all_exp_ann file to prepare for beta diversity analysis
cts_rna_all_alpha<-subset(cts_rna_exp_annotated, select=c(S108379,S108381,S108382,S108383,S108385,S108386,S108388,S108390,S108391,S108392,S108394,S108396))
names(cts_rna_all_alpha)<-c("WS1-T0","WS3-T0","Tuss1-T0","Tuss2-T0","WS1-T4","WS2-T4","WS1-T24","WS3-T24","Tuss1-T4","Tuss2-T4","Tuss1-T24","Tuss3-T24")
# Transpose all but the first column ("ID")
cts_rna_all_alpha<-as.data.frame(t(cts_rna_all_alpha))
Richness: Determine the number of unique genecalls that have more than one count recorded within each sample.
# Sum up the number of non-zero entries per row
apply(cts_rna_all_alpha>0,1,sum)
WS1-T0 WS3-T0 Tuss1-T0 Tuss2-T0 WS1-T4 WS2-T4 WS1-T24 WS3-T24 Tuss1-T4 Tuss2-T4
265016 238163 147445 130228 250164 236363 189178 152912 144608 123479
Tuss1-T24 Tuss3-T24
122748 108902
Abundance: Determine the total abundance of individual genecalls
# sum up the number of non-zero entries per row (1)
apply(cts_rna_all_alpha,1,sum)
WS1-T0 WS3-T0 Tuss1-T0 Tuss2-T0 WS1-T4 WS2-T4 WS1-T24 WS3-T24 Tuss1-T4 Tuss2-T4
3676101 2970625 2182064 1558718 3277106 3780598 1816840 1097027 2447254 2395566
Tuss1-T24 Tuss3-T24
1828558 1205803
Shannon-Wiener Diversity Index (H’): An information index that quantifies the uncertainty associated with predicting the identity of a new genecall given the total number of genecalls and the evenness in count abundances within each genecall.
diversity(cts_rna_all_alpha, index="shannon")
WS1-T0 WS3-T0 Tuss1-T0 Tuss2-T0 WS1-T4 WS2-T4 WS1-T24 WS3-T24 Tuss1-T4 Tuss2-T4
10.912618 10.786026 9.418039 9.541670 10.828631 10.545861 10.745632 10.759326 9.678131 9.297989
Tuss1-T24 Tuss3-T24
9.577669 9.371202
The following table is a summary of the Alpha Diversity on the raw gene counts for the Expressed Annotated Genes (expressed annotated genes for downstream analyses) by sample.
| ID | Tundra | Time | Replicate | Gene Richness | Gene Abundance | Gene Diversity (H') |
|---|---|---|---|---|---|---|
| S108379 | Tussock | T0 | 1 | 147,445 | 2,182,064 | 9.418 |
| S108381 | Tussock | T0 | 2 | 130,228 | 1,558,718 | 9.542 |
| S108382 | Tussock | T4 | 1 | 144,608 | 2,447,254 | 9.678 |
| S108383 | Tussock | T4 | 2 | 123,479 | 2,395,566 | 9.298 |
| S108385 | Tussock | T24 | 1 | 122,748 | 1,828,558 | 9.578 |
| S108386 | Tussock | T24 | 3 | 108,902 | 1,205,803 | 9.371 |
| S108388 | Wet Sedge | T0 | 1 | 265,016 | 3,676,101 | 10.913 |
| S108390 | Wet Sedge | T0 | 3 | 238,163 | 2,970,625 | 10.786 |
| S108391 | Wet Sedge | T4 | 1 | 250,164 | 3,277,106 | 10.829 |
| S108392 | Wet Sedge | T4 | 2 | 236,363 | 3,780,598 | 10.546 |
| S108394 | Wet Sedge | T24 | 1 | 189,178 | 1,816,840 | 10.746 |
| S108396 | Wet Sedge | T24 | 3 | 152,912 | 1,097,027 | 10.759 |
Run paired t-test to determine significance.
t.test(alpha.t.test$Tuss,alpha.t.test$WS,paired=TRUE)
Paired t-test
data: alpha.t.test$Tuss and alpha.t.test$WS
t = -23.496, df = 5, p-value = 2.6e-06
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.422629 -1.142037
sample estimates:
mean of the differences
-1.282333
Plot the Alpha Diversity summary.
# Re-arrange samples for plotting PCoA
alpha$Tundra <- factor(alpha$Tundra, levels=c("Tuss-MT-T0","Tuss-MT-T4","Tuss-MT-T24","WS-MT-T0","WS-MT-T4","WS-MT-T24"))
alpha.boxplot<-ggplot(alpha, aes(x=Tundra, y=H, fill=Tundra)) +
geom_boxplot() + ylab(expression(atop("Shannon-Wiener", paste("Diversity Index (H')")))) + theme_classic() + theme(axis.title.x=element_blank(), axis.text=element_text(size=10), axis.text.x = element_text(angle = 270, hjust=0, vjust=0.5), axis.title=element_text(size=12), legend.position = "none", legend.title=element_blank(), legend.text=element_text(size=10)) + ylim(9,12) + annotate(geom="text", x=3.5, y=11.9, label="Paired t-test") + annotate("text", x=3.5, y=11.7, size=3, label = "Tundra~(italic(p) < 0.001)", parse = TRUE) + scale_fill_manual(values=c("palegreen", "green3", "darkgreen", "lightskyblue1", "dodgerblue1", "midnightblue"), breaks=c("Tuss-MG-T0", "WS-MG-T0", "Tuss-MG-T4", "WS-MG-T4", "Tuss-MG-T24", "WS-MG-T24"), labels=c("Tuss-MG-T0", "WS-MG-T0", "Tuss-MG-T4", "WS-MG-T4", "Tuss-MG-T24", "WS-MG-T24"))
alpha.boxplot
We calculated beta diversity as the variation in gene expression between our two tundra ecosystems using Bray-Curtis similarity values.
First, subset the columns of interest from the tpm_all_exp_ann object (ID and all samples) to analyze the beta diversity of both ecosystems together. Create a tpm_groups object with columns for sample ID, vegetation type, and timepoint to be used for calculating pairwise similiarities among samples via PERMANOVA (adonis from Vegan package)
# Subset the tpm_all_exp_ann file to prepare for beta diversity analysis
tpm_all_beta<-subset(tpm_all_exp_ann, select=c(ID,S108379,S108381,S108382,S108383,S108385,S108386,S108388,S108390,S108391,S108392,S108394,S108396))
names(tpm_all_beta)<-c("ID","WS1-T0","WS3-T0","Tuss1-T0","Tuss2-T0","WS1-T4","WS2-T4","WS1-T24","WS3-T24","Tuss1-T4","Tuss2-T4","Tuss1-T24","Tuss3-T24")
# Remember "ID" column as non-numeric values
tpm_all_beta_ID<-tpm_all_beta$ID
# Transpose all but the first column ("ID")
tpm_all_beta<-as.data.frame(t(tpm_all_beta[,-1]))
colnames(tpm_all_beta)<-tpm_all_beta_ID
# Make tpm_all_groups object
tpm_all_sample<-c("WS1-T0","WS3-T0","Tuss1-T0","Tuss2-T0","WS1-T4","WS2-T4","WS1-T24","WS3-T24","Tuss1-T4","Tuss2-T4","Tuss1-T24","Tuss3-T24")
tpm_all_veg<-c("WS","WS","TUSS","TUSS","WS","WS","WS","WS","TUSS","TUSS","TUSS","TUSS")
tpm_all_time<-c("T0","T0","T0","T0","T4","T4","T24","T24","T4","T4","T24","T24")
tpm_all_groups<-data.frame(tpm_all_sample,tpm_all_veg,tpm_all_time)
# Convert the values in Column 1 ("tpm_sample") into row names
rownames(tpm_all_groups)<-tpm_all_groups[,1]
tpm_all_groups<-tpm_all_groups[,-1]
Second, subset the columns of interest from the tpm_ws_exp_ann object (ID and all samples) to analyze the beta diversity of the Wet Sedge ecosystem
# Subset the tpm_ws_exp_ann file to prepare for beta diversity analysis
tpm_ws_beta<-subset(tpm_all_exp_ann, select=c(ID,S108379,S108381,S108385,S108386,S108388,S108390))
names(tpm_ws_beta)<-c("ID","WS1-T0","WS3-T0","WS1-T4","WS2-T4","WS1-T24","WS3-T24")
# Remember "ID" column as non-numeric values
tpm_ws_beta_ID<-tpm_ws_beta$ID
# Transpose all but the first column ("ID")
tpm_ws_beta<-as.data.frame(t(tpm_ws_beta[,-1]))
colnames(tpm_ws_beta)<-tpm_ws_beta_ID
# Make tpm_ws_groups object
tpm_ws_sample<-c("WS1-T0","WS3-T0","WS1-T4","WS2-T4","WS1-T24","WS3-T24")
tpm_ws_veg<-c("WS","WS","WS","WS","WS","WS")
tpm_ws_time<-c("T0","T0","T4","T4","T24","T24")
tpm_ws_groups<-data.frame(tpm_ws_sample,tpm_ws_veg,tpm_ws_time)
# Convert the values in Column 1 ("tpm_sample") into row names
rownames(tpm_ws_groups)<-tpm_ws_groups[,1]
tpm_ws_groups<-tpm_ws_groups[,-1]
Third, subset the columns of interest from the tpm_tuss_exp_ann object (ID and all samples) to analyze the beta diversity of the Tussock ecosystem
# Subset the tpm_tuss_exp_ann file to prepare for beta diversity analysis
tpm_tuss_beta<-subset(tpm_all_exp_ann, select=c(ID,S108382,S108383,S108391,S108392,S108394,S108396))
names(tpm_tuss_beta)<-c("ID","Tuss1-T0","Tuss2-T0","Tuss1-T4","Tuss2-T4","Tuss1-T24","Tuss3-T24")
# Remember "ID" column as non-numeric values
tpm_tuss_beta_ID<-tpm_tuss_beta$ID
# Transpose all but the first column ("ID")
tpm_tuss_beta<-as.data.frame(t(tpm_tuss_beta[,-1]))
colnames(tpm_tuss_beta)<-tpm_tuss_beta_ID
# Make tpm_tuss_groups object
tpm_tuss_sample<-c("Tuss1-T0","Tuss2-T0","Tuss1-T4","Tuss2-T4","Tuss1-T24","Tuss3-T24")
tpm_tuss_veg<-c("TUSS","TUSS","TUSS","TUSS","TUSS","TUSS")
tpm_tuss_time<-c("T0","T0","T4","T4","T24","T24")
tpm_tuss_groups<-data.frame(tpm_tuss_sample,tpm_tuss_veg,tpm_tuss_time)
# Convert the values in Column 1 ("tpm_sample") into row names
rownames(tpm_tuss_groups)<-tpm_tuss_groups[,1]
tpm_tuss_groups<-tpm_tuss_groups[,-1]
Calculate the Bray-Curtis dissimilarity matrix to use with adonis
# Bray-Curtis Dissimilarity Matrix for tpm_all_beta
tpm.all.bc.dist<-vegdist(tpm_all_beta, method = "bray")
# Bray-Curtis Dissimilarity Matrix for tpm_ws_beta
tpm.ws.bc.dist<-vegdist(tpm_ws_beta, method = "bray")
# Bray-Curtis Dissimilarity Matrix for tpm_tuss_beta
tpm.tuss.bc.dist<-vegdist(tpm_tuss_beta, method = "bray")
Principal coordinates analysis (PCoA, also known as metric multidimensional scaling) attempts to represent the distances between samples in a low-dimensional, Euclidean space. In particular, it maximizes the linear correlation between the distances in the distance matrix, and the distances in a space of low dimension (typically, 2 or 3 axes are selected).
The first step of a PCoA is the construction of a (dis)similarity matrix. While PCA (principal component analysis) is based on Euclidean distances, PCoA can handle (dis)similarity matrices calculated from quantitative, semi-quantitative, qualitative, and mixed variables. As always, the choice of (dis)similarity measure is critical and must be suitable to the data in question. For count data, Bray-Curtis distance is recommended. You can use Jaccard index for presence/absence data. When the distance metric is Euclidean, PCoA is equivalent to PCA.
The first PCoA compares similarity between tundra ecosystems by incorporating the combined dataset.
# Run PCoA analysis on the Bray-Curtis dissimilarity matrix
tpm.all.bc.pcoa<-pcoa(tpm.all.bc.dist)
Extract the eigenvalues and calculate the variation explained by each axis.
pcoa.all<-cmdscale(tpm.all.bc.dist, eig=TRUE, k=2)
pcoa.all
$points
[,1] [,2]
WS1-T0 -0.4377394 0.030397881
WS3-T0 -0.4160635 0.023158270
Tuss1-T0 0.3951114 0.297624418
Tuss2-T0 0.3946852 0.305317523
WS1-T4 -0.4423050 0.012303942
WS2-T4 -0.4164105 0.028136246
WS1-T24 -0.4327351 0.009686452
WS3-T24 -0.3001615 -0.107702022
Tuss1-T4 0.4351273 -0.142104048
Tuss2-T4 0.4364641 -0.014062350
Tuss1-T24 0.4482202 -0.084366006
Tuss3-T24 0.3358067 -0.358390306
$eig
[1] 2.016506e+00 3.518486e-01 2.784335e-01 2.155332e-01 1.530410e-01 1.220625e-01 1.103878e-01
[8] 8.587274e-02 7.453914e-02 6.800370e-02 5.258858e-02 5.170945e-17
$x
NULL
$ac
[1] 0
$GOF
[1] 0.671147 0.671147
# Calculate the percent variation explained by Axis 1
pcoa.all.exp.var1<-round(pcoa.all$eig[1] / sum(pcoa.all$eig), 2) * 100
pcoa.all.exp.var1
[1] 57
# Calculate the percent variation explained by Axis 2
pcoa.all.exp.var2<-round(pcoa.all$eig[2] / sum(pcoa.all$eig), 2) * 100
pcoa.all.exp.var2
[1] 10
# Sum the percent variation explained by both axes
pcoa.all.sum.eig<-sum(pcoa.all.exp.var1, pcoa.all.exp.var2)
pcoa.all.sum.eig
[1] 67
Plot the PCoA.
# Extract the plot scores from first two PCoA axes
tpm.all.bc.pcoa.axes <- tpm.all.bc.pcoa$vectors[,c(1,2)]
# Create new object with PCoA axes
all.bc.pcoa<-data.frame(tpm.all.bc.pcoa.axes)
# Add Timepoint variable to the dataframe
all.timepoint<-c('WS-MT-T0','WS-MT-T0','Tuss-MT-T0','Tuss-MT-T0','WS-MT-T4','WS-MT-T4','WS-MT-T24','WS-MT-T24','Tuss-MT-T4','Tuss-MT-T4','Tuss-MT-T24','Tuss-MT-T24')
all.pcoa.1<-all.bc.pcoa$Axis.1
all.pcoa.2<-all.bc.pcoa$Axis.2
all.bc.pcoa<-data.frame(all.timepoint,all.pcoa.1,all.pcoa.2)
# Rename column headings
colnames(all.bc.pcoa)<-c("Timepoint","PCoA_1","PCoA_2")
# Re-arrange samples for plotting PCoA
all.bc.pcoa$Timepoint <- factor(all.bc.pcoa$Timepoint, levels=c("Tuss-MT-T0","WS-MT-T0","Tuss-MT-T4","WS-MT-T4","Tuss-MT-T24","WS-MT-T24"))
# Plot the PCoA
all.pcoa<-ggplot(data=all.bc.pcoa, aes(x=PCoA_1, y=PCoA_2, group=Timepoint, color=Timepoint)) + geom_point(aes(shape=Timepoint, size=0.75, fill=Timepoint)) + scale_fill_manual(values=c("palegreen","lightskyblue1","green3","dodgerblue1","darkgreen","midnightblue")) + scale_shape_manual(values=c(21,21,22,22,24,24)) + scale_color_manual(values=c("black","black","black","black","black","black")) + xlab("PCoA 1 (57%)") + ylab("PCoA 2 (10%)") + theme_classic() + theme(axis.text=element_text(size=10), axis.title=element_text(size=12), legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=10)) + guides(shape = guide_legend(override.aes = list(size = 3))) + scale_size(guide="none") + geom_hline(aes(yintercept = 0), color="gray", linetype="dashed") + geom_vline(aes(xintercept=0), color="gray", linetype="dashed") + ylim(-0.6,0.6) + xlim(-0.6,0.6) + annotate(geom="text", x=0.0, y=0.59, label="PERMANOVA") + annotate("text", x = 0, y = 0.52, size=3, label = "Tundra~(italic(p) == 0.001)", parse = TRUE) + annotate("text", x = 0, y = 0.45, size=3, label = "Time~Point~(italic(p) == 0.171)", parse = TRUE) + annotate("text", x = 0, y = 0.38, size=3, label = "Tundra:Time~Point~(italic(p) == 0.226)", parse = TRUE)
all.pcoa
The second PCoA compares similarity between sampling time points within the Tussock tundra ecosystem.
# Run PCoA analysis on the Bray-Curtis dissimilarity matrix
tpm.tuss.bc.pcoa<-pcoa(tpm.tuss.bc.dist)
Extract the eigenvalues and calculate the variation explained by each axis.
pcoa.tuss<-cmdscale(tpm.tuss.bc.dist, eig=TRUE, k=2)
pcoa.tuss
$points
[,1] [,2]
Tuss1-T0 -0.30365268 0.11026379
Tuss2-T0 -0.31292950 0.10833684
Tuss1-T4 0.15076219 -0.20670711
Tuss2-T4 0.01964313 -0.24828600
Tuss1-T24 0.08858266 -0.07022686
Tuss3-T24 0.35759420 0.30661935
$eig
[1] 3.489654e-01 2.272160e-01 1.534762e-01 8.675773e-02 6.818989e-02 2.949487e-17
$x
NULL
$ac
[1] 0
$GOF
[1] 0.651343 0.651343
# Calculate the percent variation explained by Axis 1
pcoa.tuss.exp.var1<-round(pcoa.tuss$eig[1] / sum(pcoa.tuss$eig), 2) * 100
pcoa.tuss.exp.var1
[1] 39
# Calculate the percent variation explained by Axis 2
pcoa.tuss.exp.var2<-round(pcoa.tuss$eig[2] / sum(pcoa.tuss$eig), 2) * 100
pcoa.tuss.exp.var2
[1] 26
# Sum the percent variation explained by both axes
pcoa.tuss.sum.eig<-sum(pcoa.tuss.exp.var1, pcoa.tuss.exp.var2)
pcoa.tuss.sum.eig
[1] 65
Plot the PCoA.
# Extract the plot scores from first two PCoA axes
tpm.tuss.bc.pcoa.axes <- tpm.tuss.bc.pcoa$vectors[,c(1,2)]
# Create new object with PCoA axes
tuss.bc.pcoa<-data.frame(tpm.tuss.bc.pcoa.axes)
# Add Timepoint variable to the dataframe
tuss.timepoint<-c('Tuss-T0','Tuss-T0','Tuss-T4','Tuss-T4','Tuss-T24','Tuss-T24')
tuss.pcoa.1<-tuss.bc.pcoa$Axis.1
tuss.pcoa.2<-tuss.bc.pcoa$Axis.2
tuss.bc.pcoa<-data.frame(tuss.timepoint,tuss.pcoa.1,tuss.pcoa.2)
# Rename column headings
colnames(tuss.bc.pcoa)<-c("Timepoint","PCoA_1","PCoA_2")
#tuss.bc.pcoa
# Re-arrange samples for plotting PCoA
tuss.bc.pcoa$Timepoint<-factor(tuss.bc.pcoa$Timepoint, levels=c("Tuss-T0","Tuss-T4","Tuss-T24"))
# Plot the PCoA
tuss.pcoa<-ggplot(data=tuss.bc.pcoa, aes(x=PCoA_1, y=PCoA_2, group=Timepoint, color=Timepoint)) + geom_point(aes(shape=Timepoint, size=0.75, fill=Timepoint)) + scale_fill_manual(values=c("palegreen","green3","darkgreen")) + scale_shape_manual(values=c(21,24,22)) + scale_color_manual(values=c("black","black","black")) + xlab("PCoA 1 (39%)") + ylab("PCoA 2 (26%)") + theme_minimal() + theme(axis.text=element_text(size=12),axis.title=element_text(size=14))+ theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=12), panel.background = element_blank(), panel.grid.major=element_blank(), panel.grid.minor = element_blank(), axis.line = element_line(colour = "gray"), panel.border = element_rect(colour = "gray", fill=NA, size=1)) + scale_size(guide=FALSE) + guides(shape = guide_legend(override.aes = list(size = 3))) + geom_hline(aes(yintercept = 0), color="gray", linetype="dashed") + geom_vline(aes(xintercept=0), color="gray", linetype="dashed") + ylim(-0.5,0.5) + xlim(-0.5,0.5) + annotate(geom="text", x=0.0, y=0.49, label="PERMANOVA") + annotate("text", x = 0, y = 0.42, size=3, label = "Time~Point~(italic(p) == 0.067)", parse = TRUE)
tuss.pcoa
The third PCoA compares similarity between sampling time points within the Wet Sedge tundra ecosystem.
# Run PCoA analysis on the Bray-Curtis dissimilarity matrix
tpm.ws.bc.pcoa<-pcoa(tpm.ws.bc.dist)
Extract the eigenvalues and calculate the variation explained by each axis.
pcoa.ws<-cmdscale(tpm.ws.bc.dist, eig=TRUE, k=2)
pcoa.ws
$points
[,1] [,2]
WS1-T0 0.12542270 0.13122361
WS3-T0 0.08716127 -0.16515464
WS1-T4 0.06687703 0.16211299
WS2-T4 0.14921322 -0.20414083
WS1-T24 0.05526330 0.10073221
WS3-T24 -0.48393753 -0.02477334
$eig
[1] 2.873146e-01 1.232105e-01 1.105396e-01 7.457974e-02 5.269727e-02 -3.869650e-17
$x
NULL
$ac
[1] 0
$GOF
[1] 0.6331925 0.6331925
# Calculate the percent variation explained by Axis 1
pcoa.ws.exp.var1<-round(pcoa.ws$eig[1] / sum(pcoa.ws$eig), 2) * 100
pcoa.ws.exp.var1
[1] 44
# Calculate the percent variation explained by Axis 2
pcoa.ws.exp.var2<-round(pcoa.ws$eig[2] / sum(pcoa.ws$eig), 2) * 100
pcoa.ws.exp.var2
[1] 19
# Sum the percent variation explained by both axes
pcoa.ws.sum.eig<-sum(pcoa.ws.exp.var1, pcoa.ws.exp.var2)
pcoa.ws.sum.eig
[1] 63
Plot the PCoA.
# Extract the plot scores from first two PCoA axes
tpm.ws.bc.pcoa.axes <- tpm.ws.bc.pcoa$vectors[,c(1,2)]
# Create new object with PCoA axes
ws.bc.pcoa<-data.frame(tpm.ws.bc.pcoa.axes)
# Add Timepoint variable to the dataframe
ws.timepoint<-c('WS-T0','WS-T0','WS-T4','WS-T4','WS-T24','WS-T24')
ws.pcoa.1<-ws.bc.pcoa$Axis.1
ws.pcoa.2<-ws.bc.pcoa$Axis.2
ws.bc.pcoa<-data.frame(ws.timepoint,ws.pcoa.1,ws.pcoa.2)
# Rename column headings
colnames(ws.bc.pcoa)<-c("Timepoint","PCoA_1","PCoA_2")
#ws.bc.pcoa
# Re-arrange samples for plotting PCoA
ws.bc.pcoa$Timepoint<-factor(ws.bc.pcoa$Timepoint, levels=c("WS-T0","WS-T4","WS-T24"))
# Plot the PCoA
ws.pcoa<-ggplot(data=ws.bc.pcoa, aes(x=PCoA_1, y=PCoA_2, group=Timepoint, color=Timepoint)) + geom_point(aes(shape=Timepoint, size=0.75, fill=Timepoint)) + scale_fill_manual(values=c("lightskyblue1","dodgerblue1","midnightblue")) + scale_shape_manual(values=c(21,24,22)) + scale_color_manual(values=c("black","black","black")) + xlab("PCoA 1 (44%)") + ylab("PCoA 2 (19%)") + theme_minimal() + theme(axis.text=element_text(size=12),axis.title=element_text(size=14))+ theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=12), panel.background = element_blank(), panel.grid.major=element_blank(), panel.grid.minor = element_blank(), axis.line = element_line(colour = "gray"), panel.border = element_rect(colour = "gray", fill=NA, size=1)) + scale_size(guide=FALSE) + guides(shape = guide_legend(override.aes = list(size = 3))) + geom_hline(aes(yintercept = 0), color="gray", linetype="dashed") + geom_vline(aes(xintercept=0), color="gray", linetype="dashed") + ylim(-0.5,0.5) + xlim(-0.5,0.5) + annotate(geom="text", x=0.0, y=0.49, label="PERMANOVA") + annotate("text", x = 0, y = 0.42, size=3, label = "Time~Point~(italic(p) == 0.333)", parse = TRUE)
ws.pcoa
adonis function of the Vegan packagePERMANOVA - TPM-All - Bray-Curtis
adonis(tpm.all.bc.dist~tpm_all_veg*tpm_all_time, data=tpm_all_groups)
Call:
adonis(formula = tpm.all.bc.dist ~ tpm_all_veg * tpm_all_time, data = tpm_all_groups)
Permutation: free
Number of permutations: 999
Terms added sequentially (first to last)
Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
tpm_all_veg 1 1.9959 1.99587 15.5588 0.56559 0.001 ***
tpm_all_time 2 0.3920 0.19602 1.5281 0.11109 0.158
tpm_all_veg:tpm_all_time 2 0.3712 0.18562 1.4470 0.10520 0.225
Residuals 6 0.7697 0.12828 0.21811
Total 11 3.5288 1.00000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
PERMANOVA - TPM-Tuss - Bray-Curtis
adonis(tpm.tuss.bc.dist~tpm_tuss_time, data=tpm_tuss_groups)
'nperm' >= set of all permutations: complete enumeration.
Set of permutations < 'minperm'. Generating entire set.
Call:
adonis(formula = tpm.tuss.bc.dist ~ tpm_tuss_time, data = tpm_tuss_groups)
Permutation: free
Number of permutations: 719
Terms added sequentially (first to last)
Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
tpm_tuss_time 2 0.49733 0.24867 1.9263 0.56221 0.06667 .
Residuals 3 0.38727 0.12909 0.43779
Total 5 0.88461 1.00000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
PERMANOVA - TPM-WS - Bray-Curtis
adonis(tpm.ws.bc.dist~tpm_ws_time, data=tpm_ws_groups)
'nperm' >= set of all permutations: complete enumeration.
Set of permutations < 'minperm'. Generating entire set.
Call:
adonis(formula = tpm.ws.bc.dist ~ tpm_ws_time, data = tpm_ws_groups)
Permutation: free
Number of permutations: 719
Terms added sequentially (first to last)
Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
tpm_ws_time 2 0.26594 0.13297 1.0432 0.41018 0.3333
Residuals 3 0.38240 0.12747 0.58982
Total 5 0.64834 1.00000
Look for differences in gene expression at multiple KEGG tier levels (II-IV) using TPM-normalized values.
Look at differences in gene expression patterns for multiple Tier levels within the Tussock tundra between sampling time points.
# Calculate TPM-normalized gene count sums for each Tuss sample in each Tier II category
tuss_tII_sum<-tpm_all_exp_ann %>% group_by(Tier_II) %>% summarise_at(vars("S108382","S108383","S108391","S108392","S108394","S108396"), sum)
tuss_tII_sum<-tuss_tII_sum %>% mutate_at(vars(S108382,S108383,S108391,S108392,S108394,S108396), funs(round(., 0)))
tuss_tII_sum<-as.data.frame(tuss_tII_sum)
tuss_tII_sum<-tuss_tII_sum[order(-tuss_tII_sum$S108382),]
Heatmap for Tier II categories by sampling timepoints
# Create copy of "ws_tII_sum" object to prep for heatmap
tuss_tII_heatmap<-tuss_tII_sum
# Convert the first column (Tier categories) into rownames
rownames(tuss_tII_heatmap) <- tuss_tII_heatmap$Tier_II
tuss_tII_heatmap<-as.data.frame(tuss_tII_heatmap[-1])
# Rename columns to sample IDs
colnames(tuss_tII_heatmap)<-c("Tuss1-T0","Tuss2-T0","Tuss1-T4","Tuss2-T4","Tuss1-T24","Tuss3-T24")
# Remove rotuss for "Organismal Systems" and "Human Diseases"
#tuss_tII_heatmap <- tuss_tII_heatmap[-c(5,6),]
# Convert dataframe into a matrix for heatmap
tuss_tII_heatmap<-as.matrix(tuss_tII_heatmap)
# Scale matrix values to generate Z-scores
tuss_tII_heatmap<-scale(t(tuss_tII_heatmap))
tuss_tII_heatmap<-t(tuss_tII_heatmap)
# Specify RColorBrewer custom color palette
col <- colorRampPalette(brewer.pal(10, "RdYlBu"))(256)
# Pheatmap
pheatmap(tuss_tII_heatmap, treeheight_row = 0, treeheight_col = 0,cluster_rows = FALSE, cluster_cols = FALSE)
Summed values by replicate for each KEGG tier II category.
| Tier_II | S108382 | S108383 | S108391 | S108392 | S108394 | S108396 | |
|---|---|---|---|---|---|---|---|
| 4 | Metabolism | 674,830 | 630,044 | 400,527 | 364,021 | 412,576 | 389,480 |
| 3 | Genetic Information Processing | 236,019 | 273,627 | 464,898 | 522,501 | 494,641 | 481,859 |
| 2 | Environmental Information Processing | 55,429 | 61,322 | 44,986 | 37,602 | 41,299 | 73,752 |
| 1 | Cellular Processes | 33,722 | 35,007 | 89,589 | 75,876 | 51,484 | 54,909 |
Calculate mean values from the replicate sums
# Calculate mean values from the replicate sums
tuss_tII_sum$meantuss_T0<-apply(tuss_tII_sum[,2:3], 1, mean)
tuss_tII_sum$meantuss_T4<-apply(tuss_tII_sum[,4:5], 1, mean)
tuss_tII_sum$meantuss_T24<-apply(tuss_tII_sum[,6:7], 1, mean)
tuss_tII_sum<-tuss_tII_sum %>% mutate_at(vars(meantuss_T0,meantuss_T4,meantuss_T24), funs(round(., 0)))
# Calculate sd values from the replicate sums
tuss_tII_sum$sdtuss_T0<-apply(tuss_tII_sum[,2:3], 1, sd)
tuss_tII_sum$sdtuss_T4<-apply(tuss_tII_sum[,4:5], 1, sd)
tuss_tII_sum$sdtuss_T24<-apply(tuss_tII_sum[,6:7], 1, sd)
tuss_tII_sum<-tuss_tII_sum %>% mutate_at(vars(sdtuss_T0,sdtuss_T4,sdtuss_T24), funs(round(., 0)))
# Subset Mean values
tuss_tII_mean<-subset(tuss_tII_sum, select=c(Tier_II,meantuss_T0,meantuss_T4,meantuss_T24))
names(tuss_tII_mean)<-c("Tier II", "Tuss-T0", "Tuss-T4", "Tuss-T24")
# Subset SD values
tuss_tII_sd<-subset(tuss_tII_sum, select=c(Tier_II,sdtuss_T0,sdtuss_T4,sdtuss_T24))
names(tuss_tII_sd)<-c("Tier II", "Tuss-T0", "Tuss-T4", "Tuss-T24")
# Remember "Tier II" as non-numeric values
tuss_tII_mean_TierII<-tuss_tII_mean$`Tier II`
tuss_tII_sd_TierII<-tuss_tII_sd$`Tier II`
# Transpose all but first column (Tier II)
tuss_tII_mean<-as.data.frame(t(tuss_tII_mean[,-1]))
colnames(tuss_tII_mean)<-tuss_tII_mean_TierII
tuss_tII_sd<-as.data.frame(t(tuss_tII_sd[,-1]))
colnames(tuss_tII_sd)<-tuss_tII_sd_TierII
# Combine mean and sd into single column with ± divider
tuss_tII_table<-as.data.frame(do.call(cbind, lapply(1:ncol(tuss_tII_mean), function(i) paste0(tuss_tII_mean[ , i], " ± ", tuss_tII_sd[ , i]))))
# Transpose the table back
tuss_tII_table<-t(tuss_tII_table)
# Rename columns to Sites
colnames(tuss_tII_table)<-c("Tussock (T0)", "Tussock (T4)", "Tussock (T24)")
rownames(tuss_tII_table)<-tuss_tII_mean_TierII
kable(tuss_tII_table, caption = "Tussock Tier II KEGG category averages (± standard deviation) by sample.", format.args = list(big.mark=","), align = "l") %>% kable_styling(bootstrap_options = c("striped","hover","condensed"))
| Tussock (T0) | Tussock (T4) | Tussock (T24) | |
|---|---|---|---|
| Metabolism | 652437 ± 31668 | 382274 ± 25814 | 401028 ± 16331 |
| Genetic Information Processing | 254823 ± 26593 | 493700 ± 40731 | 488250 ± 9038 |
| Environmental Information Processing | 58376 ± 4167 | 41294 ± 5221 | 57526 ± 22948 |
| Cellular Processes | 34364 ± 909 | 82732 ± 9697 | 53196 ± 2422 |
Run statistical tests to determine if significant differences exist between sampling timepoints for each Tier KEGG category. Click on the Show/Hide button to see the statistics.
# Subset the "sum" values for each sample
tuss_tII_stats<-subset(tuss_tII_sum, select=c(Tier_II,S108382,S108383,S108391,S108392,S108394,S108396))
rownames(tuss_tII_stats) <- tuss_tII_stats$Tier_II
tuss_tII_stats<-as.data.frame(t(tuss_tII_stats[-1]))
# Create a column with timepoint variable names
Timepoint<-c("T0","T0","T4","T4","T24","T24")
Timepoint<-data.frame(Timepoint)
# Add the timepoint column to the sum table
tuss_tII_stats<-data.frame(Timepoint,tuss_tII_stats)
# Subset response variables for MANOVA
tuss_tII_stats$response <- as.matrix(tuss_tII_stats[, 2:5])
# MANOVA test
tuss_tII_manova <- manova(response ~ Timepoint, data=tuss_tII_stats)
summary.aov(tuss_tII_manova)
Response Metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9.1031e+10 4.5515e+10 70.532 0.003005 **
Residuals 3 1.9359e+09 6.4532e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Genetic.Information.Processing. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7.4387e+10 3.7193e+10 45.581 0.005687 **
Residuals 3 2.4479e+09 8.1597e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Environmental.Information.Processing. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 370641156 185320578 0.9733 0.4723
Residuals 3 571224057 190408019
Response Cellular.Processes. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2377655296 1188827648 35.412 0.008192 **
Residuals 3 100714109 33571370
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Run ANOVA for each category of interest (significant in MANOVA)
# Metabolism
tuss_tII_aov1<-aov(Metabolism.~Timepoint,data=tuss_tII_stats)
summary.aov(tuss_tII_aov1)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9.103e+10 4.552e+10 70.53 0.00301 **
Residuals 3 1.936e+09 6.453e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tII_aov1)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Metabolism. ~ Timepoint, data = tuss_tII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -251409 -357562.5 -145255.55 0.0045100
T4-T0 -270163 -376316.5 -164009.55 0.0036586
T4-T24 -18754 -124907.5 87399.45 0.7607178
# Genetic Information Processing
tuss_tII_aov2<-aov(Genetic.Information.Processing.~Timepoint,data=tuss_tII_stats)
summary.aov(tuss_tII_aov2)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7.439e+10 3.719e+10 45.58 0.00569 **
Residuals 3 2.448e+09 8.160e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tII_aov2)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Genetic.Information.Processing. ~ Timepoint, data = tuss_tII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 233427.0 114059.5 352794.5 0.0078490
T4-T0 238876.5 119509.0 358244.0 0.0073444
T4-T24 5449.5 -113918.0 124817.0 0.9802674
# Environmental Information Processing
tuss_tII_aov3<-aov(Environmental.Information.Processing.~Timepoint,data=tuss_tII_stats)
summary.aov(tuss_tII_aov3)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 370641156 185320578 0.973 0.472
Residuals 3 571224057 190408019
TukeyHSD(tuss_tII_aov3)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Environmental.Information.Processing. ~ Timepoint, data = tuss_tII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -850.0 -58512.09 56812.09 0.9979117
T4-T0 -17081.5 -74743.59 40580.59 0.5125718
T4-T24 -16231.5 -73893.59 41430.59 0.5405595
# Cellular Processing
tuss_tII_aov4<-aov(Cellular.Processes.~Timepoint,data=tuss_tII_stats)
summary.aov(tuss_tII_aov4)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.378e+09 1.189e+09 35.41 0.00819 **
Residuals 3 1.007e+08 3.357e+07
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tII_aov4)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Cellular.Processes. ~ Timepoint, data = tuss_tII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 18832 -5380.089 43044.09 0.0934848
T4-T0 48368 24155.911 72580.09 0.0073816
T4-T24 29536 5323.911 53748.09 0.0295052
Plot the KEGG tier II categories as a barchart.
# Place the KEGG tier II categories in the preferred order for plotting
tuss_tII_bardata$Tier.II <- factor(tuss_tII_bardata$Tier.II,levels = c("Metabolism", "Genetic Information Processing", "Environmental Information Processing", "Cellular Processes"))
tuss_tII_bardata$Sample <- factor(tuss_tII_bardata$Sample,levels=c("Tuss-T0","Tuss-T4","Tuss-T24"))
tuss_tII_barplot<-ggplot(tuss_tII_bardata, aes(x = Tier.II, y = Mean, fill = Sample)) + geom_bar(stat = "identity", position = "dodge", color="black") + geom_errorbar(aes(ymin = Mean - SD, ymax = Mean + SD), width = 0.2, position = position_dodge(0.9)) + ylab(expression(atop("KEGG Tier II Categories", paste("Transcript Counts (TPM)")))) + theme_classic() + theme(axis.text=element_text(size=10), axis.title=element_text(size=12), axis.title.x=element_blank()) + theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=8), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + scale_size(guide=FALSE) + scale_fill_manual(values=c("palegreen", "green3", "darkgreen")) + scale_x_discrete(labels = function(Tier.II) str_wrap(Tier.II, width = 10)) + scale_y_continuous(expand = c(0, 0), limits = c(0, 800000)) + annotate(geom="text", x=0.70, y=750000, label="a") + annotate(geom="text", x=1.0, y=475000, label="b") + annotate(geom="text", x=1.3, y=475000, label="b") + annotate(geom="text", x=1.7, y=350000, label="a") + annotate(geom="text", x=2.0, y=600000, label="b") + annotate(geom="text", x=2.3, y=600000, label="b") + annotate(geom="text", x=3.0, y=150000, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=3.7, y=100000, label="a") + annotate(geom="text", x=4.0, y=150000, label="b") + annotate(geom="text", x=4.3, y=120000, label="a")
tuss_tII_barplot
Plot the combined METAGENOME and METATRANSCRIPTOME KEGG tier II categories as a barchart.
# Place the KEGG tier II categories in the preferred order for plotting
tuss_mg_mt_tII_bardata$Tier.II <- factor(tuss_mg_mt_tII_bardata$Tier.II,levels = c("Metabolism", "Genetic Information Processing", "Environmental Information Processing", "Cellular Processes"))
tuss_mg_mt_tII_bardata$Sample <- factor(tuss_mg_mt_tII_bardata$Sample,levels=c("Tuss-MG","Tuss-MT-T0","Tuss-MT-T4","Tuss-MT-T24"))
tuss_mg_mt_tII_barplot<-ggplot(tuss_mg_mt_tII_bardata, aes(x = Tier.II, y = Mean, fill = Sample)) + geom_bar(stat = "identity", position = "dodge", color="black") + geom_errorbar(aes(ymin = Mean - SD, ymax = Mean + SD), width = 0.2, position = position_dodge(0.9)) + ylab(expression(atop("KEGG Tier II Categories", paste("Gene Counts")))) + theme_classic() + theme(axis.text=element_text(size=10), axis.title=element_text(size=12), axis.title.x=element_blank()) + theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=8), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + scale_size(guide=FALSE) + scale_fill_manual(values=c("gray83","palegreen", "green3", "darkgreen")) + scale_x_discrete(labels = function(Tier.II) str_wrap(Tier.II, width = 10)) + scale_y_continuous(expand = c(0, 0), limits = c(0, 800000)) + annotate(geom="text", x=0.90, y=750000, label="a") + annotate(geom="text", x=1.12, y=475000, label="b") + annotate(geom="text", x=1.35, y=475000, label="b") + annotate(geom="text", x=1.9, y=350000, label="a") + annotate(geom="text", x=2.12, y=600000, label="b") + annotate(geom="text", x=2.35, y=600000, label="b") + annotate(geom="text", x=3.12, y=150000, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=3.9, y=100000, label="a") + annotate(geom="text", x=4.12, y=150000, label="b") + annotate(geom="text", x=4.35, y=120000, label="a")
tuss_mg_mt_tII_barplot
Calculate TPM-normalized gene count sums for each Tuss sample in each Tier III category.
# Calculate TPM-normalized gene count sums for each Tuss sample in each Tier III category
tuss_tIII_sum<-tpm_all_exp_ann %>% group_by(Tier_III) %>% summarise_at(vars("S108382","S108383","S108391","S108392","S108394","S108396"), sum)
tuss_tIII_sum<-tuss_tIII_sum %>% mutate_at(vars(S108382,S108383,S108391,S108392,S108394,S108396), funs(round(., 0)))
tuss_tIII_sum<-as.data.frame(tuss_tIII_sum)
tuss_tIII_sum<-tuss_tIII_sum[order(-tuss_tIII_sum$S108382),]
# Calculate mean values from the replicate sums
tuss_tIII_sum$meantuss_T0<-apply(tuss_tIII_sum[,2:3], 1, mean)
tuss_tIII_sum$meantuss_T4<-apply(tuss_tIII_sum[,4:5], 1, mean)
tuss_tIII_sum$meantuss_T24<-apply(tuss_tIII_sum[,6:7], 1, mean)
tuss_tIII_sum<-tuss_tIII_sum %>% mutate_at(vars(meantuss_T0,meantuss_T4,meantuss_T24), funs(round(., 0)))
# Calculate sd values from the replicate sums
tuss_tIII_sum$sdtuss_T0<-apply(tuss_tIII_sum[,2:3], 1, sd)
tuss_tIII_sum$sdtuss_T4<-apply(tuss_tIII_sum[,4:5], 1, sd)
tuss_tIII_sum$sdtuss_T24<-apply(tuss_tIII_sum[,6:7], 1, sd)
tuss_tIII_sum<-tuss_tIII_sum %>% mutate_at(vars(sdtuss_T0,sdtuss_T4,sdtuss_T24), funs(round(., 0)))
# Write data to .csv file
write.csv(tuss_tIII_sum, 'Norm.Results/tuss.rna.tIII.mean.sd.csv')
# Subset Mean values
tuss_tIII_mean<-subset(tuss_tIII_sum, select=c(Tier_III,meantuss_T0,meantuss_T4,meantuss_T24))
names(tuss_tIII_mean)<-c("Tier III", "Tuss-T0", "Tuss-T4", "Tuss-T24")
# Subset SD values
tuss_tIII_sd<-subset(tuss_tIII_sum, select=c(Tier_III,sdtuss_T0,sdtuss_T4,sdtuss_T24))
names(tuss_tIII_sd)<-c("Tier III", "Tuss-T0", "Tuss-T4", "Tuss-T24")
# Remember "Tier III" as non-numeric values
tuss_tIII_mean_TierIII<-tuss_tIII_mean$`Tier III`
tuss_tIII_sd_TierIII<-tuss_tIII_sd$`Tier III`
# Transpose all but first column (Tier II)
tuss_tIII_mean<-as.data.frame(t(tuss_tIII_mean[,-1]))
colnames(tuss_tIII_mean)<-tuss_tIII_mean_TierIII
tuss_tIII_sd<-as.data.frame(t(tuss_tIII_sd[,-1]))
colnames(tuss_tIII_sd)<-tuss_tIII_sd_TierIII
# Combine mean and sd into single column with ± divider
tuss_tIII_table<-as.data.frame(do.call(cbind, lapply(1:ncol(tuss_tIII_mean), function(i) paste0(tuss_tIII_mean[ , i], " ± ", tuss_tIII_sd[ , i]))))
# Transpose the table back
tuss_tIII_table<-t(tuss_tIII_table)
# Rename columns to Sites
colnames(tuss_tIII_table)<-c("Tussock (T0)", "Tussock (T4)", "Tussock (T24)")
rownames(tuss_tIII_table)<-tuss_tIII_mean_TierIII
Run statistical tests to determine if significant differences exist between sampling timepoints for each Tier KEGG category. Click on the Show/Hide button to see the statistics.
# Subset the "sum" values for each sample
tuss_tIII_stats<-subset(tuss_tIII_sum, select=c(Tier_III,S108382,S108383,S108391,S108392,S108394,S108396))
rownames(tuss_tIII_stats) <- tuss_tIII_stats$Tier_III
tuss_tIII_stats<-as.data.frame(t(tuss_tIII_stats[-1]))
# Add the timepoint column to the sum table
tuss_tIII_stats<-data.frame(Timepoint,tuss_tIII_stats)
# Subset response variables for MANOVA
tuss_tIII_stats$response <- as.matrix(tuss_tIII_stats[, 2:26])
# MANOVA test
tuss_tIII_manova <- manova(response ~ Timepoint, data=tuss_tIII_stats)
summary.aov(tuss_tIII_manova)
Response Overview. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3.0956e+10 1.5478e+10 42.322 0.006333 **
Residuals 3 1.0971e+09 3.6571e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Carbohydrate.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9211461944 4605730972 50.218 0.004939 **
Residuals 3 275146619 91715540
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Translation. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.3171e+10 6585386784 16.987 0.02311 *
Residuals 3 1.1630e+09 387668201
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Folding..sorting.and.degradation. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.9725e+10 1.4863e+10 95.521 0.001922 **
Residuals 3 4.6678e+08 1.5559e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Nucleotide.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1001591105 500795553 4.8863 0.1138
Residuals 3 307466450 102488817
Response Energy.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 334756291 167378146 5.7208 0.09468 .
Residuals 3 87772887 29257629
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Metabolism.of.cofactors.and.vitamins. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 340207504 170103752 20.105 0.01829 *
Residuals 3 25382865 8460955
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Signal.transduction. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 58507486 29253743 0.5767 0.6139
Residuals 3 152167277 50722426
Response Amino.acid.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 83140879 41570440 14.132 0.02972 *
Residuals 3 8824805 2941602
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Membrane.transport. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 138011900 69005950 1.4682 0.3593
Residuals 3 141004838 47001613
Response Cell.growth.and.death. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2564348544 1282174272 44.207 0.005945 **
Residuals 3 87011512 29003837
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Replication.and.repair. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9060508 4530254 2.5246 0.2275
Residuals 3 5383349 1794450
Response Lipid.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 20559886 10279943 4.8843 0.1139
Residuals 3 6314115 2104705
Response Metabolism.of.terpenoids.and.polyketides. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3301072 1650536 2.6472 0.2175
Residuals 3 1870532 623511
Response Cellular.community...prokaryotes. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 11953342 5976671 0.5268 0.6367
Residuals 3 34034581 11344860
Response Xenobiotics.biodegradation.and.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 37538224 18769112 165.82 0.0008488 ***
Residuals 3 339574 113191
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Metabolism.of.other.amino.acids. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 159070 79535 0.0646 0.9387
Residuals 3 3692780 1230927
Response Glycan.biosynthesis.and.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2773070 1386535 1.5981 0.3369
Residuals 3 2602910 867637
Response Transport.and.catabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3005564 1502782 2.8822 0.2003
Residuals 3 1564177 521392
Response Cell.motility. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4645112 2322556 0.6376 0.5878
Residuals 3 10927437 3642479
Response Biosynthesis.of.other.secondary.metabolites. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 85154 42577 8.4733 0.05833 .
Residuals 3 15074 5025
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Transcription. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2665852 1332926 5.9012 0.09124 .
Residuals 3 677626 225875
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Cellular.community...eukaryotes. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 292483 146242 3.9738 0.1435
Residuals 3 110404 36801
Response Signaling.molecules.and.interaction. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1912.3 956.17 0.8034 0.5255
Residuals 3 3570.5 1190.17
Response Enzyme.families. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Run ANOVA for each category of interest (significant in MANOVA)
# Overview
tuss_tIII_aov1<-aov(Overview.~Timepoint,data=tuss_tIII_stats)
summary.aov(tuss_tIII_aov1)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3.096e+10 1.548e+10 42.32 0.00633 **
Residuals 3 1.097e+09 3.657e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIII_aov1)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Overview. ~ Timepoint, data = tuss_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -146654.5 -226567.44 -66741.56 0.0094184
T4-T0 -157506.0 -237418.94 -77593.06 0.0076734
T4-T24 -10851.5 -90764.44 69061.44 0.8456760
# Carbohydrate Metabolism
tuss_tIII_aov2<-aov(Carbohydrate.metabolism.~Timepoint,data=tuss_tIII_stats)
summary.aov(tuss_tIII_aov2)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9.211e+09 4.606e+09 50.22 0.00494 **
Residuals 3 2.751e+08 9.172e+07
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIII_aov2)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Carbohydrate.metabolism. ~ Timepoint, data = tuss_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -80739.5 -120758.77 -40720.23 0.0071744
T4-T0 -85308.0 -125327.27 -45288.73 0.0061201
T4-T24 -4568.5 -44587.77 35450.77 0.8865603
# Translation
tuss_tIII_aov3<-aov(Translation.~Timepoint,data=tuss_tIII_stats)
summary.aov(tuss_tIII_aov3)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.317e+10 6.585e+09 16.99 0.0231 *
Residuals 3 1.163e+09 3.877e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIII_aov3)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Translation. ~ Timepoint, data = tuss_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 112973.5 30696.618 195250.38 0.0213363
T4-T0 73974.5 -8302.382 156251.38 0.0655840
T4-T24 -38999.0 -121275.882 43277.88 0.2633108
# Folding, Sorting, and Degradation
tuss_tIII_aov4<-aov(Folding..sorting.and.degradation.~Timepoint,data=tuss_tIII_stats)
summary.aov(tuss_tIII_aov4)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.973e+10 1.486e+10 95.52 0.00192 **
Residuals 3 4.668e+08 1.556e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIII_aov4)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Folding..sorting.and.degradation. ~ Timepoint, data = tuss_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 122733.5 70608.732 174858.27 0.0045869
T4-T0 166229.5 114104.732 218354.27 0.0019021
T4-T24 43496.0 -8628.768 95620.77 0.0788983
# Metabolism of Cofactors and Vitamins
tuss_tIII_aov5<-aov(Metabolism.of.cofactors.and.vitamins.~Timepoint,data=tuss_tIII_stats)
summary.aov(tuss_tIII_aov5)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 340207504 170103752 20.11 0.0183 *
Residuals 3 25382865 8460955
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIII_aov5)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Metabolism.of.cofactors.and.vitamins. ~ Timepoint, data = tuss_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -17234.5 -29389.569 -5079.431 0.0195185
T4-T0 -14308.0 -26463.069 -2152.931 0.0324916
T4-T24 2926.5 -9228.569 15081.569 0.6228689
# Amino Acid Metabolism
tuss_tIII_aov6<-aov(Amino.acid.metabolism.~Timepoint,data=tuss_tIII_stats)
summary.aov(tuss_tIII_aov6)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 83140879 41570440 14.13 0.0297 *
Residuals 3 8824805 2941602
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIII_aov6)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Amino.acid.metabolism. ~ Timepoint, data = tuss_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -9046.0 -16213.037 -1878.963 0.0268937
T4-T0 -5514.5 -12681.537 1652.537 0.0959158
T4-T24 3531.5 -3635.537 10698.537 0.2456345
# Cell Growth and Death
tuss_tIII_aov7<-aov(Cell.growth.and.death.~Timepoint,data=tuss_tIII_stats)
summary.aov(tuss_tIII_aov7)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.564e+09 1.282e+09 44.21 0.00595 **
Residuals 3 8.701e+07 2.900e+07
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIII_aov7)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Cell.growth.and.death. ~ Timepoint, data = tuss_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 16444.5 -6060.316 38949.32 0.1082421
T4-T0 49700.5 27195.684 72205.32 0.0055243
T4-T24 33256.0 10751.184 55760.82 0.0173935
# Xenobiotics Biodegradation and Metabolism
tuss_tIII_aov8<-aov(Xenobiotics.biodegradation.and.metabolism.~Timepoint,data=tuss_tIII_stats)
summary.aov(tuss_tIII_aov8)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 37538224 18769112 165.8 0.000849 ***
Residuals 3 339574 113191
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIII_aov8)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Xenobiotics.biodegradation.and.metabolism. ~ Timepoint, data = tuss_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -5434 -6839.9 -4028.1 0.0011070
T4-T0 -5168 -6573.9 -3762.1 0.0012720
T4-T24 266 -1139.9 1671.9 0.7336445
Calculate TPM-normalized gene count sums for each Tuss sample in each Tier IV category.
# Calculate TPM-normalized gene count sums for each Tuss sample in each Tier IV category
tuss_tIV_sum<-tpm_all_exp_ann %>% group_by(Tier_IV) %>% summarise_at(vars("S108382","S108383","S108391","S108392","S108394","S108396"), sum)
tuss_tIV_sum<-tuss_tIV_sum %>% mutate_at(vars(S108382,S108383,S108391,S108392,S108394,S108396), funs(round(., 0)))
tuss_tIV_sum<-as.data.frame(tuss_tIV_sum)
tuss_tIV_sum<-tuss_tIV_sum[order(-tuss_tIV_sum$S108382),]
# Calculate mean values from the replicate sums
tuss_tIV_sum$meantuss_T0<-apply(tuss_tIV_sum[,2:3], 1, mean)
tuss_tIV_sum$meantuss_T4<-apply(tuss_tIV_sum[,4:5], 1, mean)
tuss_tIV_sum$meantuss_T24<-apply(tuss_tIV_sum[,6:7], 1, mean)
tuss_tIV_sum<-tuss_tIV_sum %>% mutate_at(vars(meantuss_T0,meantuss_T4,meantuss_T24), funs(round(., 0)))
# Calculate sd values from the replicate sums
tuss_tIV_sum$sdtuss_T0<-apply(tuss_tIV_sum[,2:3], 1, sd)
tuss_tIV_sum$sdtuss_T4<-apply(tuss_tIV_sum[,4:5], 1, sd)
tuss_tIV_sum$sdtuss_T24<-apply(tuss_tIV_sum[,6:7], 1, sd)
tuss_tIV_sum<-tuss_tIV_sum %>% mutate_at(vars(sdtuss_T0,sdtuss_T4,sdtuss_T24), funs(round(., 0)))
# Write data to .csv file
write.csv(tuss_tIV_sum, 'Norm.Results/tuss.rna.tIV.mean.sd.csv')
# Subset Mean values
tuss_tIV_mean<-subset(tuss_tIV_sum, select=c(Tier_IV,meantuss_T0,meantuss_T4,meantuss_T24))
names(tuss_tIV_mean)<-c("Tier IV", "Tuss-T0", "Tuss-T4", "Tuss-T24")
# Subset SD values
tuss_tIV_sd<-subset(tuss_tIV_sum, select=c(Tier_IV,sdtuss_T0,sdtuss_T4,sdtuss_T24))
names(tuss_tIV_sd)<-c("Tier IV", "Tuss-T0", "Tuss-T4", "Tuss-T24")
# Remember "Tier IV" as non-numeric values
tuss_tIV_mean_TierIV<-tuss_tIV_mean$`Tier IV`
tuss_tIV_sd_TierIV<-tuss_tIV_sd$`Tier IV`
# Transpose all but first column (Tier IV)
tuss_tIV_mean<-as.data.frame(t(tuss_tIV_mean[,-1]))
colnames(tuss_tIV_mean)<-tuss_tIV_mean_TierIV
tuss_tIV_sd<-as.data.frame(t(tuss_tIV_sd[,-1]))
colnames(tuss_tIV_sd)<-tuss_tIV_sd_TierIV
# Combine mean and sd into single column with ± divider
tuss_tIV_table<-as.data.frame(do.call(cbind, lapply(1:ncol(tuss_tIV_mean), function(i) paste0(tuss_tIV_mean[ , i], " ± ", tuss_tIV_sd[ , i]))))
# Transpose the table back
tuss_tIV_table<-t(tuss_tIV_table)
# Rename columns to Sites
colnames(tuss_tIV_table)<-c("Tussock (T0)", "Tussock (T4)", "Tussock (T24)")
rownames(tuss_tIV_table)<-tuss_tIV_mean_TierIV
#tuss_tIV_table
Run statistical tests to determine if significant differences exist between sampling timepoints for each Tier KEGG category. Click on the Show/Hide button to see the statistics.
# Subset the "sum" values for each sample
tuss_tIV_stats<-subset(tuss_tIV_sum, select=c(Tier_IV,S108382,S108383,S108391,S108392,S108394,S108396))
rownames(tuss_tIV_stats) <- tuss_tIV_stats$Tier_IV
tuss_tIV_stats<-as.data.frame(t(tuss_tIV_stats[-1]))
# Add the timepoint column to the sum table
tuss_tIV_stats<-data.frame(Timepoint,tuss_tIV_stats)
# Subset response variables for MANOVA
tuss_tIV_stats$response <- as.matrix(tuss_tIV_stats[, 2:227])
# MANOVA test
tuss_tIV_manova <- manova(response ~ Timepoint, data=tuss_tIV_stats)
summary.aov(tuss_tIV_manova)
Response X01200.Carbon.metabolism..PATH.ko01200. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.7153e+10 8576404278 17.66 0.0219 *
Residuals 3 1.4569e+09 485639564
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03010.Ribosome..PATH.ko03010. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.3250e+10 6625221176 17.078 0.02294 *
Residuals 3 1.1638e+09 387927821
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03018.RNA.degradation..PATH.ko03018. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.4580e+10 1.2290e+10 116.53 0.001433 **
Residuals 3 3.1639e+08 1.0546e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00230.Purine.metabolism..PATH.ko00230. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1194233024 597116512 5.7324 0.09445 .
Residuals 3 312493039 104164346
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X01212.Fatty.acid.metabolism..PATH.ko01212. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1400417305 700208653 166.61 0.0008428 ***
Residuals 3 12607908 4202636
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00190.Oxidative.phosphorylation..PATH.ko00190. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 294927542 147463771 6.4145 0.08251 .
Residuals 3 68967675 22989225
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00650.Butanoate.metabolism..PATH.ko00650. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 916283691 458141846 11.478 0.03929 *
Residuals 3 119745259 39915086
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00010.Glycolysis...Gluconeogenesis..PATH.ko00010. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 646115961 323057980 29.342 0.01073 *
Residuals 3 33029853 11009951
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00640.Propanoate.metabolism..PATH.ko00640. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 276630866 138315433 8.5361 0.05778 .
Residuals 3 48610475 16203492
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X02020.Two.component.system..PATH.ko02020. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 100862431 50431216 0.878 0.501
Residuals 3 172323860 57441287
Response X02010.ABC.transporters..PATH.ko02010. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 136666543 68333271 1.9745 0.2837
Residuals 3 103823155 34607718
Response X01230.Biosynthesis.of.amino.acids..PATH.ko01230. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 14513158 7256579 0.4641 0.6674
Residuals 3 46906381 15635460
Response X04112.Cell.cycle...Caulobacter..PATH.ko04112. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2484296210 1242148105 41.97 0.00641 **
Residuals 3 88788203 29596068
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00760.Nicotinate.and.nicotinamide.metabolism..PATH.ko00760. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 140469145 70234573 14.447 0.02885 *
Residuals 3 14584946 4861649
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X01220.Degradation.of.aromatic.compounds..PATH.ko01220. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 92377264 46188632 27.224 0.01193 *
Residuals 3 5089767 1696589
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03060.Protein.export..PATH.ko03060. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 80425936 40212968 2.1311 0.2655
Residuals 3 56609754 18869918
Response X00240.Pyrimidine.metabolism..PATH.ko00240. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9523084 4761542 7.8297 0.06447 .
Residuals 3 1824413 608138
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X01210.2.Oxocarboxylic.acid.metabolism..PATH.ko01210. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 40780169 20390085 1.1247 0.432
Residuals 3 54389012 18129671
Response X00500.Starch.and.sucrose.metabolism..PATH.ko00500. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 28026154 14013077 37.804 0.007456 **
Residuals 3 1112040 370680
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00970.Aminoacyl.tRNA.biosynthesis..PATH.ko00970. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2042336 1021168 0.9365 0.483
Residuals 3 3271196 1090399
Response X00860.Porphyrin.and.chlorophyll.metabolism..PATH.ko00860. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8406391 4203196 12.153 0.03642 *
Residuals 3 1037565 345855
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00360.Phenylalanine.metabolism..PATH.ko00360. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6182910 3091455 6.9891 0.07428 .
Residuals 3 1326983 442328
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00620.Pyruvate.metabolism..PATH.ko00620. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 35362292 17681146 49.721 0.005011 **
Residuals 3 1066810 355603
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00900.Terpenoid.backbone.biosynthesis..PATH.ko00900. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2878816 1439408 8.8448 0.05521 .
Residuals 3 488223 162741
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00030.Pentose.phosphate.pathway..PATH.ko00030. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 41610937 20805469 9.3783 0.0512 .
Residuals 3 6655378 2218459
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X02024.Quorum.sensing..PATH.ko02024. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 16585121 8292561 0.9498 0.4791
Residuals 3 26191390 8730463
Response X00520.Amino.sugar.and.nucleotide.sugar.metabolism..PATH.ko00520. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1978704 989352 1.6001 0.3366
Residuals 3 1854922 618307
Response X00633.Nitrotoluene.degradation..PATH.ko00633. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 18660459 9330230 50.543 0.004893 **
Residuals 3 553800 184600
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00270.Cysteine.and.methionine.metabolism..PATH.ko00270. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 425956 212978 0.3666 0.7204
Residuals 3 1743004 581001
Response X04141.Protein.processing.in.endoplasmic.reticulum..PATH.ko04141. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 239695658 119847829 6.8827 0.07569 .
Residuals 3 52238944 17412982
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00920.Sulfur.metabolism..PATH.ko00920. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 406171 203085 0.1629 0.8567
Residuals 3 3740157 1246719
Response X03440.Homologous.recombination..PATH.ko03440. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1976508 988254 8.0369 0.06238 .
Residuals 3 368893 122964
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00790.Folate.biosynthesis..PATH.ko00790. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 614946 307473 3.6404 0.1576
Residuals 3 253385 84462
Response X00730.Thiamine.metabolism..PATH.ko00730. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 5096584 2548292 15.984 0.02513 *
Residuals 3 478274 159425
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00450.Selenocompound.metabolism..PATH.ko00450. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2495094 1247547 2.0526 0.2744
Residuals 3 1823373 607791
Response X00770.Pantothenate.and.CoA.biosynthesis..PATH.ko00770. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4813879 2406940 27.17 0.01197 *
Residuals 3 265765 88588
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00561.Glycerolipid.metabolism..PATH.ko00561. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6715964 3357982 33.389 0.008915 **
Residuals 3 301717 100572
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00350.Tyrosine.metabolism..PATH.ko00350. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3144017 1572009 3.076 0.1877
Residuals 3 1533150 511050
Response X00040.Pentose.and.glucuronate.interconversions..PATH.ko00040. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1252446 626223 1.1849 0.4176
Residuals 3 1585547 528516
Response X00052.Galactose.metabolism..PATH.ko00052. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2924805 1462403 8.4382 0.05864 .
Residuals 3 519922 173307
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03420.Nucleotide.excision.repair..PATH.ko03420. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 389281 194640 1.0022 0.4641
Residuals 3 582640 194213
Response X00910.Nitrogen.metabolism..PATH.ko00910. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2357742 1178871 3.3036 0.1745
Residuals 3 1070537 356846
Response X00330.Arginine.and.proline.metabolism..PATH.ko00330. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3117270 1558635 11.93 0.03733 *
Residuals 3 391951 130650
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00562.Inositol.phosphate.metabolism..PATH.ko00562. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1228842 614421 2.3246 0.2456
Residuals 3 792937 264312
Response X03050.Proteasome..PATH.ko03050. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 210967 105484 0.164 0.8559
Residuals 3 1929687 643229
Response X00750.Vitamin.B6.metabolism..PATH.ko00750. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1285060 642530 1.1552 0.4246
Residuals 3 1668555 556185
Response X00630.Glyoxylate.and.dicarboxylate.metabolism..PATH.ko00630. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2325257 1162629 8.2637 0.06022 .
Residuals 3 422074 140691
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00564.Glycerophospholipid.metabolism..PATH.ko00564. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 833920 416960 5.8203 0.09276 .
Residuals 3 214915 71638
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03030.DNA.replication..PATH.ko03030. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 83106 41553 0.9344 0.4837
Residuals 3 133406 44469
Response X00051.Fructose.and.mannose.metabolism..PATH.ko00051. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1024591 512296 7.9101 0.06364 .
Residuals 3 194295 64765
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00550.Peptidoglycan.biosynthesis..PATH.ko00550. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 845777 422889 1.8328 0.3019
Residuals 3 692217 230739
Response X02026.Biofilm.formation...Escherichia.coli..PATH.ko02026. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 822904 411452 1.3682 0.3782
Residuals 3 902181 300727
Response X00740.Riboflavin.metabolism..PATH.ko00740. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 698724 349362 4.3672 0.1293
Residuals 3 239990 79997
Response X03410.Base.excision.repair..PATH.ko03410. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 361081 180541 5.4685 0.09987 .
Residuals 3 99043 33014
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00540.Lipopolysaccharide.biosynthesis..PATH.ko00540. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 298510 149255 0.8302 0.5165
Residuals 3 539317 179772
Response X03430.Mismatch.repair..PATH.ko03430. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 919129 459565 3.2975 0.1748
Residuals 3 418102 139367
Response X00130.Ubiquinone.and.other.terpenoid.quinone.biosynthesis..PATH.ko00130. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 227911 113956 0.9201 0.488
Residuals 3 371551 123850
Response X04013.MAPK.signaling.pathway...fly..PATH.ko04013. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 632136 316068 2.8613 0.2017
Residuals 3 331384 110461
Response X04122.Sulfur.relay.system..PATH.ko04122. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3894102 1947051 53.011 0.004565 **
Residuals 3 110186 36729
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00250.Alanine..aspartate.and.glutamate.metabolism..PATH.ko00250. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 206572 103286 9.3945 0.05109 .
Residuals 3 32983 10994
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03070.Bacterial.secretion.system..PATH.ko03070. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1125234 562617 0.6321 0.5901
Residuals 3 2670247 890082
Response X00260.Glycine..serine.and.threonine.metabolism..PATH.ko00260. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 561206 280603 3.5254 0.1631
Residuals 3 238786 79595
Response X00480.Glutathione.metabolism..PATH.ko00480. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 177609 88805 4.7798 0.1167
Residuals 3 55738 18579
Response X00340.Histidine.metabolism..PATH.ko00340. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 398749 199375 11.112 0.04101 *
Residuals 3 53825 17942
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03450.Non.homologous.end.joining..PATH.ko03450. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 166980 83490 0.6421 0.586
Residuals 3 390054 130018
Response X00600.Sphingolipid.metabolism..PATH.ko00600. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1354564 677282 7.8244 0.06452 .
Residuals 3 259682 86561
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00280.Valine..leucine.and.isoleucine.degradation..PATH.ko00280. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 286025 143013 4.2494 0.1333
Residuals 3 100966 33655
Response X00061.Fatty.acid.biosynthesis..PATH.ko00061. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 614841 307421 1.4654 0.3598
Residuals 3 629357 209786
Response X00785.Lipoic.acid.metabolism..PATH.ko00785. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 246846 123423 5.728 0.09454 .
Residuals 3 64643 21548
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00364.Fluorobenzoate.degradation..PATH.ko00364. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 153292 76646 0.7662 0.5385
Residuals 3 300086 100029
Response X02040.Flagellar.assembly..PATH.ko02040. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3574937 1787469 0.6611 0.5783
Residuals 3 8111394 2703798
Response X00680.Methane.metabolism..PATH.ko00680. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 253881 126941 8.8822 0.05492 .
Residuals 3 42875 14292
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00627.Aminobenzoate.degradation..PATH.ko00627. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 203000 101500 1.4442 0.3637
Residuals 3 210847 70282
Response X00523.Polyketide.sugar.unit.biosynthesis..PATH.ko00523. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 120529 60265 3.0489 0.1894
Residuals 3 59298 19766
Response X00140.Steroid.hormone.biosynthesis..PATH.ko00140. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 305292 152646 78.14 0.002585 **
Residuals 3 5861 1954
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00511.Other.glycan.degradation..PATH.ko00511. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 339283 169642 3.9783 0.1433
Residuals 3 127924 42641
Response X00909.Sesquiterpenoid.and.triterpenoid.biosynthesis..PATH.ko00909. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 446971 223485 18.667 0.02029 *
Residuals 3 35917 11972
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04120.Ubiquitin.mediated.proteolysis..PATH.ko04120. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 334361 167181 1.6172 0.3338
Residuals 3 310134 103378
Response X00310.Lysine.degradation..PATH.ko00310. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 36486 18243.2 2.83 0.2039
Residuals 3 19339 6446.3
Response X00362.Benzoate.degradation..PATH.ko00362. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 165859 82930 10.698 0.04312 *
Residuals 3 23257 7752
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X02025.Biofilm.formation...Pseudomonas.aeruginosa..PATH.ko02025. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 133802 66901 5.389 0.1016
Residuals 3 37243 12414
Response X00380.Tryptophan.metabolism..PATH.ko00380. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 123036 61518 4.9122 0.1131
Residuals 3 37570 12523
Response X03013.RNA.transport..PATH.ko03013. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3685232 1842616 5.8254 0.09266 .
Residuals 3 948927 316309
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X05111.Biofilm.formation...Vibrio.cholerae..PATH.ko05111. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 40287 20143 0.2678 0.7816
Residuals 3 225661 75220
Response X00625.Chloroalkane.and.chloroalkene.degradation..PATH.ko00625. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 163596 81798 80.457 0.002476 **
Residuals 3 3050 1017
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00430.Taurine.and.hypotaurine.metabolism..PATH.ko00430. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 95412 47706 1.2927 0.3936
Residuals 3 110709 36903
Response X00071.Fatty.acid.degradation..PATH.ko00071. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 33124 16562 0.6734 0.5733
Residuals 3 73781 24594
Response X00473.D.Alanine.metabolism..PATH.ko00473. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 87416 43708 3.3182 0.1737
Residuals 3 39516 13172
Response X00670.One.carbon.pool.by.folate..PATH.ko00670. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 80656 40328 7.7353 0.06546 .
Residuals 3 15641 5214
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00053.Ascorbate.and.aldarate.metabolism..PATH.ko00053. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 150033 75017 4.6531 0.1204
Residuals 3 48365 16122
Response X04217.Necroptosis..PATH.ko04217. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 114410 57205 1.5744 0.3408
Residuals 3 109005 36335
Response X00510.N.Glycan.biosynthesis..PATH.ko00510. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 85234 42617 0.7549 0.5426
Residuals 3 169371 56457
Response X04146.Peroxisome..PATH.ko04146. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 312706 156353 6.307 0.08422 .
Residuals 3 74371 24790
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03008.Ribosome.biogenesis.in.eukaryotes..PATH.ko03008. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 923361 461681 8.901 0.05477 .
Residuals 3 155606 51868
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00590.Arachidonic.acid.metabolism..PATH.ko00590. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 142922 71461 0.9314 0.4846
Residuals 3 230167 76722
Response X04138.Autophagy...yeast..PATH.ko04138. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 92794 46397 0.9571 0.477
Residuals 3 145425 48475
Response X00780.Biotin.metabolism..PATH.ko00780. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 52292 26146 23.2 0.01497 *
Residuals 3 3381 1127
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00300.Lysine.biosynthesis..PATH.ko00300. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 41825 20913 28.491 0.01119 *
Residuals 3 2202 734
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04016.MAPK.signaling.pathway...plant..PATH.ko04016. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 44417 22208.7 10.905 0.04205 *
Residuals 3 6110 2036.5
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04011.MAPK.signaling.pathway...yeast..PATH.ko04011. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 275731 137865 1.7076 0.3198
Residuals 3 242203 80734
Response X00471.D.Glutamine.and.D.glutamate.metabolism..PATH.ko00471. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 26700 13350.2 3.2182 0.1793
Residuals 3 12445 4148.3
Response X03040.Spliceosome..PATH.ko03040. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2166601 1083301 6.9166 0.07524 .
Residuals 3 469867 156622
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00410.beta.Alanine.metabolism..PATH.ko00410. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 15905 7952.7 7.1818 0.07182 .
Residuals 3 3322 1107.3
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X02030.Bacterial.chemotaxis..PATH.ko02030. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 90307 45154 0.5377 0.6316
Residuals 3 251932 83977
Response X00906.Carotenoid.biosynthesis..PATH.ko00906. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 57100 28550 0.8548 0.5084
Residuals 3 100203 33401
Response X00440.Phosphonate.and.phosphinate.metabolism..PATH.ko00440. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 19605.3 9802.7 7.9022 0.06372 .
Residuals 3 3721.5 1240.5
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00660.C5.Branched.dibasic.acid.metabolism..PATH.ko00660. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 207499 103750 1.4874 0.3558
Residuals 3 209252 69751
Response X01059.Biosynthesis.of.enediyne.antibiotics..PATH.ko01059. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 12600 6300.2 1.2642 0.3997
Residuals 3 14950 4983.5
Response X00311.Penicillin.and.cephalosporin.biosynthesis..PATH.ko00311. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 13213.0 6606.5 8.9418 0.05445 .
Residuals 3 2216.5 738.8
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00984.Steroid.degradation..PATH.ko00984. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8221 4110.5 3.6473 0.1573
Residuals 3 3381 1127.0
Response X00405.Phenazine.biosynthesis..PATH.ko00405. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9930.3 4965.2 7.7219 0.0656 .
Residuals 3 1929.0 643.0
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00930.Caprolactam.degradation..PATH.ko00930. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 15745.3 7872.7 6.4716 0.08162 .
Residuals 3 3649.5 1216.5
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03022.Basal.transcription.factors..PATH.ko03022. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 32493 16246.5 2.2043 0.2577
Residuals 3 22111 7370.3
Response X00220.Arginine.biosynthesis..PATH.ko00220. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 23081.3 11540.7 13.709 0.03097 *
Residuals 3 2525.5 841.8
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00908.Zeatin.biosynthesis..PATH.ko00908. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2622.3 1311.2 0.1432 0.8722
Residuals 3 27474.5 9158.2
Response X04144.Endocytosis..PATH.ko04144. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 359824 179912 4.4662 0.1261
Residuals 3 120849 40283
Response X04142.Lysosome..PATH.ko04142. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2892.3 1446.2 0.2646 0.7838
Residuals 3 16398.5 5466.2
Response X02060.Phosphotransferase.system..PTS...PATH.ko02060. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1304.3 652.17 0.3375 0.7376
Residuals 3 5797.0 1932.33
Response X04014.Ras.signaling.pathway..PATH.ko04014. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 139030 69515 7.21 0.07147 .
Residuals 3 28924 9642
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00565.Ether.lipid.metabolism..PATH.ko00565. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 12825.3 6412.7 7.841 0.06435 .
Residuals 3 2453.5 817.8
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00903.Limonene.and.pinene.degradation..PATH.ko00903. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3172.3 1586.2 1.5188 0.3503
Residuals 3 3133.0 1044.3
Response X04145.Phagosome..PATH.ko04145. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9943 4971.5 1.4806 0.357
Residuals 3 10073 3357.7
Response X00100.Steroid.biosynthesis..PATH.ko00100. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2060.3 1030.2 0.5772 0.6137
Residuals 3 5354.5 1784.8
Response X03015.mRNA.surveillance.pathway..PATH.ko03015. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 60908 30454 2.388 0.2396
Residuals 3 38259 12753
Response X00515.Mannose.type.O.glycan.biosyntheis..PATH.ko00515. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 212.3 106.17 0.0634 0.9398
Residuals 3 5025.0 1675.00
Response X01055.Biosynthesis.of.vancomycin.group.antibiotics..PATH.ko01055. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2133.3 1066.7 0.7089 0.5596
Residuals 3 4514.0 1504.7
Response X04214.Apoptosis...fly..PATH.ko04214. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 635265 317633 24.822 0.0136 *
Residuals 3 38390 12797
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00120.Primary.bile.acid.biosynthesis..PATH.ko00120. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1456.0 728.0 1.7313 0.3163
Residuals 3 1261.5 420.5
Response X01040.Biosynthesis.of.unsaturated.fatty.acids..PATH.ko01040. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7769.3 3884.7 2.5401 0.2262
Residuals 3 4588.0 1529.3
Response X00281.Geraniol.degradation..PATH.ko00281. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 114.3 57.17 0.0304 0.9704
Residuals 3 5644.5 1881.50
Response X00195.Photosynthesis..PATH.ko00195. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4984 2492.2 0.1836 0.841
Residuals 3 40729 13576.3
Response X00514.Other.types.of.O.glycan.biosynthesis..PATH.ko00514. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4863 2431.5 6.7355 0.07773 .
Residuals 3 1083 361.0
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04371.Apelin.signaling.pathway..PATH.ko04371. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 38184 19092 1.242 0.4046
Residuals 3 46114 15372
Response X04139.Mitophagy...yeast..PATH.ko04139. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 21052 10526.0 2.0772 0.2715
Residuals 3 15202 5067.3
Response X04150.mTOR.signaling.pathway..PATH.ko04150. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3058.3 1529.17 3.2933 0.1751
Residuals 3 1393.0 464.33
Response X00261.Monobactam.biosynthesis..PATH.ko00261. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 977.3 488.7 0.0864 0.9194
Residuals 3 16969.5 5656.5
Response X04110.Cell.cycle..PATH.ko04110. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 13402 6701.2 0.3759 0.715
Residuals 3 53475 17825.0
Response X00965.Betalain.biosynthesis..PATH.ko00965. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2344.3 1172.17 2.662 0.2164
Residuals 3 1321.0 440.33
Response X04015.Rap1.signaling.pathway..PATH.ko04015. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 10108 5054.0 1.3021 0.3917
Residuals 3 11644 3881.3
Response X04530.Tight.junction..PATH.ko04530. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 300037 150019 4.512 0.1246
Residuals 3 99747 33249
Response X00960.Tropane..piperidine.and.pyridine.alkaloid.biosynthesis..PATH.ko00960. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 603 301.5 3.7222 0.1539
Residuals 3 243 81.0
Response X04024.cAMP.signaling.pathway..PATH.ko04024. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2004.3 1002.17 1.8166 0.3042
Residuals 3 1655.0 551.67
Response X03460.Fanconi.anemia.pathway..PATH.ko03460. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1047.0 523.5 0.3693 0.7188
Residuals 3 4252.5 1417.5
Response X04012.ErbB.signaling.pathway..PATH.ko04012. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 14481 7240.5 3.8061 0.1503
Residuals 3 5707 1902.3
Response X04152.AMPK.signaling.pathway..PATH.ko04152. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 10850 5425.2 0.9714 0.4728
Residuals 3 16754 5584.8
Response X04137.Mitophagy...animal..PATH.ko04137. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 45697 22848.5 5.4527 0.1002
Residuals 3 12571 4190.3
Response X04810.Regulation.of.actin.cytoskeleton..PATH.ko04810. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1202.3 601.17 0.6738 0.5732
Residuals 3 2676.5 892.17
Response X04151.PI3K.Akt.signaling.pathway..PATH.ko04151. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1002.3 501.17 0.4803 0.6592
Residuals 3 3130.5 1043.50
Response X04020.Calcium.signaling.pathway..PATH.ko04020. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1414.3 707.17 7.7853 0.06493 .
Residuals 3 272.5 90.83
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00591.Linoleic.acid.metabolism..PATH.ko00591. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 336.33 168.17 0.7101 0.5592
Residuals 3 710.50 236.83
Response X01056.Biosynthesis.of.type.II.polyketide.backbone..PATH.ko01056. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 481.0 240.50 1.2344 0.4063
Residuals 3 584.5 194.83
Response X00643.Styrene.degradation..PATH.ko00643. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1849.3 924.67 4.0467 0.1406
Residuals 3 685.5 228.50
Response X00623.Toluene.degradation..PATH.ko00623. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1484.3 742.17 1.1654 0.4222
Residuals 3 1910.5 636.83
Response X00830.Retinol.metabolism..PATH.ko00830. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1677.0 838.50 2.1345 0.2651
Residuals 3 1178.5 392.83
Response X00020.Citrate.cycle..TCA.cycle...PATH.ko00020. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8881.3 4440.7 7.3521 0.06975 .
Residuals 3 1812.0 604.0
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00563.Glycosylphosphatidylinositol.GPI..anchor.biosynthesis..PATH.ko00563. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 10249.0 5124.5 2.4834 0.2311
Residuals 3 6190.5 2063.5
Response X04010.MAPK.signaling.pathway..PATH.ko04010. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6974.3 3487.2 9.1848 0.0526 .
Residuals 3 1139.0 379.7
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04068.FoxO.signaling.pathway..PATH.ko04068. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 732.0 366.00 0.6163 0.5967
Residuals 3 1781.5 593.83
Response X00363.Bisphenol.degradation..PATH.ko00363. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 566.33 283.167 6.0463 0.08862 .
Residuals 3 140.50 46.833
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04113.Meiosis...yeast..PATH.ko04113. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1480.3 740.17 0.6777 0.5717
Residuals 3 3276.5 1092.17
Response X04111.Cell.cycle...yeast..PATH.ko04111. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8233.3 4116.7 5.7995 0.09315 .
Residuals 3 2129.5 709.8
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04140.Autophagy...animal..PATH.ko04140. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 13589.3 6794.7 2.7525 0.2095
Residuals 3 7405.5 2468.5
Response X01057.Biosynthesis.of.type.II.polyketide.products..PATH.ko01057. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 306.33 153.167 1.6237 0.3328
Residuals 3 283.00 94.333
Response X00365.Furfural.degradation..PATH.ko00365. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 273 136.50 0.9371 0.4829
Residuals 3 437 145.67
Response X04350.TGF.beta.signaling.pathway..PATH.ko04350. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 112.33 56.167 0.6673 0.5758
Residuals 3 252.50 84.167
Response X04391.Hippo.signaling.pathway..fly..PATH.ko04391. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1461 730.50 4.1427 0.1371
Residuals 3 529 176.33
Response X04080.Neuroactive.ligand.receptor.interaction..PATH.ko04080. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 13 6.5 0.4333 0.6834
Residuals 3 45 15.0
Response X04310.Wnt.signaling.pathway..PATH.ko04310. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1264.0 632.00 6.126 0.08724 .
Residuals 3 309.5 103.17
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00404.Staurosporine.biosynthesis..PATH.ko00404. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 357.33 178.667 16 0.02509 *
Residuals 3 33.50 11.167
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00943.Isoflavonoid.biosynthesis..PATH.ko00943. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 294.33 147.167 6.0897 0.08786 .
Residuals 3 72.50 24.167
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04550.Signaling.pathways.regulating.pluripotency.of.stem.cells..PATH.ko04550. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 74.333 37.167 0.5138 0.6428
Residuals 3 217.000 72.333
Response X00073.Cutin..suberine.and.wax.biosynthesis..PATH.ko00073. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 86.333 43.167 0.9317 0.4845
Residuals 3 139.000 46.333
Response X04130.SNARE.interactions.in.vesicular.transport..PATH.ko04130. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1110.3 555.17 0.6444 0.585
Residuals 3 2584.5 861.50
Response X00531.Glycosaminoglycan.degradation..PATH.ko00531. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 826.33 413.17 0.9783 0.4709
Residuals 3 1267.00 422.33
Response X00791.Atrazine.degradation..PATH.ko00791. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 240.33 120.167 3.0944 0.1865
Residuals 3 116.50 38.833
Response X04070.Phosphatidylinositol.signaling.system..PATH.ko04070. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1842.3 921.17 1.2579 0.4011
Residuals 3 2197.0 732.33
Response X01053.Biosynthesis.of.siderophore.group.nonribosomal.peptides..PATH.ko01053. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 307 153.5 0.8977 0.4948
Residuals 3 513 171.0
Response X00400.Phenylalanine..tyrosine.and.tryptophan.biosynthesis..PATH.ko00400. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.3333 1.1667 0.1556 0.8624
Residuals 3 22.5000 7.5000
Response X00624.Polycyclic.aromatic.hydrocarbon.degradation..PATH.ko00624. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 52.0 26.000 0.7393 0.5482
Residuals 3 105.5 35.167
Response X04072.Phospholipase.D.signaling.pathway..PATH.ko04072. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 90.33 45.167 0.2326 0.8055
Residuals 3 582.50 194.167
Response X00121.Secondary.bile.acid.biosynthesis..PATH.ko00121. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 44.333 22.167 7.3889 0.06932 .
Residuals 3 9.000 3.000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00513.Various.types.of.N.glycan.biosynthesis..PATH.ko00513. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1290.3 645.17 3.7293 0.1536
Residuals 3 519.0 173.00
Response X00532.Glycosaminoglycan.biosynthesis...chondroitin.sulfate...dermatan.sulfate..PATH.ko00532. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 103 51.500 0.5578 0.6224
Residuals 3 277 92.333
Response X00472.D.Arginine.and.D.ornithine.metabolism..PATH.ko00472. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3 1.5 0.2143 0.8185
Residuals 3 21 7.0
Response X00361.Chlorocyclohexane.and.chlorobenzene.degradation..PATH.ko00361. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 16.333 8.1667 0.7903 0.53
Residuals 3 31.000 10.3333
Response X04066.HIF.1.signaling.pathway..PATH.ko04066. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 14.333 7.167 0.0964 0.9108
Residuals 3 223.000 74.333
Response X04668.TNF.signaling.pathway..PATH.ko04668. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6489.0 3244.5 5.6442 0.09621 .
Residuals 3 1724.5 574.8
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00290.Valine..leucine.and.isoleucine.biosynthesis..PATH.ko00290. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 39.0 19.5000 9 0.05399 .
Residuals 3 6.5 2.1667
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00534.Glycosaminoglycan.biosynthesis...heparan.sulfate...heparin..PATH.ko00534. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 112.0 56.000 0.8337 0.5153
Residuals 3 201.5 67.167
Response X00626.Naphthalene.degradation..PATH.ko00626. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.3333 0.6667 0.1111 0.8984
Residuals 3 18.0000 6.0000
Response X00062.Fatty.acid.elongation..PATH.ko00062. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 26.333 13.167 1.1286 0.4311
Residuals 3 35.000 11.667
Response X00512.Mucin.type.O.glycan.biosynthesis..PATH.ko00512. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 156 78.000 2.34 0.2441
Residuals 3 100 33.333
Response X04022.cGMP...PKG.signaling.pathway..PATH.ko04022. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4.3333 2.1667 1 0.4648
Residuals 3 6.5000 2.1667
Response X04218.Cellular.senescence..PATH.ko04218. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 69.333 34.667 0.7536 0.543
Residuals 3 138.000 46.000
Response X00196.Photosynthesis...antenna.proteins..PATH.ko00196. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.3333 1.1667 0.2692 0.7807
Residuals 3 13.0000 4.3333
Response X00980.Metabolism.of.xenobiotics.by.cytochrome.P450..PATH.ko00980. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 5.3333 2.6667 2 0.2806
Residuals 3 4.0000 1.3333
Response X04210.Apoptosis..PATH.ko04210. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 43.0 21.5000 2.434 0.2354
Residuals 3 26.5 8.8333
Response X04216.Ferroptosis..PATH.ko04216. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 273 136.500 24.088 0.01419 *
Residuals 3 17 5.667
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04630.Jak.STAT.signaling.pathway..PATH.ko04630. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 233.33 116.667 3.8043 0.1504
Residuals 3 92.00 30.667
Response X00981.Insect.hormone.biosynthesis..PATH.ko00981. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.3333 0.66667 1 0.4648
Residuals 3 2.0000 0.66667
Response X04390.Hippo.signaling.pathway..PATH.ko04390. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 13.0 6.5 1.8571 0.2987
Residuals 3 10.5 3.5
Response X04392.Hippo.signaling.pathway...multiple.species..PATH.ko04392. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.3333 1.1667 1.1667 0.4219
Residuals 3 3.0000 1.0000
Response X04510.Focal.adhesion..PATH.ko04510. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 50.333 25.167 0.5945 0.6061
Residuals 3 127.000 42.333
Response X04514.Cell.adhesion.molecules..CAMs...PATH.ko04514. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2044.3 1022.2 0.8881 0.4978
Residuals 3 3453.0 1151.0
Response X04064.NF.kappa.B.signaling.pathway..PATH.ko04064. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4.3333 2.16667 2.6 0.2213
Residuals 3 2.5000 0.83333
Response X04330.Notch.signaling.pathway..PATH.ko04330. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 972.33 486.17 3.8331 0.1492
Residuals 3 380.50 126.83
Response X04340.Hedgehog.signaling.pathway..PATH.ko04340. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.3333 1.1667 0.3889 0.7077
Residuals 3 9.0000 3.0000
Response X04520.Adherens.junction..PATH.ko04520. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 181.0 90.50 0.6519 0.582
Residuals 3 416.5 138.83
Response X00004.KEGG.modules.in.global.maps.only :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00460.Cyanoamino.acid.metabolism..PATH.ko00460. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00525.Acarbose.and.validamycin.biosynthesis..PATH.ko00525. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00533.Glycosaminoglycan.biosynthesis...keratan.sulfate..PATH.ko00533. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00601.Glycosphingolipid.biosynthesis...lacto.and.neolacto.series..PATH.ko00601. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00604.Glycosphingolipid.biosynthesis...ganglio.series..PATH.ko00604. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 21.333 10.667 1.7778 0.3096
Residuals 3 18.000 6.000
Response X00720.Carbon.fixation.pathways.in.prokaryotes..PATH.ko00720. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 0.5 0.6495
Residuals 3 1.00000 0.33333
Response X00904.Diterpenoid.biosynthesis..PATH.ko00904. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00940.Phenylpropanoid.biosynthesis..PATH.ko00940. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.3333 0.66667 1 0.4648
Residuals 3 2.0000 0.66667
Response X00941.Flavonoid.biosynthesis..PATH.ko00941. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00982.Drug.metabolism...cytochrome.P450..PATH.ko00982. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Response X00983.Drug.metabolism...other.enzymes..PATH.ko00983. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.3333 0.66667 0.5 0.6495
Residuals 3 4.0000 1.33333
Response X01051.Biosynthesis.of.ansamycins..PATH.ko01051. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8.3333 4.1667 1 0.4648
Residuals 3 12.5000 4.1667
Response X04075.Plant.hormone.signal.transduction..PATH.ko04075. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Response X04114.Oocyte.meiosis..PATH.ko04114. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X04115.p53.signaling.pathway..PATH.ko04115. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X04341.Hedgehog.signaling.pathway...fly..PATH.ko04341. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9.3333 4.6667 0.7778 0.5344
Residuals 3 18.0000 6.0000
Response X04512.ECM.receptor.interaction..PATH.ko04512. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Run ANOVA for each category of interest (significant in MANOVA)
# Carbon Metabolism
tuss_tIV_aov1<-aov(X01200.Carbon.metabolism..PATH.ko01200.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov1)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.715e+10 8.576e+09 17.66 0.0219 *
Residuals 3 1.457e+09 4.856e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov1)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X01200.Carbon.metabolism..PATH.ko01200. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -111816 -203904.37 -19727.63 0.0298795
T4-T0 -114963 -207051.37 -22874.63 0.0277099
T4-T24 -3147 -95235.37 88941.37 0.9888620
# Ribosome
tuss_tIV_aov2<-aov(X03010.Ribosome..PATH.ko03010.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov2)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.325e+10 6.625e+09 17.08 0.0229 *
Residuals 3 1.164e+09 3.879e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov2)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X03010.Ribosome..PATH.ko03010. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 113712.0 31407.572 196016.43 0.0209743
T4-T0 72348.5 -9955.928 154652.93 0.0693890
T4-T24 -41363.5 -123667.928 40940.93 0.2369013
# RNA Degradation
tuss_tIV_aov3<-aov(X03018.RNA.degradation..PATH.ko03018.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov3)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.458e+10 1.229e+10 116.5 0.00143 **
Residuals 3 3.164e+08 1.055e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov3)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X03018.RNA.degradation..PATH.ko03018. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 114012 71097.852 156926.15 0.0032286
T4-T0 150206 107291.852 193120.15 0.0014592
T4-T24 36194 -6720.148 79108.15 0.0768641
# Fatty Acid Metabolism
tuss_tIV_aov4<-aov(X01212.Fatty.acid.metabolism..PATH.ko01212.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov4)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.400e+09 700208653 166.6 0.000843 ***
Residuals 3 1.261e+07 4202636
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov4)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X01212.Fatty.acid.metabolism..PATH.ko01212. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -32382 -40948.603 -23815.397 0.0011771
T4-T0 -32435 -41001.603 -23868.397 0.0011718
T4-T24 -53 -8619.603 8513.603 0.9996316
# Butonaote Metabolism
tuss_tIV_aov5<-aov(X00650.Butanoate.metabolism..PATH.ko00650.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov5)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 916283691 458141846 11.48 0.0393 *
Residuals 3 119745259 39915086
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov5)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00650.Butanoate.metabolism..PATH.ko00650. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -26410.5 -52811.25 -9.74956 0.0499522
T4-T0 -26014.5 -52415.25 386.25044 0.0519416
T4-T24 396.0 -26004.75 26796.75044 0.9978379
# Glycolysis Gluconeogenesis
tuss_tIV_aov6<-aov(X00010.Glycolysis...Gluconeogenesis..PATH.ko00010.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov6)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 646115961 323057980 29.34 0.0107 *
Residuals 3 33029853 11009951
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov6)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00010.Glycolysis...Gluconeogenesis..PATH.ko00010. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -21193.5 -35059.16 -7327.844 0.0158228
T4-T0 -22750.5 -36616.16 -8884.844 0.0129554
T4-T24 -1557.0 -15422.66 12308.656 0.8898939
# Caulobacter Cell Cycle
tuss_tIV_aov7<-aov(X04112.Cell.cycle...Caulobacter..PATH.ko04112.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov7)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.484e+09 1.242e+09 41.97 0.00641 **
Residuals 3 8.879e+07 2.960e+07
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov7)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X04112.Cell.cycle...Caulobacter..PATH.ko04112. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 16215.0 -6518.418 38948.42 0.1144336
T4-T0 48924.5 26191.082 71657.92 0.0059536
T4-T24 32709.5 9976.082 55442.92 0.0187381
# Nicotinate and Nicotinamide Metabolism
tuss_tIV_aov8<-aov(X00760.Nicotinate.and.nicotinamide.metabolism..PATH.ko00760.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov8)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 140469145 70234573 14.45 0.0288 *
Residuals 3 14584946 4861649
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov8)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00760.Nicotinate.and.nicotinamide.metabolism..PATH.ko00760. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -10082 -19295.816 -868.1844 0.0394901
T4-T0 -10437 -19650.816 -1223.1844 0.0360172
T4-T24 -355 -9568.816 8858.8156 0.9858779
# Degradation of Aromatic Compounds
tuss_tIV_aov9<-aov(X01220.Degradation.of.aromatic.compounds..PATH.ko01220.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov9)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 92377264 46188632 27.22 0.0119 *
Residuals 3 5089767 1696589
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov9)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X01220.Degradation.of.aromatic.compounds..PATH.ko01220. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -8712.0 -14154.971 -3269.029 0.0138964
T4-T0 -7871.5 -13314.471 -2428.529 0.0184739
T4-T24 840.5 -4602.471 6283.471 0.8078666
# Starch and Sucrose Metabolism
tuss_tIV_aov10<-aov(X00500.Starch.and.sucrose.metabolism..PATH.ko00500.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov10)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 28026154 14013077 37.8 0.00746 **
Residuals 3 1112040 370680
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov10)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00500.Starch.and.sucrose.metabolism..PATH.ko00500. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -4423.5 -6967.678 -1879.322 0.0109892
T4-T0 -4730.5 -7274.678 -2186.322 0.0090720
T4-T24 -307.0 -2851.178 2237.178 0.8746709
# Porphyrin and Chlorophyll Metabolism
tuss_tIV_aov11<-aov(X00860.Porphyrin.and.chlorophyll.metabolism..PATH.ko00860.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov11)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8406391 4203196 12.15 0.0364 *
Residuals 3 1037565 345855
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov11)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00860.Porphyrin.and.chlorophyll.metabolism..PATH.ko00860. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -1869.5 -4327.007 588.007 0.0985258
T4-T0 -2854.0 -5311.507 -396.493 0.0336923
T4-T24 -984.5 -3442.007 1473.007 0.3469692
# Pyruvate Metabolism
tuss_tIV_aov12<-aov(X00620.Pyruvate.metabolism..PATH.ko00620.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov12)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 35362292 17681146 49.72 0.00501 **
Residuals 3 1066810 355603
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov12)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00620.Pyruvate.metabolism..PATH.ko00620. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -5278.0 -7769.901 -2786.099 0.0062345
T4-T0 -5011.5 -7503.401 -2519.599 0.0072404
T4-T24 266.5 -2225.401 2758.401 0.8992519
# Nitrotoluene Degradation
tuss_tIV_aov13<-aov(X00633.Nitrotoluene.degradation..PATH.ko00633.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov13)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 18660459 9330230 50.54 0.00489 **
Residuals 3 553800 184600
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov13)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00633.Nitrotoluene.degradation..PATH.ko00633. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -3784.5 -5579.911 -1989.089 0.0063220
T4-T0 -3696.0 -5491.411 -1900.589 0.0067693
T4-T24 88.5 -1706.911 1883.911 0.9770593
# Thiamine Metabolism
tuss_tIV_aov14<-aov(X00730.Thiamine.metabolism..PATH.ko00730.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov14)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 5096584 2548292 15.98 0.0251 *
Residuals 3 478274 159425
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov14)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00730.Thiamine.metabolism..PATH.ko00730. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 399.5 -1268.99812 2067.998 0.6256537
T4-T0 2124.0 455.50188 3792.498 0.0262734
T4-T24 1724.5 56.00188 3392.998 0.0458849
# Pantothenate and CoA Biosynthesis
tuss_tIV_aov15<-aov(X00770.Pantothenate.and.CoA.biosynthesis..PATH.ko00770.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov15)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4813879 2406940 27.17 0.012 *
Residuals 3 265765 88588
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov15)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00770.Pantothenate.and.CoA.biosynthesis..PATH.ko00770. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -2018.5 -3262.2583 -774.7417 0.0133633
T4-T0 -1754.0 -2997.7583 -510.2417 0.0198140
T4-T24 264.5 -979.2583 1508.2583 0.6827983
# Glycerolipid Metabolism
tuss_tIV_aov16<-aov(X00561.Glycerolipid.metabolism..PATH.ko00561.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov16)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6715964 3357982 33.39 0.00891 **
Residuals 3 301717 100572
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov16)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00561.Glycerolipid.metabolism..PATH.ko00561. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -2234.0 -3559.217 -908.783 0.0119989
T4-T0 -2254.5 -3579.717 -929.283 0.0116915
T4-T24 -20.5 -1345.717 1304.717 0.9977007
# Arginine and Proline Metabolism
tuss_tIV_aov17<-aov(X00330.Arginine.and.proline.metabolism..PATH.ko00330.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov17)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3117270 1558635 11.93 0.0373 *
Residuals 3 391951 130650
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov17)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00330.Arginine.and.proline.metabolism..PATH.ko00330. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -1695.0 -3205.438 -184.5617 0.0369289
T4-T0 -1275.5 -2785.938 234.9383 0.0766292
T4-T24 419.5 -1090.938 1929.9383 0.5478638
# Sulfur Relay System
tuss_tIV_aov19<-aov(X04122.Sulfur.relay.system..PATH.ko04122.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov19)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3894102 1947051 53.01 0.00456 **
Residuals 3 110186 36729
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov19)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X04122.Sulfur.relay.system..PATH.ko04122. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 443.5 -357.3502 1244.35 0.1967214
T4-T0 1887.0 1086.1498 2687.85 0.0045776
T4-T24 1443.5 642.6498 2244.35 0.0099157
# Histidine Metabolism
tuss_tIV_aov20<-aov(X00340.Histidine.metabolism..PATH.ko00340.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov20)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 398749 199375 11.11 0.041 *
Residuals 3 53825 17942
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov20)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00340.Histidine.metabolism..PATH.ko00340. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -631.0 -1190.7307 -71.2693 0.0364819
T4-T0 -336.5 -896.2307 223.2307 0.1664416
T4-T24 294.5 -265.2307 854.2307 0.2173393
# Steroid Hormone Biosynthesis
tuss_tIV_aov23<-aov(X00140.Steroid.hormone.biosynthesis..PATH.ko00140.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov23)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 305292 152646 78.14 0.00258 **
Residuals 3 5861 1954
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov23)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00140.Steroid.hormone.biosynthesis..PATH.ko00140. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -409.5 -594.1947 -224.8053 0.0054615
T4-T0 -526.0 -710.6947 -341.3053 0.0026379
T4-T24 -116.5 -301.1947 68.1947 0.1503351
# Sesquiterpenoid and Triterpenoid Biosynthesis
tuss_tIV_aov24<-aov(X00909.Sesquiterpenoid.and.triterpenoid.biosynthesis..PATH.ko00909.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov24)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 446971 223485 18.67 0.0203 *
Residuals 3 35917 11972
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov24)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00909.Sesquiterpenoid.and.triterpenoid.biosynthesis..PATH.ko00909. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 264.5 -192.73255 721.7325 0.1802036
T4-T0 664.0 206.76745 1121.2325 0.0182605
T4-T24 399.5 -57.73255 856.7325 0.0704410
# Benzoate Degradation
tuss_tIV_aov25<-aov(X00362.Benzoate.degradation..PATH.ko00362.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov25)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 165859 82930 10.7 0.0431 *
Residuals 3 23257 7752
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov25)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00362.Benzoate.degradation..PATH.ko00362. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -384.5 -752.425 -16.57501 0.0445771
T4-T0 -308.5 -676.425 59.42499 0.0779744
T4-T24 76.0 -291.925 443.92499 0.6959900
# Chloroalkane and Chloroalkene Degradation
tuss_tIV_aov26<-aov(X00625.Chloroalkane.and.chloroalkene.degradation..PATH.ko00625.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov26)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 163596 81798 80.46 0.00248 **
Residuals 3 3050 1017
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov26)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00625.Chloroalkane.and.chloroalkene.degradation..PATH.ko00625. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -336 -469.2407 -202.7593 0.0037577
T4-T0 -363 -496.2407 -229.7593 0.0030009
T4-T24 -27 -160.2407 106.2407 0.7044936
# Biotin Metabolism
tuss_tIV_aov27<-aov(X00780.Biotin.metabolism..PATH.ko00780.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov27)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 52292 26146 23.2 0.015 *
Residuals 3 3381 1127
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov27)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00780.Biotin.metabolism..PATH.ko00780. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -174.0 -314.2845 -33.7155 0.0282028
T4-T0 -215.5 -355.7845 -75.2155 0.0156017
T4-T24 -41.5 -181.7845 98.7845 0.5133303
# Lysine Biosynthesis
tuss_tIV_aov28<-aov(X00300.Lysine.biosynthesis..PATH.ko00300.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov28)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 41825 20913 28.49 0.0112 *
Residuals 3 2202 734
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov28)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00300.Lysine.biosynthesis..PATH.ko00300. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -123 -236.2129 -9.787139 0.0402429
T4-T0 -203 -316.2129 -89.787139 0.0100645
T4-T24 -80 -193.2129 33.212861 0.1169023
# MAPK Signaling Pathway
tuss_tIV_aov29<-aov(X04016.MAPK.signaling.pathway...plant..PATH.ko04016.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov29)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 44417 22209 10.9 0.042 *
Residuals 3 6110 2037
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov29)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X04016.MAPK.signaling.pathway...plant..PATH.ko04016. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -209 -397.5775 -20.42247 0.0381737
T4-T0 -128 -316.5775 60.57753 0.1280147
T4-T24 81 -107.5775 269.57753 0.3110593
# Arginine Biosynthesis
tuss_tIV_aov30<-aov(X00220.Arginine.biosynthesis..PATH.ko00220.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov30)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 23081 11541 13.71 0.031 *
Residuals 3 2525 842
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov30)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00220.Arginine.biosynthesis..PATH.ko00220. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -129 -250.2442 -7.755849 0.0425255
T4-T0 -134 -255.2442 -12.755849 0.0384578
T4-T24 -5 -126.2442 116.244151 0.9838494
# Staurosporine Biosynthesis
tuss_tIV_aov32<-aov(X00404.Staurosporine.biosynthesis..PATH.ko00404.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov32)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 357.3 178.67 16 0.0251 *
Residuals 3 33.5 11.17
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov32)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X00404.Staurosporine.biosynthesis..PATH.ko00404. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -18 -31.963989 -4.03601054 0.0253892
T4-T0 -14 -27.963989 -0.03601054 0.0496675
T4-T24 4 -9.963989 17.96398946 0.5310290
# Ferroptosis
tuss_tIV_aov33<-aov(X04216.Ferroptosis..PATH.ko04216.~Timepoint,data=tuss_tIV_stats)
summary.aov(tuss_tIV_aov33)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 273 136.50 24.09 0.0142 *
Residuals 3 17 5.67
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_tIV_aov33)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = X04216.Ferroptosis..PATH.ko04216. ~ Timepoint, data = tuss_tIV_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -1.5 -11.447446 8.447446 0.8153751
T4-T0 13.5 3.552554 23.447446 0.0220370
T4-T24 15.0 5.052554 24.947446 0.0164358
Plot the significant Tuss-MT KEGG tier IV categories by timepoint as a single barchart.
tuss_tIV_sub_bardata$Sample <- factor(tuss_tIV_sub_bardata$Sample, levels=c("Tuss-MT-T24","Tuss-MT-T4","Tuss-MT-T0"))
tuss_tIV_sub_bardata$Tier.IV <- factor (tuss_tIV_sub_bardata$Tier.IV, levels=c("Cell cycle - Caulobacter (K04112)","RNA degradation (K03018)","Ribosome (K03010)","Degradation of aromatic compounds (K01220)","Fatty acid metabolism (K01212)","Carbon metabolism (K01200)","Nicotinate and nicotinamide metabolism (K00760)","Butanoate metabolism (K006650)","Glycolysis-Gluconeogenesis (K00010)"))
tuss_tIV_sub_MT_barplot<-ggplot(tuss_tIV_sub_bardata, aes(x = Tier.IV, y=Mean, fill=Sample)) + geom_bar(stat = "identity", position = "dodge", color = "black") + geom_errorbar(aes(ymin = Mean - SD, ymax = Mean +SD), width = 0.2, position = position_dodge(0.9)) + coord_flip() + scale_y_continuous(expand = c(0, 0), limits = c(0, 30), position="bottom") + ylab ("Relative Abundance") + theme_classic() + theme(axis.title.y=element_blank(), axis.text.y=element_blank(), plot.margin = margin(10, 20, 10, 10), legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=8)) + scale_fill_manual(values=c("darkgreen","green3","palegreen"), guide=guide_legend(reverse=TRUE))
### Saving as .eps file with R Viewer
# Re-size the viewer to the following dimensions before saving as .eps file
# In Console, type "tuss_ws_dna_tIII_pheatmap"
# In Plot Viewer, click "Export" --> "Save as Image"
# Width: 600 Height: 600
# Click "Update Preview" --> save as "EPS" file type --> rename as Fig.4.Bar
tuss_tIV_sub_MT_barplot
Plot the combined Tuss-MG and WS-MG KEGG tier III categories as a single barchart.
Heatmap for Tier IV categories by sampling timepoints
# Create copy of "tuss_tIV_sum" object to prep for heatmap
tuss_tIV_sub_heatmap<-tuss_tIV_sum
# Keep the first 7 columns
tuss_tIV_sub_heatmap <- tuss_tIV_sub_heatmap[,c(1,8,9,10)] #This code uses the means of timepoint
# Keep only those categories with significant differences (from ANOVA)
tuss_tIV_sub_heatmap <- subset(tuss_tIV_sub_heatmap, Tier_IV=="00010 Glycolysis / Gluconeogenesis [PATH:ko00010]" | Tier_IV=="00650 Butanoate metabolism [PATH:ko00650]" | Tier_IV=="00760 Nicotinate and nicotinamide metabolism [PATH:ko00760]" | Tier_IV=="01200 Carbon metabolism [PATH:ko01200]" | Tier_IV=="01212 Fatty acid metabolism [PATH:ko01212]" | Tier_IV=="01220 Degradation of aromatic compounds [PATH:ko01220]" | Tier_IV=="03018 RNA degradation [PATH:ko03018]" | Tier_IV=="03010 Ribosome [PATH:ko03010]" | Tier_IV=="04112 Cell cycle - Caulobacter [PATH:ko04112]", select=Tier_IV:meantuss_T24)
colnames(tuss_tIV_sub_heatmap)<-c("Tier_IV","Tuss-MT-T0","Tuss-MT-T4","Tuss-MT-T24")
# Create a vector with categores in the desired order
tuss_tIV_sub_x <- c("00010 Glycolysis / Gluconeogenesis [PATH:ko00010]",
"00650 Butanoate metabolism [PATH:ko00650]",
"00760 Nicotinate and nicotinamide metabolism [PATH:ko00760]",
"01200 Carbon metabolism [PATH:ko01200]",
"01212 Fatty acid metabolism [PATH:ko01212]",
"01220 Degradation of aromatic compounds [PATH:ko01220]",
"03010 Ribosome [PATH:ko03010]",
"03018 RNA degradation [PATH:ko03018]",
"04112 Cell cycle - Caulobacter [PATH:ko04112]")
# Re-sort the data in the desired order
tuss_tIV_sub_heatmap<-tuss_tIV_sub_heatmap %>%
slice(match(tuss_tIV_sub_x, Tier_IV))
# Convert the first column (Tier categories) into rownames
rownames(tuss_tIV_sub_heatmap) <- tuss_tIV_sub_heatmap$Tier_IV
tuss_tIV_sub_heatmap<-as.data.frame(tuss_tIV_sub_heatmap[-1])
# Rename columns to sample IDs
colnames(tuss_tIV_sub_heatmap)<-c("Tuss-MT-T0","Tuss-MT-T4","Tuss-MT-T24")
# Convert dataframe into a matrix for heatmap
tuss_tIV_sub_heatmap<-as.matrix(tuss_tIV_sub_heatmap)
# Scale matrix values to generate Z-scores
tuss_tIV_sub_heatmap<-scale(t(tuss_tIV_sub_heatmap))
tuss_tIV_sub_heatmap<-t(tuss_tIV_sub_heatmap)
# Specify RColorBrewer custom color palette
col <- colorRampPalette(brewer.pal(10, "RdYlBu"))(256)
### Saving as .eps file with R Viewer
# Re-size the viewer to the following dimensions before saving as .eps file
# In Console, type "tuss_tIV_pheatmap"
# In Plot Viewer, click "Export" --> "Save as Image"
# Width: 600 Height: 600
# Click "Update Preview" --> save as "EPS" file type --> rename as Fig.2.2.Tuss.MT.Heatmap
# Pheatmap
tuss_tIV_sub_pheatmap<-pheatmap(tuss_tIV_sub_heatmap, treeheight_row = 0, treeheight_col = 0,cluster_rows = FALSE, cluster_cols = FALSE)
Look at differences in gene expression patterns for multiple Tier levels within the Wet Sedge tundra between sampling time points.
Calculate TPM-normalized gene count sums for each WS sample in each Tier II category.
# Calculate TPM-normalized gene count sums for each WS sample in each Tier II category
ws_tII_sum<-tpm_all_exp_ann %>% group_by(Tier_II) %>% summarise_at(vars("S108379","S108381","S108385","S108386","S108388","S108390"), sum)
ws_tII_sum<-ws_tII_sum %>% mutate_at(vars(S108379,S108381,S108385,S108386,S108388,S108390), funs(round(., 0)))
ws_tII_sum<-as.data.frame(ws_tII_sum)
ws_tII_sum<-ws_tII_sum[order(-ws_tII_sum$S108379),]
| Tier_II | S108379 | S108381 | S108385 | S108386 | S108388 | S108390 | |
|---|---|---|---|---|---|---|---|
| 4 | Metabolism | 608,540 | 617,976 | 598,817 | 612,340 | 581,565 | 519,432 |
| 3 | Genetic Information Processing | 191,747 | 201,174 | 207,020 | 217,116 | 234,300 | 298,547 |
| 2 | Environmental Information Processing | 154,282 | 134,776 | 147,488 | 132,095 | 138,677 | 128,939 |
| 1 | Cellular Processes | 45,430 | 46,073 | 46,675 | 38,449 | 45,457 | 53,082 |
Heatmap for Tier II categories by sampling timepoints
# Create copy of "ws_tII_sum" object to prep for heatmap
ws_tII_heatmap<-ws_tII_sum
# Convert the first column (Tier categories) into rownames
rownames(ws_tII_heatmap) <- ws_tII_heatmap$Tier_II
ws_tII_heatmap<-as.data.frame(ws_tII_heatmap[-1])
# Rename columns to sample IDs
colnames(ws_tII_heatmap)<-c("WS1-T0","WS3-T0","WS1-T4","WS2-T4","WS1-T24","WS3-T24")
# Remove rows for "Organismal Systems" and "Human Diseases"
#ws_tII_heatmap <- ws_tII_heatmap[-c(5,6),]
# Convert dataframe into a matrix for heatmap
ws_tII_heatmap<-as.matrix(ws_tII_heatmap)
# Scale matrix values to generate Z-scores
ws_tII_heatmap<-scale(t(ws_tII_heatmap))
ws_tII_heatmap<-t(ws_tII_heatmap)
# Specify RColorBrewer custom color palette
col <- colorRampPalette(brewer.pal(10, "RdYlBu"))(256)
# Pheatmap
pheatmap(ws_tII_heatmap, treeheight_row = 0, treeheight_col = 0,cluster_rows = FALSE, cluster_cols = FALSE)
Calculate mean values from the replicate sums.
# Calculate mean values from the replicate sums
ws_tII_sum$meanWS_T0<-apply(ws_tII_sum[,2:3], 1, mean)
ws_tII_sum$meanWS_T4<-apply(ws_tII_sum[,4:5], 1, mean)
ws_tII_sum$meanWS_T24<-apply(ws_tII_sum[,6:7], 1, mean)
ws_tII_sum<-ws_tII_sum %>% mutate_at(vars(meanWS_T0,meanWS_T4,meanWS_T24), funs(round(., 0)))
# Calculate sd values from the replicate sums
ws_tII_sum$sdWS_T0<-apply(ws_tII_sum[,2:3], 1, sd)
ws_tII_sum$sdWS_T4<-apply(ws_tII_sum[,4:5], 1, sd)
ws_tII_sum$sdWS_T24<-apply(ws_tII_sum[,6:7], 1, sd)
ws_tII_sum<-ws_tII_sum %>% mutate_at(vars(sdWS_T0,sdWS_T4,sdWS_T24), funs(round(., 0)))
# Subset Mean values
ws_tII_mean<-subset(ws_tII_sum, select=c(Tier_II,meanWS_T0,meanWS_T4,meanWS_T24))
names(ws_tII_mean)<-c("Tier II", "WS - T0", "WS - T4", "WS - T24")
# Subset SD values
ws_tII_sd<-subset(ws_tII_sum, select=c(Tier_II,sdWS_T0,sdWS_T4,sdWS_T24))
names(ws_tII_sd)<-c("Tier II", "WS - T0", "WS - T4", "WS - T24")
# Remember "Tier II" as non-numeric values
ws_tII_mean_TierII<-ws_tII_mean$`Tier II`
ws_tII_sd_TierII<-ws_tII_sd$`Tier II`
# Transpose all but first column (Tier II)
ws_tII_mean<-as.data.frame(t(ws_tII_mean[,-1]))
colnames(ws_tII_mean)<-ws_tII_mean_TierII
ws_tII_sd<-as.data.frame(t(ws_tII_sd[,-1]))
colnames(ws_tII_sd)<-ws_tII_sd_TierII
# Combine mean and sd into single column with ± divider
ws_tII_table<-as.data.frame(do.call(cbind, lapply(1:ncol(ws_tII_mean), function(i) paste0(ws_tII_mean[ , i], " ± ", ws_tII_sd[ , i]))))
# Transpose the table back
ws_tII_table<-t(ws_tII_table)
# Rename columns to Sites
colnames(ws_tII_table)<-c("Wet Sedge (T0)", "Wet Sedge (T4)", "Wet Sedge (T24)")
rownames(ws_tII_table)<-ws_tII_mean_TierII
kable(ws_tII_table, caption = "Wet Sedge Tier II KEGG category averages (± standard deviation) by sample.", format.args = list(big.mark=","), align = "l") %>% kable_styling(bootstrap_options = c("striped","hover","condensed"))
| Wet Sedge (T0) | Wet Sedge (T4) | Wet Sedge (T24) | |
|---|---|---|---|
| Metabolism | 613258 ± 6672 | 605578 ± 9562 | 550498 ± 43935 |
| Genetic Information Processing | 196460 ± 6666 | 212068 ± 7139 | 266424 ± 45429 |
| Environmental Information Processing | 144529 ± 13793 | 139792 ± 10884 | 133808 ± 6886 |
| Cellular Processes | 45752 ± 455 | 42562 ± 5817 | 49270 ± 5392 |
Run statistical tests to determine if significant differences exist between sampling timepoints for each Tier KEGG category. Click on the Show/Hide button to see the statistics.
# Subset the "sum" values for each sample
ws_tII_stats<-subset(ws_tII_sum, select=c(Tier_II,S108379,S108381,S108385,S108386,S108388,S108390))
rownames(ws_tII_stats) <- ws_tII_stats$Tier_II
ws_tII_stats<-as.data.frame(t(ws_tII_stats[-1]))
# Add the timepoint column to the sum table
ws_tII_stats<-data.frame(Timepoint,ws_tII_stats)
# Subset response variables for MANOVA
ws_tII_stats$response <- as.matrix(ws_tII_stats[, 2:5])
# MANOVA test
ws_tII_manova <- manova(response ~ Timepoint, data=ws_tII_stats)
summary.aov(ws_tII_manova)
Response Metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4687690640 2343845320 3.4031 0.1692
Residuals 3 2066209657 688736552
Response Genetic.Information.Processing. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 5395290537 2697645268 3.7481 0.1528
Residuals 3 2159237277 719745759
Response Environmental.Information.Processing. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 115457346 57728673 0.4863 0.6562
Residuals 3 356128565 118709522
Response Cellular.Processes. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 45026527 22513264 1.0702 0.4459
Residuals 3 63110575 21036858
Plot a barchart for Tier II categories by sampling timepoints.
# Place Tier II categories in the preferred order for plotting
ws_tII_bardata$Tier.II <- factor(ws_tII_bardata$Tier.II,levels = c("Metabolism", "Genetic Information Processing", "Environmental Information Processing", "Cellular Processes"))
ws_tII_bardata$Sample <- factor(ws_tII_bardata$Sample,levels=c("WS-T0","WS-T4","WS-T24"))
ws_tII_barplot<-ggplot(ws_tII_bardata, aes(x = Tier.II, y = Mean, fill = Sample)) + geom_bar(stat = "identity", position = "dodge", color="black") + geom_errorbar(aes(ymin = Mean - SD, ymax = Mean + SD), width = 0.2, position = position_dodge(0.9)) + ylab(expression(atop("KEGG Tier II Categories", paste("Transcript Counts (TPM)")))) + theme_classic() + theme(axis.text=element_text(size=10), axis.title=element_text(size=12), axis.title.x=element_blank()) + theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=8), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + scale_size(guide=FALSE) + scale_fill_manual(values=c("lightskyblue1", "dodgerblue1", "midnightblue")) + scale_x_discrete(labels = function(Tier.II) str_wrap(Tier.II, width = 10)) + scale_y_continuous(expand = c(0, 0), limits = c(0, 800000)) + annotate(geom="text", x=1.0, y=700000, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=2.0, y=400000, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=3.0, y=250000, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=4.0, y=150000, label="italic(N.S.)", parse=TRUE)
ws_tII_barplot
# Place Tier II categories in the preferred order for plotting
ws_mg_mt_tII_bardata$Tier.II <- factor(ws_mg_mt_tII_bardata$Tier.II,levels = c("Metabolism", "Genetic Information Processing", "Environmental Information Processing", "Cellular Processes"))
# Place Samples in the preferred order for plotting
ws_mg_mt_tII_bardata$Sample <- factor(ws_mg_mt_tII_bardata$Sample,levels=c("WS-MG","WS-MT-T0","WS-MT-T4","WS-MT-T24"))
ws_mg_mt_tII_barplot<-ggplot(ws_mg_mt_tII_bardata, aes(x = Tier.II, y = Mean, fill = Sample)) + geom_bar(stat = "identity", position = "dodge", color="black") + geom_errorbar(aes(ymin = Mean - SD, ymax = Mean + SD), width = 0.2, position = position_dodge(0.9)) + ylab(expression(atop("KEGG Tier II Categories", paste("Gene Counts")))) + theme_classic() + theme(axis.text=element_text(size=10), axis.title=element_text(size=12), axis.title.x=element_blank()) + theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=8), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + scale_size(guide=FALSE) + scale_fill_manual(values=c("black", "lightskyblue1", "dodgerblue1", "midnightblue")) + scale_x_discrete(labels = function(Tier.II) str_wrap(Tier.II, width = 10)) + scale_y_continuous(expand = c(0, 0), limits = c(0, 800000)) + annotate(geom="text", x=1.12, y=700000, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=2.12, y=400000, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=3.12, y=250000, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=4.12, y=150000, label="italic(N.S.)", parse=TRUE)
ws_mg_mt_tII_barplot
Calculate TPM-normalized gene count sums for each WS sample in each Tier III category.
# Calculate TPM-normalized gene count sums for each WS sample in each Tier III category
ws_tIII_sum<-tpm_all_exp_ann %>% group_by(Tier_III) %>% summarise_at(vars("S108379","S108381","S108385","S108386","S108388","S108390"), sum)
ws_tIII_sum<-ws_tIII_sum %>% mutate_at(vars(S108379,S108381,S108385,S108386,S108388,S108390), funs(round(., 0)))
ws_tIII_sum<-as.data.frame(ws_tIII_sum)
ws_tIII_sum<-ws_tIII_sum[order(-ws_tIII_sum$S108379),]
# Calculate mean values from the replicate sums
ws_tIII_sum$meanWS_T0<-apply(ws_tIII_sum[,2:3], 1, mean)
ws_tIII_sum$meanWS_T4<-apply(ws_tIII_sum[,4:5], 1, mean)
ws_tIII_sum$meanWS_T24<-apply(ws_tIII_sum[,6:7], 1, mean)
ws_tIII_sum<-ws_tIII_sum %>% mutate_at(vars(meanWS_T0,meanWS_T4,meanWS_T24), funs(round(., 0)))
# Calculate sd values from the replicate sums
ws_tIII_sum$sdWS_T0<-apply(ws_tIII_sum[,2:3], 1, sd)
ws_tIII_sum$sdWS_T4<-apply(ws_tIII_sum[,4:5], 1, sd)
ws_tIII_sum$sdWS_T24<-apply(ws_tIII_sum[,6:7], 1, sd)
ws_tIII_sum<-ws_tIII_sum %>% mutate_at(vars(sdWS_T0,sdWS_T4,sdWS_T24), funs(round(., 0)))
# Write data to .csv file
write.csv(ws_tIII_sum, 'Norm.Results/ws.rna.tIII.mean.sd.csv')
# Subset Mean values
ws_tIII_mean<-subset(ws_tIII_sum, select=c(Tier_III,meanWS_T0,meanWS_T4,meanWS_T24))
names(ws_tIII_mean)<-c("Tier III", "WS - T0", "WS - T4", "WS - T24")
# Subset SD values
ws_tIII_sd<-subset(ws_tIII_sum, select=c(Tier_III,sdWS_T0,sdWS_T4,sdWS_T24))
names(ws_tIII_sd)<-c("Tier III", "WS - T0", "WS - T4", "WS - T24")
# Remember "Tier III" as non-numeric values
ws_tIII_mean_TierIII<-ws_tIII_mean$`Tier III`
ws_tIII_sd_TierIII<-ws_tIII_sd$`Tier III`
# Transpose all but first column (Tier II)
ws_tIII_mean<-as.data.frame(t(ws_tIII_mean[,-1]))
colnames(ws_tIII_mean)<-ws_tIII_mean_TierIII
ws_tIII_sd<-as.data.frame(t(ws_tIII_sd[,-1]))
colnames(ws_tIII_sd)<-ws_tIII_sd_TierIII
# Combine mean and sd into single column with ± divider
ws_tIII_table<-as.data.frame(do.call(cbind, lapply(1:ncol(ws_tIII_mean), function(i) paste0(ws_tIII_mean[ , i], " ± ", ws_tIII_sd[ , i]))))
# Transpose the table back
ws_tIII_table<-t(ws_tIII_table)
# Rename columns to Sites
colnames(ws_tIII_table)<-c("Wet Sedge (T0)", "Wet Sedge (T4)", "Wet Sedge (T24)")
rownames(ws_tIII_table)<-ws_tIII_mean_TierIII
Run statistical tests to determine if significant differences exist between sampling timepoints for each Tier KEGG category. Click on the Show/Hide button to see the statistics.
# Subset the "sum" values for each sample
ws_tIII_stats<-subset(ws_tIII_sum, select=c(Tier_III,S108379,S108381,S108385,S108386,S108388,S108390))
rownames(ws_tIII_stats) <- ws_tIII_stats$Tier_III
ws_tIII_stats<-as.data.frame(t(ws_tIII_stats[-1]))
# Add the timepoint column to the sum table
ws_tIII_stats<-data.frame(Timepoint,ws_tIII_stats)
# Subset response variables for MANOVA
ws_tIII_stats$response <- as.matrix(ws_tIII_stats[, 2:26])
# MANOVA test
ws_tIII_manova <- manova(response ~ Timepoint, data=ws_tIII_stats)
summary.aov(ws_tIII_manova)
Response Overview. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3.229e+09 1614478885 3.4671 0.166
Residuals 3 1.397e+09 465658621
Response Translation. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3848660653 1924330326 4.1167 0.138
Residuals 3 1402321521 467440507
Response Membrane.transport. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 84881827 42440914 0.7249 0.5536
Residuals 3 175650477 58550159
Response Carbohydrate.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 263005410 131502705 15.884 0.02535 *
Residuals 3 24836953 8278984
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Energy.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 80135239 40067619 0.3088 0.7552
Residuals 3 389251884 129750628
Response Nucleotide.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 744581304 372290652 1.8933 0.2939
Residuals 3 589917974 196639325
Response Folding..sorting.and.degradation. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 199525327 99762663 1.6966 0.3214
Residuals 3 176399661 58799887
Response Metabolism.of.cofactors.and.vitamins. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 17685344 8842672 0.5937 0.6064
Residuals 3 44681521 14893840
Response Signal.transduction. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 74629272 37314636 1.1404 0.4282
Residuals 3 98166071 32722024
Response Lipid.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 47142823 23571412 3.2784 0.1759
Residuals 3 21569597 7189866
Response Cellular.community...prokaryotes. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 11370262 5685131 1.032 0.456
Residuals 3 16527253 5509084
Response Amino.acid.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2557696 1278848 1.138 0.4288
Residuals 3 3371439 1123813
Response Replication.and.repair. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2159985 1079993 0.3516 0.7292
Residuals 3 9215090 3071697
Response Cell.growth.and.death. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 13096993 6548496 1.9885 0.282
Residuals 3 9879495 3293165
Response Xenobiotics.biodegradation.and.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 44485707 22242854 3.9947 0.1426
Residuals 3 16704091 5568030
Response Glycan.biosynthesis.and.metabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2557750 1278875 0.8335 0.5154
Residuals 3 4603221 1534407
Response Metabolism.of.other.amino.acids. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1212988 606494 1.0782 0.4438
Residuals 3 1687578 562526
Response Metabolism.of.terpenoids.and.polyketides. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 314281 157141 0.3303 0.7419
Residuals 3 1427130 475710
Response Transport.and.catabolism. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 11956880 5978440 4.8021 0.1161
Residuals 3 3734878 1244960
Response Transcription. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 347953 173977 0.9065 0.4921
Residuals 3 575736 191912
Response Cell.motility. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 47612 23806 2.1884 0.2593
Residuals 3 32635 10878
Response Biosynthesis.of.other.secondary.metabolites. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 18747 9373.5 1.2568 0.4014
Residuals 3 22374 7458.2
Response Cellular.community...eukaryotes. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2074.3 1037.17 1.9939 0.2813
Residuals 3 1560.5 520.17
Response Signaling.molecules.and.interaction. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1336.3 668.17 26.906 0.01213 *
Residuals 3 74.5 24.83
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Enzyme.families. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Run ANOVA for each category of interest (significant in MANOVA)
# Carbohydrate.metabolism.
ws_tIII_aov_carbs<-aov(Carbohydrate.metabolism.~Timepoint,data=ws_tIII_stats)
TukeyHSD(ws_tIII_aov_carbs)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Carbohydrate.metabolism. ~ Timepoint, data = ws_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -13645.5 -25669.148 -1621.852 0.0358367
T4-T0 767.0 -11256.648 12790.648 0.9620737
T4-T24 14412.5 2388.852 26436.148 0.0309395
# Signaling Molecules and Interactions
ws_tIII_aov_trans<-aov(Signaling.molecules.and.interaction.~Timepoint,data=ws_tIII_stats)
TukeyHSD(ws_tIII_aov_trans)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Signaling.molecules.and.interaction. ~ Timepoint, data = ws_tIII_stats)
$Timepoint
diff lwr upr p adj
T24-T0 -36.5 -57.324052 -15.6759477 0.0107391
T4-T0 -20.0 -40.824052 0.8240523 0.0554797
T4-T24 16.5 -4.324052 37.3240523 0.0894354
Calculate TPM-normalized gene count sums for each WS sample in each Tier III category.
# Calculate TPM-normalized gene count sums for each WS sample in each Tier III category
ws_tIV_sum<-tpm_all_exp_ann %>% group_by(Tier_IV) %>% summarise_at(vars("S108379","S108381","S108385","S108386","S108388","S108390"), sum)
ws_tIV_sum<-ws_tIV_sum %>% mutate_at(vars(S108379,S108381,S108385,S108386,S108388,S108390), funs(round(., 0)))
ws_tIV_sum<-as.data.frame(ws_tIV_sum)
ws_tIV_sum<-ws_tIV_sum[order(-ws_tIV_sum$S108379),]
# Calculate mean values from the replicate sums
ws_tIV_sum$meanWS_T0<-apply(ws_tIV_sum[,2:3], 1, mean)
ws_tIV_sum$meanWS_T4<-apply(ws_tIV_sum[,4:5], 1, mean)
ws_tIV_sum$meanWS_T24<-apply(ws_tIV_sum[,6:7], 1, mean)
ws_tIV_sum<-ws_tIV_sum %>% mutate_at(vars(meanWS_T0,meanWS_T4,meanWS_T24), funs(round(., 0)))
# Calculate sd values from the replicate sums
ws_tIV_sum$sdWS_T0<-apply(ws_tIV_sum[,2:3], 1, sd)
ws_tIV_sum$sdWS_T4<-apply(ws_tIV_sum[,4:5], 1, sd)
ws_tIV_sum$sdWS_T24<-apply(ws_tIV_sum[,6:7], 1, sd)
ws_tIV_sum<-ws_tIV_sum %>% mutate_at(vars(sdWS_T0,sdWS_T4,sdWS_T24), funs(round(., 0)))
# Write data to .csv file
write.csv(ws_tIV_sum, 'Norm.Results/ws.rna.tIV.mean.sd.csv')
# Subset Mean values
ws_tIV_mean<-subset(ws_tIV_sum, select=c(Tier_IV,meanWS_T0,meanWS_T4,meanWS_T24))
names(ws_tIV_mean)<-c("Tier IV", "WS - T0", "WS - T4", "WS - T24")
# Subset SD values
ws_tIV_sd<-subset(ws_tIV_sum, select=c(Tier_IV,sdWS_T0,sdWS_T4,sdWS_T24))
names(ws_tIV_sd)<-c("Tier IV", "WS - T0", "WS - T4", "WS - T24")
# Remember "Tier IV" as non-numeric values
ws_tIV_mean_TierIV<-ws_tIV_mean$`Tier IV`
ws_tIV_sd_TierIV<-ws_tIV_sd$`Tier IV`
# Transpose all but first column (Tier IV)
ws_tIV_mean<-as.data.frame(t(ws_tIV_mean[,-1]))
colnames(ws_tIV_mean)<-ws_tIV_mean_TierIV
ws_tIV_sd<-as.data.frame(t(ws_tIV_sd[,-1]))
colnames(ws_tIV_sd)<-ws_tIV_sd_TierIV
# Combine mean and sd into single column with ± divider
ws_tIV_table<-as.data.frame(do.call(cbind, lapply(1:ncol(ws_tIV_mean), function(i) paste0(ws_tIV_mean[ , i], " ± ", ws_tIV_sd[ , i]))))
# Transpose the table back
ws_tIV_table<-t(ws_tIV_table)
# Rename columns to Sites
colnames(ws_tIV_table)<-c("Wet Sedge (T0)", "Wet Sedge (T4)", "Wet Sedge (T24)")
rownames(ws_tIV_table)<-ws_tIV_mean_TierIV
#ws_tIV_table
Run statistical tests to determine if significant differences exist between sampling timepoints for each Tier KEGG category. Click on the Show/Hide button to see the statistics.
# Subset the "sum" values for each sample
ws_tIV_stats<-subset(ws_tIV_sum, select=c(Tier_IV,S108379,S108381,S108385,S108386,S108388,S108390))
rownames(ws_tIV_stats) <- ws_tIV_stats$Tier_IV
ws_tIV_stats<-as.data.frame(t(ws_tIV_stats[-1]))
# Add the timepoint column to the sum table
ws_tIV_stats<-data.frame(Timepoint,ws_tIV_stats)
# Subset response variables for MANOVA
ws_tIV_stats$response <- as.matrix(ws_tIV_stats[, 2:227])
# MANOVA test
ws_tIV_manova <- manova(response ~ Timepoint, data=ws_tIV_stats)
summary.aov(ws_tIV_manova)
Response X01200.Carbon.metabolism..PATH.ko01200. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2709213926 1354606963 3.1096 0.1856
Residuals 3 1306856322 435618774
Response X02010.ABC.transporters..PATH.ko02010. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 88690296 44345148 0.9494 0.4792
Residuals 3 140132277 46710759
Response X03010.Ribosome..PATH.ko03010. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4196962380 2098481190 3.7537 0.1526
Residuals 3 1677142543 559047514
Response X00230.Purine.metabolism..PATH.ko00230. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 721205942 360602971 1.891 0.2942
Residuals 3 572076112 190692037
Response X00190.Oxidative.phosphorylation..PATH.ko00190. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 26780164 13390082 0.6318 0.5902
Residuals 3 63579871 21193290
Response X01230.Biosynthesis.of.amino.acids..PATH.ko01230. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 64891137 32445568 1.6422 0.3298
Residuals 3 59273857 19757952
Response X02020.Two.component.system..PATH.ko02020. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 71470119 35735060 1.2758 0.3972
Residuals 3 84029147 28009716
Response X03018.RNA.degradation..PATH.ko03018. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 370553612 185276806 2.4979 0.2298
Residuals 3 222521614 74173871
Response X02024.Quorum.sensing..PATH.ko02024. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 17660261 8830131 1.8917 0.2941
Residuals 3 14003388 4667796
Response X00010.Glycolysis...Gluconeogenesis..PATH.ko00010. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 72839132 36419566 17.818 0.02164 *
Residuals 3 6131975 2043992
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00561.Glycerolipid.metabolism..PATH.ko00561. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 48735574 24367787 2.6402 0.2181
Residuals 3 27688564 9229521
Response X01210.2.Oxocarboxylic.acid.metabolism..PATH.ko01210. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 18602361 9301180 3.3251 0.1733
Residuals 3 8391785 2797262
Response X04112.Cell.cycle...Caulobacter..PATH.ko04112. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 11462476 5731238 1.6995 0.321
Residuals 3 10116639 3372213
Response X00970.Aminoacyl.tRNA.biosynthesis..PATH.ko00970. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1635331 817665 0.5034 0.6479
Residuals 3 4873087 1624362
Response X01212.Fatty.acid.metabolism..PATH.ko01212. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4306430 2153215 16.403 0.02425 *
Residuals 3 393813 131271
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00240.Pyrimidine.metabolism..PATH.ko00240. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 903233 451617 0.4385 0.6807
Residuals 3 3090018 1030006
Response X03060.Protein.export..PATH.ko03060. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3896580 1948290 1.4619 0.3604
Residuals 3 3998117 1332706
Response X00500.Starch.and.sucrose.metabolism..PATH.ko00500. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1007050 503525 0.3963 0.7035
Residuals 3 3811757 1270586
Response X00860.Porphyrin.and.chlorophyll.metabolism..PATH.ko00860. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3057000 1528500 0.5861 0.6097
Residuals 3 7823557 2607852
Response X00640.Propanoate.metabolism..PATH.ko00640. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3036497 1518249 5.1179 0.1079
Residuals 3 889956 296652
Response X00920.Sulfur.metabolism..PATH.ko00920. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3352000 1676000 3.7358 0.1533
Residuals 3 1345903 448634
Response X04141.Protein.processing.in.endoplasmic.reticulum..PATH.ko04141. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 64395394 32197697 2.1084 0.268
Residuals 3 45814095 15271365
Response X00633.Nitrotoluene.degradation..PATH.ko00633. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 38234969 19117485 6.7296 0.07782 .
Residuals 3 8522406 2840802
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00520.Amino.sugar.and.nucleotide.sugar.metabolism..PATH.ko00520. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1936987 968494 1.5732 0.341
Residuals 3 1846855 615618
Response X00650.Butanoate.metabolism..PATH.ko00650. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2146224 1073112 1.6542 0.328
Residuals 3 1946188 648729
Response X00910.Nitrogen.metabolism..PATH.ko00910. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 63792093 31896046 1.024 0.4581
Residuals 3 93446501 31148834
Response X01220.Degradation.of.aromatic.compounds..PATH.ko01220. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 865365 432683 3.2303 0.1786
Residuals 3 401833 133944
Response X00760.Nicotinate.and.nicotinamide.metabolism..PATH.ko00760. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 947060 473530 0.8559 0.5081
Residuals 3 1659803 553268
Response X03440.Homologous.recombination..PATH.ko03440. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 128497 64248 0.1669 0.8536
Residuals 3 1154517 384839
Response X03030.DNA.replication..PATH.ko03030. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2533 1267 0.0103 0.9898
Residuals 3 369378 123126
Response X00270.Cysteine.and.methionine.metabolism..PATH.ko00270. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 384680 192340 1.2152 0.4106
Residuals 3 474837 158279
Response X00550.Peptidoglycan.biosynthesis..PATH.ko00550. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 441252 220626 0.6179 0.5961
Residuals 3 1071261 357087
Response X00730.Thiamine.metabolism..PATH.ko00730. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 504147 252073 1.2055 0.4128
Residuals 3 627333 209111
Response X00051.Fructose.and.mannose.metabolism..PATH.ko00051. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1539151 769576 3.1536 0.183
Residuals 3 732091 244030
Response X00790.Folate.biosynthesis..PATH.ko00790. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 206796 103398 0.8427 0.5123
Residuals 3 368078 122693
Response X00900.Terpenoid.backbone.biosynthesis..PATH.ko00900. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 146524 73262 0.4836 0.6576
Residuals 3 454491 151497
Response X00040.Pentose.and.glucuronate.interconversions..PATH.ko00040. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 828841 414421 10.344 0.04507 *
Residuals 3 120189 40063
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00360.Phenylalanine.metabolism..PATH.ko00360. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 91534 45767 1.1848 0.4176
Residuals 3 115889 38630
Response X03420.Nucleotide.excision.repair..PATH.ko03420. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 487881 243941 4.4508 0.1266
Residuals 3 164426 54809
Response X03050.Proteasome..PATH.ko03050. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1222808 611404 0.9031 0.4932
Residuals 3 2031067 677022
Response X00630.Glyoxylate.and.dicarboxylate.metabolism..PATH.ko00630. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 758757 379379 1.6011 0.3364
Residuals 3 710837 236946
Response X00330.Arginine.and.proline.metabolism..PATH.ko00330. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 98540 49270 0.3567 0.7261
Residuals 3 414365 138122
Response X00564.Glycerophospholipid.metabolism..PATH.ko00564. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 273100 136550 1.79 0.3078
Residuals 3 228851 76284
Response X00052.Galactose.metabolism..PATH.ko00052. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 377854 188927 3.2718 0.1762
Residuals 3 173232 57744
Response X00540.Lipopolysaccharide.biosynthesis..PATH.ko00540. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 97134 48567 0.1289 0.8837
Residuals 3 1130335 376778
Response X00450.Selenocompound.metabolism..PATH.ko00450. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1112960 556480 3.3508 0.172
Residuals 3 498217 166072
Response X03070.Bacterial.secretion.system..PATH.ko03070. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 119642 59821 0.1185 0.8922
Residuals 3 1513997 504666
Response X00130.Ubiquinone.and.other.terpenoid.quinone.biosynthesis..PATH.ko00130. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 447456 223728 1.1255 0.4318
Residuals 3 596365 198788
Response X00770.Pantothenate.and.CoA.biosynthesis..PATH.ko00770. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 98052 49026 2.2371 0.2543
Residuals 3 65745 21915
Response X00250.Alanine..aspartate.and.glutamate.metabolism..PATH.ko00250. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 557416 278708 2.3603 0.2422
Residuals 3 354249 118083
Response X00680.Methane.metabolism..PATH.ko00680. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3998761 1999381 0.261 0.7861
Residuals 3 22978387 7659462
Response X00620.Pyruvate.metabolism..PATH.ko00620. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 663644 331822 4.993 0.111
Residuals 3 199371 66457
Response X03410.Base.excision.repair..PATH.ko03410. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 177752 88876 1.5661 0.3422
Residuals 3 170249 56750
Response X04013.MAPK.signaling.pathway...fly..PATH.ko04013. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 179574 89787 0.364 0.7219
Residuals 3 739975 246658
Response X00562.Inositol.phosphate.metabolism..PATH.ko00562. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 229433 114717 1.3639 0.379
Residuals 3 252322 84107
Response X00030.Pentose.phosphate.pathway..PATH.ko00030. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 260780 130390 3.9724 0.1435
Residuals 3 98473 32824
Response X00627.Aminobenzoate.degradation..PATH.ko00627. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 272965 136483 0.2956 0.7635
Residuals 3 1384916 461639
Response X00740.Riboflavin.metabolism..PATH.ko00740. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 33408 16704 0.4729 0.6629
Residuals 3 105965 35322
Response X00523.Polyketide.sugar.unit.biosynthesis..PATH.ko00523. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 50890 25445 0.4981 0.6504
Residuals 3 153253 51084
Response X00061.Fatty.acid.biosynthesis..PATH.ko00061. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 529861 264931 1.1605 0.4234
Residuals 3 684894 228298
Response X00750.Vitamin.B6.metabolism..PATH.ko00750. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 37452 18726.0 2.1669 0.2616
Residuals 3 25926 8641.8
Response X04122.Sulfur.relay.system..PATH.ko04122. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 35877 17938 0.2862 0.7696
Residuals 3 188067 62689
Response X03430.Mismatch.repair..PATH.ko03430. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 88831 44416 0.5677 0.6179
Residuals 3 234697 78232
Response X03013.RNA.transport..PATH.ko03013. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1968627 984314 0.2746 0.7771
Residuals 3 10754348 3584783
Response X03450.Non.homologous.end.joining..PATH.ko03450. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1261 630 0.0115 0.9887
Residuals 3 165121 55040
Response X00473.D.Alanine.metabolism..PATH.ko00473. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 51103 25552 1.6643 0.3264
Residuals 3 46059 15353
Response X04016.MAPK.signaling.pathway...plant..PATH.ko04016. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 236853 118427 1.9585 0.2856
Residuals 3 181403 60468
Response X02026.Biofilm.formation...Escherichia.coli..PATH.ko02026. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 65821 32910 0.6617 0.578
Residuals 3 149211 49737
Response X00480.Glutathione.metabolism..PATH.ko00480. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 188211 94105 1.2248 0.4084
Residuals 3 230503 76834
Response X00785.Lipoic.acid.metabolism..PATH.ko00785. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 47701 23851 0.7615 0.5402
Residuals 3 93966 31322
Response X00471.D.Glutamine.and.D.glutamate.metabolism..PATH.ko00471. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 57802 28901 1.2402 0.405
Residuals 3 69913 23304
Response X00260.Glycine..serine.and.threonine.metabolism..PATH.ko00260. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 35407 17703.5 2.6704 0.2157
Residuals 3 19888 6629.5
Response X00364.Fluorobenzoate.degradation..PATH.ko00364. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 55921 27961 1.5364 0.3472
Residuals 3 54598 18199
Response X04145.Phagosome..PATH.ko04145. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9613824 4806912 2.7485 0.2098
Residuals 3 5246671 1748890
Response X02025.Biofilm.formation...Pseudomonas.aeruginosa..PATH.ko02025. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 281524 140762 1.1975 0.4147
Residuals 3 352638 117546
Response X02060.Phosphotransferase.system..PTS...PATH.ko02060. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 29505 14753 0.9232 0.487
Residuals 3 47938 15979
Response X00780.Biotin.metabolism..PATH.ko00780. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 110557 55278 2.1687 0.2614
Residuals 3 76469 25490
Response X03008.Ribosome.biogenesis.in.eukaryotes..PATH.ko03008. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 39557 19779 0.1953 0.8323
Residuals 3 303884 101295
Response X03022.Basal.transcription.factors..PATH.ko03022. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 350532 175266 1.157 0.4242
Residuals 3 454441 151480
Response X00300.Lysine.biosynthesis..PATH.ko00300. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 35329 17664 1.083 0.4425
Residuals 3 48931 16310
Response X00340.Histidine.metabolism..PATH.ko00340. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 19927 9963.5 0.9461 0.4802
Residuals 3 31595 10531.7
Response X00280.Valine..leucine.and.isoleucine.degradation..PATH.ko00280. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 123456 61728 5.7053 0.09498 .
Residuals 3 32458 10819
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00350.Tyrosine.metabolism..PATH.ko00350. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 16405 8202.7 0.4015 0.7006
Residuals 3 61284 20428.0
Response X00511.Other.glycan.degradation..PATH.ko00511. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 164257 82129 2.6534 0.217
Residuals 3 92857 30952
Response X00310.Lysine.degradation..PATH.ko00310. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8161 4080.5 0.8363 0.5145
Residuals 3 14638 4879.5
Response X00670.One.carbon.pool.by.folate..PATH.ko00670. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 26904 13452.2 1.7547 0.3129
Residuals 3 22999 7666.3
Response X00140.Steroid.hormone.biosynthesis..PATH.ko00140. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 329305 164653 1.1727 0.4204
Residuals 3 421210 140403
Response X00410.beta.Alanine.metabolism..PATH.ko00410. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 20149 10074.5 1.16 0.4235
Residuals 3 26054 8684.8
Response X00984.Steroid.degradation..PATH.ko00984. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 229724 114862 20.226 0.01814 *
Residuals 3 17037 5679
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00053.Ascorbate.and.aldarate.metabolism..PATH.ko00053. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9417 4708.5 1.3673 0.3784
Residuals 3 10331 3443.7
Response X00430.Taurine.and.hypotaurine.metabolism..PATH.ko00430. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1236 618.0 0.1105 0.8989
Residuals 3 16780 5593.3
Response X04217.Necroptosis..PATH.ko04217. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 42853 21426 1.3352 0.3848
Residuals 3 48141 16047
Response X02040.Flagellar.assembly..PATH.ko02040. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 70724 35362 2.1096 0.2679
Residuals 3 50287 16762
Response X05111.Biofilm.formation...Vibrio.cholerae..PATH.ko05111. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 78082 39041 0.99 0.4676
Residuals 3 118308 39436
Response X00510.N.Glycan.biosynthesis..PATH.ko00510. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 26342 13171 1.1505 0.4257
Residuals 3 34344 11448
Response X00071.Fatty.acid.degradation..PATH.ko00071. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4300.3 2150.2 0.2839 0.771
Residuals 3 22720.5 7573.5
Response X00380.Tryptophan.metabolism..PATH.ko00380. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9997 4998.5 1.3769 0.3765
Residuals 3 10890 3630.2
Response X04214.Apoptosis...fly..PATH.ko04214. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 72441 36220 6.2013 0.08596 .
Residuals 3 17523 5841
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04146.Peroxisome..PATH.ko04146. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 22770 11385.2 1.3664 0.3786
Residuals 3 24997 8332.3
Response X00195.Photosynthesis..PATH.ko00195. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8814.3 4407.2 0.4961 0.6514
Residuals 3 26651.0 8883.7
Response X02030.Bacterial.chemotaxis..PATH.ko02030. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 15409 7704.7 10.574 0.04379 *
Residuals 3 2186 728.7
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X01040.Biosynthesis.of.unsaturated.fatty.acids..PATH.ko01040. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 31370 15685 3.0713 0.188
Residuals 3 15321 5107
Response X00600.Sphingolipid.metabolism..PATH.ko00600. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 15967 7983.5 5.0095 0.1106
Residuals 3 4781 1593.7
Response X00311.Penicillin.and.cephalosporin.biosynthesis..PATH.ko00311. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3661.0 1830.5 0.5576 0.6224
Residuals 3 9848.5 3282.8
Response X00220.Arginine.biosynthesis..PATH.ko00220. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 21097.3 10548.7 5.4586 0.1001
Residuals 3 5797.5 1932.5
Response X00362.Benzoate.degradation..PATH.ko00362. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3960.3 1980.2 1.1173 0.4339
Residuals 3 5317.0 1772.3
Response X00906.Carotenoid.biosynthesis..PATH.ko00906. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6057 3028.5 0.9632 0.4752
Residuals 3 9433 3144.3
Response X00625.Chloroalkane.and.chloroalkene.degradation..PATH.ko00625. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2096.3 1048.17 2.325 0.2456
Residuals 3 1352.5 450.83
Response X00908.Zeatin.biosynthesis..PATH.ko00908. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2209.0 1104.50 1.4157 0.369
Residuals 3 2340.5 780.17
Response X00400.Phenylalanine..tyrosine.and.tryptophan.biosynthesis..PATH.ko00400. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 14583 7291.5 0.7523 0.5435
Residuals 3 29077 9692.3
Response X00440.Phosphonate.and.phosphinate.metabolism..PATH.ko00440. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 11575.0 5787.5 13.05 0.0331 *
Residuals 3 1330.5 443.5
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00965.Betalain.biosynthesis..PATH.ko00965. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4243 2121.50 2.5799 0.2229
Residuals 3 2467 822.33
Response X03040.Spliceosome..PATH.ko03040. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 844 422 0.1375 0.8767
Residuals 3 9204 3068
Response X04152.AMPK.signaling.pathway..PATH.ko04152. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 44100 22050 0.3013 0.7599
Residuals 3 219555 73185
Response X00909.Sesquiterpenoid.and.triterpenoid.biosynthesis..PATH.ko00909. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 24.33 12.17 0.0158 0.9844
Residuals 3 2314.50 771.50
Response X04144.Endocytosis..PATH.ko04144. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 74892 37446 0.7119 0.5585
Residuals 3 157808 52603
Response X00120.Primary.bile.acid.biosynthesis..PATH.ko00120. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 54.33 27.167 0.2571 0.7888
Residuals 3 317.00 105.667
Response X04015.Rap1.signaling.pathway..PATH.ko04015. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8220.3 4110.2 6.7141 0.07804 .
Residuals 3 1836.5 612.2
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00515.Mannose.type.O.glycan.biosyntheis..PATH.ko00515. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2524.0 1262.00 5.2185 0.1055
Residuals 3 725.5 241.83
Response X01055.Biosynthesis.of.vancomycin.group.antibiotics..PATH.ko01055. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 20443 10221.5 6.4354 0.08218 .
Residuals 3 4765 1588.3
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X03015.mRNA.surveillance.pathway..PATH.ko03015. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3752.3 1876.2 0.7245 0.5537
Residuals 3 7769.0 2589.7
Response X04510.Focal.adhesion..PATH.ko04510. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3067 1533.50 3.1148 0.1853
Residuals 3 1477 492.33
Response X00565.Ether.lipid.metabolism..PATH.ko00565. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2212.0 1106.00 7.1896 0.07172 .
Residuals 3 461.5 153.83
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04142.Lysosome..PATH.ko04142. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1070.3 535.17 6.2593 0.085 .
Residuals 3 256.5 85.50
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00281.Geraniol.degradation..PATH.ko00281. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1577.3 788.67 1.1069 0.4365
Residuals 3 2137.5 712.50
Response X04138.Autophagy...yeast..PATH.ko04138. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1157.3 578.67 1.4991 0.3537
Residuals 3 1158.0 386.00
Response X00405.Phenazine.biosynthesis..PATH.ko00405. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 133 66.500 5.3919 0.1015
Residuals 3 37 12.333
Response X00590.Arachidonic.acid.metabolism..PATH.ko00590. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4566.3 2283.2 2.8719 0.201
Residuals 3 2385.0 795.0
Response X00514.Other.types.of.O.glycan.biosynthesis..PATH.ko00514. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 763.0 381.5 1.8038 0.3059
Residuals 3 634.5 211.5
Response X01057.Biosynthesis.of.type.II.polyketide.products..PATH.ko01057. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1561.3 780.67 5.1191 0.1079
Residuals 3 457.5 152.50
Response X04120.Ubiquitin.mediated.proteolysis..PATH.ko04120. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 817.0 408.50 1.9624 0.2852
Residuals 3 624.5 208.17
Response X04151.PI3K.Akt.signaling.pathway..PATH.ko04151. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 234.33 117.17 0.2055 0.8248
Residuals 3 1710.50 570.17
Response X04530.Tight.junction..PATH.ko04530. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 89.333 44.667 0.4497 0.6748
Residuals 3 298.000 99.333
Response X00660.C5.Branched.dibasic.acid.metabolism..PATH.ko00660. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 67 33.50 0.0554 0.947
Residuals 3 1813 604.33
Response X03460.Fanconi.anemia.pathway..PATH.ko03460. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3153 1576.50 1.9698 0.2842
Residuals 3 2401 800.33
Response X01059.Biosynthesis.of.enediyne.antibiotics..PATH.ko01059. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 242.33 121.17 0.9811 0.4701
Residuals 3 370.50 123.50
Response X00960.Tropane..piperidine.and.pyridine.alkaloid.biosynthesis..PATH.ko00960. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 120.33 60.167 0.9093 0.4912
Residuals 3 198.50 66.167
Response X04140.Autophagy...animal..PATH.ko04140. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1456.0 728.00 1.2927 0.3936
Residuals 3 1689.5 563.17
Response X00830.Retinol.metabolism..PATH.ko00830. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 180.33 90.167 0.9927 0.4668
Residuals 3 272.50 90.833
Response X04080.Neuroactive.ligand.receptor.interaction..PATH.ko04080. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 394.33 197.167 3.1973 0.1805
Residuals 3 185.00 61.667
Response X04014.Ras.signaling.pathway..PATH.ko04014. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 177.33 88.667 4.972 0.1116
Residuals 3 53.50 17.833
Response X00100.Steroid.biosynthesis..PATH.ko00100. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.0 0.500 0.0073 0.9927
Residuals 3 204.5 68.167
Response X04371.Apelin.signaling.pathway..PATH.ko04371. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 444 222.000 2.664 0.2162
Residuals 3 250 83.333
Response X00591.Linoleic.acid.metabolism..PATH.ko00591. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2734.3 1367.17 4.3219 0.1308
Residuals 3 949.0 316.33
Response X04012.ErbB.signaling.pathway..PATH.ko04012. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1446.3 723.17 1.2023 0.4136
Residuals 3 1804.5 601.50
Response X00020.Citrate.cycle..TCA.cycle...PATH.ko00020. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6416 3208.2 0.2767 0.7757
Residuals 3 34780 11593.5
Response X01053.Biosynthesis.of.siderophore.group.nonribosomal.peptides..PATH.ko01053. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 281.33 140.667 5.4452 0.1004
Residuals 3 77.50 25.833
Response X04514.Cell.adhesion.molecules..CAMs...PATH.ko04514. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 394.33 197.167 2.7258 0.2115
Residuals 3 217.00 72.333
Response X04010.MAPK.signaling.pathway..PATH.ko04010. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 185.33 92.667 3.5871 0.1601
Residuals 3 77.50 25.833
Response X04310.Wnt.signaling.pathway..PATH.ko04310. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 550.33 275.167 5.08 0.1088
Residuals 3 162.50 54.167
Response X04210.Apoptosis..PATH.ko04210. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 52.333 26.167 0.4198 0.6906
Residuals 3 187.000 62.333
Response X00365.Furfural.degradation..PATH.ko00365. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 169.0 84.5 4.3333 0.1304
Residuals 3 58.5 19.5
Response X00472.D.Arginine.and.D.ornithine.metabolism..PATH.ko00472. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 265.33 132.67 1.3007 0.392
Residuals 3 306.00 102.00
Response X00903.Limonene.and.pinene.degradation..PATH.ko00903. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 890.33 445.17 0.8658 0.5049
Residuals 3 1542.50 514.17
Response X00943.Isoflavonoid.biosynthesis..PATH.ko00943. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 120.33 60.167 1.8325 0.302
Residuals 3 98.50 32.833
Response X04810.Regulation.of.actin.cytoskeleton..PATH.ko04810. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 57.333 28.667 0.5911 0.6076
Residuals 3 145.500 48.500
Response X00532.Glycosaminoglycan.biosynthesis...chondroitin.sulfate...dermatan.sulfate..PATH.ko00532. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 617.33 308.67 2.8103 0.2053
Residuals 3 329.50 109.83
Response X00624.Polycyclic.aromatic.hydrocarbon.degradation..PATH.ko00624. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.333 1.167 0.0246 0.9759
Residuals 3 142.500 47.500
Response X04330.Notch.signaling.pathway..PATH.ko04330. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 102.33 51.167 2.6239 0.2194
Residuals 3 58.50 19.500
Response X04350.TGF.beta.signaling.pathway..PATH.ko04350. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 26.333 13.167 0.2651 0.7834
Residuals 3 149.000 49.667
Response X04340.Hedgehog.signaling.pathway..PATH.ko04340. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 30.333 15.167 0.7054 0.5609
Residuals 3 64.500 21.500
Response X04630.Jak.STAT.signaling.pathway..PATH.ko04630. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 261.33 130.667 4.1925 0.1353
Residuals 3 93.50 31.167
Response X04020.Calcium.signaling.pathway..PATH.ko04020. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 133.33 66.667 5.9701 0.08998 .
Residuals 3 33.50 11.167
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04137.Mitophagy...animal..PATH.ko04137. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 75.0 37.50 0.252 0.7922
Residuals 3 446.5 148.83
Response X00531.Glycosaminoglycan.degradation..PATH.ko00531. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 74.333 37.167 1.0619 0.448
Residuals 3 105.000 35.000
Response X04011.MAPK.signaling.pathway...yeast..PATH.ko04011. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 69.333 34.667 1.7333 0.316
Residuals 3 60.000 20.000
Response X04520.Adherens.junction..PATH.ko04520. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 54.333 27.167 9.0556 0.05357 .
Residuals 3 9.000 3.000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00073.Cutin..suberine.and.wax.biosynthesis..PATH.ko00073. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 30.333 15.167 1.0581 0.449
Residuals 3 43.000 14.333
Response X00363.Bisphenol.degradation..PATH.ko00363. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 19 9.5 3.1667 0.1822
Residuals 3 9 3.0
Response X04024.cAMP.signaling.pathway..PATH.ko04024. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 14.333 7.1667 0.4257 0.6874
Residuals 3 50.500 16.8333
Response X00121.Secondary.bile.acid.biosynthesis..PATH.ko00121. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 772 386 1.1029 0.4375
Residuals 3 1050 350
Response X04150.mTOR.signaling.pathway..PATH.ko04150. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 31 15.5 5.1667 0.1067
Residuals 3 9 3.0
Response X00643.Styrene.degradation..PATH.ko00643. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 13 6.5000 2.7857 0.2071
Residuals 3 7 2.3333
Response X00930.Caprolactam.degradation..PATH.ko00930. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 16.0 8.000 0.5275 0.6364
Residuals 3 45.5 15.167
Response X04391.Hippo.signaling.pathway..fly..PATH.ko04391. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 127.0 63.500 1.0438 0.4528
Residuals 3 182.5 60.833
Response X00563.Glycosylphosphatidylinositol.GPI..anchor.biosynthesis..PATH.ko00563. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 19 9.500 0.5182 0.6408
Residuals 3 55 18.333
Response X00720.Carbon.fixation.pathways.in.prokaryotes..PATH.ko00720. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 49 24.5000 10.5 0.04419 *
Residuals 3 7 2.3333
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00261.Monobactam.biosynthesis..PATH.ko00261. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 25.333 12.6667 1.3571 0.3804
Residuals 3 28.000 9.3333
Response X00623.Toluene.degradation..PATH.ko00623. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8.333 4.1667 0.2778 0.775
Residuals 3 45.000 15.0000
Response X00940.Phenylpropanoid.biosynthesis..PATH.ko00940. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 17.333 8.6667 0.7647 0.539
Residuals 3 34.000 11.3333
Response X04111.Cell.cycle...yeast..PATH.ko04111. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 36.333 18.1667 2.8684 0.2012
Residuals 3 19.000 6.3333
Response X00534.Glycosaminoglycan.biosynthesis...heparan.sulfate...heparin..PATH.ko00534. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9 4.500 0.3649 0.7214
Residuals 3 37 12.333
Response X04068.FoxO.signaling.pathway..PATH.ko04068. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 42.333 21.167 1.7162 0.3185
Residuals 3 37.000 12.333
Response X04668.TNF.signaling.pathway..PATH.ko04668. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 5.3333 2.6667 0.3636 0.7221
Residuals 3 22.0000 7.3333
Response X00626.Naphthalene.degradation..PATH.ko00626. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 16.333 8.1667 3.7692 0.1519
Residuals 3 6.500 2.1667
Response X01056.Biosynthesis.of.type.II.polyketide.backbone..PATH.ko01056. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 30.333 15.167 1.3788 0.3761
Residuals 3 33.000 11.000
Response X04064.NF.kappa.B.signaling.pathway..PATH.ko04064. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 16.333 8.1667 1.6897 0.3225
Residuals 3 14.500 4.8333
Response X04070.Phosphatidylinositol.signaling.system..PATH.ko04070. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6.3333 3.1667 1.4615 0.3605
Residuals 3 6.5000 2.1667
Response X04390.Hippo.signaling.pathway..PATH.ko04390. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 30.333 15.167 15.167 0.027 *
Residuals 3 3.000 1.000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04022.cGMP...PKG.signaling.pathway..PATH.ko04022. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7 3.5 0.1 0.9077
Residuals 3 105 35.0
Response X04110.Cell.cycle..PATH.ko04110. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 76.0 38.0 8.4444 0.05858 .
Residuals 3 13.5 4.5
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X04550.Signaling.pathways.regulating.pluripotency.of.stem.cells..PATH.ko04550. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 19 9.5000 2.1923 0.2589
Residuals 3 13 4.3333
Response X00062.Fatty.acid.elongation..PATH.ko00062. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 44.333 22.167 1.3571 0.3804
Residuals 3 49.000 16.333
Response X00941.Flavonoid.biosynthesis..PATH.ko00941. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4 2.0000 0.6 0.6037
Residuals 3 10 3.3333
Response X00983.Drug.metabolism...other.enzymes..PATH.ko00983. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.0 0.50000 0.6 0.6037
Residuals 3 2.5 0.83333
Response X04075.Plant.hormone.signal.transduction..PATH.ko04075. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 20.333 10.167 0.1584 0.8602
Residuals 3 192.500 64.167
Response X04114.Oocyte.meiosis..PATH.ko04114. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.3333 0.1667 0.0455 0.9562
Residuals 3 11.0000 3.6667
Response X04130.SNARE.interactions.in.vesicular.transport..PATH.ko04130. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 5.3333 2.6667 5.3333 0.1028
Residuals 3 1.5000 0.5000
Response X04139.Mitophagy...yeast..PATH.ko04139. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.3333 1.1667 1.1667 0.4219
Residuals 3 3.0000 1.0000
Response X04512.ECM.receptor.interaction..PATH.ko04512. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.3333 0.16667 0.1111 0.8984
Residuals 3 4.5000 1.50000
Response X00004.KEGG.modules.in.global.maps.only :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Response X00512.Mucin.type.O.glycan.biosynthesis..PATH.ko00512. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 40.333 20.167 20.167 0.01822 *
Residuals 3 3.000 1.000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response X00513.Various.types.of.N.glycan.biosynthesis..PATH.ko00513. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 7 3.5000 0.8077 0.524
Residuals 3 13 4.3333
Response X00525.Acarbose.and.validamycin.biosynthesis..PATH.ko00525. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4 2.00000 3 0.1925
Residuals 3 2 0.66667
Response X00791.Atrazine.degradation..PATH.ko00791. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 9.3333 4.6667 1.75 0.3136
Residuals 3 8.0000 2.6667
Response X04072.Phospholipase.D.signaling.pathway..PATH.ko04072. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 2.3333 1.1667 0.5 0.6495
Residuals 3 7.0000 2.3333
Response X04115.p53.signaling.pathway..PATH.ko04115. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4 2.0000 0.75 0.5443
Residuals 3 8 2.6667
Response X04216.Ferroptosis..PATH.ko04216. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 4.0 2.0000 1.0909 0.4405
Residuals 3 5.5 1.8333
Response X00196.Photosynthesis...antenna.proteins..PATH.ko00196. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00290.Valine..leucine.and.isoleucine.biosynthesis..PATH.ko00290. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 0.5 0.6495
Residuals 3 1.00000 0.33333
Response X00361.Chlorocyclohexane.and.chlorobenzene.degradation..PATH.ko00361. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00404.Staurosporine.biosynthesis..PATH.ko00404. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 8.3333 4.1667 0.5435 0.6289
Residuals 3 23.0000 7.6667
Response X00460.Cyanoamino.acid.metabolism..PATH.ko00460. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.0 0.50000 3 0.1925
Residuals 3 0.5 0.16667
Response X00533.Glycosaminoglycan.biosynthesis...keratan.sulfate..PATH.ko00533. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Response X00601.Glycosphingolipid.biosynthesis...lacto.and.neolacto.series..PATH.ko00601. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Response X00604.Glycosphingolipid.biosynthesis...ganglio.series..PATH.ko00604. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X00904.Diterpenoid.biosynthesis..PATH.ko00904. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 3 1.5 0.5 0.6495
Residuals 3 9 3.0
Response X00980.Metabolism.of.xenobiotics.by.cytochrome.P450..PATH.ko00980. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Response X00981.Insect.hormone.biosynthesis..PATH.ko00981. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 1.0 0.50000 0.6 0.6037
Residuals 3 2.5 0.83333
Response X00982.Drug.metabolism...cytochrome.P450..PATH.ko00982. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Response X01051.Biosynthesis.of.ansamycins..PATH.ko01051. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 0.2 0.8288
Residuals 3 2.50000 0.83333
Response X04066.HIF.1.signaling.pathway..PATH.ko04066. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Response X04113.Meiosis...yeast..PATH.ko04113. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X04218.Cellular.senescence..PATH.ko04218. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 21.333 10.667 2.6667 0.216
Residuals 3 12.000 4.000
Response X04341.Hedgehog.signaling.pathway...fly..PATH.ko04341. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0 0 NaN NaN
Residuals 3 0 0
Response X04392.Hippo.signaling.pathway...multiple.species..PATH.ko04392. :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.33333 0.16667 1 0.4648
Residuals 3 0.50000 0.16667
Run ANOVA for each category of interest (significant in MANOVA)
## Glycolysis
ws_tIV_stat1<-aov(X00010.Glycolysis...Gluconeogenesis..PATH.ko00010.~Timepoint,data=ws_tIV_stats)
summary.aov(ws_tIV_stat1)
TukeyHSD(ws_tIV_stat1)
## Fatty Acid Metabolism
ws_tIV_stat2<-aov(X01212.Fatty.acid.metabolism..PATH.ko01212.~Timepoint,data=ws_tIV_stats)
summary.aov(ws_tIV_stat2)
TukeyHSD(ws_tIV_stat2)
## Pentose and Glucuronate Interconversions
ws_tIV_stat3<-aov(X00040.Pentose.and.glucuronate.interconversions..PATH.ko00040.~Timepoint,data=ws_tIV_stats)
summary.aov(ws_tIV_stat3)
TukeyHSD(ws_tIV_stat3)
## Steroid Degradation
ws_tIV_stat4<-aov(X00984.Steroid.degradation..PATH.ko00984.~Timepoint,data=ws_tIV_stats)
summary.aov(ws_tIV_stat4)
TukeyHSD(ws_tIV_stat4)
## Bacterial Chemotaxis
ws_tIV_stat5<-aov(X02030.Bacterial.chemotaxis..PATH.ko02030.~Timepoint,data=ws_tIV_stats)
summary.aov(ws_tIV_stat5)
TukeyHSD(ws_tIV_stat5)
## Phosphonate and Phoshinate Metabolism
ws_tIV_stat6<-aov(X00440.Phosphonate.and.phosphinate.metabolism..PATH.ko00440.~Timepoint,data=ws_tIV_stats)
summary.aov(ws_tIV_stat6)
TukeyHSD(ws_tIV_stat6)
## Carbon Fixation Pathways in Prokaryotes
ws_tIV_stat7<-aov(X00720.Carbon.fixation.pathways.in.prokaryotes..PATH.ko00720.~Timepoint,data=ws_tIV_stats)
summary.aov(ws_tIV_stat7)
TukeyHSD(ws_tIV_stat7)
## Mucin Type O Glycan Biosynthesis
ws_tIV_stat8<-aov(X00512.Mucin.type.O.glycan.biosynthesis..PATH.ko00512.~Timepoint,data=ws_tIV_stats)
summary.aov(ws_tIV_stat8)
TukeyHSD(ws_tIV_stat8)
Determine which genes were differentially expressed between experimental treatments using the Bioconductor package edgeR for differential expression analysis on TPM-normalized read counts. The edgeR package implements statistical methods based on generalized linear models (glms), suitable for multifactor experiments of any complexity. A particular feature of edgeR functionality are empirical Bayes methods that permit the estimation of gene-specific biological variation, even for experiments with minimal levels of biological replication.
We have two objectives for our differential expression analysis.
It is possible (and likely) that there are unique genecall ID’s that share identical annotations to KO and Taxonomy, but are still treated as unique entries with separate count values that are factored into DE analysis improperly. + Here, we aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
Our first set of DE analyses focus on determining significant differentially expressed genes between sampling timepoints within Tussock or Wet Sedge mesocosms.
Subset data from the full tpm_expressed dataset. Click on the Show/Hide button to see the steps.
# Subset data from the full tpm_expressed dataset
tuss_T4vsT0<-tpm_tuss_expressed[,c(2:5,8:14)]
# Aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
tuss_T4vsT0<-aggregate(. ~ KO + Symbol + Function + Tier_II + Tier_III + Tier_IV + Taxonomy, data=tuss_T4vsT0,FUN=sum)
## Reading in the data
# Create DGEList for Tuss T4 vs T0 subdata
tuss_T4vsT0<-DGEList(counts=tuss_T4vsT0[,8:11], genes=tuss_T4vsT0[,1:7])
rownames(tuss_T4vsT0$counts)<-rownames(tuss_T4vsT0$genes)<-tuss_T4vsT0$genes$ID
tuss_T4vsT0$genes$ID<-NULL
## Normalization
# Tell edgeR that you don't want to normalize the subdata (already TPM normalized)
tuss_T4vsT0<-calcNormFactors(tuss_T4vsT0, method="none")
tuss_T4vsT0$samples
## Data Exploration
# The plotMDS function produces a plot in which distances between samples correspond to leading biological coefficient of variation (BCV) between those samples
plotMDS(tuss_T4vsT0, col=rep(1:2, each=2))
## The Design Matrix
# Before we fit negative binomial GLMs, we need to define our design matrix based on the experimental design. Here, we want to test for differential expression between T4 and T0 timepoints within mesocosms, i.e., adjusting for differences between mesocosms. In statistics, this is an additive linear model with mesocosm as the blocking factor.
Mesocosm_tuss_T4vsT0<-factor(c(1,2,1,2))
Timepoint_tuss_T4vsT0<-factor(c("T0","T0","T4","T4"))
data.frame(Sample_tuss_T4vsT0=colnames(tuss_T4vsT0),Mesocosm_tuss_T4vsT0,Timepoint_tuss_T4vsT0)
tuss_T4vsT0_design<-model.matrix(~Mesocosm_tuss_T4vsT0+Timepoint_tuss_T4vsT0)
rownames(tuss_T4vsT0_design)<-colnames(tuss_T4vsT0)
tuss_T4vsT0_design
(Intercept) Mesocosm_tuss_T4vsT02 Timepoint_tuss_T4vsT0T4
Tuss1_T0 1 0 0
Tuss2_T0 1 1 0
Tuss1_T4 1 0 1
Tuss2_T4 1 1 1
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$Mesocosm_tuss_T4vsT0
[1] "contr.treatment"
attr(,"contrasts")$Timepoint_tuss_T4vsT0
[1] "contr.treatment"
## Estimating Dispersion
# We estimate the NB dispersion for the dataset
tuss_T4vsT0<-estimateDisp(tuss_T4vsT0,tuss_T4vsT0_design,robust=TRUE)
tuss_T4vsT0$common.dispersion
[1] 0.236728
# The square root of the common dispersion gives the coefficient of variation of biological variation.
sqrt(tuss_T4vsT0$common.dispersion)
[1] 0.486547
# View the dispersion estimates in a BCV plot
plotBCV(tuss_T4vsT0)
## Differential Expression
# Now we proceed to determine differentially expressed genes by fitting genewise glms
tuss_T4vsT0_fit<-glmFit(tuss_T4vsT0,tuss_T4vsT0_design)
# Conduct likelihood ratio tests for T4 vs T0 differences and show the top genes. The genewise tests are for T4 vs T0 differential expression, adjusting for baseline differences between the two mesocosms. The tests can be viewed as analogous to paired t-tests. The top DE tags have tiny p-values and FDR values, as well as large fold changes.
tuss_T4vsT0_lrt<-glmLRT(tuss_T4vsT0_fit)
tuss_T4vsT0_topTags<-topTags(tuss_T4vsT0_lrt,n=25)
tuss_T4vsT0_topTags
Coefficient: Timepoint_tuss_T4vsT0T4
## Here's the TPM-normalized values in individual samples for all significant genes.
# We see that all the significant genes have consistent T4 vs T0 changes for the two mesocosms.
tuss_T4vsT0_pvalue<-order(tuss_T4vsT0_lrt$table$PValue)
cpm(tuss_T4vsT0)[tuss_T4vsT0_pvalue[1:25],]
Tuss1_T0 Tuss2_T0 Tuss1_T4 Tuss2_T4
[1,] 1048.152018 1774.305826 1.2689433 65.9270105
[2,] 491.062982 974.471374 3.6118199 20.2399628
[3,] 151.179465 96.610305 0.0000000 0.0000000
[4,] 472.155603 666.441032 3.7921111 22.4112348
[5,] 327.422538 480.914542 2.3328049 14.5438981
[6,] 50.649133 135.863538 0.0000000 0.0000000
[7,] 643.602116 384.955662 13.5353953 11.3111135
[8,] 2.581553 4.114011 173.7320520 257.2741995
[9,] 110.027568 104.546640 0.0000000 0.9614356
[10,] 141.029067 72.092588 0.4209765 0.4374304
[11,] 1444.859709 1919.924360 30.4546395 90.9792745
[12,] 689.459434 854.779215 22.5192757 24.6994152
[13,] 493.675976 276.418040 4.8411986 13.4155275
[14,] 89.129180 418.186558 4.4145468 1.0858566
[15,] 372.564861 858.111261 2.1339780 52.7189038
[16,] 182.135619 761.371698 12.1623908 4.6519282
[17,] 546.497859 303.658138 9.4792423 13.0406336
[18,] 282.348034 150.179706 2.1903303 6.0432310
[19,] 0.000000 0.000000 91.4427222 52.9218229
[20,] 585.713144 336.433517 5.1643042 24.6842993
[21,] 0.000000 7.137799 152.0557650 161.6181304
[22,] 241.604764 349.070365 3.7028266 16.1795628
[23,] 3.028716 6.272798 118.4347090 271.9708940
[24,] 17.789144 3.778794 238.6530733 464.0529017
[25,] 212.526367 375.417478 12.1126407 6.2930328
Plotting significant KOs in DGE analysis.
# The total number of differentially expressed genes at 5% FDR is given by:
tuss_T4vsT0_FDR<-p.adjust(tuss_T4vsT0_lrt$table$PValue, method="BH")
sum(tuss_T4vsT0_FDR<0.05)
[1] 1255
summary(decideTests(tuss_T4vsT0_lrt))
Timepoint_tuss_T4vsT0T4
Down 588
NotSig 100055
Up 667
# Plot the log-fold change against log-transcripts per million, with DE genes highlighted.
tuss_T4vsT0_MDplot<-plotMD(tuss_T4vsT0_lrt, hl.col=c("green3","palegreen1"), main = NULL, xlab="Average log TPM") + abline(h=0,col="gray", lty="dashed")
tuss_T4vsT0_MDplot
integer(0)
Now analyze the differentially expressed genes in the Tussock dataset between timepoints T24 (mixed redox) and T4 (oxic redox). Click on the Show/Hide button to see the steps.
# Subset data from the full tpm_expressed dataset
tuss_T24vsT4<-tpm_tuss_expressed[,c(4:7,8:14)]
# Aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
tuss_T24vsT4<-aggregate(. ~ KO + Symbol + Function + Tier_II + Tier_III + Tier_IV + Taxonomy, data=tuss_T24vsT4, FUN=sum)
## Reading in the data
# Create DGEList for WS T4 vs T0 subdata
tuss_T24vsT4<-DGEList(counts=tuss_T24vsT4[,8:11], genes=tuss_T24vsT4[,1:7])
rownames(tuss_T24vsT4$counts)<-rownames(tuss_T24vsT4$genes)<-tuss_T24vsT4$genes$ID
tuss_T24vsT4$genes$ID<-NULL
## Normalization
# Tell edgeR that you don't want to normalize the subdata (already TPM normalized)
tuss_T24vsT4<-calcNormFactors(tuss_T24vsT4, method="none")
tuss_T24vsT4$samples
## Data Exploration
# The plotMDS function produces a plot in which distances between samples correspond to leading biological coefficient of variation (BCV) between those samples
plotMDS(tuss_T24vsT4, col=rep(1:2, each=2))
## The Design Matrix
# Before we fit negative binomial GLMs, we need to define our design matrix based on the experimental design. Here, we want to test for differential expression between T4 and T0 timepoints within mesocosms, i.e., adjusting for differences between mesocosms. In statistics, this is an additive linear model with mesocosm as the blocking factor.
Mesocosm_tuss_T24vsT4<-factor(c(1,2,1,2))
Timepoint_tuss_T24vsT4<-factor(c("T4","T4","T24","T24"))
data.frame(Sample_tuss_T24vsT4=colnames(tuss_T24vsT4),Mesocosm_tuss_T24vsT4,Timepoint_tuss_T24vsT4)
tuss_T24vsT4_design<-model.matrix(~Mesocosm_tuss_T24vsT4+Timepoint_tuss_T24vsT4)
rownames(tuss_T24vsT4_design)<-colnames(tuss_T24vsT4)
tuss_T24vsT4_design
(Intercept) Mesocosm_tuss_T24vsT42 Timepoint_tuss_T24vsT4T4
Tuss1_T4 1 0 1
Tuss2_T4 1 1 1
Tuss1_T24 1 0 0
Tuss3_T24 1 1 0
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$Mesocosm_tuss_T24vsT4
[1] "contr.treatment"
attr(,"contrasts")$Timepoint_tuss_T24vsT4
[1] "contr.treatment"
## Estimating Dispersion
# We estimate the NB dispersion for the dataset
tuss_T24vsT4<-estimateDisp(tuss_T24vsT4,tuss_T24vsT4_design,robust=TRUE)
tuss_T24vsT4$common.dispersion
[1] 0.5193338
# The square root of the common dispersion gives the coefficient of variation of biological variation.
sqrt(tuss_T24vsT4$common.dispersion)
[1] 0.7206482
# View the dispersion estimates in a BCV plot
plotBCV(tuss_T24vsT4)
## Differential Expression
# Now we proceed to determine differentially expressed genes by fitting genewise glms
tuss_T24vsT4_fit<-glmFit(tuss_T24vsT4,tuss_T24vsT4_design)
# Conduct likelihood ratio tests for T4 vs T0 differences and show the top genes. The genewise tests are for T4 vs T0 differential expression, adjusting for baseline differences between the two mesocosms. The tests can be viewed as analogous to paired t-tests. The top DE tags have tiny p-values and FDR values, as well as large fold changes.
tuss_T24vsT4_lrt<-glmLRT(tuss_T24vsT4_fit)
topTags(tuss_T24vsT4_lrt,n=25)
Coefficient: Timepoint_tuss_T24vsT4T4
# Here's the TPM-normalized values in individual samples.
# We see that all the genes have consistent changes between replicates within each timepoint.
tuss_T24vsT4_pvalue<-order(tuss_T24vsT4_lrt$table$PValue)
cpm(tuss_T24vsT4)[tuss_T24vsT4_pvalue[1:25],]
Tuss1_T4 Tuss2_T4 Tuss1_T24 Tuss3_T24
[1,] 0.0000000 0.0000000 10.215879 722.38388
[2,] 0.0000000 0.0000000 0.000000 1963.01758
[3,] 35.3894190 73.2291623 0.000000 0.00000
[4,] 31.5396779 58.8565775 0.000000 0.00000
[5,] 0.0000000 1.9431119 3.958524 1163.51858
[6,] 0.0000000 3.9783541 14.748411 733.37655
[7,] 0.0000000 0.0000000 15.192393 88.88862
[8,] 0.0000000 0.5099327 21.189145 114.56599
[9,] 1.3062652 0.0000000 14.100129 1553.90747
[10,] 0.0000000 0.0000000 7.102287 104.18310
[11,] 14.5022093 56.5088661 0.000000 0.00000
[12,] 0.5843818 0.0000000 13.404399 215.90577
[13,] 0.0000000 0.0000000 8.125498 88.63034
[14,] 26.8856979 31.6024982 0.000000 0.00000
[15,] 0.0000000 0.0000000 0.000000 321.59614
[16,] 0.0000000 0.0000000 0.000000 288.87218
[17,] 0.0000000 0.0000000 0.000000 284.71091
[18,] 36.4821202 18.1299279 0.000000 0.00000
[19,] 0.0000000 0.0000000 6.784024 78.25073
[20,] 0.0000000 0.0000000 3.436922 114.02166
[21,] 11.1032540 51.0934331 0.000000 0.00000
[22,] 0.0000000 0.0000000 8.225075 63.64126
[23,] 31.1615468 20.0268843 0.000000 0.00000
[24,] 0.4332977 0.0000000 1.169279 745.12664
[25,] 0.0000000 5.3661520 25.967738 225.39614
Plotting significant KOs in DGE analysis.
# The total number of differentially expressed genes at 5% FDR is given by:
tuss_T24vsT4_FDR<-p.adjust(tuss_T24vsT4_lrt$table$PValue, method="BH")
sum(tuss_T24vsT4_FDR<0.05)
[1] 7
summary(decideTests(tuss_T24vsT4_lrt))
Timepoint_tuss_T24vsT4T4
Down 5
NotSig 101303
Up 2
# Plot the log-fold change against log-transcripts per million, with DE genes highlighted.
tuss_T24vsT4_MDplot<-plotMD(tuss_T24vsT4_lrt, hl.col=c("darkgreen","green3"), main = NULL, xlab="Average log TPM") + abline(h=0,col="gray", lty="dashed")
tuss_T24vsT4_MDplot
integer(0)
Finally, analyze the differentially expressed genes in the Tussock dataset between timepoints T24 (mixed redox) and T0 (anoxic redox). Click on the Show/Hide button to see the steps.
# Subset data from the full tpm_expressed dataset
tuss_T24vsT0<-tpm_tuss_expressed[,c(2:3,6:14)]
# Aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
tuss_T24vsT0<-aggregate(. ~ KO + Symbol + Function + Tier_II + Tier_III + Tier_IV + Taxonomy, data=tuss_T24vsT0, FUN=sum)
## Reading in the data
# Create DGEList for Tuss T24 vs T0 subdata
tuss_T24vsT0<-DGEList(counts=tuss_T24vsT0[,8:11], genes=tuss_T24vsT0[,1:7])
rownames(tuss_T24vsT0$counts)<-rownames(tuss_T24vsT0$genes)<-tuss_T24vsT0$genes$ID
tuss_T24vsT0$genes$ID<-NULL
## Normalization
# Tell edgeR that you don't want to normalize the subdata (already TPM normalized)
tuss_T24vsT0<-calcNormFactors(tuss_T24vsT0, method="none")
tuss_T24vsT0$samples
## Data Exploration
# The plotMDS function produces a plot in which distances between samples correspond to leading biological coefficient of variation (BCV) between those samples
plotMDS(tuss_T24vsT0, col=rep(1:2, each=2))
## The Design Matrix
# Before we fit negative binomial GLMs, we need to define our design matrix based on the experimental design. Here, we want to test for differential expression between T4 and T0 timepoints within mesocosms, i.e., adjusting for differences between mesocosms. In statistics, this is an additive linear model with mesocosm as the blocking factor.
Mesocosm_tuss_T24vsT0<-factor(c(1,2,1,2))
Timepoint_tuss_T24vsT0<-factor(c("T0","T0","T24","T24"))
data.frame(Sample_tuss_T24vsT0=colnames(tuss_T24vsT0),Mesocosm_tuss_T24vsT0,Timepoint_tuss_T24vsT0)
tuss_T24vsT0_design<-model.matrix(~Mesocosm_tuss_T24vsT0+Timepoint_tuss_T24vsT0)
rownames(tuss_T24vsT0_design)<-colnames(tuss_T24vsT0)
tuss_T24vsT0_design
(Intercept) Mesocosm_tuss_T24vsT02 Timepoint_tuss_T24vsT0T24
Tuss1_T0 1 0 0
Tuss2_T0 1 1 0
Tuss1_T24 1 0 1
Tuss3_T24 1 1 1
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$Mesocosm_tuss_T24vsT0
[1] "contr.treatment"
attr(,"contrasts")$Timepoint_tuss_T24vsT0
[1] "contr.treatment"
## Estimating Dispersion
# We estimate the NB dispersion for the dataset
tuss_T24vsT0<-estimateDisp(tuss_T24vsT0,tuss_T24vsT0_design,robust=TRUE)
tuss_T24vsT0$common.dispersion
[1] 0.4882395
# The square root of the common dispersion gives the coefficient of variation of biological variation.
sqrt(tuss_T24vsT0$common.dispersion)
[1] 0.6987414
# View the dispersion estimates in a BCV plot
plotBCV(tuss_T24vsT0)
## Differential Expression
# Now we proceed to determine differentially expressed genes by fitting genewise glms
tuss_T24vsT0_fit<-glmFit(tuss_T24vsT0,tuss_T24vsT0_design)
# Conduct likelihood ratio tests for T4 vs T0 differences and show the top genes. The genewise tests are for T4 vs T0 differential expression, adjusting for baseline differences between the two mesocosms. The tests can be viewed as analogous to paired t-tests. The top DE tags have tiny p-values and FDR values, as well as large fold changes.
tuss_T24vsT0_lrt<-glmLRT(tuss_T24vsT0_fit)
topTags(tuss_T24vsT0_lrt,n=25)
Coefficient: Timepoint_tuss_T24vsT0T24
## Here's the TPM-normalized values in individual samples.
# We see that all the genes have consistent changes between replicates within each timepoint.
tuss_T24vsT0_pvalue<-order(tuss_T24vsT0_lrt$table$PValue)
cpm(tuss_T24vsT0)[tuss_T24vsT0_pvalue[1:25],]
Tuss1_T0 Tuss2_T0 Tuss1_T24 Tuss3_T24
[1,] 0.0000000 0.000000 92.4565621 56.6710283
[2,] 337.7932001 381.234729 7.2323939 0.0000000
[3,] 0.0000000 0.000000 108.7021596 23.0637906
[4,] 0.0000000 1.650736 3.9585235 1163.5185752
[5,] 0.0000000 0.000000 9.2108401 133.6818168
[6,] 15.2509805 86.918297 0.0000000 0.0000000
[7,] 585.7131436 336.433517 26.4787312 0.0000000
[8,] 49.9500257 191.303929 1.8824780 0.0000000
[9,] 96.7699731 162.191889 3.0698703 0.7513205
[10,] 0.0000000 0.000000 41.8527647 31.4839046
[11,] 126.2657034 433.131040 7.3045031 0.0000000
[12,] 0.0000000 0.000000 35.6699699 36.4927076
[13,] 0.0000000 1.438715 2.1991027 807.2903774
[14,] 315.6567717 155.948737 14.4759443 0.0000000
[15,] 0.0000000 0.000000 7.6305073 105.7030151
[16,] 0.0000000 0.000000 89.1200480 9.2470209
[17,] 0.0000000 0.000000 0.0000000 321.5961350
[18,] 0.0000000 0.000000 7.1022873 104.1831024
[19,] 0.0000000 0.000000 0.0000000 284.7109075
[20,] 0.7472161 0.000000 13.4043992 215.9057715
[21,] 0.0000000 0.000000 6.5671835 98.8037852
[22,] 151.1794651 96.610305 6.9144865 0.0000000
[23,] 0.0000000 0.000000 44.8298004 20.6469742
[24,] 119.5226404 15.710050 1.0995514 0.0000000
[25,] 57.6711436 29.251391 0.6209902 0.0000000
Plotting significant KOs in DGE analysis.
# The total number of differentially expressed genes at 5% FDR is given by:
tuss_T24vsT0_FDR<-p.adjust(tuss_T24vsT0_lrt$table$PValue, method="BH")
sum(tuss_T24vsT0_FDR<0.05)
[1] 142
summary(decideTests(tuss_T24vsT0_lrt))
Timepoint_tuss_T24vsT0T24
Down 67
NotSig 101168
Up 75
# Plot the log-fold change against log-transcripts per million, with DE genes highlighted.
tuss_T24vsT0_MDplot<-plotMD(tuss_T24vsT0_lrt, hl.col=c("darkgreen","palegreen1"), main = NULL, xlab="Average log TPM") + abline(h=0,col="gray", lty="dashed")
tuss_T24vsT0_MDplot
integer(0)
Start by analysing differentially expressed genes in the Wet Sedge dataset between timepoints T4 (oxic redox) and T0 (anoxic redox). Click on the Show/Hide button to see the steps.
# Subset data from the full tpm_expressed dataset
ws_T4vsT0<-tpm_ws_expressed[,c(2:5,8:14)]
# Aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
ws_T4vsT0<-aggregate(. ~ KO + Symbol + Function + Tier_II + Tier_III + Tier_IV + Taxonomy, data=ws_T4vsT0, FUN=sum)
## Reading in the data
# Create DGEList for WS T4 vs T0 subdata
ws_T4vsT0<-DGEList(counts=ws_T4vsT0[,8:11], genes=ws_T4vsT0[,1:7])
rownames(ws_T4vsT0$counts)<-rownames(ws_T4vsT0$genes)<-ws_T4vsT0$genes$ID
ws_T4vsT0$genes$ID<-NULL
## Normalization
# Tell edgeR that you don't want to normalize the subdata (already TPM normalized)
ws_T4vsT0<-calcNormFactors(ws_T4vsT0, method="none")
ws_T4vsT0$samples
## Data Exploration
# The plotMDS function produces a plot in which distances between samples correspond to leading biological coefficient of variation (BCV) between those samples
plotMDS(ws_T4vsT0, col=rep(1:2, each=2))
## The Design Matrix
# Before we fit negative binomial GLMs, we need to define our design matrix based on the experimental design. Here, we want to test for differential expression between T4 and T0 timepoints within mesocosms, i.e., adjusting for differences between mesocosms. In statistics, this is an additive linear model with mesocosm as the blocking factor.
Mesocosm_ws_T4vsT0<-factor(c(1,2,1,2))
Timepoint_ws_T4vsT0<-factor(c("T0","T0","T4","T4"))
data.frame(Sample_ws_T4vsT0=colnames(ws_T4vsT0),Mesocosm_ws_T4vsT0,Timepoint_ws_T4vsT0)
ws_T4vsT0_design<-model.matrix(~Mesocosm_ws_T4vsT0+Timepoint_ws_T4vsT0)
rownames(ws_T4vsT0_design)<-colnames(ws_T4vsT0)
ws_T4vsT0_design
(Intercept) Mesocosm_ws_T4vsT02 Timepoint_ws_T4vsT0T4
WS1_T0 1 0 0
WS3_T0 1 1 0
WS1_T4 1 0 1
WS2_T4 1 1 1
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$Mesocosm_ws_T4vsT0
[1] "contr.treatment"
attr(,"contrasts")$Timepoint_ws_T4vsT0
[1] "contr.treatment"
## Estimating Dispersion
# We estimate the NB dispersion for the dataset
ws_T4vsT0<-estimateDisp(ws_T4vsT0,ws_T4vsT0_design,robust=TRUE)
ws_T4vsT0$common.dispersion
[1] 0.1691577
# The square root of the common dispersion gives the coefficient of variation of biological variation.
sqrt(ws_T4vsT0$common.dispersion)
[1] 0.4112878
# View the dispersion estimates in a BCV plot
plotBCV(ws_T4vsT0)
## Differential Expression
# Now we proceed to determine differentially expressed genes by fitting genewise glms
ws_T4vsT0_fit<-glmFit(ws_T4vsT0,ws_T4vsT0_design)
# Conduct likelihood ratio tests for T4 vs T0 differences and show the top genes. The genewise tests are for T4 vs T0 differential expression, adjusting for baseline differences between the two mesocosms. The tests can be viewed as analogous to paired t-tests. The top DE tags have tiny p-values and FDR values, as well as large fold changes.
ws_T4vsT0_lrt<-glmLRT(ws_T4vsT0_fit)
ws_T4vsT0_topTags<-topTags(ws_T4vsT0_lrt,n=25)
ws_T4vsT0_topTags
Coefficient: Timepoint_ws_T4vsT0T4
## Here's the TPM-normalized values in individual samples.
# We see that all the significant genes have consistent T4 vs T0 changes for the two mesocosms.
ws_T4vsT0_pvalue<-order(ws_T4vsT0_lrt$table$PValue)
ws_T4vsT0_cpm<-cpm(ws_T4vsT0)[ws_T4vsT0_pvalue[1:25],]
ws_T4vsT0_cpm
WS1_T0 WS3_T0 WS1_T4 WS2_T4
[1,] 10.8192790 5.438542 850.46554 731.47651
[2,] 0.0000000 0.000000 118.55017 89.97818
[3,] 6.9056096 6.464740 475.57345 363.29120
[4,] 5.7694309 5.730429 422.62086 272.59995
[5,] 15.1291717 17.965751 874.04336 490.04976
[6,] 0.8681561 0.000000 105.27061 67.14878
[7,] 29.2836105 51.861640 1381.44762 995.31097
[8,] 10.5554705 4.117992 213.40063 288.12586
[9,] 7.4930143 15.995405 299.82503 298.78618
[10,] 7.1355300 5.585100 175.21433 172.58804
[11,] 0.0000000 0.000000 41.01244 42.20746
[12,] 24.6197776 18.997054 621.69712 336.14915
[13,] 3.7564449 1.690089 135.21129 70.08873
[14,] 2.5720938 1.941487 117.80967 65.52927
[15,] 1.1226157 0.401687 32.32747 106.03918
[16,] 2.8464136 3.119110 171.84005 46.07958
[17,] 23.4717067 23.143571 248.77496 519.14558
[18,] 1.4310266 110.216730 0.00000 0.00000
[19,] 0.2292666 1.122721 59.81417 27.64980
[20,] 3.6832384 5.816958 117.74292 77.78665
[21,] 5.1724528 6.607292 142.89813 83.19582
[22,] 0.4970360 1.867385 65.49113 31.31506
[23,] 6.3790717 6.574605 128.60327 95.40070
[24,] 0.4120994 1.548275 16.93296 82.74532
[25,] 26.1490232 14.909863 293.77502 245.33748
Plotting significant KOs in DGE analysis.
# The total number of differentially expressed genes at 5% FDR is given by:
ws_T4vsT0_FDR<-p.adjust(ws_T4vsT0_lrt$table$PValue, method="BH")
sum(ws_T4vsT0_FDR<0.05)
[1] 91
summary(decideTests(ws_T4vsT0_lrt))
Timepoint_ws_T4vsT0T4
Down 13
NotSig 139939
Up 78
# Plot the log-fold change against log-transcripts per million, with DE genes highlighted.
ws_T4vsT0_MDplot<-plotMD(ws_T4vsT0_lrt, hl.col=c("dodgerblue1","lightskyblue1"), main = NULL, xlab="Average log TPM") + abline(h=0,col="gray", lty="dashed")
ws_T4vsT0_MDplot
integer(0)
Now analyze the differentially expressed genes in the Wet Sedge dataset between timepoints T24 (mixed redox) and T4 (oxic redox). Click on the Show/Hide button to see the steps.
# Subset data from the full tpm_expressed dataset
ws_T24vsT4<-tpm_ws_expressed[,c(4:7,8:14)]
# Aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
ws_T24vsT4<-aggregate(. ~ KO + Symbol + Function + Tier_II + Tier_III + Tier_IV + Taxonomy, data=ws_T24vsT4, FUN=sum)
## Reading in the data
# Create DGEList for WS T24 vs T4 subdata
ws_T24vsT4<-DGEList(counts=ws_T24vsT4[,8:11], genes=ws_T24vsT4[,1:7])
rownames(ws_T24vsT4$counts)<-rownames(ws_T24vsT4$genes)<-ws_T24vsT4$genes$ID
ws_T24vsT4$genes$ID<-NULL
## Normalization
# Tell edgeR that you don't want to normalize the subdata (already TPM normalized)
ws_T24vsT4<-calcNormFactors(ws_T24vsT4, method="none")
ws_T24vsT4$samples
## Data Exploration
# The plotMDS function produces a plot in which distances between samples correspond to leading biological coefficient of variation (BCV) between those samples
plotMDS(ws_T24vsT4, col=rep(1:2, each=2))
## The Design Matrix
# Before we fit negative binomial GLMs, we need to define our design matrix based on the experimental design. Here, we want to test for differential expression between T4 and T0 timepoints within mesocosms, i.e., adjusting for differences between mesocosms. In statistics, this is an additive linear model with mesocosm as the blocking factor.
Mesocosm_ws_T24vsT4<-factor(c(1,2,1,2))
Timepoint_ws_T24vsT4<-factor(c("T4","T4","T24","T24"))
data.frame(Sample_ws_T24vsT4=colnames(ws_T24vsT4),Mesocosm_ws_T24vsT4,Timepoint_ws_T24vsT4)
ws_T24vsT4_design<-model.matrix(~Mesocosm_ws_T24vsT4+Timepoint_ws_T24vsT4)
rownames(ws_T24vsT4_design)<-colnames(ws_T24vsT4)
ws_T24vsT4_design
(Intercept) Mesocosm_ws_T24vsT42 Timepoint_ws_T24vsT4T4
WS1_T4 1 0 1
WS2_T4 1 1 1
WS1_T24 1 0 0
WS3_T24 1 1 0
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$Mesocosm_ws_T24vsT4
[1] "contr.treatment"
attr(,"contrasts")$Timepoint_ws_T24vsT4
[1] "contr.treatment"
## Estimating Dispersion
# We estimate the NB dispersion for the dataset
ws_T24vsT4<-estimateDisp(ws_T24vsT4,ws_T24vsT4_design,robust=TRUE)
ws_T24vsT4$common.dispersion
[1] 0.6092723
# The square root of the common dispersion gives the coefficient of variation of biological variation.
sqrt(ws_T24vsT4$common.dispersion)
[1] 0.780559
# View the dispersion estimates in a BCV plot
plotBCV(ws_T24vsT4)
## Differential Expression
# Now we proceed to determine differentially expressed genes by fitting genewise glms
ws_T24vsT4_fit<-glmFit(ws_T24vsT4,ws_T24vsT4_design)
# Conduct likelihood ratio tests for T4 vs T0 differences and show the top genes. The genewise tests are for T4 vs T0 differential expression, adjusting for baseline differences between the two mesocosms. The tests can be viewed as analogous to paired t-tests. The top DE tags have tiny p-values and FDR values, as well as large fold changes.
ws_T24vsT4_lrt<-glmLRT(ws_T24vsT4_fit)
ws_T24vsT4_topTags<-topTags(ws_T24vsT4_lrt,n=25)
ws_T24vsT4_topTags
Coefficient: Timepoint_ws_T24vsT4T4
## Here's the TPM-normalized values in individual samples.
# We see that all the genes have consistent changes between replicates within each timepoint.
ws_T24vsT4_pvalue<-order(ws_T24vsT4_lrt$table$PValue)
ws_T24vsT4_cpm<-cpm(ws_T24vsT4)[ws_T24vsT4_pvalue[1:25],]
ws_T24vsT4_cpm
WS1_T4 WS2_T4 WS1_T24 WS3_T24
[1,] 0.0000000 0.0000000 12.159331 82.77885
[2,] 0.0000000 0.0000000 4.111889 162.13060
[3,] 0.0000000 0.0000000 22.147203 37.69657
[4,] 0.0000000 0.0000000 18.978543 37.77894
[5,] 0.0000000 0.0000000 2.268628 151.83192
[6,] 1.8015202 161.9678928 0.000000 0.00000
[7,] 0.0000000 0.0000000 2.860444 122.43272
[8,] 0.0000000 0.0000000 2.436675 125.71781
[9,] 4.2492077 73.8630558 0.000000 0.00000
[10,] 0.3962099 0.0000000 15.973034 55.79849
[11,] 0.0000000 0.0000000 6.265735 52.01182
[12,] 0.0000000 0.0000000 0.000000 170.39072
[13,] 0.0000000 0.0000000 2.328857 82.99674
[14,] 0.0000000 0.0000000 4.090584 60.48395
[15,] 0.0000000 0.0000000 4.337817 56.26279
[16,] 12.1054398 28.3149820 0.000000 0.00000
[17,] 0.0000000 0.0000000 0.000000 144.85609
[18,] 0.0000000 0.0000000 0.000000 136.53103
[19,] 1.0779240 2.7133367 35.319373 135.32165
[20,] 0.0000000 0.0000000 0.000000 128.18747
[21,] 0.0000000 0.0000000 11.783323 24.45332
[22,] 37.4142554 6.2216186 0.000000 0.00000
[23,] 0.0000000 0.0000000 2.741259 55.46573
[24,] 0.0000000 0.6881651 1.430222 164.56122
[25,] 2.3518342 0.0000000 24.815653 142.61101
Plotting significant KOs in DGE analysis.
# The total number of differentially expressed genes at 5% FDR is given by:
ws_T24vsT4_FDR<-p.adjust(ws_T24vsT4_lrt$table$PValue, method="BH")
sum(ws_T24vsT4_FDR<0.05)
[1] 0
summary(decideTests(ws_T24vsT4_lrt))
Timepoint_ws_T24vsT4T4
Down 0
NotSig 140030
Up 0
# Plot the log-fold change against log-transcripts per million, with DE genes highlighted.
ws_T24vsT4_MDplot<-plotMD(ws_T24vsT4_lrt, hl.col=c("midnightblue","dodgerblue1"), main = NULL, xlab="Average log TPM") + abline(h=0,col="gray", lty="dashed")
ws_T24vsT4_MDplot
integer(0)
Finally, analyze the differentially expressed genes in the Wet Sedge dataset between timepoints T24 (mixed redox) and T0 (anoxic redox). Click on the Show/Hide button to see the steps.
# Subset data from the full tpm_expressed dataset
ws_T24vsT0<-tpm_ws_expressed[,c(2:3,6:14)]
# Aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
ws_T24vsT0<-aggregate(. ~ KO + Symbol + Function + Tier_II + Tier_III + Tier_IV + Taxonomy, data=ws_T24vsT0, FUN=sum)
## Reading in the data
# Create DGEList for WS T24 vs T0 subdata
ws_T24vsT0<-DGEList(counts=ws_T24vsT0[,8:11], genes=ws_T24vsT0[,1:7])
rownames(ws_T24vsT0$counts)<-rownames(ws_T24vsT0$genes)<-ws_T24vsT0$genes$ID
ws_T24vsT0$genes$ID<-NULL
## Normalization
# Tell edgeR that you don't want to normalize the subdata (already TPM normalized)
ws_T24vsT0<-calcNormFactors(ws_T24vsT0, method="none")
ws_T24vsT0$samples
## Data Exploration
# The plotMDS function produces a plot in which distances between samples correspond to leading biological coefficient of variation (BCV) between those samples
plotMDS(ws_T24vsT0, col=rep(1:2, each=2))
## The Design Matrix
# Before we fit negative binomial GLMs, we need to define our design matrix based on the experimental design. Here, we want to test for differential expression between T4 and T0 timepoints within mesocosms, i.e., adjusting for differences between mesocosms. In statistics, this is an additive linear model with mesocosm as the blocking factor.
Mesocosm_ws_T24vsT0<-factor(c(1,2,1,2))
Timepoint_ws_T24vsT0<-factor(c("T0","T0","T24","T24"))
data.frame(Sample_ws_T24vsT0=colnames(ws_T24vsT0),Mesocosm_ws_T24vsT0,Timepoint_ws_T24vsT0)
ws_T24vsT0_design<-model.matrix(~Mesocosm_ws_T24vsT0+Timepoint_ws_T24vsT0)
rownames(ws_T24vsT0_design)<-colnames(ws_T24vsT0)
ws_T24vsT0_design
(Intercept) Mesocosm_ws_T24vsT02 Timepoint_ws_T24vsT0T24
WS1_T0 1 0 0
WS3_T0 1 1 0
WS1_T24 1 0 1
WS3_T24 1 1 1
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$Mesocosm_ws_T24vsT0
[1] "contr.treatment"
attr(,"contrasts")$Timepoint_ws_T24vsT0
[1] "contr.treatment"
## Estimating Dispersion
# We estimate the NB dispersion for the dataset
ws_T24vsT0<-estimateDisp(ws_T24vsT0,ws_T24vsT0_design,robust=TRUE)
ws_T24vsT0$common.dispersion
[1] 0.4789872
# The square root of the common dispersion gives the coefficient of variation of biological variation.
sqrt(ws_T24vsT0$common.dispersion)
[1] 0.692089
# View the dispersion estimates in a BCV plot
plotBCV(ws_T24vsT0)
## Differential Expression
# Now we proceed to determine differentially expressed genes by fitting genewise glms
ws_T24vsT0_fit<-glmFit(ws_T24vsT0,ws_T24vsT0_design)
# Conduct likelihood ratio tests for T4 vs T0 differences and show the top genes. The genewise tests are for T4 vs T0 differential expression, adjusting for baseline differences between the two mesocosms. The tests can be viewed as analogous to paired t-tests. The top DE tags have tiny p-values and FDR values, as well as large fold changes.
ws_T24vsT0_lrt<-glmLRT(ws_T24vsT0_fit)
topTags(ws_T24vsT0_lrt,n=25)
Coefficient: Timepoint_ws_T24vsT0T24
## Here's the TPM-normalized values in individual samples.
# We see that all the genes have consistent changes between replicates within each timepoint.
ws_T24vsT0_pvalue<-order(ws_T24vsT0_lrt$table$PValue)
cpm(ws_T24vsT0)[ws_T24vsT0_pvalue[1:25],]
WS1_T0 WS3_T0 WS1_T24 WS3_T24
[1,] 6.9056096 6.4647402 475.223925 409.34166
[2,] 0.0000000 0.0000000 108.604055 12.79978
[3,] 0.0000000 0.0000000 16.945966 65.16254
[4,] 0.0000000 0.0000000 12.159331 82.77885
[5,] 36.8022714 29.0716593 0.000000 0.00000
[6,] 0.0000000 0.0000000 17.230772 57.06411
[7,] 0.0000000 0.0000000 87.413579 10.02501
[8,] 3.6818737 3.0701332 273.763785 121.50411
[9,] 0.7155133 0.0000000 143.576377 13.62810
[10,] 18.8100498 48.0194480 0.000000 0.00000
[11,] 0.0000000 0.0000000 1.445939 221.80304
[12,] 0.6576940 0.8236612 71.106601 39.30439
[13,] 0.0000000 0.0000000 14.446244 38.72049
[14,] 7.1733241 1.7275945 136.991697 185.44980
[15,] 0.0000000 0.0000000 7.926532 56.75085
[16,] 1.7337742 4.3296881 621.071942 52.58268
[17,] 24.6197776 18.9970545 1192.944780 428.00608
[18,] 0.0000000 1.1648923 18.797206 73.14162
[19,] 0.0000000 0.0000000 0.000000 170.39072
[20,] 0.0000000 0.8494006 4.111889 162.13060
[21,] 0.0000000 0.0000000 0.000000 153.59741
[22,] 14.7213851 30.0492822 0.000000 0.00000
[23,] 0.0000000 0.0000000 2.328857 82.99674
[24,] 0.0000000 0.0000000 19.493398 22.75517
[25,] 1.9533513 0.0000000 50.000567 36.86338
Plotting significant KOs in DGE analysis.
# The total number of differentially expressed genes at 5% FDR is given by:
ws_T24vsT0_FDR<-p.adjust(ws_T24vsT0_lrt$table$PValue, method="BH")
sum(ws_T24vsT0_FDR<0.05)
[1] 11
summary(decideTests(ws_T24vsT0_lrt))
Timepoint_ws_T24vsT0T24
Down 2
NotSig 140019
Up 9
# Plot the log-fold change against log-transcripts per million, with DE genes highlighted.
ws_T24vsT0_MDplot<-plotMD(ws_T24vsT0_lrt, hl.col=c("midnightblue","lightskyblue1"), main = NULL, xlab="Average log TPM") + abline(h=0,col="gray", lty="dashed")
ws_T24vsT0_MDplot
integer(0)
Now switch perspectives and determine which expressed genes were significantly differentially expressed between ecosystems (i.e., vegetation types) at the same sampling timepoint.
Compare gene expression under anoxic conditions (T0) between Wet Sedge and Tussocks. Click on the Show/Hide button to see the steps.
# Subset data from the full tpm_expressed dataset
ws_tuss_T0<-tpm_all_exp_ann[,c(2:5,14:20)]
# Aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
ws_tuss_T0<-aggregate(. ~ KO + Symbol + Function + Tier_II + Tier_III + Tier_IV + Taxonomy, data=ws_tuss_T0, FUN=sum)
## Reading in the data
# Create DGEList for Tuss vs WS at T4 subdata
ws_tuss_T0<-DGEList(counts=ws_tuss_T0[,8:11], genes=ws_tuss_T0[,1:7])
rownames(ws_tuss_T0$counts)<-rownames(ws_tuss_T0$genes)<-ws_tuss_T0$genes$ID
ws_tuss_T0$genes$ID<-NULL
## Normalization
# Tell edgeR that you don't want to normalize the subdata (already TPM normalized)
ws_tuss_T0<-calcNormFactors(ws_tuss_T0, method="none")
ws_tuss_T0$samples
## Data Exploration
# The plotMDS function produces a plot in which distances between samples correspond to leading biological coefficient of variation (BCV) between those samples
plotMDS(ws_tuss_T0, col=rep(1:2, each=2))
## The Design Matrix
# Before we fit negative binomial GLMs, we need to define our design matrix based on the experimental design. Here, we want to test for differential expression between T4 and T0 timepoints within mesocosms, i.e., adjusting for differences between mesocosms. In statistics, this is an additive linear model with mesocosm as the blocking factor.
Mesocosm_ws_tuss_T0<-factor(c(1,2,1,2))
Vegetation_ws_tuss_T0<-factor(c("WS","WS","TUSS","TUSS"))
data.frame(Sample_ws_tuss_T0=colnames(ws_tuss_T0),Mesocosm_ws_tuss_T0,Vegetation_ws_tuss_T0)
ws_tuss_T0_design<-model.matrix(~Mesocosm_ws_tuss_T0+Vegetation_ws_tuss_T0)
rownames(ws_tuss_T0_design)<-colnames(ws_tuss_T0)
ws_tuss_T0_design
(Intercept) Mesocosm_ws_tuss_T02 Vegetation_ws_tuss_T0WS
S108379 1 0 1
S108381 1 1 1
S108382 1 0 0
S108383 1 1 0
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$Mesocosm_ws_tuss_T0
[1] "contr.treatment"
attr(,"contrasts")$Vegetation_ws_tuss_T0
[1] "contr.treatment"
## Estimating Dispersion
# We estimate the NB dispersion for the dataset
ws_tuss_T0<-estimateDisp(ws_tuss_T0,ws_tuss_T0_design,robust=TRUE)
ws_tuss_T0$common.dispersion
[1] 0.2095678
# The square root of the common dispersion gives the coefficient of variation of biological variation.
sqrt(ws_tuss_T0$common.dispersion)
[1] 0.4577857
# View the dispersion estimates in a BCV plot
plotBCV(ws_tuss_T0)
## Differential Expression
# Now we proceed to determine differentially expressed genes by fitting genewise glms
ws_tuss_T0_fit<-glmFit(ws_tuss_T0,ws_tuss_T0_design)
# Conduct likelihood ratio tests for WS T0 vs TUSS T0 differences and show the top genes. The genewise tests are for WS T0 vs TUSS T0 differential expression, adjusting for baseline differences between the replicate mesocosms. The tests can be viewed as analogous to paired t-tests. The top DE tags have tiny p-values and FDR values, as well as large fold changes.
ws_tuss_T0_lrt<-glmLRT(ws_tuss_T0_fit)
ws_tuss_T0_topTags<-topTags(ws_tuss_T0_lrt,n=25)
ws_tuss_T0_topTags
Coefficient: Vegetation_ws_tuss_T0WS
## Here's the TPM-normalized values in individual samples.
# We see that all the genes have consistent changes between replicates within each timepoint.
ws_tuss_T0_pvalue<-order(ws_tuss_T0_lrt$table$PValue)
cpm(ws_tuss_T0)[ws_tuss_T0_pvalue[1:25],]
S108379 S108381 S108382 S108383
[1,] 2.790502 2.1715893 1048.1520 1774.3058
[2,] 18.653991 23.3768187 3862.5422 9060.6678
[3,] 5.547236 10.5136483 1725.5008 3661.3731
[4,] 101.482342 81.4146821 19971.3520 20238.9159
[5,] 14.824705 5.9177545 3183.2604 2395.2434
[6,] 15.373237 15.9541956 3304.4782 3840.8938
[7,] 7.525094 9.3787376 1925.8803 2408.4338
[8,] 3.152842 1.8611792 683.5328 1448.8869
[9,] 1.834133 1.1484853 689.4594 854.7792
[10,] 17.040372 14.6767472 2870.1227 3780.1996
[11,] 3.263535 1.7826727 758.9130 1170.5083
[12,] 39.649128 47.6659745 8597.7390 6883.0791
[13,] 9.459767 7.6660223 1982.6558 1768.8644
[14,] 0.856733 2.4062734 491.0630 974.4714
[15,] 1.683924 0.7029522 616.0448 636.6969
[16,] 18.868062 23.7007072 5166.4870 2892.4947
[17,] 11.381785 15.3112141 2097.0478 2773.9525
[18,] 12.079508 11.6498504 1405.6115 3292.7357
[19,] 2.187497 3.0415660 711.5607 809.4788
[20,] 1.559618 0.9765914 372.5649 858.1113
[21,] 0.000000 0.0000000 222.6106 442.2323
[22,] 12.077353 7.3032488 1947.3882 1437.6220
[23,] 8.087159 9.5441367 1672.4565 1490.1581
[24,] 3.405617 3.7638208 721.9555 1044.8854
[25,] 32.748134 28.6056476 4359.9140 4150.2631
Plotting significant KOs in DGE analysis.
# The total number of differentially expressed genes at 5% FDR is given by:
ws_tuss_T0_FDR<-p.adjust(ws_tuss_T0_lrt$table$PValue, method="BH")
sum(ws_tuss_T0_FDR<0.05)
[1] 39971
summary(decideTests(ws_tuss_T0_lrt))
Vegetation_ws_tuss_T0WS
Down 12732
NotSig 126783
Up 27239
# Plot the log-fold change against log-transcripts per million, with DE genes highlighted.
ws_tuss_T0_MDplot<-plotMD(ws_tuss_T0_lrt, hl.col=c("palegreen2","lightskyblue2"), main = NULL, xlab="Average log TPM") + abline(h=0,col="gray", lty="dashed")
ws_tuss_T0_MDplot
integer(0)
Next, compare gene expression under oxic conditions (T4) between Wet Sedge and Tussocks. Click on the Show/Hide button to see the steps.
# Subset data from the full tpm_expressed dataset
ws_tuss_T4<-tpm_all_exp_ann[,c(6:7,10:11,14:20)]
# Aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
ws_tuss_T4<-aggregate(. ~ KO + Symbol + Function + Tier_II + Tier_III + Tier_IV + Taxonomy, data=ws_tuss_T4, FUN=sum)
## Reading in the data
# Create DGEList for Tuss vs WS at T4 subdata
ws_tuss_T4<-DGEList(counts=ws_tuss_T4[,8:11], genes=ws_tuss_T4[,1:7])
rownames(ws_tuss_T4$counts)<-rownames(ws_tuss_T4$genes)<-ws_tuss_T4$genes$ID
ws_tuss_T4$genes$ID<-NULL
## Normalization
# Tell edgeR that you don't want to normalize the subdata (already TPM normalized)
ws_tuss_T4<-calcNormFactors(ws_tuss_T4, method="none")
ws_tuss_T4$samples
## Data Exploration
# The plotMDS function produces a plot in which distances between samples correspond to leading biological coefficient of variation (BCV) between those samples
plotMDS(ws_tuss_T4, col=rep(1:2, each=2))
## The Design Matrix
# Before we fit negative binomial GLMs, we need to define our design matrix based on the experimental design. Here, we want to test for differential expression between T4 and T0 timepoints within mesocosms, i.e., adjusting for differences between mesocosms. In statistics, this is an additive linear model with mesocosm as the blocking factor.
Mesocosm_ws_tuss_T4<-factor(c(1,2,1,2))
Vegetation_ws_tuss_T4<-factor(c("WS","WS","TUSS","TUSS"))
data.frame(Sample_ws_tuss_T4=colnames(ws_tuss_T4),Mesocosm_ws_tuss_T4,Vegetation_ws_tuss_T4)
ws_tuss_T4_design<-model.matrix(~Mesocosm_ws_tuss_T4+Vegetation_ws_tuss_T4)
rownames(ws_tuss_T4_design)<-colnames(ws_tuss_T4)
ws_tuss_T4_design
(Intercept) Mesocosm_ws_tuss_T42 Vegetation_ws_tuss_T4WS
S108385 1 0 1
S108386 1 1 1
S108391 1 0 0
S108392 1 1 0
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$Mesocosm_ws_tuss_T4
[1] "contr.treatment"
attr(,"contrasts")$Vegetation_ws_tuss_T4
[1] "contr.treatment"
## Estimating Dispersion
# We estimate the NB dispersion for the dataset
ws_tuss_T4<-estimateDisp(ws_tuss_T4,ws_tuss_T4_design,robust=TRUE)
ws_tuss_T4$common.dispersion
[1] 0.1823877
# The square root of the common dispersion gives the coefficient of variation of biological variation.
sqrt(ws_tuss_T4$common.dispersion)
[1] 0.4270688
# View the dispersion estimates in a BCV plot
plotBCV(ws_tuss_T4)
## Differential Expression
# Now we proceed to determine differentially expressed genes by fitting genewise glms
ws_tuss_T4_fit<-glmFit(ws_tuss_T4,ws_tuss_T4_design)
# Conduct likelihood ratio tests for WS T4 vs TUSS T4 differences and show the top genes. The genewise tests are for WS T4 vs TUSS T4 differential expression, adjusting for baseline differences between the replicate mesocosms. The tests can be viewed as analogous to paired t-tests. The top DE tags have tiny p-values and FDR values, as well as large fold changes.
ws_tuss_T4_lrt<-glmLRT(ws_tuss_T4_fit)
topTags(ws_tuss_T4_lrt,n=25)
Coefficient: Vegetation_ws_tuss_T4WS
## Here's the TPM-normalized values in individual samples.
# We see that all the genes have consistent changes between replicates within each timepoint.
ws_tuss_T4_pvalue<-order(ws_tuss_T4_lrt$table$PValue)
cpm(ws_tuss_T4)[ws_tuss_T4_pvalue[1:25],]
S108385 S108386 S108391 S108392
[1,] 8.0274233 11.651619 4175.2141 3827.0926
[2,] 2.8435944 3.267810 2432.0822 1366.7868
[3,] 5.7799711 8.972472 2681.1172 3047.4687
[4,] 8.1131950 9.058751 1997.5411 4226.8287
[5,] 61.5662670 104.047069 15987.8386 22081.8476
[6,] 21.1933255 24.981156 4987.9826 7030.0079
[7,] 13.0324201 28.054604 4664.0350 5636.6829
[8,] 4.6609588 2.597162 1415.1183 1593.5747
[9,] 2.0592362 2.492567 1311.6065 1042.3992
[10,] 0.0000000 1.546245 540.8269 1020.3616
[11,] 12.0780470 23.350111 6073.1910 3583.9212
[12,] 0.0000000 4.455884 636.2206 1453.9601
[13,] 5.6855278 4.648439 1648.6917 1757.2707
[14,] 15.7567944 24.984147 3412.5304 6198.6897
[15,] 2.6816646 5.463272 1435.1464 1415.1327
[16,] 3.4957521 15.027355 2096.7399 2581.4026
[17,] 9.3862380 19.792247 4673.4838 2830.0177
[18,] 1.4146780 2.082176 937.8244 910.3646
[19,] 6.0806606 4.212434 1488.4907 1718.9066
[20,] 0.8329413 4.461367 868.1466 1191.5025
[21,] 0.0000000 0.000000 331.1166 532.6308
[22,] 2.4356977 4.586270 1546.3077 951.6303
[23,] 8.6697544 10.885006 1721.0323 3019.0001
[24,] 5.1404376 8.433296 1438.6632 2061.2372
[25,] 5.9714493 7.283721 1646.3021 1654.2275
Plotting significant KOs in DGE analysis.
# The total number of differentially expressed genes at 5% FDR is given by:
ws_tuss_T4_FDR<-p.adjust(ws_tuss_T4_lrt$table$PValue, method="BH")
sum(ws_tuss_T4_FDR<0.05)
[1] 45878
summary(decideTests(ws_tuss_T4_lrt))
Vegetation_ws_tuss_T4WS
Down 13512
NotSig 120876
Up 32366
# Plot the log-fold change against log-transcripts per million, with DE genes highlighted.
ws_tuss_T4_MDplot<-plotMD(ws_tuss_T4_lrt, hl.col=c("green3","dodgerblue1"), main = NULL, xlab="Average log TPM") + abline(h=0,col="gray", lty="dashed")
ws_tuss_T4_MDplot
integer(0)
Finally, compare gene expression under mixed-redox conditions (T24) between Wet Sedge and Tussocks. Click on the Show/Hide button to see the steps.
# Subset data from the full tpm_expressed dataset
ws_tuss_T24<-tpm_all_exp_ann[,c(8:9,12:20)]
# Aggregate (sum) genecall ID counts by shared KO and Taxonomy to reduce duplicates in the dataset and improve the DE analysis results to be biologically relevant.
ws_tuss_T24<-aggregate(. ~ KO + Symbol + Function + Tier_II + Tier_III + Tier_IV + Taxonomy, data=ws_tuss_T24, FUN=sum)
## Reading in the data
# Create DGEList for Tuss vs WS at T24 subdata
ws_tuss_T24<-DGEList(counts=ws_tuss_T24[,8:11], genes=ws_tuss_T24[,1:7])
rownames(ws_tuss_T24$counts)<-rownames(ws_tuss_T24$genes)<-ws_tuss_T24$genes$ID
ws_tuss_T24$genes$ID<-NULL
## Normalization
# Tell edgeR that you don't want to normalize the subdata (already TPM normalized)
ws_tuss_T24<-calcNormFactors(ws_tuss_T24, method="none")
ws_tuss_T24$samples
## Data Exploration
# The plotMDS function produces a plot in which distances between samples correspond to leading biological coefficient of variation (BCV) between those samples
plotMDS(ws_tuss_T24, col=rep(1:2, each=2))
## The Design Matrix
# Before we fit negative binomial GLMs, we need to define our design matrix based on the experimental design. Here, we want to test for differential expression between T4 and T0 timepoints within mesocosms, i.e., adjusting for differences between mesocosms. In statistics, this is an additive linear model with mesocosm as the blocking factor.
Mesocosm_ws_tuss_T24<-factor(c(1,2,1,2))
Vegetation_ws_tuss_T24<-factor(c("WS","WS","TUSS","TUSS"))
data.frame(Sample_ws_tuss_T24=colnames(ws_tuss_T24),Mesocosm_ws_tuss_T24,Vegetation_ws_tuss_T24)
ws_tuss_T24_design<-model.matrix(~Mesocosm_ws_tuss_T24+Vegetation_ws_tuss_T24)
rownames(ws_tuss_T24_design)<-colnames(ws_tuss_T24)
ws_tuss_T24_design
(Intercept) Mesocosm_ws_tuss_T242 Vegetation_ws_tuss_T24WS
S108388 1 0 1
S108390 1 1 1
S108394 1 0 0
S108396 1 1 0
attr(,"assign")
[1] 0 1 2
attr(,"contrasts")
attr(,"contrasts")$Mesocosm_ws_tuss_T24
[1] "contr.treatment"
attr(,"contrasts")$Vegetation_ws_tuss_T24
[1] "contr.treatment"
## Estimating Dispersion
# We estimate the NB dispersion for the dataset
ws_tuss_T24<-estimateDisp(ws_tuss_T24,ws_tuss_T24_design,robust=TRUE)
ws_tuss_T24$common.dispersion
[1] 0.5676736
# The square root of the common dispersion gives the coefficient of variation of biological variation.
sqrt(ws_tuss_T24$common.dispersion)
[1] 0.7534412
# View the dispersion estimates in a BCV plot
plotBCV(ws_tuss_T24)
## Differential Expression
# Now we proceed to determine differentially expressed genes by fitting genewise glms
ws_tuss_T24_fit<-glmFit(ws_tuss_T24,ws_tuss_T24_design)
# Conduct likelihood ratio tests for WS T24 vs TUSS T24 differences and show the top genes. The genewise tests are for WS T24 vs TUSS T24 differential expression, adjusting for baseline differences between the replicate mesocosms. The tests can be viewed as analogous to paired t-tests. The top DE tags have tiny p-values and FDR values, as well as large fold changes.
ws_tuss_T24_lrt<-glmLRT(ws_tuss_T24_fit)
topTags(ws_tuss_T24_lrt,n=25)
Coefficient: Vegetation_ws_tuss_T24WS
## Here's the TPM-normalized values in individual samples.
# We see that all the genes have consistent changes between replicates within each timepoint.
ws_tuss_T24_pvalue<-order(ws_tuss_T24_lrt$table$PValue)
cpm(ws_tuss_T24)[ws_tuss_T24_pvalue[1:25],]
S108388 S108390 S108394 S108396
[1,] 0.000000 0.000000 589.963445 417.5307
[2,] 371.013698 453.238248 0.000000 0.0000
[3,] 372.594535 214.590235 0.000000 0.0000
[4,] 1.920882 1.473356 1515.990843 629.6368
[5,] 1.775575 0.000000 1016.728192 250.2307
[6,] 687.952160 263.727529 1.216762 0.0000
[7,] 8.464239 0.000000 1238.236744 1321.0873
[8,] 165.797499 290.508099 0.000000 0.0000
[9,] 190.033341 240.415378 0.000000 0.0000
[10,] 0.000000 0.000000 183.228218 237.4590
[11,] 2.429625 1.845014 1412.637638 463.0927
[12,] 90.292033 413.736153 0.000000 0.0000
[13,] 8.489061 3.303170 2882.225623 878.3178
[14,] 0.000000 0.000000 310.311819 117.3602
[15,] 2.240023 1.576596 853.834799 573.3462
[16,] 188.544098 185.919975 0.000000 0.0000
[17,] 0.000000 0.000000 119.142851 278.1357
[18,] 0.000000 0.000000 167.062660 190.7137
[19,] 0.000000 0.000000 265.492941 118.6896
[20,] 6.723088 1.494865 916.987096 1529.7441
[21,] 0.000000 2.275517 894.839757 317.0247
[22,] 0.000000 0.000000 215.026006 136.0775
[23,] 0.000000 0.000000 239.182819 109.0986
[24,] 136.991697 185.449796 0.000000 0.0000
[25,] 3.556228 3.150716 810.327981 947.9758
Plotting significant KOs in DGE analysis.
# The total number of differentially expressed genes at 5% FDR is given by:
ws_tuss_T24_FDR<-p.adjust(ws_tuss_T24_lrt$table$PValue, method="BH")
sum(ws_tuss_T24_FDR<0.05)
[1] 16897
summary(decideTests(ws_tuss_T24_lrt))
Vegetation_ws_tuss_T24WS
Down 4774
NotSig 149857
Up 12123
# Plot the log-fold change against log-transcripts per million, with DE genes highlighted.
ws_tuss_T24_MDplot<-plotMD(ws_tuss_T24_lrt, hl.col=c("darkgreen","midnightblue"), main = NULL, xlab="Average log TPM") + abline(h=0,col="gray", lty="dashed")
ws_tuss_T24_MDplot
integer(0)
Gene expression related to iron acquisition, iron redox cycling, and iron storage were determined using FeGenie (Garber et al. 2020). FeGenie provides a comprehensive database of hidden Markov models (HMMs) based on genes related to iron acquisition, storage, and oxidation/reduction in Bacteria and Archaea, which are generally not annotated as such by established gene annotation pipelines, such as GhostKOALA, which was used to annotate the metatranscriptomes presented in this study.
Here, the quality-controlled sequencing reads for each metatranscriptome sample were assembled separately using MEGAHIT to generate a sample-specific contigs file. The metatranscriptome contigs files were used as input for FeGenie, which first predicted open-reading frames (ORFs) using Prodigal and then queried them against a custom library of HMMs using hmmsearch, with custom bit score cutoffs for each HMM. Results from FeGenie included all identified putative iron-related genes, their functional category, bit score, number of canonical heme-binding motifs, amino acid sequence, and closest homolog. Counts within each iron gene category were summarized as their relative expression against all identified iron genes.
Assemble each QC’d metatranscriptome sample for use in FeGenie
{Terminal}
module load Bioinformatics megahit/1.2.8
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108379_fwd.good.fastq -2 Sample_108379_rev.good.fastq -o ./megahit_108379_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108381_fwd.good.fastq -2 Sample_108381_rev.good.fastq -o ./megahit_108381_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108382_fwd.good.fastq -2 Sample_108382_rev.good.fastq -o ./megahit_108382_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108383_fwd.good.fastq -2 Sample_108383_rev.good.fastq -o ./megahit_108383_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108385_fwd.good.fastq -2 Sample_108385_rev.good.fastq -o ./megahit_108385_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108386_fwd.good.fastq -2 Sample_108386_rev.good.fastq -o ./megahit_108386_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108388_fwd.good.fastq -2 Sample_108388_rev.good.fastq -o ./megahit_108388_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108390_fwd.good.fastq -2 Sample_108390_rev.good.fastq -o ./megahit_108390_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108391_fwd.good.fastq -2 Sample_108391_rev.good.fastq -o ./megahit_108391_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108392_fwd.good.fastq -2 Sample_108392_rev.good.fastq -o ./megahit_108392_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108394_fwd.good.fastq -2 Sample_108394_rev.good.fastq -o ./megahit_108394_out
megahit --k-min 21 --k-max 141 --k-step 12 -1 Sample_108396_fwd.good.fastq -2 Sample_108396_rev.good.fastq -o ./megahit_108396_out
FeGenie is a python script with dependencies that can be downloaded from Arkadiy-Garber GitHub.
Start by loading prerequisite modules
{Terminal}
module load python3.6-anaconda Bioinformatics hmmer prodigal ncbi-blast R
Run the FeGenie python script
{Terminal}
FeGenie_rna.sh
# This is what's inside the "FeGenie_rna.sh" script
FeGenie.py -bin_dir /Path/To/assembly_fasta -bin_ext fa -out FeGenie_rna_out
Results from FeGenie analysis using Garber et al. 2020 R-scripts.
l = length(FeGenie_heatmap_data_organized)
fegenie.t <- t(FeGenie_heatmap_data_organized[,3:l-1])
fegenie.matrix = as.matrix(fegenie.t)
colnames(fegenie.t) <- as.vector(FeGenie_heatmap_data_organized$V1)
FeGenie <- fegenie.t
#-------------data preparation
# convert to data frame
FeGenie.data <- as.data.frame(FeGenie)
#FeGenie.data<-FeGenie.data[1,]
#class(FeGenie.data)
#(FeGenie.data)
# ----------- melt data
FeGenie.data.melt <- melt(FeGenie.data, id.vars = 1:1)
# rename columns
colnames(FeGenie.data.melt)[colnames(FeGenie.data.melt)=="variable"] <- "Iron_category"
colnames(FeGenie.data.melt)[colnames(FeGenie.data.melt)=="value"] <- "Normalized_gene_abundance"
# ----------------- output files
FeGenie.data.melt$Normalized_gene_abundance = as.character(FeGenie.data.melt$Normalized_gene_abundance)
FeGenie.data.melt$Normalized_gene_abundance = as.numeric(FeGenie.data.melt$Normalized_gene_abundance)
FeGenie.meta.plot <- ggplot(FeGenie.data.melt, aes(x = X, y = Iron_category, size = Normalized_gene_abundance), alpha=0.7) +
geom_point(aes(color=Iron_category)) +
scale_size_area(max_size = 10) +
labs(x="", y="Iron Category") +
scale_y_discrete(labels=c("iron_aquisition-iron_transport" = "Iron Transport",
"iron_aquisition-heme_transport" = "Heme Transport",
"iron_aquisition-heme_oxygenase" = "Heme Oxygenase",
"iron_aquisition-siderophore_synthesis" = "Siderophore Synthesis",
"iron_aquisition-siderophore_transport" = "Siderophore Transport",
"iron_gene_regulation" = "Iron Gene Regulation",
"iron_oxidation" = "Iron Oxidation",
"possible_iron_oxidation_and_possible_iron_reduction" = "Probable Iron Oxidation",
"probable_iron_reduction" = "Probable Iron Reduction",
"iron_reduction" = "Iron Reduction",
"iron_storage" = "Iron Storage",
"magnetosome_formation" = "Magnetosome Formation",
"iron_aquisition-iron_uptake" = "Iron Uptake",
"iron_aquisition-heme_uptake" = "Heme Uptake",
"iron_aquisition-heme_lyase" = "Heme Lyase",
"iron_aquisition-siderophore_uptake" = "Siderophore Uptake")) +
theme(panel.background = element_rect(fill = "white", colour = "black", size = 1, linetype = "solid"),
panel.border = element_rect(colour="black", size=1, fill=NA),
strip.background=element_rect(fill='white', colour='white'),
strip.text = element_text(face="bold", size=10),
panel.grid.major = element_line(size = 0.1, colour = "gray"),
panel.grid.minor = element_line(size = 0.1, colour = "gray"),
axis.text = element_text(size=12, colour="black"),
axis.title.y = element_blank(),
axis.text.x = element_text(vjust = 1, angle = 270, color = "black", size = 12, hjust=1), legend.position="right") + guides(color = FALSE) + labs(size="Normalized\nGene Abundance")
FeGenie.meta.plot
Plot the relative iron gene abundance data from the Metagenomes (MG) and Metatranscriptomes (MT) within Tuss.
# Place categories in the preferred order for plotting
tuss_dna_fegenie_bardata$Iron <- factor(tuss_dna_fegenie_bardata$Iron,levels = c("Iron Oxidation", "Iron Reduction", "Siderophore Transport", "Iron Storage"))
tuss_dna_fegenie_bardata$Sample <- factor(tuss_dna_fegenie_bardata$Sample,levels=c("Tuss-MG","Tuss-MT-T0","Tuss-MT-T4","Tuss-MT-T24"))
tuss_dna_fegenie_barplot<-ggplot(tuss_dna_fegenie_bardata, aes(x = Iron, y = Mean, fill = Sample)) + geom_bar(stat = "identity", position = "dodge", color="black") + geom_errorbar(aes(ymin = Mean - SD, ymax = Mean + SD), width = 0.2, position = position_dodge(0.9)) + ylab(expression(atop("Microbial Iron Genes", paste("Relative Abundance (%)")))) + theme_classic() + theme(axis.text=element_text(size=10), axis.title=element_text(size=12), axis.title.x=element_blank()) + theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=8), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + scale_size(guide=FALSE) + scale_fill_manual(values=c("black", "palegreen", "green3", "darkgreen")) + scale_x_discrete(labels = function(Iron) str_wrap(Iron, width = 10)) + scale_y_continuous(expand = c(0, 0), limits = c(0, 100)) + annotate(geom="text", x=0.9, y=12, label="a") + annotate(geom="text", x=1.12, y=22, label="b") + annotate(geom="text", x=1.35, y=18, label="ab") + annotate(geom="text", x=2.12, y=12, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=3.12, y=70, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=4.12, y=50, label="italic(N.S.)", parse=TRUE) + geom_segment(aes(x = 1.12, y = 35, xend = 1.12, yend = 50)) + geom_segment(aes(x = 1.12, y = 50, xend = 2.12, yend = 50)) + geom_segment(aes(x = 2.12, y = 50, xend = 2.12, yend = 20)) + annotate("text", x = 1.635, y = 54, size=3, label = "Paired~t-test~(italic(p) == 0.004)", parse = TRUE) + annotate("text", x = 1.635, y = 61, size=4, label = "Mean MT Difference (8%)")
tuss_dna_fegenie_barplot
# Place categories in the preferred order for plotting
tuss_dna_fegenie_redox_bardata$Iron <- factor(tuss_dna_fegenie_redox_bardata$Iron,levels = c("Iron Oxidation", "Iron Reduction"))
tuss_dna_fegenie_redox_bardata$Sample <- factor(tuss_dna_fegenie_redox_bardata$Sample,levels=c("Tuss-MG","Tuss-MT-T0","Tuss-MT-T4","Tuss-MT-T24"))
tuss_dna_fegenie_redox_barplot<-ggplot(tuss_dna_fegenie_redox_bardata, aes(x = Iron, y = Mean, fill = Sample)) + geom_bar(stat = "identity", position = "dodge", color="black") + geom_errorbar(aes(ymin = Mean - SD, ymax = Mean + SD), width = 0.2, position = position_dodge(0.9)) + ylab(expression(atop("Iron Redox Cycling Genes", paste("Relative Abundance (%)")))) + theme_classic() + theme(axis.text=element_text(size=10), axis.title=element_text(size=12), axis.title.x=element_blank()) + theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=8), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + scale_size(guide=FALSE) + scale_fill_manual(values=c("gray83", "palegreen", "green3", "darkgreen")) + scale_x_discrete(labels = function(Iron) str_wrap(Iron, width = 10)) + scale_y_continuous(expand = c(0, 0), limits = c(0, 40)) + annotate(geom="text", x=0.89, y=8, label="a") + annotate(geom="text", x=1.12, y=18, label="b") + annotate(geom="text", x=1.35, y=13, label="ab") + annotate(geom="text", x=2.12, y=4, label="italic(N.S.)", parse=TRUE) + geom_segment(aes(x = 1.12, y = 22, xend = 1.12, yend = 30)) + geom_segment(aes(x = 1.12, y = 30, xend = 2.12, yend = 30)) + geom_segment(aes(x = 2.12, y = 30, xend = 2.12, yend = 7)) + annotate("text", x = 1.635, y = 32, size=3, label = "Paired~t-test~(italic(p) == 0.004)", parse = TRUE) + annotate("text", x = 1.635, y = 34, size=4, label = "Mean MT Difference (43x)")
tuss_dna_fegenie_redox_barplot
Run statistical tests on the METATRANSCRIPTOMES to determine if significant differences exist between sampling timepoints for each iron gene category .
# Subset response variables for MANOVA
tuss_fegenie_stats$response <- as.matrix(tuss_fegenie_stats[, 2:5])
# MANOVA test
tuss_fegenie_manova <- manova(response ~ Timepoint, data=tuss_fegenie_stats)
summary.aov(tuss_fegenie_manova)
Response Iron.Oxidation :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 67.60 33.800 11.317 0.04004 *
Residuals 3 8.96 2.987
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Response Iron.Reduction :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 0.031826 0.0159132 2.0863 0.2705
Residuals 3 0.022882 0.0076273
Response Siderophore.Transport :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 48.840 24.42 1.1401 0.4283
Residuals 3 64.259 21.42
Response Iron.Storage :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 37.739 18.869 0.8012 0.5263
Residuals 3 70.655 23.552
## Run ANOVA for each category of interest (significant in MANOVA)
# Iron Oxidation
tuss_fegenie_stat1<-aov(Iron.Oxidation~Timepoint,data=tuss_fegenie_stats)
summary.aov(tuss_fegenie_stat1)
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 67.60 33.80 11.32 0.04 *
Residuals 3 8.96 2.99
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(tuss_fegenie_stat1)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Iron.Oxidation ~ Timepoint, data = tuss_fegenie_stats)
$Timepoint
diff lwr upr p adj
T24-T0 2.001337 -5.2204727 9.223146 0.5490550
T4-T0 7.906886 0.6850767 15.128696 0.0394294
T4-T24 5.905549 -1.3162602 13.127359 0.0828858
# t-test for differences between reduction and oxidation averaged across experiment
t.test(tuss_fegenie_stats$Iron.Oxidation,tuss_fegenie_stats$Iron.Reduction,paired=TRUE)
Paired t-test
data: tuss_fegenie_stats$Iron.Oxidation and tuss_fegenie_stats$Iron.Reduction
t = 5.2033, df = 5, p-value = 0.003458
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
4.138735 12.220977
sample estimates:
mean of the differences
8.179856
Plot the relative iron gene abundance data from the Metagenomes (MG) and Metatranscriptomes (MT) within WS.
# Place categories in the preferred order for plotting
ws_dna_fegenie_bardata$Iron <- factor(ws_dna_fegenie_bardata$Iron,levels = c("Iron Oxidation", "Iron Reduction", "Siderophore Transport", "Iron Storage"))
ws_dna_fegenie_bardata$Sample <- factor(ws_dna_fegenie_bardata$Sample,levels=c("WS-MG","WS-MT-T0","WS-MT-T4","WS-MT-T24"))
ws_dna_fegenie_barplot<-ggplot(ws_dna_fegenie_bardata, aes(x = Iron, y = Mean, fill = Sample)) + geom_bar(stat = "identity", position = "dodge") + geom_errorbar(aes(ymin = Mean - SD, ymax = Mean + SD), width = 0.2, position = position_dodge(0.9)) + ylab(expression(atop("Microbial Iron Genes", paste("Relative Abundance (%)")))) + theme_classic() + theme(axis.text=element_text(size=10), axis.title=element_text(size=12), axis.title.x=element_blank()) + theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=8), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + scale_size(guide=FALSE) + scale_fill_manual(values=c("black","lightskyblue1", "dodgerblue1", "midnightblue")) + scale_x_discrete(labels = function(Iron) str_wrap(Iron, width = 10)) + scale_y_continuous(expand = c(0, 0), limits = c(0, 100)) + annotate(geom="text", x=1.12, y=20, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=2.12, y=30, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=3.12, y=60, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=4.12, y=60, label="italic(N.S.)", parse=TRUE) + geom_segment(aes(x = 1.12, y = 30, xend = 1.12, yend = 50)) + geom_segment(aes(x = 1.12, y = 50, xend = 2.12, yend = 50)) + geom_segment(aes(x = 2.12, y = 50, xend = 2.12, yend = 45)) + annotate("text", x = 1.635, y = 54, size=3, label = "Paired~t-test~(italic(p) == 0.005)", parse = TRUE) + annotate("text", x = 1.635, y = 61, size=4, label = "Mean MT Difference (14%)")
# Place categories in the preferred order for plotting
ws_dna_fegenie_redox_bardata$Iron <- factor(ws_dna_fegenie_redox_bardata$Iron,levels = c("Iron Oxidation", "Iron Reduction"))
ws_dna_fegenie_redox_bardata$Sample <- factor(ws_dna_fegenie_redox_bardata$Sample,levels=c("WS-MG","WS-MT-T0","WS-MT-T4","WS-MT-T24"))
ws_dna_fegenie_redox_barplot<-ggplot(ws_dna_fegenie_redox_bardata, aes(x = Iron, y = Mean, fill = Sample)) + geom_bar(stat = "identity", position = "dodge", color = "black") + geom_errorbar(aes(ymin = Mean - SD, ymax = Mean + SD), width = 0.2, position = position_dodge(0.9)) + ylab(expression(atop("Iron Redox Cycling Genes", paste("Relative Abundance (%)")))) + theme_classic() + theme(axis.text=element_text(size=10), axis.title=element_text(size=12), axis.title.x=element_blank()) + theme(legend.position = "bottom", legend.title=element_blank(), legend.text=element_text(size=8), panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank()) + scale_size(guide=FALSE) + scale_fill_manual(values=c("black","lightskyblue1", "dodgerblue1", "midnightblue")) + scale_x_discrete(labels = function(Iron) str_wrap(Iron, width = 10)) + scale_y_continuous(expand = c(0, 0), limits = c(0, 40)) + annotate(geom="text", x=1.12, y=8.5, label="italic(N.S.)", parse=TRUE) + annotate(geom="text", x=2.12, y=25, label="italic(N.S.)", parse=TRUE) + geom_segment(aes(x = 1.12, y = 10, xend = 1.12, yend = 30)) + geom_segment(aes(x = 1.12, y = 30, xend = 2.12, yend = 30)) + geom_segment(aes(x = 2.12, y = 30, xend = 2.12, yend = 27)) + annotate("text", x = 1.635, y = 32, size=3, label = "Paired~t-test~(italic(p) == 0.005)", parse = TRUE) + annotate("text", x = 1.635, y = 34, size=4, label = "Mean MT Difference (4x)")
ws_dna_fegenie_redox_barplot
Run statistical tests on the METATRANSCRIPTOMES to determine if significant differences exist between sampling timepoints for each iron gene category.
# Subset response variables for MANOVA
ws_fegenie_stats$response <- as.matrix(ws_fegenie_stats[, 2:5])
# MANOVA test
ws_fegenie_manova <- manova(response ~ Timepoint, data=ws_fegenie_stats)
summary.aov(ws_fegenie_manova)
Response Iron.Oxidation :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 6.0375 3.01876 3.2344 0.1783
Residuals 3 2.8000 0.93333
Response Iron.Reduction :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 47.435 23.7177 3.8592 0.1481
Residuals 3 18.437 6.1457
Response Siderophore.Transport :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 80.866 40.433 0.5203 0.6397
Residuals 3 233.115 77.705
Response Iron.Storage :
Df Sum Sq Mean Sq F value Pr(>F)
Timepoint 2 11.59 5.795 0.1468 0.8693
Residuals 3 118.38 39.461
# t-test for differences between reduction and oxidation averaged across experiment
t.test(ws_fegenie_stats$Iron.Oxidation,ws_fegenie_stats$Iron.Reduction,paired=TRUE)
Paired t-test
data: ws_fegenie_stats$Iron.Oxidation and ws_fegenie_stats$Iron.Reduction
t = -9.9638, df = 5, p-value = 0.0001739
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-17.76968 -10.48120
sample estimates:
mean of the differences
-14.12544
Ferroptosis is a form of regulated cell death that results from the iron-dependent accumulation of lipid peroxides associated with reactive oxygen species such as hydroxyl radical. Here, it is likely that gene expression for ferroptosis is directly linked with the overproduction of hydroxyl radical given that hydroxyl radical production associated with the geochemical oxidation of Fe(II) has been shown to increase in tussock tundra following rainfall. However, little is known about the regulation of ferroptosis in soil environments or even how it is regulated within microbial cells regardless of environment. More studies need to be done to determine any direct link between hydroxyl radical production and ferroptosis gene expression following rainfall.
tuss_ferroptosis<-tpm_all_exp_ann
tuss_ferroptosis<-subset(tuss_ferroptosis, Tier_IV=="04216 Ferroptosis [PATH:ko04216]",
select=ID:Taxonomy)
tuss_ferroptosis
The session information is provided for full reproducibility.
devtools::session_info()
─ Session info ─────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 4.1.2 (2021-11-01)
os macOS Big Sur 10.16
system x86_64, darwin17.0
ui RStudio
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/Los_Angeles
date 2022-01-24
rstudio 1.4.1106 Tiger Daylily (desktop)
pandoc 2.11.4 @ /Applications/RStudio.app/Contents/MacOS/pandoc/ (via rmarkdown)
─ Packages ─────────────────────────────────────────────────────────────────────────────────────────
! package * version date (UTC) lib source
abind 1.4-5 2016-07-21 [1] CRAN (R 4.1.0)
ape * 5.6-1 2022-01-07 [1] CRAN (R 4.1.2)
assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.1.0)
backports 1.4.1 2021-12-13 [1] CRAN (R 4.1.0)
BiocManager 1.30.16 2021-06-15 [1] CRAN (R 4.1.0)
brio 1.1.3 2021-11-30 [1] CRAN (R 4.1.0)
broom 0.7.11 2022-01-03 [1] CRAN (R 4.1.2)
cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.0)
callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.0)
car 3.0-12 2021-11-06 [1] CRAN (R 4.1.0)
carData 3.0-5 2022-01-06 [1] CRAN (R 4.1.2)
cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.1.0)
cli 3.1.1 2022-01-20 [1] CRAN (R 4.1.2)
cluster 2.1.2 2021-04-17 [1] CRAN (R 4.1.2)
colorspace 2.0-2 2021-06-24 [1] CRAN (R 4.1.0)
cowplot * 1.1.1 2020-12-30 [1] CRAN (R 4.1.0)
crayon 1.4.2 2021-10-29 [1] CRAN (R 4.1.0)
data.table * 1.14.2 2021-09-27 [1] CRAN (R 4.1.0)
DBI 1.1.2 2021-12-20 [1] CRAN (R 4.1.0)
dbplyr 2.1.1 2021-04-06 [1] CRAN (R 4.1.0)
desc 1.4.0 2021-09-28 [1] CRAN (R 4.1.0)
devtools * 2.4.3 2021-11-30 [1] CRAN (R 4.1.0)
digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.0)
dplyr * 1.0.7 2021-06-18 [1] CRAN (R 4.1.0)
DT * 0.20 2021-11-15 [1] CRAN (R 4.1.0)
edgeR * 3.36.0 2021-10-26 [1] Bioconductor
ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.1.0)
evaluate 0.14 2019-05-28 [1] CRAN (R 4.1.0)
fansi 1.0.2 2022-01-14 [1] CRAN (R 4.1.2)
farver 2.1.0 2021-02-28 [1] CRAN (R 4.1.0)
fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.0)
forcats * 0.5.1 2021-01-27 [1] CRAN (R 4.1.0)
fs 1.5.2 2021-12-08 [1] CRAN (R 4.1.0)
generics 0.1.1 2021-10-25 [1] CRAN (R 4.1.0)
ggplot2 * 3.3.5 2021-06-25 [1] CRAN (R 4.1.0)
ggpubr * 0.4.0 2020-06-27 [1] CRAN (R 4.1.0)
ggsignif 0.6.3 2021-09-09 [1] CRAN (R 4.1.0)
V glue 1.6.0 2022-01-22 [1] CRAN (R 4.1.2) (on disk 1.6.1)
gridExtra * 2.3 2017-09-09 [1] CRAN (R 4.1.0)
gtable 0.3.0 2019-03-25 [1] CRAN (R 4.1.0)
haven 2.4.3 2021-08-04 [1] CRAN (R 4.1.0)
highr 0.9 2021-04-16 [1] CRAN (R 4.1.0)
hms 1.1.1 2021-09-26 [1] CRAN (R 4.1.0)
htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.0)
htmlwidgets 1.5.4 2021-09-08 [1] CRAN (R 4.1.0)
httr 1.4.2 2020-07-20 [1] CRAN (R 4.1.0)
jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.0)
jsonlite 1.7.3 2022-01-17 [1] CRAN (R 4.1.2)
kableExtra * 1.3.4 2021-02-20 [1] CRAN (R 4.1.0)
knitr * 1.37 2021-12-16 [1] CRAN (R 4.1.0)
labeling 0.4.2 2020-10-20 [1] CRAN (R 4.1.0)
lattice * 0.20-45 2021-09-22 [1] CRAN (R 4.1.2)
lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.0)
limma * 3.50.0 2021-10-26 [1] Bioconductor
locfit 1.5-9.4 2020-03-25 [1] CRAN (R 4.1.0)
lubridate 1.8.0 2021-10-07 [1] CRAN (R 4.1.0)
magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.1.0)
MASS 7.3-55 2022-01-13 [1] CRAN (R 4.1.2)
Matrix 1.4-0 2021-12-08 [1] CRAN (R 4.1.0)
memoise 2.0.1 2021-11-26 [1] CRAN (R 4.1.0)
mgcv 1.8-38 2021-10-06 [1] CRAN (R 4.1.2)
modelr 0.1.8 2020-05-19 [1] CRAN (R 4.1.0)
munsell 0.5.0 2018-06-12 [1] CRAN (R 4.1.0)
nlme 3.1-155 2022-01-13 [1] CRAN (R 4.1.2)
permute * 0.9-5 2019-03-12 [1] CRAN (R 4.1.0)
pheatmap * 1.0.12 2019-01-04 [1] CRAN (R 4.1.0)
pillar 1.6.4 2021-10-18 [1] CRAN (R 4.1.0)
pkgbuild 1.3.1 2021-12-20 [1] CRAN (R 4.1.0)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.0)
pkgload 1.2.4 2021-11-30 [1] CRAN (R 4.1.0)
plyr 1.8.6 2020-03-03 [1] CRAN (R 4.1.0)
png * 0.1-7 2013-12-03 [1] CRAN (R 4.1.0)
prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.0)
processx 3.5.2 2021-04-30 [1] CRAN (R 4.1.0)
ps 1.6.0 2021-02-28 [1] CRAN (R 4.1.0)
purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.1.0)
R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.0)
RColorBrewer * 1.1-2 2014-12-07 [1] CRAN (R 4.1.0)
Rcpp 1.0.8 2022-01-13 [1] CRAN (R 4.1.2)
readr * 2.1.1 2021-11-30 [1] CRAN (R 4.1.0)
readxl 1.3.1 2019-03-13 [1] CRAN (R 4.1.0)
remotes 2.4.2 2021-11-30 [1] CRAN (R 4.1.0)
reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.0)
reshape * 0.8.8 2018-10-23 [1] CRAN (R 4.1.0)
rlang 0.4.12 2021-10-18 [1] CRAN (R 4.1.0)
rmarkdown 2.11 2021-09-14 [1] CRAN (R 4.1.0)
rprojroot 2.0.2 2020-11-15 [1] CRAN (R 4.1.0)
rsconnect 0.8.25 2021-11-19 [1] CRAN (R 4.1.0)
rstatix * 0.7.0 2021-02-13 [1] CRAN (R 4.1.0)
rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.0)
rvest 1.0.2 2021-10-16 [1] CRAN (R 4.1.0)
scales 1.1.1 2020-05-11 [1] CRAN (R 4.1.0)
sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.0)
statmod * 1.4.36 2021-05-10 [1] CRAN (R 4.1.0)
stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.0)
stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.1.0)
svglite 2.0.0 2021-02-20 [1] CRAN (R 4.1.0)
systemfonts 1.0.3 2021-10-13 [1] CRAN (R 4.1.2)
testthat 3.1.2 2022-01-20 [1] CRAN (R 4.1.2)
tibble * 3.1.6 2021-11-07 [1] CRAN (R 4.1.0)
tidyr * 1.1.4 2021-09-27 [1] CRAN (R 4.1.0)
tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.1.0)
tidyverse * 1.3.1 2021-04-15 [1] CRAN (R 4.1.0)
tinytex 0.36 2021-12-19 [1] CRAN (R 4.1.0)
tzdb 0.2.0 2021-10-27 [1] CRAN (R 4.1.0)
usethis * 2.1.5 2021-12-09 [1] CRAN (R 4.1.0)
utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.0)
vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.1.0)
vegan * 2.5-7 2020-11-28 [1] CRAN (R 4.1.0)
viridisLite 0.4.0 2021-04-13 [1] CRAN (R 4.1.0)
webshot 0.5.2 2019-11-22 [1] CRAN (R 4.1.0)
withr 2.4.3 2021-11-30 [1] CRAN (R 4.1.0)
xfun 0.29 2021-12-14 [1] CRAN (R 4.1.0)
xml2 1.3.3 2021-11-30 [1] CRAN (R 4.1.0)
yaml * 2.2.1 2020-02-01 [1] CRAN (R 4.1.0)
[1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library
V ── Loaded and on-disk version mismatch.
────────────────────────────────────────────────────────────────────────────────────────────────────