This analysis pipe noted https://github.com/minsiksudo/BTE3207_Advanced_Biostatistics.
Roading the data
from previous lecture.. we generated a phyloseq object. it can be saved to hard disk using the below command.
saveRDS(ps, "phyloseq_example.rds")
And it can be loaded to a new R session like below!
getwd()
## [1] "/Volumes/macdrive/Dropbox"
library(phyloseq)
library(tidyverse)
library(microbiome)
##
## microbiome R package (microbiome.github.com)
##
##
##
## Copyright (C) 2011-2022 Leo Lahti,
## Sudarshan Shetty et al. <microbiome.github.io>
##
## Attaching package: 'microbiome'
## The following object is masked from 'package:ggplot2':
##
## alpha
## The following object is masked from 'package:base':
##
## transform
library(vegan)
## Loading required package: permute
##
## Attaching package: 'vegan'
## The following object is masked from 'package:microbiome':
##
## diversity
library(ggplot2)
#source("https://raw.githubusercontent.com/joey711/phyloseq/master/inst/scripts/installer.R",
# local = TRUE)
phyloseq <- readRDS("Inha/5_Lectures/2_IBS7048_Metagenomics/2025F/scripts/IBS7048_Advanced_metagenomics/phyloseq_examples_tree.rds")
phyloseq
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 225 taxa and 19 samples ]
## sample_data() Sample Data: [ 19 samples by 4 sample variables ]
## tax_table() Taxonomy Table: [ 225 taxa by 6 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 225 tips and 223 internal nodes ]
## refseq() DNAStringSet: [ 225 reference sequences ]
Basics of statistical tests
MPG dataset
MPG dataset is basic data installed in ggplot
head(mpg)
## # A tibble: 6 × 11
## manufacturer model displ year cyl trans drv cty hwy fl class
## <chr> <chr> <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
## 1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compa…
## 2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compa…
## 3 audi a4 2 2008 4 manual(m6) f 20 31 p compa…
## 4 audi a4 2 2008 4 auto(av) f 21 30 p compa…
## 5 audi a4 2.8 1999 6 auto(l5) f 16 26 p compa…
## 6 audi a4 2.8 1999 6 manual(m5) f 18 26 p compa…
Double-check the data type.
str(mpg)
## tibble [234 × 11] (S3: tbl_df/tbl/data.frame)
## $ manufacturer: chr [1:234] "audi" "audi" "audi" "audi" ...
## $ model : chr [1:234] "a4" "a4" "a4" "a4" ...
## $ displ : num [1:234] 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
## $ year : int [1:234] 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
## $ cyl : int [1:234] 4 4 4 4 6 6 6 4 4 4 ...
## $ trans : chr [1:234] "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
## $ drv : chr [1:234] "f" "f" "f" "f" ...
## $ cty : int [1:234] 18 21 20 21 16 18 18 18 16 20 ...
## $ hwy : int [1:234] 29 29 31 30 26 26 27 26 25 28 ...
## $ fl : chr [1:234] "p" "p" "p" "p" ...
## $ class : chr [1:234] "compact" "compact" "compact" "compact" ...
t.test fucntion will conduct basic t-test
Paired t-test
t.test(mpg$cty, mpg$hwy, paired = T)
##
## Paired t-test
##
## data: mpg$cty and mpg$hwy
## t = -44.492, df = 233, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
## -6.872628 -6.289765
## sample estimates:
## mean difference
## -6.581197
Unpaired t-test
mpg$class
## [1] "compact" "compact" "compact" "compact" "compact"
## [6] "compact" "compact" "compact" "compact" "compact"
## [11] "compact" "compact" "compact" "compact" "compact"
## [16] "midsize" "midsize" "midsize" "suv" "suv"
## [21] "suv" "suv" "suv" "2seater" "2seater"
## [26] "2seater" "2seater" "2seater" "suv" "suv"
## [31] "suv" "suv" "midsize" "midsize" "midsize"
## [36] "midsize" "midsize" "minivan" "minivan" "minivan"
## [41] "minivan" "minivan" "minivan" "minivan" "minivan"
## [46] "minivan" "minivan" "minivan" "pickup" "pickup"
## [51] "pickup" "pickup" "pickup" "pickup" "pickup"
## [56] "pickup" "pickup" "suv" "suv" "suv"
## [61] "suv" "suv" "suv" "suv" "pickup"
## [66] "pickup" "pickup" "pickup" "pickup" "pickup"
## [71] "pickup" "pickup" "pickup" "pickup" "suv"
## [76] "suv" "suv" "suv" "suv" "suv"
## [81] "suv" "suv" "suv" "pickup" "pickup"
## [86] "pickup" "pickup" "pickup" "pickup" "pickup"
## [91] "subcompact" "subcompact" "subcompact" "subcompact" "subcompact"
## [96] "subcompact" "subcompact" "subcompact" "subcompact" "subcompact"
## [101] "subcompact" "subcompact" "subcompact" "subcompact" "subcompact"
## [106] "subcompact" "subcompact" "subcompact" "midsize" "midsize"
## [111] "midsize" "midsize" "midsize" "midsize" "midsize"
## [116] "subcompact" "subcompact" "subcompact" "subcompact" "subcompact"
## [121] "subcompact" "subcompact" "suv" "suv" "suv"
## [126] "suv" "suv" "suv" "suv" "suv"
## [131] "suv" "suv" "suv" "suv" "suv"
## [136] "suv" "suv" "suv" "suv" "suv"
## [141] "suv" "compact" "compact" "midsize" "midsize"
## [146] "midsize" "midsize" "midsize" "midsize" "midsize"
## [151] "suv" "suv" "suv" "suv" "midsize"
## [156] "midsize" "midsize" "midsize" "midsize" "suv"
## [161] "suv" "suv" "suv" "suv" "suv"
## [166] "subcompact" "subcompact" "subcompact" "subcompact" "compact"
## [171] "compact" "compact" "compact" "suv" "suv"
## [176] "suv" "suv" "suv" "suv" "midsize"
## [181] "midsize" "midsize" "midsize" "midsize" "midsize"
## [186] "midsize" "compact" "compact" "compact" "compact"
## [191] "compact" "compact" "compact" "compact" "compact"
## [196] "compact" "compact" "compact" "suv" "suv"
## [201] "pickup" "pickup" "pickup" "pickup" "pickup"
## [206] "pickup" "pickup" "compact" "compact" "compact"
## [211] "compact" "compact" "compact" "compact" "compact"
## [216] "compact" "compact" "compact" "compact" "compact"
## [221] "compact" "subcompact" "subcompact" "subcompact" "subcompact"
## [226] "subcompact" "subcompact" "midsize" "midsize" "midsize"
## [231] "midsize" "midsize" "midsize" "midsize"
mpg %>% filter(class == "2seater")%>% .$cty
## [1] 16 15 16 15 15
city_mpg_2seater <- mpg %>% filter(class == "2seater") %>% .$cty
city_mpg_midsize <- mpg %>% filter(class == "midsize") %>% .$cty
city_mpg_2seater
## [1] 16 15 16 15 15
city_mpg_midsize
## [1] 15 17 16 19 22 18 18 17 18 18 21 21 18 18 19 23 23 19 19 18 19 19 18 16 17
## [26] 18 16 21 21 21 21 18 18 19 21 18 19 21 16 18 17
t.test(city_mpg_2seater,
city_mpg_midsize,
paired = F,
var.equal = F)
##
## Welch Two Sample t-test
##
## data: city_mpg_2seater and city_mpg_midsize
## t = -8.5965, df = 20.862, p-value = 2.69e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4.168314 -2.543882
## sample estimates:
## mean of x mean of y
## 15.4000 18.7561
Linear model
Univariate linear model
Boxplot of categorical x, continuous y
boxplot(data = mpg, cty ~ class)
Linear model
lm(data = mpg,
cty ~ class)
##
## Call:
## lm(formula = cty ~ class, data = mpg)
##
## Coefficients:
## (Intercept) classcompact classmidsize classminivan
## 15.4000 4.7277 3.3561 0.4182
## classpickup classsubcompact classsuv
## -2.4000 4.9714 -1.9000
Linear model - the output can be stored
lm_result_mpg <-
lm(data = mpg,
cty ~ class)
lm_result_mpg
##
## Call:
## lm(formula = cty ~ class, data = mpg)
##
## Coefficients:
## (Intercept) classcompact classmidsize classminivan
## 15.4000 4.7277 3.3561 0.4182
## classpickup classsubcompact classsuv
## -2.4000 4.9714 -1.9000
The result of lm is a list having multiple caclulated factors
lm_result_mpg %>% names
## [1] "coefficients" "residuals" "effects" "rank"
## [5] "fitted.values" "assign" "qr" "df.residual"
## [9] "contrasts" "xlevels" "call" "terms"
## [13] "model"
Summary of a linear model
summary(lm_result_mpg)
##
## Call:
## lm(formula = cty ~ class, data = mpg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6.3714 -1.7561 -0.1277 1.0000 14.6286
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 15.4000 1.3024 11.824 < 2e-16 ***
## classcompact 4.7277 1.3699 3.451 0.000666 ***
## classmidsize 3.3561 1.3796 2.433 0.015759 *
## classminivan 0.4182 1.5708 0.266 0.790307
## classpickup -2.4000 1.3976 -1.717 0.087304 .
## classsubcompact 4.9714 1.3923 3.571 0.000435 ***
## classsuv -1.9000 1.3539 -1.403 0.161884
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.912 on 227 degrees of freedom
## Multiple R-squared: 0.5438, Adjusted R-squared: 0.5317
## F-statistic: 45.1 on 6 and 227 DF, p-value: < 2.2e-16
This result will show you how different data is. (by the factor label
of class)
You have 1. Estimate (beta) 2. Standard errors 3. p-values 4. R-squared values
ANOVA
Analysis of variance for linear model can be done to this linear model result.
anova(lm_result_mpg)
## Analysis of Variance Table
##
## Response: cty
## Df Sum Sq Mean Sq F value Pr(>F)
## class 6 2295.0 382.51 45.099 < 2.2e-16 ***
## Residuals 227 1925.3 8.48
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
How can we test different multiple associations?
ggplot(mpg,
aes(y = cty, x = class, fill = as.factor(year))) +
geom_boxplot()
Anova - Multivariate
lm(data = mpg,
formula = hwy ~ class + as.factor(year)) %>%
anova
## Analysis of Variance Table
##
## Response: hwy
## Df Sum Sq Mean Sq F value Pr(>F)
## class 6 5683.2 947.21 83.4157 <2e-16 ***
## as.factor(year) 1 12.1 12.15 1.0699 0.3021
## Residuals 226 2566.3 11.36
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
linear model - Multivariate
Multiple linear regression can be don by just adding a new variable to formula
lm(data = mpg,
formula = hwy ~ class * as.factor(year)) %>%
summary
##
## Call:
## lm(formula = hwy ~ class * as.factor(year), data = mpg)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8.000 -1.812 -0.500 1.341 16.080
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 24.5000 2.3822 10.285 < 2e-16 ***
## classcompact 3.4200 2.4756 1.381 0.16854
## classmidsize 2.0000 2.4984 0.800 0.42429
## classminivan -2.0000 2.7507 -0.727 0.46795
## classpickup -7.6875 2.5267 -3.043 0.00263 **
## classsubcompact 4.5000 2.5044 1.797 0.07373 .
## classsuv -6.9483 2.4629 -2.821 0.00522 **
## as.factor(year)2008 0.5000 3.0754 0.163 0.87100
## classcompact:as.factor(year)2008 0.3073 3.2292 0.095 0.92428
## classmidsize:as.factor(year)2008 1.0476 3.2505 0.322 0.74754
## classminivan:as.factor(year)2008 -0.8000 3.6904 -0.217 0.82858
## classpickup:as.factor(year)2008 -0.3713 3.2916 -0.113 0.91029
## classsubcompact:as.factor(year)2008 -2.3750 3.2809 -0.724 0.46991
## classsuv:as.factor(year)2008 0.5846 3.1927 0.183 0.85487
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.369 on 220 degrees of freedom
## Multiple R-squared: 0.6978, Adjusted R-squared: 0.6799
## F-statistic: 39.07 on 13 and 220 DF, p-value: < 2.2e-16
Statistical tests for alpha diversity
# microbiome package
microbiome::alpha(phyloseq)
## Observed richness
## Other forms of richness
## Diversity
## Evenness
## Dominance
## Rarity
## observed chao1 diversity_inverse_simpson diversity_gini_simpson
## F3D0 104 104 27.74707 0.9639602
## F3D1 94 94 32.96293 0.9696629
## F3D141 72 72 19.87039 0.9496739
## F3D142 47 47 16.13006 0.9380039
## F3D143 54 54 18.03920 0.9445652
## F3D144 45 45 14.43812 0.9307389
## F3D145 70 70 15.85773 0.9369393
## F3D146 82 82 22.71345 0.9559732
## F3D147 103 103 17.15604 0.9417115
## F3D148 94 94 20.20291 0.9505022
## F3D149 109 109 23.13346 0.9567726
## F3D150 75 75 21.86987 0.9542750
## F3D2 132 132 14.23904 0.9297705
## F3D3 64 64 12.67559 0.9211082
## F3D5 82 82 26.56809 0.9623609
## F3D6 87 87 16.13482 0.9380222
## F3D7 58 58 12.38949 0.9192864
## F3D8 97 97 22.16340 0.9548806
## F3D9 101 101 23.20831 0.9569120
## diversity_shannon diversity_fisher diversity_coverage evenness_camargo
## F3D0 3.845083 17.623177 10 0.3282552
## F3D1 3.939724 16.535173 13 0.3954679
## F3D141 3.416365 12.020521 7 0.2989099
## F3D142 3.100883 8.218271 6 0.3315311
## F3D143 3.256438 9.764408 6 0.3352886
## F3D144 2.985183 7.309790 6 0.3107364
## F3D145 3.114549 11.214612 6 0.2344288
## F3D146 3.609865 14.739937 8 0.3237746
## F3D147 3.338489 15.294818 6 0.2098778
## F3D148 3.424846 14.412322 7 0.2390079
## F3D149 3.631134 16.969707 8 0.2559155
## F3D150 3.557807 12.914216 8 0.3314166
## F3D2 3.553624 19.560660 6 0.2355371
## F3D3 3.017631 10.208471 5 0.2358849
## F3D5 3.737352 14.873300 9 0.3723447
## F3D6 3.399273 14.155266 6 0.2716844
## F3D7 2.965123 9.540154 5 0.2472467
## F3D8 3.701149 17.508733 8 0.3136044
## F3D9 3.737488 17.292247 8 0.3137205
## evenness_pielou evenness_simpson evenness_evar evenness_bulla
## F3D0 0.8278982 0.2667988 0.3715485 0.5085783
## F3D1 0.8671513 0.3506694 0.4058522 0.5706106
## F3D141 0.7988383 0.2759776 0.3439671 0.4407498
## F3D142 0.8053933 0.3431927 0.3020775 0.4582905
## F3D143 0.8163578 0.3340592 0.3757782 0.4666677
## F3D144 0.7841996 0.3208471 0.2944975 0.4213795
## F3D145 0.7330946 0.2265390 0.2382547 0.3366673
## F3D146 0.8191729 0.2769933 0.4100491 0.4723746
## F3D147 0.7203202 0.1665635 0.2780173 0.3305908
## F3D148 0.7538243 0.2149246 0.2752081 0.3582721
## F3D149 0.7740066 0.2122336 0.3033751 0.3963266
## F3D150 0.8240456 0.2915983 0.4177821 0.4879845
## F3D2 0.7277838 0.1078715 0.3021921 0.3973778
## F3D3 0.7255868 0.1980561 0.2818964 0.3534158
## F3D5 0.8481030 0.3240011 0.4235522 0.5441913
## F3D6 0.7611605 0.1854576 0.3893171 0.4116843
## F3D7 0.7302461 0.2136119 0.2977758 0.3614254
## F3D8 0.8090454 0.2284886 0.3993826 0.4895786
## F3D9 0.8098354 0.2297852 0.4161506 0.4837492
## dominance_dbp dominance_dmn dominance_absolute dominance_relative
## F3D0 0.08981943 0.1625156 577 0.08981943
## F3D1 0.08328180 0.1558442 404 0.08328180
## F3D141 0.10338346 0.1959064 495 0.10338346
## F3D142 0.12144289 0.2368737 303 0.12144289
## F3D143 0.09906237 0.1928251 243 0.09906237
## F3D144 0.12093023 0.2244186 416 0.12093023
## F3D145 0.11180664 0.2123109 643 0.11180664
## F3D146 0.10164620 0.1865691 389 0.10164620
## F3D147 0.11592962 0.2103706 1489 0.11592962
## F3D148 0.08899561 0.1766629 871 0.08899561
## F3D149 0.08576083 0.1699885 895 0.08576083
## F3D150 0.10898483 0.2016336 467 0.10898483
## F3D2 0.20886114 0.3034160 3479 0.20886114
## F3D3 0.18267980 0.2943691 983 0.18267980
## F3D5 0.08902804 0.1660768 327 0.08902804
## F3D6 0.15284306 0.2545868 1008 0.15284306
## F3D7 0.15488215 0.2768158 644 0.15488215
## F3D8 0.12449347 0.2030617 553 0.12449347
## F3D9 0.09979771 0.1857721 592 0.09979771
## dominance_simpson dominance_core_abundance dominance_gini
## F3D0 0.03603984 0.7693026 0.8351031
## F3D1 0.03033711 0.7179963 0.8227637
## F3D141 0.05032614 0.9043442 0.8958192
## F3D142 0.06199606 0.9482966 0.9261118
## F3D143 0.05543484 0.9168365 0.9123848
## F3D144 0.06926109 0.9526163 0.9334729
## F3D145 0.06306074 0.9238393 0.9229313
## F3D146 0.04402677 0.8228377 0.8689510
## F3D147 0.05828851 0.8864061 0.8975764
## F3D148 0.04949782 0.8900582 0.8934573
## F3D149 0.04322743 0.8462054 0.8668362
## F3D150 0.04572500 0.8686114 0.8779422
## F3D2 0.07022948 0.8013448 0.8536546
## F3D3 0.07889181 0.9384873 0.9278193
## F3D5 0.03763914 0.7919956 0.8538512
## F3D6 0.06197778 0.8356331 0.8849169
## F3D7 0.08071357 0.9220779 0.9313345
## F3D8 0.04511944 0.7593426 0.8515203
## F3D9 0.04308802 0.7434255 0.8447396
## rarity_log_modulo_skewness rarity_low_abundance rarity_rare_abundance
## F3D0 2.042400 0.036737235 0.082347447
## F3D1 2.025227 0.022675737 0.047825191
## F3D141 2.043099 0.019632414 0.011904762
## F3D142 1.995630 0.006813627 0.012024048
## F3D143 2.039543 0.006114961 0.011822258
## F3D144 2.020668 0.008139535 0.004941860
## F3D145 2.042709 0.032516084 0.017909929
## F3D146 2.047470 0.016461981 0.043114711
## F3D147 2.047934 0.056913734 0.029040797
## F3D148 2.050079 0.055481762 0.024522326
## F3D149 2.046433 0.046857033 0.044174013
## F3D150 2.043424 0.012602100 0.032438740
## F3D2 1.999751 0.050849493 0.056973044
## F3D3 2.045144 0.028619216 0.015424642
## F3D5 2.051674 0.019057991 0.035121154
## F3D6 2.055520 0.032297195 0.035178165
## F3D7 2.050004 0.021164021 0.008658009
## F3D8 2.056511 0.024313372 0.064835660
## F3D9 2.051201 0.032366824 0.074173972
sample_data_with_alpha <-
phyloseq %>%
sample_data() %>%
merge(., #Merge function will do the same thing as vloopup at excel
microbiome::alpha(phyloseq),
by = 0) %>%
column_to_rownames("Row.names") %>%
sample_data
## Observed richness
## Other forms of richness
## Diversity
## Evenness
## Dominance
## Rarity
sample_data(phyloseq) <- sample_data_with_alpha
sample_data_with_alpha
## Subject Gender Day When observed chao1 diversity_inverse_simpson
## F3D0 3 F 0 Early 104 104 27.74707
## F3D1 3 F 1 Early 94 94 32.96293
## F3D141 3 F 141 Late 72 72 19.87039
## F3D142 3 F 142 Late 47 47 16.13006
## F3D143 3 F 143 Late 54 54 18.03920
## F3D144 3 F 144 Late 45 45 14.43812
## F3D145 3 F 145 Late 70 70 15.85773
## F3D146 3 F 146 Late 82 82 22.71345
## F3D147 3 F 147 Late 103 103 17.15604
## F3D148 3 F 148 Late 94 94 20.20291
## F3D149 3 F 149 Late 109 109 23.13346
## F3D150 3 F 150 Late 75 75 21.86987
## F3D2 3 F 2 Early 132 132 14.23904
## F3D3 3 F 3 Early 64 64 12.67559
## F3D5 3 F 5 Early 82 82 26.56809
## F3D6 3 F 6 Early 87 87 16.13482
## F3D7 3 F 7 Early 58 58 12.38949
## F3D8 3 F 8 Early 97 97 22.16340
## F3D9 3 F 9 Early 101 101 23.20831
## diversity_gini_simpson diversity_shannon diversity_fisher
## F3D0 0.9639602 3.845083 17.623177
## F3D1 0.9696629 3.939724 16.535173
## F3D141 0.9496739 3.416365 12.020521
## F3D142 0.9380039 3.100883 8.218271
## F3D143 0.9445652 3.256438 9.764408
## F3D144 0.9307389 2.985183 7.309790
## F3D145 0.9369393 3.114549 11.214612
## F3D146 0.9559732 3.609865 14.739937
## F3D147 0.9417115 3.338489 15.294818
## F3D148 0.9505022 3.424846 14.412322
## F3D149 0.9567726 3.631134 16.969707
## F3D150 0.9542750 3.557807 12.914216
## F3D2 0.9297705 3.553624 19.560660
## F3D3 0.9211082 3.017631 10.208471
## F3D5 0.9623609 3.737352 14.873300
## F3D6 0.9380222 3.399273 14.155266
## F3D7 0.9192864 2.965123 9.540154
## F3D8 0.9548806 3.701149 17.508733
## F3D9 0.9569120 3.737488 17.292247
## diversity_coverage evenness_camargo evenness_pielou evenness_simpson
## F3D0 10 0.3282552 0.8278982 0.2667988
## F3D1 13 0.3954679 0.8671513 0.3506694
## F3D141 7 0.2989099 0.7988383 0.2759776
## F3D142 6 0.3315311 0.8053933 0.3431927
## F3D143 6 0.3352886 0.8163578 0.3340592
## F3D144 6 0.3107364 0.7841996 0.3208471
## F3D145 6 0.2344288 0.7330946 0.2265390
## F3D146 8 0.3237746 0.8191729 0.2769933
## F3D147 6 0.2098778 0.7203202 0.1665635
## F3D148 7 0.2390079 0.7538243 0.2149246
## F3D149 8 0.2559155 0.7740066 0.2122336
## F3D150 8 0.3314166 0.8240456 0.2915983
## F3D2 6 0.2355371 0.7277838 0.1078715
## F3D3 5 0.2358849 0.7255868 0.1980561
## F3D5 9 0.3723447 0.8481030 0.3240011
## F3D6 6 0.2716844 0.7611605 0.1854576
## F3D7 5 0.2472467 0.7302461 0.2136119
## F3D8 8 0.3136044 0.8090454 0.2284886
## F3D9 8 0.3137205 0.8098354 0.2297852
## evenness_evar evenness_bulla dominance_dbp dominance_dmn
## F3D0 0.3715485 0.5085783 0.08981943 0.1625156
## F3D1 0.4058522 0.5706106 0.08328180 0.1558442
## F3D141 0.3439671 0.4407498 0.10338346 0.1959064
## F3D142 0.3020775 0.4582905 0.12144289 0.2368737
## F3D143 0.3757782 0.4666677 0.09906237 0.1928251
## F3D144 0.2944975 0.4213795 0.12093023 0.2244186
## F3D145 0.2382547 0.3366673 0.11180664 0.2123109
## F3D146 0.4100491 0.4723746 0.10164620 0.1865691
## F3D147 0.2780173 0.3305908 0.11592962 0.2103706
## F3D148 0.2752081 0.3582721 0.08899561 0.1766629
## F3D149 0.3033751 0.3963266 0.08576083 0.1699885
## F3D150 0.4177821 0.4879845 0.10898483 0.2016336
## F3D2 0.3021921 0.3973778 0.20886114 0.3034160
## F3D3 0.2818964 0.3534158 0.18267980 0.2943691
## F3D5 0.4235522 0.5441913 0.08902804 0.1660768
## F3D6 0.3893171 0.4116843 0.15284306 0.2545868
## F3D7 0.2977758 0.3614254 0.15488215 0.2768158
## F3D8 0.3993826 0.4895786 0.12449347 0.2030617
## F3D9 0.4161506 0.4837492 0.09979771 0.1857721
## dominance_absolute dominance_relative dominance_simpson
## F3D0 577 0.08981943 0.03603984
## F3D1 404 0.08328180 0.03033711
## F3D141 495 0.10338346 0.05032614
## F3D142 303 0.12144289 0.06199606
## F3D143 243 0.09906237 0.05543484
## F3D144 416 0.12093023 0.06926109
## F3D145 643 0.11180664 0.06306074
## F3D146 389 0.10164620 0.04402677
## F3D147 1489 0.11592962 0.05828851
## F3D148 871 0.08899561 0.04949782
## F3D149 895 0.08576083 0.04322743
## F3D150 467 0.10898483 0.04572500
## F3D2 3479 0.20886114 0.07022948
## F3D3 983 0.18267980 0.07889181
## F3D5 327 0.08902804 0.03763914
## F3D6 1008 0.15284306 0.06197778
## F3D7 644 0.15488215 0.08071357
## F3D8 553 0.12449347 0.04511944
## F3D9 592 0.09979771 0.04308802
## dominance_core_abundance dominance_gini rarity_log_modulo_skewness
## F3D0 0.7693026 0.8351031 2.042400
## F3D1 0.7179963 0.8227637 2.025227
## F3D141 0.9043442 0.8958192 2.043099
## F3D142 0.9482966 0.9261118 1.995630
## F3D143 0.9168365 0.9123848 2.039543
## F3D144 0.9526163 0.9334729 2.020668
## F3D145 0.9238393 0.9229313 2.042709
## F3D146 0.8228377 0.8689510 2.047470
## F3D147 0.8864061 0.8975764 2.047934
## F3D148 0.8900582 0.8934573 2.050079
## F3D149 0.8462054 0.8668362 2.046433
## F3D150 0.8686114 0.8779422 2.043424
## F3D2 0.8013448 0.8536546 1.999751
## F3D3 0.9384873 0.9278193 2.045144
## F3D5 0.7919956 0.8538512 2.051674
## F3D6 0.8356331 0.8849169 2.055520
## F3D7 0.9220779 0.9313345 2.050004
## F3D8 0.7593426 0.8515203 2.056511
## F3D9 0.7434255 0.8447396 2.051201
## rarity_low_abundance rarity_rare_abundance
## F3D0 0.036737235 0.082347447
## F3D1 0.022675737 0.047825191
## F3D141 0.019632414 0.011904762
## F3D142 0.006813627 0.012024048
## F3D143 0.006114961 0.011822258
## F3D144 0.008139535 0.004941860
## F3D145 0.032516084 0.017909929
## F3D146 0.016461981 0.043114711
## F3D147 0.056913734 0.029040797
## F3D148 0.055481762 0.024522326
## F3D149 0.046857033 0.044174013
## F3D150 0.012602100 0.032438740
## F3D2 0.050849493 0.056973044
## F3D3 0.028619216 0.015424642
## F3D5 0.019057991 0.035121154
## F3D6 0.032297195 0.035178165
## F3D7 0.021164021 0.008658009
## F3D8 0.024313372 0.064835660
## F3D9 0.032366824 0.074173972
Alpha diversity boxplots
sample_data_with_alpha %>%
ggplot(aes(x = When, y = observed)) +
geom_boxplot() +
xlab("Genus richness") +
theme_classic(base_size = 15)
Statistical tests on alpha diversity (linear model)
sample_data_with_alpha %>%
data.frame() %>%
lm(data = ., observed ~ When) %>%
summary
##
## Call:
## lm(formula = observed ~ When, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -33.00 -15.05 -0.10 11.50 41.00
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 91.000 7.418 12.268 7.18e-10 ***
## WhenLate -15.900 10.225 -1.555 0.138
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 22.25 on 17 degrees of freedom
## Multiple R-squared: 0.1245, Adjusted R-squared: 0.07303
## F-statistic: 2.418 on 1 and 17 DF, p-value: 0.1384
The difference between those two group - is not statistically significant
How about other outcomes?
sample_data_with_alpha %>%
data.frame() %>%
lm(data = ., observed ~ evenness_simpson) %>%
summary
##
## Call:
## lm(formula = observed ~ evenness_simpson, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -32.386 -13.908 2.837 14.264 32.095
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 134.78 17.58 7.668 6.47e-07 ***
## evenness_simpson -207.81 67.83 -3.063 0.00703 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 19.09 on 17 degrees of freedom
## Multiple R-squared: 0.3557, Adjusted R-squared: 0.3178
## F-statistic: 9.385 on 1 and 17 DF, p-value: 0.007033
The evenness difference between those two group - is statistically significant
sample_data_with_alpha %>%
data.frame() %>%
lm(data = ., observed ~ dominance_core_abundance) %>%
summary
##
## Call:
## lm(formula = observed ~ dominance_core_abundance, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -19.181 -12.114 -4.527 2.548 37.443
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 273.61 45.13 6.062 1.27e-05 ***
## dominance_core_abundance -223.44 52.62 -4.247 0.000544 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 16.57 on 17 degrees of freedom
## Multiple R-squared: 0.5147, Adjusted R-squared: 0.4862
## F-statistic: 18.03 on 1 and 17 DF, p-value: 0.0005441
Core abundance was also statistically different.
How can we say this difference is not because of deeper sequencing?
sample_data_with_alpha$final_reads <- phyloseq %>% sample_sums
sample_data_with_alpha %>%
data.frame() %>%
ggplot(aes(x = final_reads, y = observed)) +
geom_point()
Linear model of this association
(As deeper sequencing identified more taxa)
lm_eqn <- function(m){
eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2,
list(a = format(unname(coef(m)[1]), digits = 2),
b = format(unname(coef(m)[2]), digits = 2),
r2 = format(summary(m)$r.squared, digits = 3)))
as.character(as.expression(eq));
}
ggplot(data = sample_data_with_alpha, aes(x = final_reads, y = observed)) +
geom_smooth(method = "lm",
se = T,
color = "red",
linetype = "dashed"
) + # Plot regression slope
geom_point() +
theme_classic( base_family = "serif", base_size = 20) +
geom_label(x = 12000, y = 60,
label = lm_eqn(lm(data = data.frame(sample_data_with_alpha), observed ~ final_reads)),
parse = TRUE,
family='serif',
size = 8, label.size = 0, color = "red")#, fill = alpha(c("white"),1))
## Warning: The `label.size` argument of `geom_label()` is deprecated as of ggplot2 3.5.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `geom_smooth()` using formula = 'y ~ x'
lm(data = data.frame(sample_data_with_alpha), observed ~ final_reads) %>% summary
##
## Call:
## lm(formula = observed ~ final_reads, data = data.frame(sample_data_with_alpha))
##
## Residuals:
## Min 1Q Median 3Q Max
## -24.01 -11.19 -1.72 11.47 23.08
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.217e+01 6.728e+00 7.754 5.57e-07 ***
## final_reads 4.896e-03 9.353e-04 5.235 6.73e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 14.72 on 17 degrees of freedom
## Multiple R-squared: 0.6171, Adjusted R-squared: 0.5946
## F-statistic: 27.4 on 1 and 17 DF, p-value: 6.727e-05
There is significant correlation !
ggplot(data = sample_data_with_alpha, aes(x = final_reads, y = observed)) +
geom_smooth(method = "lm",
se = T,
color = "red",
linetype = "dashed"
) + # Plot regression slope
geom_point(aes(col = When)) +
theme_classic( base_family = "serif", base_size = 20) +
geom_label(x = 12000, y = 60,
label = lm_eqn(lm(data = data.frame(sample_data_with_alpha), observed ~ final_reads)),
parse = TRUE,
family='serif',
size = 8, label.size = 0, color = "red")#, fill = alpha(c("white"),1))
## `geom_smooth()` using formula = 'y ~ x'
Even for evenness, this association seem to have a significant term in general.
Multiple linear regression
lm(data = data.frame(sample_data_with_alpha),
evenness_simpson ~ final_reads + When) %>% summary
##
## Call:
## lm(formula = evenness_simpson ~ final_reads + When, data = data.frame(sample_data_with_alpha))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.05258 -0.02686 0.00104 0.01807 0.09423
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.247e-01 2.111e-02 15.378 5.26e-11 ***
## final_reads -1.406e-05 2.540e-06 -5.537 4.51e-05 ***
## WhenLate 2.616e-02 1.836e-02 1.424 0.174
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.03989 on 16 degrees of freedom
## Multiple R-squared: 0.6786, Adjusted R-squared: 0.6385
## F-statistic: 16.89 on 2 and 16 DF, p-value: 0.0001138
The association we ‘obsereved’ was not because of experimental condition, but due to different final reads.
glm(data = data.frame(sample_data_with_alpha),
as.factor(When) ~ final_reads, family = "binomial") %>% summary
##
## Call:
## glm(formula = as.factor(When) ~ final_reads, family = "binomial",
## data = data.frame(sample_data_with_alpha))
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 3.189e-01 9.189e-01 0.347 0.729
## final_reads -3.429e-05 1.278e-04 -0.268 0.788
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 26.287 on 18 degrees of freedom
## Residual deviance: 26.215 on 17 degrees of freedom
## AIC: 30.215
##
## Number of Fisher Scoring iterations: 3
However, the change of dominance was due to experimental condition.
So, the TRUE change between the samples from early and late
phase, was Dominance.
How would it change the association for beta diversity?
Ordination
Principal Coordinates Analysis with Bray-Curtis dist
phyloseq_rel <- phyloseq %>% transform_sample_counts(., function(x) {x/sum(x)})
sample_data(phyloseq_rel) <- sample_data_with_alpha
plot_ordination(physeq = phyloseq_rel,
ordination = ordinate(phyloseq_rel, "PCoA", "bray"),
col = "When"
) +
geom_point(size = 8)
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the phyloseq package.
## Please report the issue at <https://github.com/joey711/phyloseq/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
These two groups seems different. Is it really different, by sequencing depth or the sampling phase?
plot_ordination(physeq = phyloseq_rel,
ordination = ordinate(phyloseq_rel, "PCoA", "bray"),
col = "final_reads"
) +
geom_point(size = 8) +
scale_color_gradient(low = "blue", high = "red")
Symbols and colors at the sample time
plot_ordination(physeq = phyloseq_rel,
ordination = ordinate(phyloseq_rel, "PCoA", "bray"),
shape = "When",
col = "final_reads"
) +
geom_point(size = 8) +
scale_color_gradient(low = "blue", high = "red") +
stat_ellipse(type = "norm", linetype = 2) +
stat_ellipse(type = "t") +
theme_classic()
## Warning: The following aesthetics were dropped during statistical transformation:
## colour.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
## The following aesthetics were dropped during statistical transformation:
## colour.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
#scale_color_brewer(type = "qual")
PERMANOVA
As betadiversity indices have multiple distances at the same time, the calculation must accompany multiple analyses, at a random order (permutation).
This is permutational anova (PERMANOVA).
Univariate permutational anova (PERMANOVA).
Based on sampling point
distance(phyloseq, method = "bray")
## F3D0 F3D1 F3D141 F3D142 F3D143 F3D144 F3D145
## F3D1 0.3862528
## F3D141 0.3503389 0.4912335
## F3D142 0.4999439 0.5322625 0.3618015
## F3D143 0.5250648 0.5550383 0.3522994 0.2332255
## F3D144 0.4148418 0.5510795 0.2265435 0.3206403 0.3107076
## F3D145 0.3458727 0.5998868 0.2695702 0.4474897 0.4763530 0.2742901
## F3D146 0.4152766 0.4892832 0.2503772 0.3492566 0.2904459 0.2486583 0.3708499
## F3D147 0.5301017 0.6900819 0.5241606 0.6815959 0.6931424 0.5831491 0.4006991
## F3D148 0.4453149 0.6416177 0.3836021 0.6090213 0.6155229 0.4887730 0.2816321
## F3D149 0.4540925 0.6241251 0.4076458 0.6289537 0.6299170 0.5172961 0.3236548
## F3D150 0.4033056 0.5295534 0.2214262 0.3389381 0.3202731 0.3020065 0.3521323
## F3D2 0.5527923 0.5675098 0.6502681 0.7732874 0.7884877 0.7367766 0.6240628
## F3D3 0.4788649 0.4982408 0.4211820 0.5205688 0.5537401 0.4762499 0.3850162
## F3D5 0.4083391 0.3418583 0.4116535 0.4176394 0.4508652 0.4424294 0.4955433
## F3D6 0.3859743 0.4389306 0.4486515 0.5870187 0.6065429 0.5154958 0.3936498
## F3D7 0.4137214 0.4663115 0.3887771 0.4558846 0.5011345 0.4195841 0.3496821
## F3D8 0.4518682 0.3661896 0.4442037 0.4870982 0.5199420 0.5219487 0.5237908
## F3D9 0.4088702 0.3882964 0.4339552 0.5590364 0.5799642 0.5251814 0.4710263
## F3D146 F3D147 F3D148 F3D149 F3D150 F3D2 F3D3
## F3D1
## F3D141
## F3D142
## F3D143
## F3D144
## F3D145
## F3D146
## F3D147 0.5877872
## F3D148 0.4980167 0.2253988
## F3D149 0.5169319 0.2627148 0.1611531
## F3D150 0.2564103 0.5565415 0.4640421 0.4315604
## F3D2 0.7084554 0.4970340 0.5563455 0.5121987 0.6675580
## F3D3 0.5230235 0.5074897 0.4145570 0.4078523 0.4694807 0.5417915
## F3D5 0.4013333 0.6684628 0.5846954 0.5891984 0.3885398 0.6532218 0.4250055
## F3D6 0.5014393 0.4745614 0.4144793 0.4141272 0.4599265 0.4713573 0.2728791
## F3D7 0.4632436 0.5551112 0.4743636 0.4863643 0.4075566 0.6192169 0.2338820
## F3D8 0.4707945 0.6292954 0.5628646 0.5624412 0.4591498 0.6349590 0.3814517
## F3D9 0.4913413 0.5802088 0.5009861 0.5030547 0.4785162 0.5447342 0.3480067
## F3D5 F3D6 F3D7 F3D8
## F3D1
## F3D141
## F3D142
## F3D143
## F3D144
## F3D145
## F3D146
## F3D147
## F3D148
## F3D149
## F3D150
## F3D2
## F3D3
## F3D5
## F3D6 0.3807947
## F3D7 0.3472098 0.2496978
## F3D8 0.2958718 0.3353266 0.2932558
## F3D9 0.3442998 0.2718129 0.2755203 0.2111047
distance(phyloseq, method = "jaccard")
## F3D0 F3D1 F3D141 F3D142 F3D143 F3D144 F3D145
## F3D1 0.5572617
## F3D141 0.5188904 0.6588284
## F3D142 0.6666168 0.6947406 0.5313571
## F3D143 0.6885803 0.7138581 0.5210376 0.3782366
## F3D144 0.5864144 0.7105754 0.3694015 0.4855831 0.4741067
## F3D145 0.5139753 0.7499116 0.4246637 0.6182976 0.6453104 0.4304986
## F3D146 0.5868486 0.6570721 0.4004827 0.5177022 0.4501481 0.3982808 0.5410510
## F3D147 0.6928974 0.8166254 0.6878023 0.8106536 0.8187645 0.7366951 0.5721416
## F3D148 0.6162185 0.7816896 0.5544977 0.7570084 0.7620107 0.6566118 0.4394898
## F3D149 0.6245717 0.7685677 0.5791881 0.7722180 0.7729436 0.6818657 0.4890320
## F3D150 0.5747937 0.6924288 0.3625699 0.5062789 0.4851619 0.4639093 0.5208548
## F3D2 0.7119978 0.7240909 0.7880757 0.8721512 0.8817368 0.8484414 0.7685206
## F3D3 0.6476114 0.6651011 0.5927207 0.6847027 0.7127834 0.6452158 0.5559735
## F3D5 0.5798875 0.5095296 0.5832217 0.5892040 0.6215122 0.6134503 0.6626933
## F3D6 0.5569718 0.6100789 0.6194057 0.7397754 0.7550908 0.6802998 0.5649192
## F3D7 0.5852941 0.6360333 0.5598841 0.6262647 0.6676743 0.5911367 0.5181696
## F3D8 0.6224645 0.5360744 0.6151538 0.6550989 0.6841603 0.6858953 0.6874839
## F3D9 0.5804228 0.5593854 0.6052563 0.7171563 0.7341486 0.6886806 0.6404050
## F3D146 F3D147 F3D148 F3D149 F3D150 F3D2 F3D3
## F3D1
## F3D141
## F3D142
## F3D143
## F3D144
## F3D145
## F3D146
## F3D147 0.7403853
## F3D148 0.6649014 0.3678783
## F3D149 0.6815493 0.4161110 0.2775743
## F3D150 0.4081633 0.7151001 0.6339190 0.6029230
## F3D2 0.8293519 0.6640250 0.7149383 0.6774225 0.8006414
## F3D3 0.6868226 0.6732911 0.5861298 0.5793964 0.6389749 0.7028077
## F3D5 0.5727878 0.8012918 0.7379278 0.7415039 0.5596380 0.7902410 0.5964967
## F3D6 0.6679448 0.6436645 0.5860521 0.5857000 0.6300680 0.6407109 0.4287589
## F3D7 0.6331736 0.7139183 0.6434825 0.6544348 0.5790979 0.7648350 0.3790994
## F3D8 0.6401908 0.7724755 0.7202986 0.7199518 0.6293388 0.7767277 0.5522476
## F3D9 0.6589254 0.7343445 0.6675426 0.6693765 0.6472925 0.7052788 0.5163279
## F3D5 F3D6 F3D7 F3D8
## F3D1
## F3D141
## F3D142
## F3D143
## F3D144
## F3D145
## F3D146
## F3D147
## F3D148
## F3D149
## F3D150
## F3D2
## F3D3
## F3D5
## F3D6 0.5515588
## F3D7 0.5154502 0.3996130
## F3D8 0.4566375 0.5022391 0.4535156
## F3D9 0.5122367 0.4274416 0.4320124 0.3486151
data.frame(sample_data_with_alpha)
## Subject Gender Day When observed chao1 diversity_inverse_simpson
## F3D0 3 F 0 Early 104 104 27.74707
## F3D1 3 F 1 Early 94 94 32.96293
## F3D141 3 F 141 Late 72 72 19.87039
## F3D142 3 F 142 Late 47 47 16.13006
## F3D143 3 F 143 Late 54 54 18.03920
## F3D144 3 F 144 Late 45 45 14.43812
## F3D145 3 F 145 Late 70 70 15.85773
## F3D146 3 F 146 Late 82 82 22.71345
## F3D147 3 F 147 Late 103 103 17.15604
## F3D148 3 F 148 Late 94 94 20.20291
## F3D149 3 F 149 Late 109 109 23.13346
## F3D150 3 F 150 Late 75 75 21.86987
## F3D2 3 F 2 Early 132 132 14.23904
## F3D3 3 F 3 Early 64 64 12.67559
## F3D5 3 F 5 Early 82 82 26.56809
## F3D6 3 F 6 Early 87 87 16.13482
## F3D7 3 F 7 Early 58 58 12.38949
## F3D8 3 F 8 Early 97 97 22.16340
## F3D9 3 F 9 Early 101 101 23.20831
## diversity_gini_simpson diversity_shannon diversity_fisher
## F3D0 0.9639602 3.845083 17.623177
## F3D1 0.9696629 3.939724 16.535173
## F3D141 0.9496739 3.416365 12.020521
## F3D142 0.9380039 3.100883 8.218271
## F3D143 0.9445652 3.256438 9.764408
## F3D144 0.9307389 2.985183 7.309790
## F3D145 0.9369393 3.114549 11.214612
## F3D146 0.9559732 3.609865 14.739937
## F3D147 0.9417115 3.338489 15.294818
## F3D148 0.9505022 3.424846 14.412322
## F3D149 0.9567726 3.631134 16.969707
## F3D150 0.9542750 3.557807 12.914216
## F3D2 0.9297705 3.553624 19.560660
## F3D3 0.9211082 3.017631 10.208471
## F3D5 0.9623609 3.737352 14.873300
## F3D6 0.9380222 3.399273 14.155266
## F3D7 0.9192864 2.965123 9.540154
## F3D8 0.9548806 3.701149 17.508733
## F3D9 0.9569120 3.737488 17.292247
## diversity_coverage evenness_camargo evenness_pielou evenness_simpson
## F3D0 10 0.3282552 0.8278982 0.2667988
## F3D1 13 0.3954679 0.8671513 0.3506694
## F3D141 7 0.2989099 0.7988383 0.2759776
## F3D142 6 0.3315311 0.8053933 0.3431927
## F3D143 6 0.3352886 0.8163578 0.3340592
## F3D144 6 0.3107364 0.7841996 0.3208471
## F3D145 6 0.2344288 0.7330946 0.2265390
## F3D146 8 0.3237746 0.8191729 0.2769933
## F3D147 6 0.2098778 0.7203202 0.1665635
## F3D148 7 0.2390079 0.7538243 0.2149246
## F3D149 8 0.2559155 0.7740066 0.2122336
## F3D150 8 0.3314166 0.8240456 0.2915983
## F3D2 6 0.2355371 0.7277838 0.1078715
## F3D3 5 0.2358849 0.7255868 0.1980561
## F3D5 9 0.3723447 0.8481030 0.3240011
## F3D6 6 0.2716844 0.7611605 0.1854576
## F3D7 5 0.2472467 0.7302461 0.2136119
## F3D8 8 0.3136044 0.8090454 0.2284886
## F3D9 8 0.3137205 0.8098354 0.2297852
## evenness_evar evenness_bulla dominance_dbp dominance_dmn
## F3D0 0.3715485 0.5085783 0.08981943 0.1625156
## F3D1 0.4058522 0.5706106 0.08328180 0.1558442
## F3D141 0.3439671 0.4407498 0.10338346 0.1959064
## F3D142 0.3020775 0.4582905 0.12144289 0.2368737
## F3D143 0.3757782 0.4666677 0.09906237 0.1928251
## F3D144 0.2944975 0.4213795 0.12093023 0.2244186
## F3D145 0.2382547 0.3366673 0.11180664 0.2123109
## F3D146 0.4100491 0.4723746 0.10164620 0.1865691
## F3D147 0.2780173 0.3305908 0.11592962 0.2103706
## F3D148 0.2752081 0.3582721 0.08899561 0.1766629
## F3D149 0.3033751 0.3963266 0.08576083 0.1699885
## F3D150 0.4177821 0.4879845 0.10898483 0.2016336
## F3D2 0.3021921 0.3973778 0.20886114 0.3034160
## F3D3 0.2818964 0.3534158 0.18267980 0.2943691
## F3D5 0.4235522 0.5441913 0.08902804 0.1660768
## F3D6 0.3893171 0.4116843 0.15284306 0.2545868
## F3D7 0.2977758 0.3614254 0.15488215 0.2768158
## F3D8 0.3993826 0.4895786 0.12449347 0.2030617
## F3D9 0.4161506 0.4837492 0.09979771 0.1857721
## dominance_absolute dominance_relative dominance_simpson
## F3D0 577 0.08981943 0.03603984
## F3D1 404 0.08328180 0.03033711
## F3D141 495 0.10338346 0.05032614
## F3D142 303 0.12144289 0.06199606
## F3D143 243 0.09906237 0.05543484
## F3D144 416 0.12093023 0.06926109
## F3D145 643 0.11180664 0.06306074
## F3D146 389 0.10164620 0.04402677
## F3D147 1489 0.11592962 0.05828851
## F3D148 871 0.08899561 0.04949782
## F3D149 895 0.08576083 0.04322743
## F3D150 467 0.10898483 0.04572500
## F3D2 3479 0.20886114 0.07022948
## F3D3 983 0.18267980 0.07889181
## F3D5 327 0.08902804 0.03763914
## F3D6 1008 0.15284306 0.06197778
## F3D7 644 0.15488215 0.08071357
## F3D8 553 0.12449347 0.04511944
## F3D9 592 0.09979771 0.04308802
## dominance_core_abundance dominance_gini rarity_log_modulo_skewness
## F3D0 0.7693026 0.8351031 2.042400
## F3D1 0.7179963 0.8227637 2.025227
## F3D141 0.9043442 0.8958192 2.043099
## F3D142 0.9482966 0.9261118 1.995630
## F3D143 0.9168365 0.9123848 2.039543
## F3D144 0.9526163 0.9334729 2.020668
## F3D145 0.9238393 0.9229313 2.042709
## F3D146 0.8228377 0.8689510 2.047470
## F3D147 0.8864061 0.8975764 2.047934
## F3D148 0.8900582 0.8934573 2.050079
## F3D149 0.8462054 0.8668362 2.046433
## F3D150 0.8686114 0.8779422 2.043424
## F3D2 0.8013448 0.8536546 1.999751
## F3D3 0.9384873 0.9278193 2.045144
## F3D5 0.7919956 0.8538512 2.051674
## F3D6 0.8356331 0.8849169 2.055520
## F3D7 0.9220779 0.9313345 2.050004
## F3D8 0.7593426 0.8515203 2.056511
## F3D9 0.7434255 0.8447396 2.051201
## rarity_low_abundance rarity_rare_abundance final_reads
## F3D0 0.036737235 0.082347447 6424
## F3D1 0.022675737 0.047825191 4851
## F3D141 0.019632414 0.011904762 4788
## F3D142 0.006813627 0.012024048 2495
## F3D143 0.006114961 0.011822258 2453
## F3D144 0.008139535 0.004941860 3440
## F3D145 0.032516084 0.017909929 5751
## F3D146 0.016461981 0.043114711 3827
## F3D147 0.056913734 0.029040797 12844
## F3D148 0.055481762 0.024522326 9787
## F3D149 0.046857033 0.044174013 10436
## F3D150 0.012602100 0.032438740 4285
## F3D2 0.050849493 0.056973044 16657
## F3D3 0.028619216 0.015424642 5381
## F3D5 0.019057991 0.035121154 3673
## F3D6 0.032297195 0.035178165 6595
## F3D7 0.021164021 0.008658009 4158
## F3D8 0.024313372 0.064835660 4442
## F3D9 0.032366824 0.074173972 5932
set.seed(1)
vegan::adonis2(formula = distance(phyloseq, method = "bray") ~ When + Day,
data = data.frame(sample_data_with_alpha), by = "terms",
permutations = 10000)
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 10000
##
## vegan::adonis2(formula = distance(phyloseq, method = "bray") ~ When + Day, data = data.frame(sample_data_with_alpha), permutations = 10000, by = "terms")
## Df SumOfSqs R2 F Pr(>F)
## When 1 0.50229 0.24709 5.8595 9.999e-05 ***
## Day 1 0.15900 0.07822 1.8548 0.09719 .
## Residual 16 1.37157 0.67470
## Total 18 2.03286 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There was difference. R-squared shows that the 25% of variance in
beta-diversity was explained by When variable.
Multivariate anlaysis
vegan::adonis2(formula = distance(phyloseq, method = "bray") ~ final_reads + When,
data = data.frame(sample_data_with_alpha), by = "terms",
permutations = 10000)
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 10000
##
## vegan::adonis2(formula = distance(phyloseq, method = "bray") ~ final_reads + When, data = data.frame(sample_data_with_alpha), permutations = 10000, by = "terms")
## Df SumOfSqs R2 F Pr(>F)
## final_reads 1 0.65112 0.32030 11.7547 9.999e-05 ***
## When 1 0.49546 0.24373 8.9445 9.999e-05 ***
## Residual 16 0.88628 0.43598
## Total 18 2.03286 1.00000
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There was difference. R-squared shows that the 24% of variance in
beta-diversity was explained by When variable, and 32% was
explained by Fian reads.
–> 1% of variance in beta-diversity was ADJUSTED by sequencing depth.
New example phyloseq data from the host depletion study
download.file("https://github.com/minsiksudo/IBS7048_Advanced_metagenomics/raw/refs/heads/main/phyloseq_examples_host_depletion.rds", destfile = "phyloseq_ex_host_dep_deidentified.rds")
phyloseq2 <- readRDS("phyloseq_ex_host_dep_deidentified.rds")
decontam
#write your own code
rarefaction
#write your own code
Bibliography
## Computing. R Foundation for Statistical Computing, Vienna, Austria. <https://www.R-project.org/>. We have invested a lot of time and effort in creating R, please cite it when using it for data analysis. See also 'citation("pkgname")' for citing R packages.
## version 1.4.5, <https://CRAN.R-project.org/package=readxl>.
## graphics of microbiome census data. Paul J. McMurdie and Susan Holmes (2013) PLoS ONE 8(4):e61217.
## Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). "Welcome to the tidyverse." Journal of Open Source Software_, *4*(43), 1686. doi:10.21105/joss.01686 <https://doi.org/10.21105/joss.01686>.
## R. version 0.5.0. Buffalo, New York. http://github.com/trinker/pacman
## J, reikoch, Beasley W, O'Connor B, Warnes GR, Quinn M, Kamvar ZN, Gao C (2024). yaml: Methods to Convert R Data to YAML and Back_. R package version 2.3.10, <https://CRAN.R-project.org/package=yaml>. ATTENTION: This citation information has been auto-generated from the package DESCRIPTION file and may need manual editing, see 'help("citation")'.