COD_202511020_MGK_IBS7048_week12_statistical_microbiome_analysis

Minsik Kim

2025-11-03

This analysis pipe noted https://github.com/minsiksudo/BTE3207_Advanced_Biostatistics.

Roading the data

from previous lecture.. we generated a phyloseq object. it can be saved to hard disk using the below command.

saveRDS(ps, "phyloseq_example.rds")

And it can be loaded to a new R session like below!

getwd()
## [1] "/Volumes/macdrive/Dropbox"
library(phyloseq)
library(tidyverse)
library(microbiome)
## 
## microbiome R package (microbiome.github.com)
##     
## 
## 
##  Copyright (C) 2011-2022 Leo Lahti, 
##     Sudarshan Shetty et al. <microbiome.github.io>
## 
## Attaching package: 'microbiome'
## The following object is masked from 'package:ggplot2':
## 
##     alpha
## The following object is masked from 'package:base':
## 
##     transform
library(vegan)
## Loading required package: permute
## 
## Attaching package: 'vegan'
## The following object is masked from 'package:microbiome':
## 
##     diversity
library(ggplot2)
#source("https://raw.githubusercontent.com/joey711/phyloseq/master/inst/scripts/installer.R",
#       local = TRUE)


phyloseq <- readRDS("Inha/5_Lectures/2_IBS7048_Metagenomics/2025F/scripts/IBS7048_Advanced_metagenomics/phyloseq_examples_tree.rds")

phyloseq
## phyloseq-class experiment-level object
## otu_table()   OTU Table:         [ 225 taxa and 19 samples ]
## sample_data() Sample Data:       [ 19 samples by 4 sample variables ]
## tax_table()   Taxonomy Table:    [ 225 taxa by 6 taxonomic ranks ]
## phy_tree()    Phylogenetic Tree: [ 225 tips and 223 internal nodes ]
## refseq()      DNAStringSet:      [ 225 reference sequences ]

Basics of statistical tests

MPG dataset

MPG dataset is basic data installed in ggplot

head(mpg)
## # A tibble: 6 × 11
##   manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
##   <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr> 
## 1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa…
## 2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa…
## 3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa…
## 4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa…
## 5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa…
## 6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compa…

Double-check the data type.

str(mpg)
## tibble [234 × 11] (S3: tbl_df/tbl/data.frame)
##  $ manufacturer: chr [1:234] "audi" "audi" "audi" "audi" ...
##  $ model       : chr [1:234] "a4" "a4" "a4" "a4" ...
##  $ displ       : num [1:234] 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ...
##  $ year        : int [1:234] 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ...
##  $ cyl         : int [1:234] 4 4 4 4 6 6 6 4 4 4 ...
##  $ trans       : chr [1:234] "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ...
##  $ drv         : chr [1:234] "f" "f" "f" "f" ...
##  $ cty         : int [1:234] 18 21 20 21 16 18 18 18 16 20 ...
##  $ hwy         : int [1:234] 29 29 31 30 26 26 27 26 25 28 ...
##  $ fl          : chr [1:234] "p" "p" "p" "p" ...
##  $ class       : chr [1:234] "compact" "compact" "compact" "compact" ...

t.test fucntion will conduct basic t-test

Paired t-test

t.test(mpg$cty, mpg$hwy, paired = T)
## 
##  Paired t-test
## 
## data:  mpg$cty and mpg$hwy
## t = -44.492, df = 233, p-value < 2.2e-16
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -6.872628 -6.289765
## sample estimates:
## mean difference 
##       -6.581197

Unpaired t-test

mpg$class
##   [1] "compact"    "compact"    "compact"    "compact"    "compact"   
##   [6] "compact"    "compact"    "compact"    "compact"    "compact"   
##  [11] "compact"    "compact"    "compact"    "compact"    "compact"   
##  [16] "midsize"    "midsize"    "midsize"    "suv"        "suv"       
##  [21] "suv"        "suv"        "suv"        "2seater"    "2seater"   
##  [26] "2seater"    "2seater"    "2seater"    "suv"        "suv"       
##  [31] "suv"        "suv"        "midsize"    "midsize"    "midsize"   
##  [36] "midsize"    "midsize"    "minivan"    "minivan"    "minivan"   
##  [41] "minivan"    "minivan"    "minivan"    "minivan"    "minivan"   
##  [46] "minivan"    "minivan"    "minivan"    "pickup"     "pickup"    
##  [51] "pickup"     "pickup"     "pickup"     "pickup"     "pickup"    
##  [56] "pickup"     "pickup"     "suv"        "suv"        "suv"       
##  [61] "suv"        "suv"        "suv"        "suv"        "pickup"    
##  [66] "pickup"     "pickup"     "pickup"     "pickup"     "pickup"    
##  [71] "pickup"     "pickup"     "pickup"     "pickup"     "suv"       
##  [76] "suv"        "suv"        "suv"        "suv"        "suv"       
##  [81] "suv"        "suv"        "suv"        "pickup"     "pickup"    
##  [86] "pickup"     "pickup"     "pickup"     "pickup"     "pickup"    
##  [91] "subcompact" "subcompact" "subcompact" "subcompact" "subcompact"
##  [96] "subcompact" "subcompact" "subcompact" "subcompact" "subcompact"
## [101] "subcompact" "subcompact" "subcompact" "subcompact" "subcompact"
## [106] "subcompact" "subcompact" "subcompact" "midsize"    "midsize"   
## [111] "midsize"    "midsize"    "midsize"    "midsize"    "midsize"   
## [116] "subcompact" "subcompact" "subcompact" "subcompact" "subcompact"
## [121] "subcompact" "subcompact" "suv"        "suv"        "suv"       
## [126] "suv"        "suv"        "suv"        "suv"        "suv"       
## [131] "suv"        "suv"        "suv"        "suv"        "suv"       
## [136] "suv"        "suv"        "suv"        "suv"        "suv"       
## [141] "suv"        "compact"    "compact"    "midsize"    "midsize"   
## [146] "midsize"    "midsize"    "midsize"    "midsize"    "midsize"   
## [151] "suv"        "suv"        "suv"        "suv"        "midsize"   
## [156] "midsize"    "midsize"    "midsize"    "midsize"    "suv"       
## [161] "suv"        "suv"        "suv"        "suv"        "suv"       
## [166] "subcompact" "subcompact" "subcompact" "subcompact" "compact"   
## [171] "compact"    "compact"    "compact"    "suv"        "suv"       
## [176] "suv"        "suv"        "suv"        "suv"        "midsize"   
## [181] "midsize"    "midsize"    "midsize"    "midsize"    "midsize"   
## [186] "midsize"    "compact"    "compact"    "compact"    "compact"   
## [191] "compact"    "compact"    "compact"    "compact"    "compact"   
## [196] "compact"    "compact"    "compact"    "suv"        "suv"       
## [201] "pickup"     "pickup"     "pickup"     "pickup"     "pickup"    
## [206] "pickup"     "pickup"     "compact"    "compact"    "compact"   
## [211] "compact"    "compact"    "compact"    "compact"    "compact"   
## [216] "compact"    "compact"    "compact"    "compact"    "compact"   
## [221] "compact"    "subcompact" "subcompact" "subcompact" "subcompact"
## [226] "subcompact" "subcompact" "midsize"    "midsize"    "midsize"   
## [231] "midsize"    "midsize"    "midsize"    "midsize"
mpg %>% filter(class == "2seater")%>% .$cty
## [1] 16 15 16 15 15
city_mpg_2seater <- mpg %>% filter(class == "2seater") %>% .$cty

city_mpg_midsize <- mpg %>% filter(class == "midsize") %>% .$cty

city_mpg_2seater
## [1] 16 15 16 15 15
city_mpg_midsize
##  [1] 15 17 16 19 22 18 18 17 18 18 21 21 18 18 19 23 23 19 19 18 19 19 18 16 17
## [26] 18 16 21 21 21 21 18 18 19 21 18 19 21 16 18 17
t.test(city_mpg_2seater,
       city_mpg_midsize,
       paired = F,
       var.equal = F)
## 
##  Welch Two Sample t-test
## 
## data:  city_mpg_2seater and city_mpg_midsize
## t = -8.5965, df = 20.862, p-value = 2.69e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -4.168314 -2.543882
## sample estimates:
## mean of x mean of y 
##   15.4000   18.7561

Linear model

Univariate linear model

Boxplot of categorical x, continuous y

boxplot(data = mpg, cty ~ class)

Linear model

lm(data = mpg, 
       cty ~ class)
## 
## Call:
## lm(formula = cty ~ class, data = mpg)
## 
## Coefficients:
##     (Intercept)     classcompact     classmidsize     classminivan  
##         15.4000           4.7277           3.3561           0.4182  
##     classpickup  classsubcompact         classsuv  
##         -2.4000           4.9714          -1.9000

Linear model - the output can be stored

lm_result_mpg <- 
        lm(data = mpg, 
           cty ~ class)

lm_result_mpg
## 
## Call:
## lm(formula = cty ~ class, data = mpg)
## 
## Coefficients:
##     (Intercept)     classcompact     classmidsize     classminivan  
##         15.4000           4.7277           3.3561           0.4182  
##     classpickup  classsubcompact         classsuv  
##         -2.4000           4.9714          -1.9000

The result of lm is a list having multiple caclulated factors

lm_result_mpg %>% names
##  [1] "coefficients"  "residuals"     "effects"       "rank"         
##  [5] "fitted.values" "assign"        "qr"            "df.residual"  
##  [9] "contrasts"     "xlevels"       "call"          "terms"        
## [13] "model"

Summary of a linear model

summary(lm_result_mpg)
## 
## Call:
## lm(formula = cty ~ class, data = mpg)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6.3714 -1.7561 -0.1277  1.0000 14.6286 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      15.4000     1.3024  11.824  < 2e-16 ***
## classcompact      4.7277     1.3699   3.451 0.000666 ***
## classmidsize      3.3561     1.3796   2.433 0.015759 *  
## classminivan      0.4182     1.5708   0.266 0.790307    
## classpickup      -2.4000     1.3976  -1.717 0.087304 .  
## classsubcompact   4.9714     1.3923   3.571 0.000435 ***
## classsuv         -1.9000     1.3539  -1.403 0.161884    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.912 on 227 degrees of freedom
## Multiple R-squared:  0.5438, Adjusted R-squared:  0.5317 
## F-statistic:  45.1 on 6 and 227 DF,  p-value: < 2.2e-16

This result will show you how different data is. (by the factor label of class)

You have 1. Estimate (beta) 2. Standard errors 3. p-values 4. R-squared values

ANOVA

Analysis of variance for linear model can be done to this linear model result.

anova(lm_result_mpg)
## Analysis of Variance Table
## 
## Response: cty
##            Df Sum Sq Mean Sq F value    Pr(>F)    
## class       6 2295.0  382.51  45.099 < 2.2e-16 ***
## Residuals 227 1925.3    8.48                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

How can we test different multiple associations?

ggplot(mpg,
       aes(y = cty, x = class, fill = as.factor(year))) +
        geom_boxplot()

Anova - Multivariate

lm(data = mpg,
        formula = hwy ~ class + as.factor(year)) %>%
        anova
## Analysis of Variance Table
## 
## Response: hwy
##                  Df Sum Sq Mean Sq F value Pr(>F)    
## class             6 5683.2  947.21 83.4157 <2e-16 ***
## as.factor(year)   1   12.1   12.15  1.0699 0.3021    
## Residuals       226 2566.3   11.36                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

linear model - Multivariate

Multiple linear regression can be don by just adding a new variable to formula

lm(data = mpg,
        formula = hwy ~ class * as.factor(year)) %>%
        summary
## 
## Call:
## lm(formula = hwy ~ class * as.factor(year), data = mpg)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.000 -1.812 -0.500  1.341 16.080 
## 
## Coefficients:
##                                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                          24.5000     2.3822  10.285  < 2e-16 ***
## classcompact                          3.4200     2.4756   1.381  0.16854    
## classmidsize                          2.0000     2.4984   0.800  0.42429    
## classminivan                         -2.0000     2.7507  -0.727  0.46795    
## classpickup                          -7.6875     2.5267  -3.043  0.00263 ** 
## classsubcompact                       4.5000     2.5044   1.797  0.07373 .  
## classsuv                             -6.9483     2.4629  -2.821  0.00522 ** 
## as.factor(year)2008                   0.5000     3.0754   0.163  0.87100    
## classcompact:as.factor(year)2008      0.3073     3.2292   0.095  0.92428    
## classmidsize:as.factor(year)2008      1.0476     3.2505   0.322  0.74754    
## classminivan:as.factor(year)2008     -0.8000     3.6904  -0.217  0.82858    
## classpickup:as.factor(year)2008      -0.3713     3.2916  -0.113  0.91029    
## classsubcompact:as.factor(year)2008  -2.3750     3.2809  -0.724  0.46991    
## classsuv:as.factor(year)2008          0.5846     3.1927   0.183  0.85487    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.369 on 220 degrees of freedom
## Multiple R-squared:  0.6978, Adjusted R-squared:  0.6799 
## F-statistic: 39.07 on 13 and 220 DF,  p-value: < 2.2e-16

Statistical tests for alpha diversity

# microbiome package

microbiome::alpha(phyloseq)
## Observed richness
## Other forms of richness
## Diversity
## Evenness
## Dominance
## Rarity
##        observed chao1 diversity_inverse_simpson diversity_gini_simpson
## F3D0        104   104                  27.74707              0.9639602
## F3D1         94    94                  32.96293              0.9696629
## F3D141       72    72                  19.87039              0.9496739
## F3D142       47    47                  16.13006              0.9380039
## F3D143       54    54                  18.03920              0.9445652
## F3D144       45    45                  14.43812              0.9307389
## F3D145       70    70                  15.85773              0.9369393
## F3D146       82    82                  22.71345              0.9559732
## F3D147      103   103                  17.15604              0.9417115
## F3D148       94    94                  20.20291              0.9505022
## F3D149      109   109                  23.13346              0.9567726
## F3D150       75    75                  21.86987              0.9542750
## F3D2        132   132                  14.23904              0.9297705
## F3D3         64    64                  12.67559              0.9211082
## F3D5         82    82                  26.56809              0.9623609
## F3D6         87    87                  16.13482              0.9380222
## F3D7         58    58                  12.38949              0.9192864
## F3D8         97    97                  22.16340              0.9548806
## F3D9        101   101                  23.20831              0.9569120
##        diversity_shannon diversity_fisher diversity_coverage evenness_camargo
## F3D0            3.845083        17.623177                 10        0.3282552
## F3D1            3.939724        16.535173                 13        0.3954679
## F3D141          3.416365        12.020521                  7        0.2989099
## F3D142          3.100883         8.218271                  6        0.3315311
## F3D143          3.256438         9.764408                  6        0.3352886
## F3D144          2.985183         7.309790                  6        0.3107364
## F3D145          3.114549        11.214612                  6        0.2344288
## F3D146          3.609865        14.739937                  8        0.3237746
## F3D147          3.338489        15.294818                  6        0.2098778
## F3D148          3.424846        14.412322                  7        0.2390079
## F3D149          3.631134        16.969707                  8        0.2559155
## F3D150          3.557807        12.914216                  8        0.3314166
## F3D2            3.553624        19.560660                  6        0.2355371
## F3D3            3.017631        10.208471                  5        0.2358849
## F3D5            3.737352        14.873300                  9        0.3723447
## F3D6            3.399273        14.155266                  6        0.2716844
## F3D7            2.965123         9.540154                  5        0.2472467
## F3D8            3.701149        17.508733                  8        0.3136044
## F3D9            3.737488        17.292247                  8        0.3137205
##        evenness_pielou evenness_simpson evenness_evar evenness_bulla
## F3D0         0.8278982        0.2667988     0.3715485      0.5085783
## F3D1         0.8671513        0.3506694     0.4058522      0.5706106
## F3D141       0.7988383        0.2759776     0.3439671      0.4407498
## F3D142       0.8053933        0.3431927     0.3020775      0.4582905
## F3D143       0.8163578        0.3340592     0.3757782      0.4666677
## F3D144       0.7841996        0.3208471     0.2944975      0.4213795
## F3D145       0.7330946        0.2265390     0.2382547      0.3366673
## F3D146       0.8191729        0.2769933     0.4100491      0.4723746
## F3D147       0.7203202        0.1665635     0.2780173      0.3305908
## F3D148       0.7538243        0.2149246     0.2752081      0.3582721
## F3D149       0.7740066        0.2122336     0.3033751      0.3963266
## F3D150       0.8240456        0.2915983     0.4177821      0.4879845
## F3D2         0.7277838        0.1078715     0.3021921      0.3973778
## F3D3         0.7255868        0.1980561     0.2818964      0.3534158
## F3D5         0.8481030        0.3240011     0.4235522      0.5441913
## F3D6         0.7611605        0.1854576     0.3893171      0.4116843
## F3D7         0.7302461        0.2136119     0.2977758      0.3614254
## F3D8         0.8090454        0.2284886     0.3993826      0.4895786
## F3D9         0.8098354        0.2297852     0.4161506      0.4837492
##        dominance_dbp dominance_dmn dominance_absolute dominance_relative
## F3D0      0.08981943     0.1625156                577         0.08981943
## F3D1      0.08328180     0.1558442                404         0.08328180
## F3D141    0.10338346     0.1959064                495         0.10338346
## F3D142    0.12144289     0.2368737                303         0.12144289
## F3D143    0.09906237     0.1928251                243         0.09906237
## F3D144    0.12093023     0.2244186                416         0.12093023
## F3D145    0.11180664     0.2123109                643         0.11180664
## F3D146    0.10164620     0.1865691                389         0.10164620
## F3D147    0.11592962     0.2103706               1489         0.11592962
## F3D148    0.08899561     0.1766629                871         0.08899561
## F3D149    0.08576083     0.1699885                895         0.08576083
## F3D150    0.10898483     0.2016336                467         0.10898483
## F3D2      0.20886114     0.3034160               3479         0.20886114
## F3D3      0.18267980     0.2943691                983         0.18267980
## F3D5      0.08902804     0.1660768                327         0.08902804
## F3D6      0.15284306     0.2545868               1008         0.15284306
## F3D7      0.15488215     0.2768158                644         0.15488215
## F3D8      0.12449347     0.2030617                553         0.12449347
## F3D9      0.09979771     0.1857721                592         0.09979771
##        dominance_simpson dominance_core_abundance dominance_gini
## F3D0          0.03603984                0.7693026      0.8351031
## F3D1          0.03033711                0.7179963      0.8227637
## F3D141        0.05032614                0.9043442      0.8958192
## F3D142        0.06199606                0.9482966      0.9261118
## F3D143        0.05543484                0.9168365      0.9123848
## F3D144        0.06926109                0.9526163      0.9334729
## F3D145        0.06306074                0.9238393      0.9229313
## F3D146        0.04402677                0.8228377      0.8689510
## F3D147        0.05828851                0.8864061      0.8975764
## F3D148        0.04949782                0.8900582      0.8934573
## F3D149        0.04322743                0.8462054      0.8668362
## F3D150        0.04572500                0.8686114      0.8779422
## F3D2          0.07022948                0.8013448      0.8536546
## F3D3          0.07889181                0.9384873      0.9278193
## F3D5          0.03763914                0.7919956      0.8538512
## F3D6          0.06197778                0.8356331      0.8849169
## F3D7          0.08071357                0.9220779      0.9313345
## F3D8          0.04511944                0.7593426      0.8515203
## F3D9          0.04308802                0.7434255      0.8447396
##        rarity_log_modulo_skewness rarity_low_abundance rarity_rare_abundance
## F3D0                     2.042400          0.036737235           0.082347447
## F3D1                     2.025227          0.022675737           0.047825191
## F3D141                   2.043099          0.019632414           0.011904762
## F3D142                   1.995630          0.006813627           0.012024048
## F3D143                   2.039543          0.006114961           0.011822258
## F3D144                   2.020668          0.008139535           0.004941860
## F3D145                   2.042709          0.032516084           0.017909929
## F3D146                   2.047470          0.016461981           0.043114711
## F3D147                   2.047934          0.056913734           0.029040797
## F3D148                   2.050079          0.055481762           0.024522326
## F3D149                   2.046433          0.046857033           0.044174013
## F3D150                   2.043424          0.012602100           0.032438740
## F3D2                     1.999751          0.050849493           0.056973044
## F3D3                     2.045144          0.028619216           0.015424642
## F3D5                     2.051674          0.019057991           0.035121154
## F3D6                     2.055520          0.032297195           0.035178165
## F3D7                     2.050004          0.021164021           0.008658009
## F3D8                     2.056511          0.024313372           0.064835660
## F3D9                     2.051201          0.032366824           0.074173972
sample_data_with_alpha <-
        phyloseq %>%
        sample_data() %>%
        merge(., #Merge function will do the same thing as vloopup at excel
              microbiome::alpha(phyloseq),
              by = 0) %>%
        column_to_rownames("Row.names") %>%
        sample_data
## Observed richness
## Other forms of richness
## Diversity
## Evenness
## Dominance
## Rarity
sample_data(phyloseq) <- sample_data_with_alpha
sample_data_with_alpha
##        Subject Gender Day  When observed chao1 diversity_inverse_simpson
## F3D0         3      F   0 Early      104   104                  27.74707
## F3D1         3      F   1 Early       94    94                  32.96293
## F3D141       3      F 141  Late       72    72                  19.87039
## F3D142       3      F 142  Late       47    47                  16.13006
## F3D143       3      F 143  Late       54    54                  18.03920
## F3D144       3      F 144  Late       45    45                  14.43812
## F3D145       3      F 145  Late       70    70                  15.85773
## F3D146       3      F 146  Late       82    82                  22.71345
## F3D147       3      F 147  Late      103   103                  17.15604
## F3D148       3      F 148  Late       94    94                  20.20291
## F3D149       3      F 149  Late      109   109                  23.13346
## F3D150       3      F 150  Late       75    75                  21.86987
## F3D2         3      F   2 Early      132   132                  14.23904
## F3D3         3      F   3 Early       64    64                  12.67559
## F3D5         3      F   5 Early       82    82                  26.56809
## F3D6         3      F   6 Early       87    87                  16.13482
## F3D7         3      F   7 Early       58    58                  12.38949
## F3D8         3      F   8 Early       97    97                  22.16340
## F3D9         3      F   9 Early      101   101                  23.20831
##        diversity_gini_simpson diversity_shannon diversity_fisher
## F3D0                0.9639602          3.845083        17.623177
## F3D1                0.9696629          3.939724        16.535173
## F3D141              0.9496739          3.416365        12.020521
## F3D142              0.9380039          3.100883         8.218271
## F3D143              0.9445652          3.256438         9.764408
## F3D144              0.9307389          2.985183         7.309790
## F3D145              0.9369393          3.114549        11.214612
## F3D146              0.9559732          3.609865        14.739937
## F3D147              0.9417115          3.338489        15.294818
## F3D148              0.9505022          3.424846        14.412322
## F3D149              0.9567726          3.631134        16.969707
## F3D150              0.9542750          3.557807        12.914216
## F3D2                0.9297705          3.553624        19.560660
## F3D3                0.9211082          3.017631        10.208471
## F3D5                0.9623609          3.737352        14.873300
## F3D6                0.9380222          3.399273        14.155266
## F3D7                0.9192864          2.965123         9.540154
## F3D8                0.9548806          3.701149        17.508733
## F3D9                0.9569120          3.737488        17.292247
##        diversity_coverage evenness_camargo evenness_pielou evenness_simpson
## F3D0                   10        0.3282552       0.8278982        0.2667988
## F3D1                   13        0.3954679       0.8671513        0.3506694
## F3D141                  7        0.2989099       0.7988383        0.2759776
## F3D142                  6        0.3315311       0.8053933        0.3431927
## F3D143                  6        0.3352886       0.8163578        0.3340592
## F3D144                  6        0.3107364       0.7841996        0.3208471
## F3D145                  6        0.2344288       0.7330946        0.2265390
## F3D146                  8        0.3237746       0.8191729        0.2769933
## F3D147                  6        0.2098778       0.7203202        0.1665635
## F3D148                  7        0.2390079       0.7538243        0.2149246
## F3D149                  8        0.2559155       0.7740066        0.2122336
## F3D150                  8        0.3314166       0.8240456        0.2915983
## F3D2                    6        0.2355371       0.7277838        0.1078715
## F3D3                    5        0.2358849       0.7255868        0.1980561
## F3D5                    9        0.3723447       0.8481030        0.3240011
## F3D6                    6        0.2716844       0.7611605        0.1854576
## F3D7                    5        0.2472467       0.7302461        0.2136119
## F3D8                    8        0.3136044       0.8090454        0.2284886
## F3D9                    8        0.3137205       0.8098354        0.2297852
##        evenness_evar evenness_bulla dominance_dbp dominance_dmn
## F3D0       0.3715485      0.5085783    0.08981943     0.1625156
## F3D1       0.4058522      0.5706106    0.08328180     0.1558442
## F3D141     0.3439671      0.4407498    0.10338346     0.1959064
## F3D142     0.3020775      0.4582905    0.12144289     0.2368737
## F3D143     0.3757782      0.4666677    0.09906237     0.1928251
## F3D144     0.2944975      0.4213795    0.12093023     0.2244186
## F3D145     0.2382547      0.3366673    0.11180664     0.2123109
## F3D146     0.4100491      0.4723746    0.10164620     0.1865691
## F3D147     0.2780173      0.3305908    0.11592962     0.2103706
## F3D148     0.2752081      0.3582721    0.08899561     0.1766629
## F3D149     0.3033751      0.3963266    0.08576083     0.1699885
## F3D150     0.4177821      0.4879845    0.10898483     0.2016336
## F3D2       0.3021921      0.3973778    0.20886114     0.3034160
## F3D3       0.2818964      0.3534158    0.18267980     0.2943691
## F3D5       0.4235522      0.5441913    0.08902804     0.1660768
## F3D6       0.3893171      0.4116843    0.15284306     0.2545868
## F3D7       0.2977758      0.3614254    0.15488215     0.2768158
## F3D8       0.3993826      0.4895786    0.12449347     0.2030617
## F3D9       0.4161506      0.4837492    0.09979771     0.1857721
##        dominance_absolute dominance_relative dominance_simpson
## F3D0                  577         0.08981943        0.03603984
## F3D1                  404         0.08328180        0.03033711
## F3D141                495         0.10338346        0.05032614
## F3D142                303         0.12144289        0.06199606
## F3D143                243         0.09906237        0.05543484
## F3D144                416         0.12093023        0.06926109
## F3D145                643         0.11180664        0.06306074
## F3D146                389         0.10164620        0.04402677
## F3D147               1489         0.11592962        0.05828851
## F3D148                871         0.08899561        0.04949782
## F3D149                895         0.08576083        0.04322743
## F3D150                467         0.10898483        0.04572500
## F3D2                 3479         0.20886114        0.07022948
## F3D3                  983         0.18267980        0.07889181
## F3D5                  327         0.08902804        0.03763914
## F3D6                 1008         0.15284306        0.06197778
## F3D7                  644         0.15488215        0.08071357
## F3D8                  553         0.12449347        0.04511944
## F3D9                  592         0.09979771        0.04308802
##        dominance_core_abundance dominance_gini rarity_log_modulo_skewness
## F3D0                  0.7693026      0.8351031                   2.042400
## F3D1                  0.7179963      0.8227637                   2.025227
## F3D141                0.9043442      0.8958192                   2.043099
## F3D142                0.9482966      0.9261118                   1.995630
## F3D143                0.9168365      0.9123848                   2.039543
## F3D144                0.9526163      0.9334729                   2.020668
## F3D145                0.9238393      0.9229313                   2.042709
## F3D146                0.8228377      0.8689510                   2.047470
## F3D147                0.8864061      0.8975764                   2.047934
## F3D148                0.8900582      0.8934573                   2.050079
## F3D149                0.8462054      0.8668362                   2.046433
## F3D150                0.8686114      0.8779422                   2.043424
## F3D2                  0.8013448      0.8536546                   1.999751
## F3D3                  0.9384873      0.9278193                   2.045144
## F3D5                  0.7919956      0.8538512                   2.051674
## F3D6                  0.8356331      0.8849169                   2.055520
## F3D7                  0.9220779      0.9313345                   2.050004
## F3D8                  0.7593426      0.8515203                   2.056511
## F3D9                  0.7434255      0.8447396                   2.051201
##        rarity_low_abundance rarity_rare_abundance
## F3D0            0.036737235           0.082347447
## F3D1            0.022675737           0.047825191
## F3D141          0.019632414           0.011904762
## F3D142          0.006813627           0.012024048
## F3D143          0.006114961           0.011822258
## F3D144          0.008139535           0.004941860
## F3D145          0.032516084           0.017909929
## F3D146          0.016461981           0.043114711
## F3D147          0.056913734           0.029040797
## F3D148          0.055481762           0.024522326
## F3D149          0.046857033           0.044174013
## F3D150          0.012602100           0.032438740
## F3D2            0.050849493           0.056973044
## F3D3            0.028619216           0.015424642
## F3D5            0.019057991           0.035121154
## F3D6            0.032297195           0.035178165
## F3D7            0.021164021           0.008658009
## F3D8            0.024313372           0.064835660
## F3D9            0.032366824           0.074173972

Alpha diversity boxplots

sample_data_with_alpha %>%
        ggplot(aes(x = When, y = observed)) +
        geom_boxplot() +
        xlab("Genus richness") +
        theme_classic(base_size = 15)

Statistical tests on alpha diversity (linear model)

sample_data_with_alpha %>%
        data.frame() %>% 
        lm(data = ., observed ~ When) %>%
        summary
## 
## Call:
## lm(formula = observed ~ When, data = .)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -33.00 -15.05  -0.10  11.50  41.00 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   91.000      7.418  12.268 7.18e-10 ***
## WhenLate     -15.900     10.225  -1.555    0.138    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 22.25 on 17 degrees of freedom
## Multiple R-squared:  0.1245, Adjusted R-squared:  0.07303 
## F-statistic: 2.418 on 1 and 17 DF,  p-value: 0.1384

The difference between those two group - is not statistically significant

How about other outcomes?

sample_data_with_alpha %>%
        data.frame() %>% 
        lm(data = ., observed ~ evenness_simpson) %>%
        summary
## 
## Call:
## lm(formula = observed ~ evenness_simpson, data = .)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -32.386 -13.908   2.837  14.264  32.095 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        134.78      17.58   7.668 6.47e-07 ***
## evenness_simpson  -207.81      67.83  -3.063  0.00703 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 19.09 on 17 degrees of freedom
## Multiple R-squared:  0.3557, Adjusted R-squared:  0.3178 
## F-statistic: 9.385 on 1 and 17 DF,  p-value: 0.007033

The evenness difference between those two group - is statistically significant

sample_data_with_alpha %>%
        data.frame() %>% 
        lm(data = ., observed ~ dominance_core_abundance) %>%
        summary
## 
## Call:
## lm(formula = observed ~ dominance_core_abundance, data = .)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -19.181 -12.114  -4.527   2.548  37.443 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                273.61      45.13   6.062 1.27e-05 ***
## dominance_core_abundance  -223.44      52.62  -4.247 0.000544 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 16.57 on 17 degrees of freedom
## Multiple R-squared:  0.5147, Adjusted R-squared:  0.4862 
## F-statistic: 18.03 on 1 and 17 DF,  p-value: 0.0005441

Core abundance was also statistically different.

How can we say this difference is not because of deeper sequencing?

sample_data_with_alpha$final_reads <- phyloseq %>% sample_sums

sample_data_with_alpha %>%
        data.frame() %>% 
        ggplot(aes(x = final_reads, y = observed)) +
        geom_point()

Linear model of this association

(As deeper sequencing identified more taxa)

lm_eqn <- function(m){
    eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2, 
         list(a = format(unname(coef(m)[1]), digits = 2),
              b = format(unname(coef(m)[2]), digits = 2),
             r2 = format(summary(m)$r.squared, digits = 3)))
    as.character(as.expression(eq));
}


ggplot(data = sample_data_with_alpha, aes(x = final_reads, y = observed)) +
        geom_smooth(method = "lm", 
                    se = T, 
                    color = "red", 
                    linetype = "dashed"
                    ) +  # Plot regression slope
        geom_point() +
        theme_classic( base_family = "serif", base_size = 20) +
        geom_label(x = 12000, y = 60,
                  label = lm_eqn(lm(data = data.frame(sample_data_with_alpha), observed ~ final_reads)),
                  parse = TRUE, 
                  family='serif',
                  size = 8, label.size = 0, color = "red")#, fill = alpha(c("white"),1)) 
## Warning: The `label.size` argument of `geom_label()` is deprecated as of ggplot2 3.5.0.
## ℹ Please use the `linewidth` argument instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `geom_smooth()` using formula = 'y ~ x'

lm(data = data.frame(sample_data_with_alpha), observed ~ final_reads) %>% summary
## 
## Call:
## lm(formula = observed ~ final_reads, data = data.frame(sample_data_with_alpha))
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -24.01 -11.19  -1.72  11.47  23.08 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 5.217e+01  6.728e+00   7.754 5.57e-07 ***
## final_reads 4.896e-03  9.353e-04   5.235 6.73e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.72 on 17 degrees of freedom
## Multiple R-squared:  0.6171, Adjusted R-squared:  0.5946 
## F-statistic:  27.4 on 1 and 17 DF,  p-value: 6.727e-05

There is significant correlation !

ggplot(data = sample_data_with_alpha, aes(x = final_reads, y = observed)) +
        geom_smooth(method = "lm", 
                    se = T, 
                    color = "red", 
                    linetype = "dashed"
                    ) +  # Plot regression slope
        geom_point(aes(col = When)) +
        theme_classic( base_family = "serif", base_size = 20) +
        geom_label(x = 12000, y = 60,
                  label = lm_eqn(lm(data = data.frame(sample_data_with_alpha), observed ~ final_reads)),
                  parse = TRUE, 
                  family='serif',
                  size = 8, label.size = 0, color = "red")#, fill = alpha(c("white"),1)) 
## `geom_smooth()` using formula = 'y ~ x'

Even for evenness, this association seem to have a significant term in general.

Multiple linear regression

lm(data = data.frame(sample_data_with_alpha),
   evenness_simpson ~ final_reads + When) %>% summary
## 
## Call:
## lm(formula = evenness_simpson ~ final_reads + When, data = data.frame(sample_data_with_alpha))
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.05258 -0.02686  0.00104  0.01807  0.09423 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  3.247e-01  2.111e-02  15.378 5.26e-11 ***
## final_reads -1.406e-05  2.540e-06  -5.537 4.51e-05 ***
## WhenLate     2.616e-02  1.836e-02   1.424    0.174    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.03989 on 16 degrees of freedom
## Multiple R-squared:  0.6786, Adjusted R-squared:  0.6385 
## F-statistic: 16.89 on 2 and 16 DF,  p-value: 0.0001138

The association we ‘obsereved’ was not because of experimental condition, but due to different final reads.

glm(data = data.frame(sample_data_with_alpha),
   as.factor(When) ~ final_reads, family = "binomial") %>% summary
## 
## Call:
## glm(formula = as.factor(When) ~ final_reads, family = "binomial", 
##     data = data.frame(sample_data_with_alpha))
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept)  3.189e-01  9.189e-01   0.347    0.729
## final_reads -3.429e-05  1.278e-04  -0.268    0.788
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 26.287  on 18  degrees of freedom
## Residual deviance: 26.215  on 17  degrees of freedom
## AIC: 30.215
## 
## Number of Fisher Scoring iterations: 3

However, the change of dominance was due to experimental condition.

So, the TRUE change between the samples from early and late phase, was Dominance.

How would it change the association for beta diversity?

Ordination

Principal Coordinates Analysis with Bray-Curtis dist

phyloseq_rel <- phyloseq %>% transform_sample_counts(., function(x) {x/sum(x)})

sample_data(phyloseq_rel) <- sample_data_with_alpha

plot_ordination(physeq = phyloseq_rel,
                ordination = ordinate(phyloseq_rel, "PCoA", "bray"),
                col = "When"
                ) +
        geom_point(size = 8)
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## ℹ The deprecated feature was likely used in the phyloseq package.
##   Please report the issue at <https://github.com/joey711/phyloseq/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

These two groups seems different. Is it really different, by sequencing depth or the sampling phase?

plot_ordination(physeq = phyloseq_rel,
                ordination = ordinate(phyloseq_rel, "PCoA", "bray"),
                col = "final_reads"
                ) +
        geom_point(size = 8) +
        scale_color_gradient(low = "blue", high = "red") 

Symbols and colors at the sample time

plot_ordination(physeq = phyloseq_rel,
                ordination = ordinate(phyloseq_rel, "PCoA", "bray"),
                shape = "When",
                col = "final_reads"
                ) +
        geom_point(size = 8) +
        scale_color_gradient(low = "blue", high = "red") +
        stat_ellipse(type = "norm", linetype = 2) +
        stat_ellipse(type = "t") +
        theme_classic() 
## Warning: The following aesthetics were dropped during statistical transformation:
## colour.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
##   the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
##   variable into a factor?
## The following aesthetics were dropped during statistical transformation:
## colour.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
##   the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
##   variable into a factor?

        #scale_color_brewer(type = "qual")

PERMANOVA

As betadiversity indices have multiple distances at the same time, the calculation must accompany multiple analyses, at a random order (permutation).

This is permutational anova (PERMANOVA).

Univariate permutational anova (PERMANOVA).

Based on sampling point

distance(phyloseq, method = "bray")
##             F3D0      F3D1    F3D141    F3D142    F3D143    F3D144    F3D145
## F3D1   0.3862528                                                            
## F3D141 0.3503389 0.4912335                                                  
## F3D142 0.4999439 0.5322625 0.3618015                                        
## F3D143 0.5250648 0.5550383 0.3522994 0.2332255                              
## F3D144 0.4148418 0.5510795 0.2265435 0.3206403 0.3107076                    
## F3D145 0.3458727 0.5998868 0.2695702 0.4474897 0.4763530 0.2742901          
## F3D146 0.4152766 0.4892832 0.2503772 0.3492566 0.2904459 0.2486583 0.3708499
## F3D147 0.5301017 0.6900819 0.5241606 0.6815959 0.6931424 0.5831491 0.4006991
## F3D148 0.4453149 0.6416177 0.3836021 0.6090213 0.6155229 0.4887730 0.2816321
## F3D149 0.4540925 0.6241251 0.4076458 0.6289537 0.6299170 0.5172961 0.3236548
## F3D150 0.4033056 0.5295534 0.2214262 0.3389381 0.3202731 0.3020065 0.3521323
## F3D2   0.5527923 0.5675098 0.6502681 0.7732874 0.7884877 0.7367766 0.6240628
## F3D3   0.4788649 0.4982408 0.4211820 0.5205688 0.5537401 0.4762499 0.3850162
## F3D5   0.4083391 0.3418583 0.4116535 0.4176394 0.4508652 0.4424294 0.4955433
## F3D6   0.3859743 0.4389306 0.4486515 0.5870187 0.6065429 0.5154958 0.3936498
## F3D7   0.4137214 0.4663115 0.3887771 0.4558846 0.5011345 0.4195841 0.3496821
## F3D8   0.4518682 0.3661896 0.4442037 0.4870982 0.5199420 0.5219487 0.5237908
## F3D9   0.4088702 0.3882964 0.4339552 0.5590364 0.5799642 0.5251814 0.4710263
##           F3D146    F3D147    F3D148    F3D149    F3D150      F3D2      F3D3
## F3D1                                                                        
## F3D141                                                                      
## F3D142                                                                      
## F3D143                                                                      
## F3D144                                                                      
## F3D145                                                                      
## F3D146                                                                      
## F3D147 0.5877872                                                            
## F3D148 0.4980167 0.2253988                                                  
## F3D149 0.5169319 0.2627148 0.1611531                                        
## F3D150 0.2564103 0.5565415 0.4640421 0.4315604                              
## F3D2   0.7084554 0.4970340 0.5563455 0.5121987 0.6675580                    
## F3D3   0.5230235 0.5074897 0.4145570 0.4078523 0.4694807 0.5417915          
## F3D5   0.4013333 0.6684628 0.5846954 0.5891984 0.3885398 0.6532218 0.4250055
## F3D6   0.5014393 0.4745614 0.4144793 0.4141272 0.4599265 0.4713573 0.2728791
## F3D7   0.4632436 0.5551112 0.4743636 0.4863643 0.4075566 0.6192169 0.2338820
## F3D8   0.4707945 0.6292954 0.5628646 0.5624412 0.4591498 0.6349590 0.3814517
## F3D9   0.4913413 0.5802088 0.5009861 0.5030547 0.4785162 0.5447342 0.3480067
##             F3D5      F3D6      F3D7      F3D8
## F3D1                                          
## F3D141                                        
## F3D142                                        
## F3D143                                        
## F3D144                                        
## F3D145                                        
## F3D146                                        
## F3D147                                        
## F3D148                                        
## F3D149                                        
## F3D150                                        
## F3D2                                          
## F3D3                                          
## F3D5                                          
## F3D6   0.3807947                              
## F3D7   0.3472098 0.2496978                    
## F3D8   0.2958718 0.3353266 0.2932558          
## F3D9   0.3442998 0.2718129 0.2755203 0.2111047
distance(phyloseq, method = "jaccard")
##             F3D0      F3D1    F3D141    F3D142    F3D143    F3D144    F3D145
## F3D1   0.5572617                                                            
## F3D141 0.5188904 0.6588284                                                  
## F3D142 0.6666168 0.6947406 0.5313571                                        
## F3D143 0.6885803 0.7138581 0.5210376 0.3782366                              
## F3D144 0.5864144 0.7105754 0.3694015 0.4855831 0.4741067                    
## F3D145 0.5139753 0.7499116 0.4246637 0.6182976 0.6453104 0.4304986          
## F3D146 0.5868486 0.6570721 0.4004827 0.5177022 0.4501481 0.3982808 0.5410510
## F3D147 0.6928974 0.8166254 0.6878023 0.8106536 0.8187645 0.7366951 0.5721416
## F3D148 0.6162185 0.7816896 0.5544977 0.7570084 0.7620107 0.6566118 0.4394898
## F3D149 0.6245717 0.7685677 0.5791881 0.7722180 0.7729436 0.6818657 0.4890320
## F3D150 0.5747937 0.6924288 0.3625699 0.5062789 0.4851619 0.4639093 0.5208548
## F3D2   0.7119978 0.7240909 0.7880757 0.8721512 0.8817368 0.8484414 0.7685206
## F3D3   0.6476114 0.6651011 0.5927207 0.6847027 0.7127834 0.6452158 0.5559735
## F3D5   0.5798875 0.5095296 0.5832217 0.5892040 0.6215122 0.6134503 0.6626933
## F3D6   0.5569718 0.6100789 0.6194057 0.7397754 0.7550908 0.6802998 0.5649192
## F3D7   0.5852941 0.6360333 0.5598841 0.6262647 0.6676743 0.5911367 0.5181696
## F3D8   0.6224645 0.5360744 0.6151538 0.6550989 0.6841603 0.6858953 0.6874839
## F3D9   0.5804228 0.5593854 0.6052563 0.7171563 0.7341486 0.6886806 0.6404050
##           F3D146    F3D147    F3D148    F3D149    F3D150      F3D2      F3D3
## F3D1                                                                        
## F3D141                                                                      
## F3D142                                                                      
## F3D143                                                                      
## F3D144                                                                      
## F3D145                                                                      
## F3D146                                                                      
## F3D147 0.7403853                                                            
## F3D148 0.6649014 0.3678783                                                  
## F3D149 0.6815493 0.4161110 0.2775743                                        
## F3D150 0.4081633 0.7151001 0.6339190 0.6029230                              
## F3D2   0.8293519 0.6640250 0.7149383 0.6774225 0.8006414                    
## F3D3   0.6868226 0.6732911 0.5861298 0.5793964 0.6389749 0.7028077          
## F3D5   0.5727878 0.8012918 0.7379278 0.7415039 0.5596380 0.7902410 0.5964967
## F3D6   0.6679448 0.6436645 0.5860521 0.5857000 0.6300680 0.6407109 0.4287589
## F3D7   0.6331736 0.7139183 0.6434825 0.6544348 0.5790979 0.7648350 0.3790994
## F3D8   0.6401908 0.7724755 0.7202986 0.7199518 0.6293388 0.7767277 0.5522476
## F3D9   0.6589254 0.7343445 0.6675426 0.6693765 0.6472925 0.7052788 0.5163279
##             F3D5      F3D6      F3D7      F3D8
## F3D1                                          
## F3D141                                        
## F3D142                                        
## F3D143                                        
## F3D144                                        
## F3D145                                        
## F3D146                                        
## F3D147                                        
## F3D148                                        
## F3D149                                        
## F3D150                                        
## F3D2                                          
## F3D3                                          
## F3D5                                          
## F3D6   0.5515588                              
## F3D7   0.5154502 0.3996130                    
## F3D8   0.4566375 0.5022391 0.4535156          
## F3D9   0.5122367 0.4274416 0.4320124 0.3486151
data.frame(sample_data_with_alpha)
##        Subject Gender Day  When observed chao1 diversity_inverse_simpson
## F3D0         3      F   0 Early      104   104                  27.74707
## F3D1         3      F   1 Early       94    94                  32.96293
## F3D141       3      F 141  Late       72    72                  19.87039
## F3D142       3      F 142  Late       47    47                  16.13006
## F3D143       3      F 143  Late       54    54                  18.03920
## F3D144       3      F 144  Late       45    45                  14.43812
## F3D145       3      F 145  Late       70    70                  15.85773
## F3D146       3      F 146  Late       82    82                  22.71345
## F3D147       3      F 147  Late      103   103                  17.15604
## F3D148       3      F 148  Late       94    94                  20.20291
## F3D149       3      F 149  Late      109   109                  23.13346
## F3D150       3      F 150  Late       75    75                  21.86987
## F3D2         3      F   2 Early      132   132                  14.23904
## F3D3         3      F   3 Early       64    64                  12.67559
## F3D5         3      F   5 Early       82    82                  26.56809
## F3D6         3      F   6 Early       87    87                  16.13482
## F3D7         3      F   7 Early       58    58                  12.38949
## F3D8         3      F   8 Early       97    97                  22.16340
## F3D9         3      F   9 Early      101   101                  23.20831
##        diversity_gini_simpson diversity_shannon diversity_fisher
## F3D0                0.9639602          3.845083        17.623177
## F3D1                0.9696629          3.939724        16.535173
## F3D141              0.9496739          3.416365        12.020521
## F3D142              0.9380039          3.100883         8.218271
## F3D143              0.9445652          3.256438         9.764408
## F3D144              0.9307389          2.985183         7.309790
## F3D145              0.9369393          3.114549        11.214612
## F3D146              0.9559732          3.609865        14.739937
## F3D147              0.9417115          3.338489        15.294818
## F3D148              0.9505022          3.424846        14.412322
## F3D149              0.9567726          3.631134        16.969707
## F3D150              0.9542750          3.557807        12.914216
## F3D2                0.9297705          3.553624        19.560660
## F3D3                0.9211082          3.017631        10.208471
## F3D5                0.9623609          3.737352        14.873300
## F3D6                0.9380222          3.399273        14.155266
## F3D7                0.9192864          2.965123         9.540154
## F3D8                0.9548806          3.701149        17.508733
## F3D9                0.9569120          3.737488        17.292247
##        diversity_coverage evenness_camargo evenness_pielou evenness_simpson
## F3D0                   10        0.3282552       0.8278982        0.2667988
## F3D1                   13        0.3954679       0.8671513        0.3506694
## F3D141                  7        0.2989099       0.7988383        0.2759776
## F3D142                  6        0.3315311       0.8053933        0.3431927
## F3D143                  6        0.3352886       0.8163578        0.3340592
## F3D144                  6        0.3107364       0.7841996        0.3208471
## F3D145                  6        0.2344288       0.7330946        0.2265390
## F3D146                  8        0.3237746       0.8191729        0.2769933
## F3D147                  6        0.2098778       0.7203202        0.1665635
## F3D148                  7        0.2390079       0.7538243        0.2149246
## F3D149                  8        0.2559155       0.7740066        0.2122336
## F3D150                  8        0.3314166       0.8240456        0.2915983
## F3D2                    6        0.2355371       0.7277838        0.1078715
## F3D3                    5        0.2358849       0.7255868        0.1980561
## F3D5                    9        0.3723447       0.8481030        0.3240011
## F3D6                    6        0.2716844       0.7611605        0.1854576
## F3D7                    5        0.2472467       0.7302461        0.2136119
## F3D8                    8        0.3136044       0.8090454        0.2284886
## F3D9                    8        0.3137205       0.8098354        0.2297852
##        evenness_evar evenness_bulla dominance_dbp dominance_dmn
## F3D0       0.3715485      0.5085783    0.08981943     0.1625156
## F3D1       0.4058522      0.5706106    0.08328180     0.1558442
## F3D141     0.3439671      0.4407498    0.10338346     0.1959064
## F3D142     0.3020775      0.4582905    0.12144289     0.2368737
## F3D143     0.3757782      0.4666677    0.09906237     0.1928251
## F3D144     0.2944975      0.4213795    0.12093023     0.2244186
## F3D145     0.2382547      0.3366673    0.11180664     0.2123109
## F3D146     0.4100491      0.4723746    0.10164620     0.1865691
## F3D147     0.2780173      0.3305908    0.11592962     0.2103706
## F3D148     0.2752081      0.3582721    0.08899561     0.1766629
## F3D149     0.3033751      0.3963266    0.08576083     0.1699885
## F3D150     0.4177821      0.4879845    0.10898483     0.2016336
## F3D2       0.3021921      0.3973778    0.20886114     0.3034160
## F3D3       0.2818964      0.3534158    0.18267980     0.2943691
## F3D5       0.4235522      0.5441913    0.08902804     0.1660768
## F3D6       0.3893171      0.4116843    0.15284306     0.2545868
## F3D7       0.2977758      0.3614254    0.15488215     0.2768158
## F3D8       0.3993826      0.4895786    0.12449347     0.2030617
## F3D9       0.4161506      0.4837492    0.09979771     0.1857721
##        dominance_absolute dominance_relative dominance_simpson
## F3D0                  577         0.08981943        0.03603984
## F3D1                  404         0.08328180        0.03033711
## F3D141                495         0.10338346        0.05032614
## F3D142                303         0.12144289        0.06199606
## F3D143                243         0.09906237        0.05543484
## F3D144                416         0.12093023        0.06926109
## F3D145                643         0.11180664        0.06306074
## F3D146                389         0.10164620        0.04402677
## F3D147               1489         0.11592962        0.05828851
## F3D148                871         0.08899561        0.04949782
## F3D149                895         0.08576083        0.04322743
## F3D150                467         0.10898483        0.04572500
## F3D2                 3479         0.20886114        0.07022948
## F3D3                  983         0.18267980        0.07889181
## F3D5                  327         0.08902804        0.03763914
## F3D6                 1008         0.15284306        0.06197778
## F3D7                  644         0.15488215        0.08071357
## F3D8                  553         0.12449347        0.04511944
## F3D9                  592         0.09979771        0.04308802
##        dominance_core_abundance dominance_gini rarity_log_modulo_skewness
## F3D0                  0.7693026      0.8351031                   2.042400
## F3D1                  0.7179963      0.8227637                   2.025227
## F3D141                0.9043442      0.8958192                   2.043099
## F3D142                0.9482966      0.9261118                   1.995630
## F3D143                0.9168365      0.9123848                   2.039543
## F3D144                0.9526163      0.9334729                   2.020668
## F3D145                0.9238393      0.9229313                   2.042709
## F3D146                0.8228377      0.8689510                   2.047470
## F3D147                0.8864061      0.8975764                   2.047934
## F3D148                0.8900582      0.8934573                   2.050079
## F3D149                0.8462054      0.8668362                   2.046433
## F3D150                0.8686114      0.8779422                   2.043424
## F3D2                  0.8013448      0.8536546                   1.999751
## F3D3                  0.9384873      0.9278193                   2.045144
## F3D5                  0.7919956      0.8538512                   2.051674
## F3D6                  0.8356331      0.8849169                   2.055520
## F3D7                  0.9220779      0.9313345                   2.050004
## F3D8                  0.7593426      0.8515203                   2.056511
## F3D9                  0.7434255      0.8447396                   2.051201
##        rarity_low_abundance rarity_rare_abundance final_reads
## F3D0            0.036737235           0.082347447        6424
## F3D1            0.022675737           0.047825191        4851
## F3D141          0.019632414           0.011904762        4788
## F3D142          0.006813627           0.012024048        2495
## F3D143          0.006114961           0.011822258        2453
## F3D144          0.008139535           0.004941860        3440
## F3D145          0.032516084           0.017909929        5751
## F3D146          0.016461981           0.043114711        3827
## F3D147          0.056913734           0.029040797       12844
## F3D148          0.055481762           0.024522326        9787
## F3D149          0.046857033           0.044174013       10436
## F3D150          0.012602100           0.032438740        4285
## F3D2            0.050849493           0.056973044       16657
## F3D3            0.028619216           0.015424642        5381
## F3D5            0.019057991           0.035121154        3673
## F3D6            0.032297195           0.035178165        6595
## F3D7            0.021164021           0.008658009        4158
## F3D8            0.024313372           0.064835660        4442
## F3D9            0.032366824           0.074173972        5932
set.seed(1)

vegan::adonis2(formula = distance(phyloseq, method = "bray") ~ When + Day,
       data = data.frame(sample_data_with_alpha), by = "terms",
       permutations = 10000) 
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 10000
## 
## vegan::adonis2(formula = distance(phyloseq, method = "bray") ~ When + Day, data = data.frame(sample_data_with_alpha), permutations = 10000, by = "terms")
##          Df SumOfSqs      R2      F    Pr(>F)    
## When      1  0.50229 0.24709 5.8595 9.999e-05 ***
## Day       1  0.15900 0.07822 1.8548   0.09719 .  
## Residual 16  1.37157 0.67470                     
## Total    18  2.03286 1.00000                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There was difference. R-squared shows that the 25% of variance in beta-diversity was explained by When variable.

Multivariate anlaysis

vegan::adonis2(formula = distance(phyloseq, method = "bray") ~ final_reads + When,
       data = data.frame(sample_data_with_alpha), by = "terms",
       permutations = 10000) 
## Permutation test for adonis under reduced model
## Terms added sequentially (first to last)
## Permutation: free
## Number of permutations: 10000
## 
## vegan::adonis2(formula = distance(phyloseq, method = "bray") ~ final_reads + When, data = data.frame(sample_data_with_alpha), permutations = 10000, by = "terms")
##             Df SumOfSqs      R2       F    Pr(>F)    
## final_reads  1  0.65112 0.32030 11.7547 9.999e-05 ***
## When         1  0.49546 0.24373  8.9445 9.999e-05 ***
## Residual    16  0.88628 0.43598                      
## Total       18  2.03286 1.00000                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There was difference. R-squared shows that the 24% of variance in beta-diversity was explained by When variable, and 32% was explained by Fian reads.

–> 1% of variance in beta-diversity was ADJUSTED by sequencing depth.

New example phyloseq data from the host depletion study

download.file("https://github.com/minsiksudo/IBS7048_Advanced_metagenomics/raw/refs/heads/main/phyloseq_examples_host_depletion.rds", destfile = "phyloseq_ex_host_dep_deidentified.rds")

phyloseq2 <- readRDS("phyloseq_ex_host_dep_deidentified.rds")

decontam

#write your own code

rarefaction

#write your own code

Bibliography

## Computing. R Foundation for Statistical Computing, Vienna, Austria. <https://www.R-project.org/>. We have invested a lot of time and effort in creating R, please cite it when using it for data analysis. See also 'citation("pkgname")' for citing R packages.
## version 1.4.5, <https://CRAN.R-project.org/package=readxl>.
## graphics of microbiome census data. Paul J. McMurdie and Susan Holmes (2013) PLoS ONE 8(4):e61217.
## Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). "Welcome to the tidyverse." Journal of Open Source Software_, *4*(43), 1686. doi:10.21105/joss.01686 <https://doi.org/10.21105/joss.01686>.
## R. version 0.5.0. Buffalo, New York. http://github.com/trinker/pacman
## J, reikoch, Beasley W, O'Connor B, Warnes GR, Quinn M, Kamvar ZN, Gao C (2024). yaml: Methods to Convert R Data to YAML and Back_. R package version 2.3.10, <https://CRAN.R-project.org/package=yaml>. ATTENTION: This citation information has been auto-generated from the package DESCRIPTION file and may need manual editing, see 'help("citation")'.