https://cran.r-project.org/web/packages/corncob/vignettes/corncob-intro-no-phyloseq.html
As you go through the tutorial, copy and paste code below to perform the tasks. Additionally, I have a few questions you need to answer.
remotes::install_github("statdivlab/corncob")
## Downloading GitHub repo statdivlab/corncob@HEAD
## glue (1.7.0 -> 1.8.0 ) [CRAN]
## ps (1.7.7 -> 1.8.0 ) [CRAN]
## slam (0.1-52 -> 0.1-54) [CRAN]
## VGAM (1.1-11 -> 1.1-12) [CRAN]
## Installing 4 packages: glue, ps, slam, VGAM
## Installing packages into 'C:/Users/JourdanHourican/AppData/Local/R/win-library/4.4'
## (as 'lib' is unspecified)
## Warning in download.file(url, destfile, method, mode = "wb", ...): downloaded
## length 0 != reported length 18143
## Warning in download.file(url, destfile, method, mode = "wb", ...): cannot open
## URL 'https://cloud.r-project.org/bin/windows/contrib/4.4/glue_1.8.0.zip': HTTP
## status was '403 Forbidden'
## Error in download.file(url, destfile, method, mode = "wb", ...) :
## cannot open URL 'https://cloud.r-project.org/bin/windows/contrib/4.4/glue_1.8.0.zip'
## Warning in download.packages(pkgs, destdir = tmpd, available = available, :
## download of package 'glue' failed
## Warning in download.file(url, destfile, method, mode = "wb", ...): downloaded
## length 0 != reported length 18135
## Warning in download.file(url, destfile, method, mode = "wb", ...): cannot open
## URL 'https://cloud.r-project.org/bin/windows/contrib/4.4/ps_1.8.0.zip': HTTP
## status was '403 Forbidden'
## Error in download.file(url, destfile, method, mode = "wb", ...) :
## cannot open URL 'https://cloud.r-project.org/bin/windows/contrib/4.4/ps_1.8.0.zip'
## Warning in download.packages(pkgs, destdir = tmpd, available = available, :
## download of package 'ps' failed
## Warning in download.file(url, destfile, method, mode = "wb", ...): downloaded
## length 0 != reported length 18147
## Warning in download.file(url, destfile, method, mode = "wb", ...): cannot open
## URL 'https://cloud.r-project.org/bin/windows/contrib/4.4/slam_0.1-54.zip': HTTP
## status was '403 Forbidden'
## Error in download.file(url, destfile, method, mode = "wb", ...) :
## cannot open URL 'https://cloud.r-project.org/bin/windows/contrib/4.4/slam_0.1-54.zip'
## Warning in download.packages(pkgs, destdir = tmpd, available = available, :
## download of package 'slam' failed
## Warning in download.file(url, destfile, method, mode = "wb", ...): downloaded
## length 0 != reported length 18147
## Warning in download.file(url, destfile, method, mode = "wb", ...): cannot open
## URL 'https://cloud.r-project.org/bin/windows/contrib/4.4/VGAM_1.1-12.zip': HTTP
## status was '403 Forbidden'
## Error in download.file(url, destfile, method, mode = "wb", ...) :
## cannot open URL 'https://cloud.r-project.org/bin/windows/contrib/4.4/VGAM_1.1-12.zip'
## Warning in download.packages(pkgs, destdir = tmpd, available = available, :
## download of package 'VGAM' failed
## ── R CMD build ─────────────────────────────────────────────────────────────────
## checking for file 'C:\Users\JourdanHourican\AppData\Local\Temp\RtmpCuAdvo\remotes22541bad355\statdivlab-corncob-a0309c5/DESCRIPTION' ... checking for file 'C:\Users\JourdanHourican\AppData\Local\Temp\RtmpCuAdvo\remotes22541bad355\statdivlab-corncob-a0309c5/DESCRIPTION' ... ✔ checking for file 'C:\Users\JourdanHourican\AppData\Local\Temp\RtmpCuAdvo\remotes22541bad355\statdivlab-corncob-a0309c5/DESCRIPTION' (968ms)
## ─ preparing 'corncob': (1.2s)
## checking DESCRIPTION meta-information ... checking DESCRIPTION meta-information ... ✔ checking DESCRIPTION meta-information
## ─ checking for LF line-endings in source and make files and shell scripts (1.1s)
## ─ checking for empty or unneeded directories
## ─ building 'corncob_0.4.1.tar.gz'
##
##
## Installing package into 'C:/Users/JourdanHourican/AppData/Local/R/win-library/4.4'
## (as 'lib' is unspecified)
library(corncob)
library(magrittr)
data(soil_phylo_sample)
data(soil_phylo_otu)
data(soil_phylo_taxa)
head(soil_phylo_sample)
## Plants DayAmdmt Amdmt ID Day
## S009 1 01 1 D 0
## S204 1 21 1 D 2
## S112 0 11 1 B 1
## S247 0 22 2 F 2
## S026 0 00 0 A 0
## S023 1 00 0 C 0
soil_phylo_otu[1:5, 1:5]
## S009 S204 S112 S247 S026
## OTU.43 350 74 300 70 43
## OTU.2 1796 4204 1752 695 945
## OTU.187 280 709 426 100 139
## OTU.150 33 151 18 13 28
## OTU.91 0 0 184 0 0
soil_phylo_taxa[1:3, ]
## Kingdom Phylum Class Order
## OTU.43 "Bacteria" "Nitrospirae" "Nitrospira" "Nitrospirales"
## OTU.2 "Bacteria" "Proteobacteria" "Alphaproteobacteria" "Rhizobiales"
## OTU.187 "Bacteria" "Acidobacteria" "Acidobacteriia" "Acidobacteriales"
## Family Genus Species
## OTU.43 "Nitrospiraceae" "Nitrospira" ""
## OTU.2 "Bradyrhizobiaceae" "Bradyrhizobium" ""
## OTU.187 "Koribacteraceae" "" ""
data(soil_phylum_small_sample)
sample_data <- soil_phylum_small_sample
data(soil_phylum_small_otu)
data <- soil_phylum_small_otu
pro_data <- cbind(sample_data,
W = unlist(data["Proteobacteria", ]),
M = colSums(data))
corncob <- bbdml(formula = cbind(W, M - W) ~ 1,
phi.formula = ~ 1,
data = pro_data)
plot(corncob, B = 50)
plot(corncob, total = TRUE, B = 50)
plot(corncob, total = TRUE, color = "DayAmdmt", B = 50)
plot(corncob, color = "DayAmdmt", B = 50)
corncob_da <- bbdml(formula = cbind(W, M - W) ~ DayAmdmt,
phi.formula = ~ DayAmdmt,
data = pro_data)
plot(corncob_da, color = "DayAmdmt", total = TRUE, B = 50)
plot(corncob_da, color = "DayAmdmt", B = 50)
lrtest(mod_null = corncob, mod = corncob_da)
## [1] 4.550571e-05
summary(corncob_da)
##
## Call:
## bbdml(formula = cbind(W, M - W) ~ DayAmdmt, phi.formula = ~DayAmdmt,
## data = pro_data)
##
##
## Coefficients associated with abundance:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.44595 0.03604 -12.375 7.18e-13 ***
## DayAmdmt21 -0.16791 0.04067 -4.129 0.000297 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Coefficients associated with dispersion:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.3077 0.3537 -15.008 6.44e-15 ***
## DayAmdmt21 -1.3518 0.5029 -2.688 0.012 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
##
## Log-likelihood: -286.53
set.seed(1)
da_analysis <- differentialTest(formula = ~ DayAmdmt,
phi.formula = ~ DayAmdmt,
formula_null = ~ 1,
phi.formula_null = ~ DayAmdmt,
test = "Wald", boot = FALSE,
data = data,
sample_data = sample_data,
taxa_are_rows = TRUE,
fdr_cutoff = 0.05)
da_analysis
## Object of class differentialTest
##
## $p: p-values
## $p_fdr: FDR-adjusted p-values
## $significant_taxa: taxa names of the statistically significant taxa
## $significant_models: model summaries of the statistically significant taxa
## $all_models: all model summaries
## $restrictions_DA: covariates tested for differential abundance
## $restrictions_DV: covariates tested for differential variability
## $discriminant_taxa_DA: taxa for which at least one covariate associated with the abundance was perfectly discriminant
## $discriminant_taxa_DV: taxa for which at least one covariate associated with the dispersion was perfectly discriminant
##
## plot( ) to see a plot of tested coefficients from significant taxa
da_analysis$significant_taxa
## [1] "Proteobacteria" "Gemmatimonadetes" "Bacteroidetes" "Cyanobacteria"
## [5] "Firmicutes" "Planctomycetes" "Armatimonadetes" "Spirochaetes"
## [9] "Elusimicrobia" "BRC1" "OP3" "FBP"
## [13] "Chlorobi" "TM6"
set.seed(1)
dv_analysis <- differentialTest(formula = ~ DayAmdmt,
phi.formula = ~ DayAmdmt,
formula_null = ~ DayAmdmt,
phi.formula_null = ~ 1,
test = "LRT", boot = FALSE,
data = data,
sample_data = sample_data,
taxa_are_rows = TRUE,
fdr_cutoff = 0.05)
dv_analysis$significant_taxa
## [1] "Acidobacteria" "Cyanobacteria" "Spirochaetes" "Elusimicrobia"
## [5] "FBP"
da_analysis$p[1:5]
## Acidobacteria Proteobacteria Gemmatimonadetes Actinobacteria
## 6.509417e-01 3.642734e-05 3.270448e-13 3.703096e-01
## [Thermi]
## 1.031419e-01
da_analysis$p_fdr[1:5]
## Acidobacteria Proteobacteria Gemmatimonadetes Actinobacteria
## 7.811301e-01 1.457094e-04 3.924537e-12 5.172744e-01
## [Thermi]
## 2.062838e-01
plot(da_analysis)
which(is.na(da_analysis$p)) %>% names
## [1] "GN04" "GN02" "MVP-21"
data["GN04", ]
## S204 S112 S134 S207 S202 S139 S122 S212 S117 S104 S214 S109 S217 S229 S132
## GN04 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## S209 S227 S107 S237 S224 S127 S137 S114 S124 S119 S219 S232 S129 S102 S234
## GN04 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
## S222 S239
## GN04 0 0
ex1 <- differentialTest(formula = ~ Day,
phi.formula = ~ 1,
formula_null = ~ 1,
phi.formula_null = ~ 1,
data = data,
taxa_are_rows = TRUE,
sample_data = sample_data,
test = "Wald", boot = FALSE,
fdr_cutoff = 0.05)
plot(ex1)
ex2 <- differentialTest(formula = ~ Day,
phi.formula = ~ Day,
formula_null = ~ 1,
phi.formula_null = ~ Day,
data = data,
taxa_are_rows = TRUE,
sample_data = sample_data,
test = "Wald", boot = FALSE,
fdr_cutoff = 0.05)
plot(ex2)
ex3 <- differentialTest(formula = ~ Day,
phi.formula = ~ Day,
formula_null = ~ 1,
phi.formula_null = ~ 1,
data = data,
taxa_are_rows = TRUE,
sample_data = sample_data,
test = "Wald", boot = FALSE,
fdr_cutoff = 0.05)
plot(ex3)
QUESTIONS: 1. How do you interpret the first relative abundance plot? (Black and White one)
This relative abundance plot shows relative abundance by sample with 95% prediction intervals. This includes 50 bootstrapping samples. The data points do not show a clear trend and relative abundance is quite similar between all of the samples.
We add covariates to increase precision by explaining some variability between the data. Covariates also help account for confounding variables. As a result, data points shift, error bars shorten (due to increased precision) and we see better fit to the data on the relative abundance plot.
These plots are useful when testing for differential abundance. In this case it is helpful when you want to compare original data with data controlling for different variables. These plots are interpreted by looking at the relative abundance (x-axis) for each individual Taxa (y-axis) and comparing it to the other Taxa.In this case the condition of the right tells us how significant differential abundance is being determined, and that if a taxon’s abundance data point or error bars cross zero it is considered not significantly different.