Logic
Background: Mangrove forests, pantropical coastal wetland ecosystems, provide important ecosystem services including fisheries nursery habitat provision, coastline protection, nutrient cycling, and other local benefits. Of global import is their ability to store carbon in organic peat belowground for hundreds to thousands of years, with more carbon per unit area than any other forest type. Mangrove carbon storage relies in part on the high primary productivity of these systems, but essential to the large and long-lived nature of this storage is the slow microbial decomposition of buried mangrove peat. In this study, we explore the relationship between carbon content and microbial community composition with sediment age and plot the slow course of buried mangrove detritus over time. At four mangrove sites with deep (< 1 m) deposits of peat in the area of La Paz, B.C.S., Mexico, we cored the sediments until rejection with a Russian peat corer, and from these cores obtained 5 cm samples at 20 cm intervals. In these samples we measured total carbon, organic carbon, nitrogen, and 14C age. We observed high percentage carbon by mass (14 ± 6%) and high C:N ratios (29 ± 7) in peat samples. Radiocarbon dates allowed us to reconstruct the accumulation and slow decomposition of organic matter over the last 5,000 years. Limitations on microbial decomposition by microbes, likely help explain this slow decomposition. [Summary of microbial results.] These results shed light on the microbial environment in which these peat deposits are preserved for long periods of time. Future work should examine the roles of distinct microbial taxa in the slow turnover of mangrove carbon. Mangrove forests, long considered detritus-based ecosystems, can only be understood when these belowground carbon cycling processes are captured. Furthermore, data on patterns of belowground carbon in these threatened systems can motivate their conservation, given the value of their ecosystem service of carbon storage, estimated to be worth on the order of 1 billion US$ in the Gulf of California’s mangroves alone.
Q2. Will peat age be correlated with a shift in the microbial community toward taxa that can consume this refractory remaining carbon and away from the community composition near the sediment surface with greater affinity to marine and terrestrial soil communities.
Outputs:
1. Determine the number of new OTUs identified at each depth per sampling event
2. NMDS of unweighted unifrac or jaccard dissimilarity. PERMANOVA with PERMDISP.
getwd()
## [1] "/Users/maltz/Documents/GitHub/MangroveSeqs2018"
Prepare data
Split taxonomy column into multiple variables
# bacteria otu
B16S<-read.csv("data/filtered_table_w_metadata.csv", header = TRUE, stringsAsFactors = FALSE)
# split taxa groupings into new columns
source('dust_functions.R')
B16S<-split_taxa(B16S)
Read in UniFrac distances
UniFrac is a measure of B-diversity that uses phylogenetic information to compare community samples. Use with multivariate methods like principal coordinates (PCoA), NMDS, etc to explain differences among communities. Measured as the amount of evolutionary history that is unique to each group - the fraction of branch length in a phylogenetic tree associated with ONLY the focal sample in pairwise comparison. With sequence data UniFrac values can be invfluencd by the number of sequences/sample and can use sequence jackknifing (Lozupone et al 2011. ISME).
Use to compare phylogenetic lineages between groups, to cluster samples.
unifrac<-read.table('data/unweighted_unifrac_dm.txt')
unifrac_wt<-read.table('data/weighted_unifrac_dm.txt')
Use heatmap to visualize sample dissimilarity based on UniFrac

Prep data for multivariate anaylsis
otu.taxa<-read.csv('data/OTU_taxa_id.csv') # otu ids
otu.comm<-read.csv('data/OTU_community_clean.csv')
#otu.grp<-otu.comm%>%left_join(map6, by='SampleID')
## the samples in the unifrq dist and community need to be in the same order
# need to order community based on unifraq
colnames(unifrac) # use this order
## [1] "X24" "X25" "X26" "X27" "X20" "X21" "X22" "X23" "X28" "X29" "X1"
## [12] "X3" "X2" "X5" "X4" "X7" "X6" "X9" "X8" "X11" "X10" "X13"
## [23] "X12" "X15" "X14" "X17" "X16" "X19" "X18" "X31" "X30" "X33" "X32"
# make new df based on unifraq
otu.ordered<-data.frame(SampleID = colnames(unifrac))%>%left_join(otu.comm, by='SampleID')
otu.ordered$SampleID
## [1] X24 X25 X26 X27 X20 X21 X22 X23 X28 X29 X1 X3 X2 X5 X4 X7 X6
## [18] X9 X8 X11 X10 X13 X12 X15 X14 X17 X16 X19 X18 X31 X30 X33 X32
## 33 Levels: X1 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X2 X20 X21 ... X9
# this works but is inefficient - there are multiple copies of the otu table which is large
# alternatively - match the order, rewrite over same name
otu.comm<-otu.comm[match(colnames(unifrac), otu.comm$SampleID),]
# this says - for the community df, order the rows to match the order of the column names in unifrac, that match to SampleID
rm(otu.ordered)
#combine with grouping variables - align mapping data with community data
#View(otu.comm)
str(otu.comm)
## 'data.frame': 33 obs. of 28594 variables:
## $ SampleID : Factor w/ 33 levels "X1","X10","X11",..: 17 18 19 20 13 14 15 16 21 22 ...
## $ X100167 : int NA 6 NA NA NA NA NA NA NA 3 ...
## $ X1003755 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1007180 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1011088 : int 6 6 NA 5 NA NA NA 5 NA 6 ...
## $ X1011502 : int NA 4 NA NA NA NA NA NA NA NA ...
## $ X1011514 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1012668 : int NA NA NA NA 3 3 NA NA NA NA ...
## $ X1014492 : int NA 7 NA NA 9 NA NA 5 NA 5 ...
## $ X1016465 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X101654 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1019161 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X101967 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1021233 : int NA NA NA NA 7 NA NA NA NA NA ...
## $ X102272 : int NA NA NA NA 7 NA NA NA NA NA ...
## $ X1024772 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X102556 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1025949 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X102653 : int NA NA NA NA 3 NA NA 3 NA 5 ...
## $ X1027649 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1027802 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X102886 : int NA NA NA NA 18 NA NA NA NA 3 ...
## $ X103408 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1035774 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1038403 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X103868 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1040290 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X104068 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1041464 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1044436 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1045589 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1045667 : int NA NA NA NA 44 17 NA NA NA 5 ...
## $ X104681 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1050284 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1051517 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X105224 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1053775 : int NA NA NA 6 NA NA NA NA NA NA ...
## $ X1054222 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1056339 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X106027 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1062089 : int 15 86 10 15 456 98 7 54 7 165 ...
## $ X1062748 : int NA NA NA NA 5 NA NA 4 NA NA ...
## $ X1067447 : int 5 NA NA NA NA NA NA NA NA NA ...
## $ X106754 : int NA NA 5 NA NA NA NA NA NA NA ...
## $ X10681 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X106877 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X10691 : int NA NA NA NA 5 NA NA NA NA 3 ...
## $ X1070610 : int NA NA NA NA 9 5 NA NA NA 3 ...
## $ X1070749 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1071670 : int NA NA NA NA NA NA NA NA NA 4 ...
## $ X1074625 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1076709 : int NA 6 NA NA 10 NA NA 3 NA 13 ...
## $ X1078881 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1079938 : int NA NA NA NA NA NA NA 5 NA NA ...
## $ X1082059 : int 6 27 NA NA 19 NA 6 26 NA 17 ...
## $ X1082294 : int NA NA NA NA NA NA NA NA 3 NA ...
## $ X108405 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X108426 : int NA NA NA NA 4 NA NA NA NA NA ...
## $ X1084486 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1084931 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1088265 : int NA NA 4 8 NA NA NA NA 5 14 ...
## $ X1088391 : int 8 15 NA 3 10 NA NA 3 NA 5 ...
## $ X109060 : int NA NA NA NA NA NA 3 NA NA NA ...
## $ X10923 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X109236 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X10926 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X109279 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X109288 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X109299 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X109302 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1093382 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X10940 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1095353 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1096010 : int NA NA NA NA NA NA NA NA NA 5 ...
## $ X1098922 : int NA NA NA NA 6 NA NA NA NA NA ...
## $ X1101451 : int NA NA NA NA NA NA NA NA NA 3 ...
## $ X1104662 : int 5 4 3 NA NA NA NA 56 8 16 ...
## $ X1104672 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1104691 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1104845 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1104911 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1105022 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X110514 : int NA NA NA NA 3 NA NA NA NA 12 ...
## $ X1105269 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1105285 : int 8 28 7 5 204 53 3 29 NA 30 ...
## $ X1105509 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1105584 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1105884 : int NA NA NA NA NA NA 10 NA 6 3 ...
## $ X1105927 : int 4 16 NA NA 106 24 NA 8 NA 10 ...
## $ X1106016 : int 3 25 NA 15 271 73 3 20 NA 41 ...
## $ X1106049 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1106130 : int NA NA NA NA 3 NA NA NA NA NA ...
## $ X1106146 : int NA NA NA NA NA NA NA NA NA 3 ...
## $ X1106162 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1106218 : int NA NA 3 NA NA NA NA NA NA NA ...
## $ X1106238 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1106251 : int NA 9 NA NA 21 4 NA 8 NA 9 ...
## $ X1106293 : int NA NA NA NA NA NA NA NA NA NA ...
## $ X1106406 : int NA NA NA NA NA NA NA NA NA 3 ...
## [list output truncated]
otu.grp<-otu.comm%>%left_join(map6, by='SampleID')
## Error in left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y, check_na_matches(na_matches)): Can't join on 'SampleID' x 'SampleID' because of incompatible types (factor / logical)
#instead of making our df larger, can reorder mapping data to community like above
grps<-map6[match(colnames(unifrac), map6$SampleID),]
grps$SampleID
## [1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
## [24] NA NA NA NA NA NA NA NA NA NA
# now that all the data (community, distances, grouping variables) are ordered the same, we can use permanova etc
Permanova
Permanova tests for differences in composition among groups
Reminder - permanova is always based on pairwise distances/dissimilarities.
Month and elevation are significant in all tests. Year is not but that is probably because month is nested within year and that is not indicated in the model.
Include Core in model appropriately Or alternatively, if not interested in Core effect, constrain permutations within Core using ‘strata’
Manipulate code for test of interest
NMDS
The NMDS often pairs well with the PERMANOVA because both are more robust to properties of community data. To gain proper fit may need to try multiple dissimilarity methods or tranformations.
Plot ordination
Manipulate plot code to make figure as desired
Separates well by depth/C age, within each core grouped together for the most part. EMI is more different/distinct, and the most overlap in SJI and SJII
NMDS vs PCoA
NMDS is an unconstrained ordination method to visualize multivariate data in fewer dimensions. Depending on the properties of your data and your questions, different methods of ordination may be appropriate. PCoA (or MDS) is a metric version of NMDS, meaning that PCoA is a Euclidean representation of dissimilarities. So, like NMDS PCoA uses dissimilarities, but it is euclidean rather than rank-order which is used to preserve distances hierarchically. NMDS and distance-based ordinations/tests are often easier to use with community data because 0’s and 1’s, rare species are common and do not configure in euclidean space.
In general, ordination & multivaraite methods are defined by (1) whether or not using dissimilarity, (2) whether or not euclidean, (3) test for differences vs explaining differences …
PCoA, like PCA, returns a set of orthogonal axes with associated eignevalues that measure their importance. PCoA and PCA should produce the same results if the PCA is calculated on a covariance matrix with scaling of 1.
plot the pcoa