First we are going to load our required library for this analysis, which is SQMtools
Since we ran this analysis in sequential mode (one sample at a time), we have output files for each sample run. We now need to combine these samples into a single file using the ‘coimbineSQMlite’ command
We’ll first combine the first 4 sample datasets (which represent a single biological sample)
These are the taxa identified for the ATS Deep Seep samples
plotTaxonomy(meta, rank='superkingdom', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))
## Warning in mostAbundant(data, N = N, items = tax, others = others, rescale =
## rescale): N=15 but only 7 items exist. Returning 7 items
## Warning in mostAbundant(data, N = N + rr, items = tax, others = others, : N=16
## but only 7 items exist. Returning 7 items
plotTaxonomy(meta, rank='phylum', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))
plotTaxonomy(meta, rank='class', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))
plotTaxonomy(meta, rank='order', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))
plotTaxonomy(meta, rank='family', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))
plotTaxonomy(meta, rank='species', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))
These are the pathways identified
To answer our previous discussion on why they are using TPM when we are using DNA, the authors posted this:
iitalicNo, as TPM is not actually the number of counts from that gene that you observe per million counts.
iitalicSee this excerpt from our recently published paper (https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03703-2)
iitalicThe TPM (transcripts per million) metric was introduced by Wagner et al. [14] as an improved way to account for gene length and sequencing depth in transcriptomic experiments: we find it equally useful in metagenomics. The TPM of a feature (be it a iitalictranscript, a gene or a functional category) is the number of times that we would find that feature when randomly sampling 1 million features, given the abundances of the different features in our sample. […]. For the sake of being consistent with iitalicprevious works, we maintain the nomenclature “TPM”, even when use it to measure the abundances of features other than transcripts.
iitalicSo strickly speaking we could use the more generic FPM - Features Per Million, but we find it less confusing to stick to the existing nomenclature.
plotFunctions(meta, fun_level = 'KEGG', count = 'copy_number', N = 8, base_size = 20)
plotFunctions(meta, fun_level = 'COG', count = 'copy_number', N = 8, base_size = 28)
plotFunctions(meta, fun_level = 'PFAM', count = 'copy_number', N = 8, base_size = 20)
transposon = subsetFun(meta, fun = 'transposon', rescale_copy_number = F)
plotTaxonomy(transposon, rank = 'genus', count = 'percent', base_size = 20, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))
ABC = subsetFun(meta, fun = 'ABC', rescale_copy_number = F)
plotTaxonomy(ABC, rank = 'genus', count = 'percent', base_size = 10, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))
This is a list of all the MAGs that were identified via the Squeeze Meta pipeline. I’ve only included MAGS in which there was > 85% completeness (which seems to be the standard in the published literature).
library(kableExtra)
MAGS <- read.csv("MAGS.csv")
kbl(MAGS) %>%
kable_classic(full_width = F, html_font = "Cambria")
| Contig.File | Taxonomy.Kingdom | Phylum | Class | Order | Family | Genus | Species | Length | GC.perc | Num.contigs | Disparity | Completeness |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| concoct.2.fa.contigs | Bacteria | Rhodothermota | Rhodothermia | Rhodothermales | Rubricoccaceae | Rubrivirga | 6788719 | 63.70 | 185 | 0.000 | 100.00 | |
| concoct.11.fa.contigs | Bacteria | Pseudomonadota | Betaproteobacteria | Burkholderiales | Burkholderiaceae | Burkholderia | 6727187 | 67.05 | 221 | 0.000 | 99.83 | |
| concoct.120.fa_sub.contigs | Bacteria | Bacillota | Bacilli | Bacillales | Bacillaceae | Bacillus | 5225094 | 44.89 | 190 | 0.000 | 99.41 | |
| concoct.128.fa_sub.contigs | Archaea | Euryarchaeota | Methanonatronarchaeia | Methanonatronarchaeales | 1850888 | 61.60 | 115 | 0.524 | 99.20 | |||
| concoct.60.fa.contigs | Bacteria | Chloroflexota | 3136764 | 69.24 | 172 | 0.012 | 99.14 | |||||
| concoct.156.fa.contigs | Bacteria | 6095001 | 43.54 | 191 | 0.011 | 98.28 | ||||||
| concoct.3.fa_sub.contigs | Bacteria | Balneolota | Balneolia | Balneolales | Balneolaceae | 4274000 | 49.22 | 221 | 0.054 | 98.28 | ||
| concoct.74.fa.contigs | Bacteria | Gemmatimonadota | 5085934 | 71.21 | 92 | 0.000 | 98.28 | |||||
| concoct.95.fa.contigs | Bacteria | Planctomycetota | 4888782 | 65.33 | 228 | 0.000 | 98.28 | |||||
| concoct.124.fa.contigs | Bacteria | Planctomycetota | Phycisphaerae | Phycisphaerales | Phycisphaeraceae | Phycisphaera | 4196988 | 59.25 | 40 | 0.059 | 98.28 | |
| concoct.26.fa_sub.contigs | Bacteria | Bacillota | Clostridia | 4160121 | 35.96 | 116 | 0.147 | 98.18 | ||||
| concoct.142.fa_sub.contigs | Bacteria | Bacillota | Clostridia | 4821405 | 40.74 | 142 | 0.093 | 98.18 | ||||
| concoct.43.fa.contigs | Bacteria | Chloroflexota | Anaerolineae | Anaerolineales | Anaerolineaceae | 4116655 | 53.37 | 96 | 0.000 | 98.12 | ||
| concoct.135.fa.contigs | Bacteria | Chloroflexota | Anaerolineae | Anaerolineales | Anaerolineaceae | 5666743 | 60.19 | 339 | 0.039 | 97.41 | ||
| concoct.136.fa.contigs | Bacteria | Planctomycetota | 5953379 | 62.85 | 363 | 0.014 | 96.55 | |||||
| concoct.87.fa.contigs | Bacteria | Candidatus | Hydrogenedentes | 4397254 | 63.18 | 347 | 0.013 | 96.55 | ||||
| concoct.72.fa_sub.contigs | Bacteria | Acidobacteriota | 6992966 | 63.02 | 398 | 0.224 | 96.55 | |||||
| metabat2.98.fa.contigs | Bacteria | Thermodesulfobacteriota | Desulfobacteria | Desulfobacterales | Desulfobacteraceae | 7292894 | 40.84 | 776 | 0.011 | 96.55 | ||
| concoct.146.fa_sub.contigs | Bacteria | Gemmatimonadota | 4266924 | 71.98 | 101 | 0.110 | 96.55 | |||||
| concoct.42.fa_sub.contigs | Bacteria | 4818767 | 68.56 | 103 | 0.000 | 95.69 | ||||||
| concoct.127.fa_sub.contigs | Bacteria | 5468141 | 63.61 | 288 | 0.000 | 94.83 | ||||||
| concoct.150.fa_sub.contigs | Bacteria | Pseudomonadota | Alphaproteobacteria | 4958108 | 68.60 | 387 | 0.481 | 94.26 | ||||
| concoct.101.fa_sub.contigs | Bacteria | 3839312 | 73.09 | 357 | 0.000 | 93.97 | ||||||
| metabat2.28.fa_sub.contigs | Bacteria | Gemmatimonadota | 4283236 | 72.45 | 247 | 0.024 | 93.10 | |||||
| metabat2.39.fa_sub.contigs | Bacteria | Bacillota | Tissierellia | Tissierellales | Thermohalobacteraceae | Thermohalobacter | berrensis | 4466897 | 29.80 | 879 | 0.425 | 93.03 |
| metabat2.1.fa.contigs | Bacteria | Bacillota | Clostridia | 2422754 | 38.33 | 208 | 0.000 | 92.64 | ||||
| metabat2.45.fa.contigs | Bacteria | Gemmatimonadota | 3981811 | 73.03 | 512 | 0.033 | 92.24 | |||||
| concoct.126.fa.contigs | Bacteria | Chlamydiota | Chlamydiia | Parachlamydiales | Parachlamydiaceae | 1784283 | 39.41 | 554 | 0.077 | 90.20 | ||
| metabat2.34.fa.contigs | Archaea | Nitrososphaerota | Nitrososphaeria | Nitrososphaerales | Nitrososphaeraceae | Nitrososphaera | 2089720 | 40.85 | 229 | 0.356 | 89.95 | |
| concoct.70.fa.contigs | Bacteria | Chloroflexota | 5773847 | 56.30 | 603 | 0.017 | 89.66 | |||||
| metabat2.46.fa.contigs | Bacteria | Bacillota | Bacilli | Bacillales | Paenibacillaceae | Paenibacillus | 4410215 | 51.60 | 460 | 0.013 | 88.79 | |
| concoct.63.fa_sub.contigs | Bacteria | Nitrospirota | Nitrospiria | Nitrospirales | Nitrospiraceae | Nitrospira | 4194091 | 59.23 | 700 | 0.047 | 87.93 | |
| concoct.112.fa_sub.contigs | Bacteria | Candidatus | Thermoplasmata| | 4101401 | 38.53 | 421 | 0.107 | 87.93 | ||||
| metabat2.32.fa.contigs | Bacteria | Bacillota | Bacilli | Bacillales | Bacillaceae | 2883534 | 33.86 | 195 | 0.039 | 87.90 | ||
| concoct.28.fa_sub.contigs | Bacteria | 1463234 | 36.95 | 28 | 0.000 | 87.77 | ||||||
| metabat2.60.fa.contigs | Bacteria | Candidatus | Sumerlaeota | 2860697 | 50.90 | 500 | 0.018 | 86.60 | ||||
| metabat2.77.fa.contigs | Bacteria | Atribacterota | 1457719 | 34.68 | 166 | 0.000 | 85.86 | |||||
| metabat2.3.fa.contigs | Bacteria | 1954332 | 63.14 | 371 | 0.038 | 85.19 |
For the 8 MAGs that were confident down to the Genus level, I’ve included some AI generated summaries of the Genus. I thought this would at least be helpful in terms of a starting point for the discussion.
he genus Rubrivirga, part of the family Rhodothermaceae, is characterized by red-pigmented bacteria that have been isolated from deep-sea water12. Two species within this genus, Rubrivirga marina and Rubrivirga profundi, are particularly noteworthy12. These bacteria are Gram-staining-negative, rod-shaped, facultatively anaerobic, non-motile, and exhibit a pale-red pigmentation2.
Rubrivirga marina was first described in a 2013 study, where it was isolated from deep seawater1. Similarly, Rubrivirga profundi was isolated from deep-sea water, further highlighting the genus’s adaptation to deep-sea environments2.
In addition to their unique environmental niche, members of the Rubrivirga genus have shown potential enzymatic properties. A study characterized the degradation patterns and enzymatic properties of a novel alkali-resistant alginate lyase, AlyRm1, from Rubrivirga marina3. This discovery suggests potential applications in biotechnology, particularly in the breakdown of alginate, a complex polysaccharide found in the cell walls of brown algae.
Overall, the Rubrivirga genus represents a fascinating area of study within microbiology, offering insights into deep-sea microbial diversity and potential biotechnological applications.
Rubrivirga marina gen. nov., sp. nov., a member of the family Rhodothermaceae isolated from deep seawater
Rubrivirga profundi sp. nov., isolated from deep-sea water
Characterization of degradation patterns and enzymatic properties of a novel alkali-resistant alginate lyase AlyRm1 from Rubrivirga marina
The genus Burkholderia is a group of over 80 different Gram-negative species1. These bacteria are known for their diverse roles, acting as both beneficial and pathogenic strains2. Some species within this genus have been studied for their potential immune evasion mechanisms1, while others are being explored for their potential in vaccine development3.
Burkholderia is a genus of bacteria known for its versatility and adaptability in various environmental conditions. Sporulation, a process where bacteria form endospores, is a complex developmental process that some bacteria undergo under stressful conditions. However, the understanding of sporulation in Burkholderia is still an active area of research.
Studies have identified genes in Burkholderia pseudomallei, a species within the Burkholderia genus, that are regulated by σE in response to oxidative stress1. The σE, also known as AlgU, is part of an operon that has been found to regulate heat stress response in Burkholderia pseudomallei2. This operon is also suggested to have a role in sporulation-specific sigma factors3.
The genus Bacillus is a diverse group of bacteria known for their ability to produce endospores, allowing them to survive in harsh environments. They are Gram-positive, rod-shaped bacteria that can be found in various environments including soil, water, and the human gut1.
Some species of Bacillus, such as Bacillus cereus, are known to be pathogenic. Emerging strains of B. cereus have been associated with anthrax-like diseases2. On the other hand, many Bacillus species are known for their ability to produce biologically active substances, including antibiotics and bacteriocins3.
Bacillus is a genus of bacteria known for its ability to form endospores, a process known as sporulation. This process is an adaptive response to nutritional stress and involves the differential development of two cells1. The decision to initiate sporulation, DNA translocation, and cell-cell communication are key aspects of this process2. The environment in which sporulation occurs can influence the properties of the spores, and controlling the fate of Bacillus spores is pivotal to controlling bacterial populations3.
Bacillus subtilis sporulation: regulation of gene expression ↩︎
Recent progress in Bacillus subtilis sporulation ↩︎
Sporulation environment influences spore properties in Bacillus ↩︎
The genus Phycisphaera is a group of Gram-negative, motile bacteria that were first isolated from seaweed1. This genus represents a novel species within the class Phycisphaerae in the phylum Planctomycetes2. One notable species within this genus is Phycisphaera mikurensis, which was discovered as a new marine isolate3.
Phycisphaera - Fukunaga - Major Reference Works ↩︎
Cultivation-Independent Analysis of the Bacterial … ↩︎
Phycisphaera mikurensis gen. nov., sp. nov., isolated from …
The genus Thermohalobacter is part of the Thermohalobacteraceae family1. Organisms in this genus are thermophilic anaerobes, which means they thrive in high-temperature environments and do not require oxygen for growth2. Unfortunately, there is limited literature available specifically on Thermohalobacter, and further research would be needed for a comprehensive review.
Thermohalobacteraceae fam. nov. ↩︎
Diversity of thermophilic anaerobes ↩︎
The genus Nitrososphaera comprises aerobic ammonia-oxidizing archaea. Nitrososphaera viennensis, a species within this genus, was isolated from garden soil in Vienna and is known to be mesophilic, neutrophilic, and aerobic1. This species can grow on ammonia or urea as an energy source2. Another species, Candidatus Nitrososphaera gargensis, has had its complete genome sequenced, providing insights into its evolutionary lineage3.
The genus Paenibacillus is a group of bacteria that can be isolated from a wide range of sources and is relevant to humans, animals, and plants1. Some species within this genus are endospore-forming, Gram-stain-positive or variable, motile, rod-shaped, and can be aerobic or facultatively anaerobic2. One notable species is Paenibacillus paeoniae, an endophytic bacterium that was originally isolated from a surface-sterilized peony root3.
Current knowledge and perspectives of Paenibacillus ↩︎
Paenibacillus foliorum sp. nov., Paenibacillus … ↩︎
Paenibacillus paeoniae sp. nov., a novel endophytic … ↩︎
The genus Nitrospira is known for its pivotal role in the nitrification process as an aerobic chemolithoautotrophic nitrite-oxidizing bacterium1. This genus is ubiquitous and equipped with molecular machineries for both ammonia and nitrite oxidation2. Numerous investigations have recognized representatives of the genus Nitrospira as key and predominant nitrite-oxidizing bacteria in biological nutrient removal3.
Nitrospira ↩︎
Nitrospira as versatile nitrifiers: Taxonomy, ecophysiology … ↩︎
The occurrence and role of Nitrospira in nitrogen removal … ↩︎