First we are going to load our required library for this analysis, which is SQMtools

Since we ran this analysis in sequential mode (one sample at a time), we have output files for each sample run. We now need to combine these samples into a single file using the ‘coimbineSQMlite’ command

We’ll first combine the first 4 sample datasets (which represent a single biological sample)

ATS Deep Seep

These are the taxa identified for the ATS Deep Seep samples

Kingdom

plotTaxonomy(meta, rank='superkingdom', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))
## Warning in mostAbundant(data, N = N, items = tax, others = others, rescale =
## rescale): N=15 but only 7 items exist. Returning 7 items
## Warning in mostAbundant(data, N = N + rr, items = tax, others = others, : N=16
## but only 7 items exist. Returning 7 items

Phylum

plotTaxonomy(meta, rank='phylum', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))

Class

plotTaxonomy(meta, rank='class', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))

Order

plotTaxonomy(meta, rank='order', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))

Family

plotTaxonomy(meta, rank='family', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))

Species

plotTaxonomy(meta, rank='species', count='percent', ignore_unmapped = TRUE, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))

Pathway Analysis

ATS Deep Seep

These are the pathways identified

Kegg Pathways

To answer our previous discussion on why they are using TPM when we are using DNA, the authors posted this:

iitalicNo, as TPM is not actually the number of counts from that gene that you observe per million counts.

iitalicSee this excerpt from our recently published paper (https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03703-2)

iitalicThe TPM (transcripts per million) metric was introduced by Wagner et al. [14] as an improved way to account for gene length and sequencing depth in transcriptomic experiments: we find it equally useful in metagenomics. The TPM of a feature (be it a iitalictranscript, a gene or a functional category) is the number of times that we would find that feature when randomly sampling 1 million features, given the abundances of the different features in our sample. […]. For the sake of being consistent with iitalicprevious works, we maintain the nomenclature “TPM”, even when use it to measure the abundances of features other than transcripts.

iitalicSo strickly speaking we could use the more generic FPM - Features Per Million, but we find it less confusing to stick to the existing nomenclature.

plotFunctions(meta, fun_level = 'KEGG', count = 'copy_number', N = 8, base_size = 20) 

COG Pathways

plotFunctions(meta, fun_level = 'COG', count = 'copy_number', N = 8, base_size = 28)

PFAM Pathways

plotFunctions(meta, fun_level = 'PFAM', count = 'copy_number', N = 8, base_size = 20)

Pathways of interest

Pathway Enrichment

Transposons

transposon = subsetFun(meta, fun = 'transposon', rescale_copy_number = F) 


plotTaxonomy(transposon, rank = 'genus', count = 'percent', base_size = 20, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))

ABC Proteins

ABC = subsetFun(meta, fun = 'ABC', rescale_copy_number = F) 


plotTaxonomy(ABC, rank = 'genus', count = 'percent', base_size = 10, metadata_groups = list("Deep-Seep" = c('Sample1', 'Sample2', 'Sample3', 'Sample4', 'Sample5', 'Sample6', 'Sample7', 'Sample8', 'Sample9', 'Sample10', 'Sample11', 'Sample12'), "Avery Island" = c('Sample13', 'Sample14', 'Sample15', 'Sample16', 'Sample17', 'Sample18', 'Sample19', 'Sample20', 'Sample21', 'Sample22', 'Sample23', 'Sample24'), "Spore Enriched" = c('Sample25', 'Sample26', 'Sample27', 'Sample28', 'Sample29', 'Sample30', 'Sample31', 'Sample32', 'Sample33', 'Sample34', 'Sample35', 'Sample36')))

Mag Table

This is a list of all the MAGs that were identified via the Squeeze Meta pipeline. I’ve only included MAGS in which there was > 85% completeness (which seems to be the standard in the published literature).

library(kableExtra)

MAGS <- read.csv("MAGS.csv")

kbl(MAGS) %>%
    kable_classic(full_width = F, html_font = "Cambria")
Contig.File Taxonomy.Kingdom Phylum Class Order Family Genus Species Length GC.perc Num.contigs Disparity Completeness
concoct.2.fa.contigs Bacteria Rhodothermota Rhodothermia Rhodothermales Rubricoccaceae Rubrivirga 6788719 63.70 185 0.000 100.00
concoct.11.fa.contigs Bacteria Pseudomonadota Betaproteobacteria Burkholderiales Burkholderiaceae Burkholderia 6727187 67.05 221 0.000 99.83
concoct.120.fa_sub.contigs Bacteria Bacillota Bacilli Bacillales Bacillaceae Bacillus 5225094 44.89 190 0.000 99.41
concoct.128.fa_sub.contigs Archaea Euryarchaeota Methanonatronarchaeia Methanonatronarchaeales 1850888 61.60 115 0.524 99.20
concoct.60.fa.contigs Bacteria Chloroflexota 3136764 69.24 172 0.012 99.14
concoct.156.fa.contigs Bacteria 6095001 43.54 191 0.011 98.28
concoct.3.fa_sub.contigs Bacteria Balneolota Balneolia Balneolales Balneolaceae 4274000 49.22 221 0.054 98.28
concoct.74.fa.contigs Bacteria Gemmatimonadota 5085934 71.21 92 0.000 98.28
concoct.95.fa.contigs Bacteria Planctomycetota 4888782 65.33 228 0.000 98.28
concoct.124.fa.contigs Bacteria Planctomycetota Phycisphaerae Phycisphaerales Phycisphaeraceae Phycisphaera 4196988 59.25 40 0.059 98.28
concoct.26.fa_sub.contigs Bacteria Bacillota Clostridia 4160121 35.96 116 0.147 98.18
concoct.142.fa_sub.contigs Bacteria Bacillota Clostridia 4821405 40.74 142 0.093 98.18
concoct.43.fa.contigs Bacteria Chloroflexota Anaerolineae Anaerolineales Anaerolineaceae 4116655 53.37 96 0.000 98.12
concoct.135.fa.contigs Bacteria Chloroflexota Anaerolineae Anaerolineales Anaerolineaceae 5666743 60.19 339 0.039 97.41
concoct.136.fa.contigs Bacteria Planctomycetota 5953379 62.85 363 0.014 96.55
concoct.87.fa.contigs Bacteria Candidatus Hydrogenedentes 4397254 63.18 347 0.013 96.55
concoct.72.fa_sub.contigs Bacteria Acidobacteriota 6992966 63.02 398 0.224 96.55
metabat2.98.fa.contigs Bacteria Thermodesulfobacteriota Desulfobacteria Desulfobacterales Desulfobacteraceae 7292894 40.84 776 0.011 96.55
concoct.146.fa_sub.contigs Bacteria Gemmatimonadota 4266924 71.98 101 0.110 96.55
concoct.42.fa_sub.contigs Bacteria 4818767 68.56 103 0.000 95.69
concoct.127.fa_sub.contigs Bacteria 5468141 63.61 288 0.000 94.83
concoct.150.fa_sub.contigs Bacteria Pseudomonadota Alphaproteobacteria 4958108 68.60 387 0.481 94.26
concoct.101.fa_sub.contigs Bacteria 3839312 73.09 357 0.000 93.97
metabat2.28.fa_sub.contigs Bacteria Gemmatimonadota 4283236 72.45 247 0.024 93.10
metabat2.39.fa_sub.contigs Bacteria Bacillota Tissierellia Tissierellales Thermohalobacteraceae Thermohalobacter berrensis 4466897 29.80 879 0.425 93.03
metabat2.1.fa.contigs Bacteria Bacillota Clostridia 2422754 38.33 208 0.000 92.64
metabat2.45.fa.contigs Bacteria Gemmatimonadota 3981811 73.03 512 0.033 92.24
concoct.126.fa.contigs Bacteria Chlamydiota Chlamydiia Parachlamydiales Parachlamydiaceae 1784283 39.41 554 0.077 90.20
metabat2.34.fa.contigs Archaea Nitrososphaerota Nitrososphaeria Nitrososphaerales Nitrososphaeraceae Nitrososphaera 2089720 40.85 229 0.356 89.95
concoct.70.fa.contigs Bacteria Chloroflexota 5773847 56.30 603 0.017 89.66
metabat2.46.fa.contigs Bacteria Bacillota Bacilli Bacillales Paenibacillaceae Paenibacillus 4410215 51.60 460 0.013 88.79
concoct.63.fa_sub.contigs Bacteria Nitrospirota Nitrospiria Nitrospirales Nitrospiraceae Nitrospira 4194091 59.23 700 0.047 87.93
concoct.112.fa_sub.contigs Bacteria Candidatus Thermoplasmata| 4101401 38.53 421 0.107 87.93
metabat2.32.fa.contigs Bacteria Bacillota Bacilli Bacillales Bacillaceae 2883534 33.86 195 0.039 87.90
concoct.28.fa_sub.contigs Bacteria 1463234 36.95 28 0.000 87.77
metabat2.60.fa.contigs Bacteria Candidatus Sumerlaeota 2860697 50.90 500 0.018 86.60
metabat2.77.fa.contigs Bacteria Atribacterota 1457719 34.68 166 0.000 85.86
metabat2.3.fa.contigs Bacteria 1954332 63.14 371 0.038 85.19

MAGS

For the 8 MAGs that were confident down to the Genus level, I’ve included some AI generated summaries of the Genus. I thought this would at least be helpful in terms of a starting point for the discussion.

Rubrivirga

he genus Rubrivirga, part of the family Rhodothermaceae, is characterized by red-pigmented bacteria that have been isolated from deep-sea water12. Two species within this genus, Rubrivirga marina and Rubrivirga profundi, are particularly noteworthy12. These bacteria are Gram-staining-negative, rod-shaped, facultatively anaerobic, non-motile, and exhibit a pale-red pigmentation2.

Rubrivirga marina was first described in a 2013 study, where it was isolated from deep seawater1. Similarly, Rubrivirga profundi was isolated from deep-sea water, further highlighting the genus’s adaptation to deep-sea environments2.

In addition to their unique environmental niche, members of the Rubrivirga genus have shown potential enzymatic properties. A study characterized the degradation patterns and enzymatic properties of a novel alkali-resistant alginate lyase, AlyRm1, from Rubrivirga marina3. This discovery suggests potential applications in biotechnology, particularly in the breakdown of alginate, a complex polysaccharide found in the cell walls of brown algae.

Overall, the Rubrivirga genus represents a fascinating area of study within microbiology, offering insights into deep-sea microbial diversity and potential biotechnological applications.

  1. Rubrivirga marina gen. nov., sp. nov., a member of the family Rhodothermaceae isolated from deep seawater

  2. Rubrivirga profundi sp. nov., isolated from deep-sea water

  3. Characterization of degradation patterns and enzymatic properties of a novel alkali-resistant alginate lyase AlyRm1 from Rubrivirga marina

Burkholderia

The genus Burkholderia is a group of over 80 different Gram-negative species1. These bacteria are known for their diverse roles, acting as both beneficial and pathogenic strains2. Some species within this genus have been studied for their potential immune evasion mechanisms1, while others are being explored for their potential in vaccine development3.

  1. A Review of Potential Immune Evasion Mechanisms ↩︎ ↩︎2
  2. Members of the genus Burkholderia: good and bad guys ↩︎
  3. Current Advances in Burkholderia Vaccines Development ↩︎

Burkholderia is a genus of bacteria known for its versatility and adaptability in various environmental conditions. Sporulation, a process where bacteria form endospores, is a complex developmental process that some bacteria undergo under stressful conditions. However, the understanding of sporulation in Burkholderia is still an active area of research.

Studies have identified genes in Burkholderia pseudomallei, a species within the Burkholderia genus, that are regulated by σE in response to oxidative stress1. The σE, also known as AlgU, is part of an operon that has been found to regulate heat stress response in Burkholderia pseudomallei2. This operon is also suggested to have a role in sporulation-specific sigma factors3.

  1. Transcriptional profiles of Burkholderia pseudomallei ↩︎
  2. The rpoE operon regulates heat stress response in Burkholderia pseudomallei ↩︎
  3. The Burkholderia pseudomallei RpoE (AlgU) operon ↩︎

Bacillus

The genus Bacillus is a diverse group of bacteria known for their ability to produce endospores, allowing them to survive in harsh environments. They are Gram-positive, rod-shaped bacteria that can be found in various environments including soil, water, and the human gut1.

Some species of Bacillus, such as Bacillus cereus, are known to be pathogenic. Emerging strains of B. cereus have been associated with anthrax-like diseases2. On the other hand, many Bacillus species are known for their ability to produce biologically active substances, including antibiotics and bacteriocins3.

Bacillus is a genus of bacteria known for its ability to form endospores, a process known as sporulation. This process is an adaptive response to nutritional stress and involves the differential development of two cells1. The decision to initiate sporulation, DNA translocation, and cell-cell communication are key aspects of this process2. The environment in which sporulation occurs can influence the properties of the spores, and controlling the fate of Bacillus spores is pivotal to controlling bacterial populations3.

  1. Bacillus subtilis sporulation: regulation of gene expression ↩︎

  2. Recent progress in Bacillus subtilis sporulation ↩︎

  3. Sporulation environment influences spore properties in Bacillus ↩︎

Phycisphaera

The genus Phycisphaera is a group of Gram-negative, motile bacteria that were first isolated from seaweed1. This genus represents a novel species within the class Phycisphaerae in the phylum Planctomycetes2. One notable species within this genus is Phycisphaera mikurensis, which was discovered as a new marine isolate3.

  1. Phycisphaera - Fukunaga - Major Reference Works ↩︎

  2. Cultivation-Independent Analysis of the Bacterial … ↩︎

  3. Phycisphaera mikurensis gen. nov., sp. nov., isolated from …

Thermohalobacter

The genus Thermohalobacter is part of the Thermohalobacteraceae family1. Organisms in this genus are thermophilic anaerobes, which means they thrive in high-temperature environments and do not require oxygen for growth2. Unfortunately, there is limited literature available specifically on Thermohalobacter, and further research would be needed for a comprehensive review.

  1. Thermohalobacteraceae fam. nov. ↩︎

  2. Diversity of thermophilic anaerobes ↩︎

Nitrososphaera

The genus Nitrososphaera comprises aerobic ammonia-oxidizing archaea. Nitrososphaera viennensis, a species within this genus, was isolated from garden soil in Vienna and is known to be mesophilic, neutrophilic, and aerobic1. This species can grow on ammonia or urea as an energy source2. Another species, Candidatus Nitrososphaera gargensis, has had its complete genome sequenced, providing insights into its evolutionary lineage3.

Paenibacillus

The genus Paenibacillus is a group of bacteria that can be isolated from a wide range of sources and is relevant to humans, animals, and plants1. Some species within this genus are endospore-forming, Gram-stain-positive or variable, motile, rod-shaped, and can be aerobic or facultatively anaerobic2. One notable species is Paenibacillus paeoniae, an endophytic bacterium that was originally isolated from a surface-sterilized peony root3.

  1. Current knowledge and perspectives of Paenibacillus ↩︎

  2. Paenibacillus foliorum sp. nov., Paenibacillus … ↩︎

  3. Paenibacillus paeoniae sp. nov., a novel endophytic … ↩︎

Nitrospira

The genus Nitrospira is known for its pivotal role in the nitrification process as an aerobic chemolithoautotrophic nitrite-oxidizing bacterium1. This genus is ubiquitous and equipped with molecular machineries for both ammonia and nitrite oxidation2. Numerous investigations have recognized representatives of the genus Nitrospira as key and predominant nitrite-oxidizing bacteria in biological nutrient removal3.

  1. Nitrospira ↩︎

  2. Nitrospira as versatile nitrifiers: Taxonomy, ecophysiology … ↩︎

  3. The occurrence and role of Nitrospira in nitrogen removal … ↩︎