Taxonomic placement

SSU taxonomy

Longest SSU is extracted via MiGA and placed in the Silva SSU database:

000000F|quiver_6902612_6904145___ CTGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAGCGGATGAAGGGAGCTTG CTCCCTGATTCAGCGGCGGACGGGTGAGTAATGCCTAGGAATCTGCCTGGTAGTGGGGGACAACGTTCCGAAAGGGACGC TAATACCGCATACGTCCTACGGGAGAAAGTGGGGGATCTTCGGACCTCACGCTATCAGATGAGCCTAGGTCGGATTAGCT AGTAGGTGGGGTAATGGCTCACCTAGGCGACGATCCGTAACTGGTCTGAGAGGATGATCAGTCACACTGGAACTGAGACA CGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCGTGTGT GAAGAAGGTCTTCGGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCTGCAAGTTAATACCTTGTAGTTTTGACGTTACC AACAGAATAAGCACCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGGGTGCAAGCGTTAATCGGAATTACTGGGC GTAAAGCGCGCGTAGGTGGTTTGGTAAGATGGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCCATAACTGCCTG ACTAGAGTACGGTAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCAGTGGCGAA GGCGACCACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACG CCGTAAACGATGTCGACTAGCCGTTGGGATCCTTGAGATCTTAGTGGCGCAGCTAACGCGATAAGTCGACCGCCTGGGGA GTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAA CGCGAAGAACCTTACCTGGCCTTGACATGTCCGGAATCCTGCAGAGATGTGGGAGTGCCTTCGGGAATCGGAACACAGGT GCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACC AGCGGGTTATGCCGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGC CCTTACGGCCAGGGCTACACACGTGCTACAATGGTCGGTACAGAGGGTTGCCAAGCCGCGAGGTGGAGCTAATCCCACAA AACCGATCGTAGTCCGGATCGCAGTCTGCAACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGTGAATCAGAATGTCA CGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCTCCAGAAGTAGCTAGTCTAAC CTTCGGGGGGACGGTTACCACGGAGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAGCCGTAGGGGAACCTGCGGCT GGATCACCTCCTT

silva.class <- read.csv("MiGA/P273_silva_classify.csv", sep = ";", stringsAsFactors = FALSE)
cat(paste("#### SilvaSSU classification:", silva.class$lca_tax_slv))

SilvaSSU classification: Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Pseudomonadaceae;Pseudomonas;

Closest Neighbors in Silva SSU database as determined by RaxML

library(stringr)
silva.neighbors <- read.csv("MiGA/P273_silva_neighbors.csv", sep = ";")
silva.neighbors.tax <- data.frame(stringr::str_split_fixed(silva.neighbors$full_name, ";", 7), stringsAsFactors = F)
kable(unique(silva.neighbors.tax$X7))
x
uncultured Pseudomonas sp.
Pseudomonas citronellolis
Pseudomonas sp. wust-c
Pseudomonas sp. EGD-AKN5
Pseudomonas citronellolis NBRC 103043
Pseudomonas sp. ADP
Pseudomonas knackmussii B13
Pseudomonas sp. CCA 1

Analysis at MiGA

Genome quality assessment

103 of 111 typcially essential genes are detected:

Whole genome taxonomic placement:

This is most likley an uncharacterized species of Pseudomonas

Phylogeny based on average amino acid identity (AAI)

MiGA calculate AAI scores for the most closely related genomes in the chosen database. I used NCBI Prok, which is the largest database available at MiGA (~9000 genomes). AAI is an identity score among homologous proteins (it ignores non-homologous, non-shared proteins)

kable(read.csv("MiGA/P273_aai.csv", header =T))
Dataset AAI…. Standard.deviation..AAI.. Fraction.of.proteins.shared….
Pseudomonas citronellolis NZ CP014158 94.55 9.33 85
Pseudomonas knackmussii B13 NZ HG322950 85.88 13.14 75.79
Pseudomonas sp. ATCC 13867 NC 020829 81.87 13.83 82.18
Pseudomonas aeruginosa PAO581 NZ CP006705 76.80 15.25 59.79
Pseudomonas aeruginosa NZ CP027169 76.45 15.84 61.93
Pseudomonas sp. TCU HL1 NZ CP015992 74.00 15.75 67
Pseudomonas stutzeri NZ CP027543 67.20 0.00 (estimated)
Pseudomonas psychrotolerans NZ CP018758 67.11 0.00 (estimated)
Pseudomonas sp. LPH1 NZ CP017290 66.67 0.00 (estimated)
Pseudomonas fluorescens NZ CP028826 65.01 0.00 (estimated)
Pseudomonas brassicacearum NZ LT629713 65.00 0.00 (estimated)
Pseudomonas sp. DR 5 09 NZ CP011566 65.00 0.00 (estimated)
Pseudomonas putida NBRC 14164 NC 021505 64.86 0.00 (estimated)
Pseudomonas syringae pv. syringae NZ LT962480 64.82 0.00 (estimated)
Pseudomonas frederiksbergensis NZ CP018319 64.46 0.00 (estimated)
Pseudomonas fluorescens NZ CP010896 64.26 0.00 (estimated)
Pseudomonas fragi NZ CP013861 64.06 0.00 (estimated)
Pseudomonas sp. S 6 2 NZ CP020100 61.56 0.00 (estimated)
Oblitimonas alkaliphila NZ CP012360 60.73 0.00 (estimated)
Gammaproteobacteria bacterium ESL0073 NZ CP029478 58.51 0.00 (estimated)
Microbulbifer thermotolerans NZ CP014864 51.52 0.00 (estimated)
Cellvibrio sp. PSBB023 NZ CP019799 51.52 0.00 (estimated)
Marinobacter sp. Arc7 DN 1 CP031848 51.44 0.00 (estimated)
Halomonas sp. SF2003 NZ CP028367 51.44 0.00 (estimated)
Halomonas aestuarii NZ CP018139 51.44 0.00 (estimated)
Halioglobus japonicus NZ CP019450 51.41 0.00 (estimated)
Halomonas campaniensis NZ CP007757 51.05 0.00 (estimated)
Bacterioplanes sanyensis NZ CP022530 50.81 0.00 (estimated)
Saccharospirillum mangrovi NZ CP031415 50.30 0.00 (estimated)
Kushneria marisflavi NZ CP021358 50.17 0.00 (estimated)
Marinomonas primoryensis CP016181 50.15 0.00 (estimated)
Alcanivorax xenomutans NZ CP012331 49.68 0.00 (estimated)
Citrobacter sp. FDAARGOS 156 NZ CP014030 48.79 0.00 (estimated)
endosymbiont of unidentified scaly snail isolate Monju NZ AP012978 48.50 0.00 (estimated)
Halorhodospira halochloris str. A CP007268 47.83 0.00 (estimated)
Piscirickettsia salmonis LF 89 ATCC VR 1361 NZ CP011849 47.77 0.00 (estimated)
Acinetobacter lactucae NZ CP020015 47.75 0.00 (estimated)
Allochromatium vinosum DSM 180 NC 013851 47.68 0.00 (estimated)
Acinetobacter schindleri NZ CP015615 47.63 0.00 (estimated)
Acidihalobacter prosperus NZ CP017448 47.55 0.00 (estimated)
Methylophaga nitratireducenticrescens NC 017857 47.13 0.00 (estimated)
Methylomonas denitrificans NZ CP014476 46.85 0.00 (estimated)
Thioploca ingrica NZ AP014633 46.72 0.00 (estimated)
Moraxella bovoculi NZ CP011377 46.65 0.00 (estimated)
Thioalkalivibrio paradoxus ARh 1 NZ CP007029 46.60 0.00 (estimated)
Moraxella osloensis NZ CP024176 46.54 0.00 (estimated)
Nitrosococcus oceani ATCC 19707 NC 007484 46.53 0.00 (estimated)
Halothiobacillus sp. LS2 NZ CP016027 46.15 0.00 (estimated)
Spiribacter curvatus NC 022664 46.15 0.00 (estimated)
Thioalkalivibrio versutus NZ CP011367 46.14 0.00 (estimated)

MiGA also provides a tree based on the AAI; the methods are not explicit, but I have to assume that it aligns proteins shared across the dataset and builds a tree (BioNJ tree, the program is provided here. BioNJ is not the best, but it is better than NJ and is much faster than ML).