Longest SSU is extracted via MiGA and placed in the Silva SSU database:
000000F|quiver_6902612_6904145___ CTGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAGCGGATGAAGGGAGCTTG CTCCCTGATTCAGCGGCGGACGGGTGAGTAATGCCTAGGAATCTGCCTGGTAGTGGGGGACAACGTTCCGAAAGGGACGC TAATACCGCATACGTCCTACGGGAGAAAGTGGGGGATCTTCGGACCTCACGCTATCAGATGAGCCTAGGTCGGATTAGCT AGTAGGTGGGGTAATGGCTCACCTAGGCGACGATCCGTAACTGGTCTGAGAGGATGATCAGTCACACTGGAACTGAGACA CGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGGACAATGGGCGAAAGCCTGATCCAGCCATGCCGCGTGTGT GAAGAAGGTCTTCGGATTGTAAAGCACTTTAAGTTGGGAGGAAGGGCTGCAAGTTAATACCTTGTAGTTTTGACGTTACC AACAGAATAAGCACCGGCTAACTTCGTGCCAGCAGCCGCGGTAATACGAAGGGTGCAAGCGTTAATCGGAATTACTGGGC GTAAAGCGCGCGTAGGTGGTTTGGTAAGATGGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATCCATAACTGCCTG ACTAGAGTACGGTAGAGGGTGGTGGAATTTCCTGTGTAGCGGTGAAATGCGTAGATATAGGAAGGAACACCAGTGGCGAA GGCGACCACCTGGACTGATACTGACACTGAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACG CCGTAAACGATGTCGACTAGCCGTTGGGATCCTTGAGATCTTAGTGGCGCAGCTAACGCGATAAGTCGACCGCCTGGGGA GTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGAAGCAA CGCGAAGAACCTTACCTGGCCTTGACATGTCCGGAATCCTGCAGAGATGTGGGAGTGCCTTCGGGAATCGGAACACAGGT GCTGCATGGCTGTCGTCAGCTCGTGTCGTGAGATGTTGGGTTAAGTCCCGTAACGAGCGCAACCCTTGTCCTTAGTTACC AGCGGGTTATGCCGGGCACTCTAAGGAGACTGCCGGTGACAAACCGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGC CCTTACGGCCAGGGCTACACACGTGCTACAATGGTCGGTACAGAGGGTTGCCAAGCCGCGAGGTGGAGCTAATCCCACAA AACCGATCGTAGTCCGGATCGCAGTCTGCAACTCGACTGCGTGAAGTCGGAATCGCTAGTAATCGTGAATCAGAATGTCA CGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCTCCAGAAGTAGCTAGTCTAAC CTTCGGGGGGACGGTTACCACGGAGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAGCCGTAGGGGAACCTGCGGCT GGATCACCTCCTT
silva.class <- read.csv("MiGA/P273_silva_classify.csv", sep = ";", stringsAsFactors = FALSE)
cat(paste("#### SilvaSSU classification:", silva.class$lca_tax_slv))
library(stringr)
silva.neighbors <- read.csv("MiGA/P273_silva_neighbors.csv", sep = ";")
silva.neighbors.tax <- data.frame(stringr::str_split_fixed(silva.neighbors$full_name, ";", 7), stringsAsFactors = F)
kable(unique(silva.neighbors.tax$X7))
| x |
|---|
| uncultured Pseudomonas sp. |
| Pseudomonas citronellolis |
| Pseudomonas sp. wust-c |
| Pseudomonas sp. EGD-AKN5 |
| Pseudomonas citronellolis NBRC 103043 |
| Pseudomonas sp. ADP |
| Pseudomonas knackmussii B13 |
| Pseudomonas sp. CCA 1 |
103 of 111 typcially essential genes are detected:
MiGA calculate AAI scores for the most closely related genomes in the chosen database. I used NCBI Prok, which is the largest database available at MiGA (~9000 genomes). AAI is an identity score among homologous proteins (it ignores non-homologous, non-shared proteins)
kable(read.csv("MiGA/P273_aai.csv", header =T))
| Dataset | AAI…. | Standard.deviation..AAI.. | Fraction.of.proteins.shared…. |
|---|---|---|---|
| Pseudomonas citronellolis NZ CP014158 | 94.55 | 9.33 | 85 |
| Pseudomonas knackmussii B13 NZ HG322950 | 85.88 | 13.14 | 75.79 |
| Pseudomonas sp. ATCC 13867 NC 020829 | 81.87 | 13.83 | 82.18 |
| Pseudomonas aeruginosa PAO581 NZ CP006705 | 76.80 | 15.25 | 59.79 |
| Pseudomonas aeruginosa NZ CP027169 | 76.45 | 15.84 | 61.93 |
| Pseudomonas sp. TCU HL1 NZ CP015992 | 74.00 | 15.75 | 67 |
| Pseudomonas stutzeri NZ CP027543 | 67.20 | 0.00 | (estimated) |
| Pseudomonas psychrotolerans NZ CP018758 | 67.11 | 0.00 | (estimated) |
| Pseudomonas sp. LPH1 NZ CP017290 | 66.67 | 0.00 | (estimated) |
| Pseudomonas fluorescens NZ CP028826 | 65.01 | 0.00 | (estimated) |
| Pseudomonas brassicacearum NZ LT629713 | 65.00 | 0.00 | (estimated) |
| Pseudomonas sp. DR 5 09 NZ CP011566 | 65.00 | 0.00 | (estimated) |
| Pseudomonas putida NBRC 14164 NC 021505 | 64.86 | 0.00 | (estimated) |
| Pseudomonas syringae pv. syringae NZ LT962480 | 64.82 | 0.00 | (estimated) |
| Pseudomonas frederiksbergensis NZ CP018319 | 64.46 | 0.00 | (estimated) |
| Pseudomonas fluorescens NZ CP010896 | 64.26 | 0.00 | (estimated) |
| Pseudomonas fragi NZ CP013861 | 64.06 | 0.00 | (estimated) |
| Pseudomonas sp. S 6 2 NZ CP020100 | 61.56 | 0.00 | (estimated) |
| Oblitimonas alkaliphila NZ CP012360 | 60.73 | 0.00 | (estimated) |
| Gammaproteobacteria bacterium ESL0073 NZ CP029478 | 58.51 | 0.00 | (estimated) |
| Microbulbifer thermotolerans NZ CP014864 | 51.52 | 0.00 | (estimated) |
| Cellvibrio sp. PSBB023 NZ CP019799 | 51.52 | 0.00 | (estimated) |
| Marinobacter sp. Arc7 DN 1 CP031848 | 51.44 | 0.00 | (estimated) |
| Halomonas sp. SF2003 NZ CP028367 | 51.44 | 0.00 | (estimated) |
| Halomonas aestuarii NZ CP018139 | 51.44 | 0.00 | (estimated) |
| Halioglobus japonicus NZ CP019450 | 51.41 | 0.00 | (estimated) |
| Halomonas campaniensis NZ CP007757 | 51.05 | 0.00 | (estimated) |
| Bacterioplanes sanyensis NZ CP022530 | 50.81 | 0.00 | (estimated) |
| Saccharospirillum mangrovi NZ CP031415 | 50.30 | 0.00 | (estimated) |
| Kushneria marisflavi NZ CP021358 | 50.17 | 0.00 | (estimated) |
| Marinomonas primoryensis CP016181 | 50.15 | 0.00 | (estimated) |
| Alcanivorax xenomutans NZ CP012331 | 49.68 | 0.00 | (estimated) |
| Citrobacter sp. FDAARGOS 156 NZ CP014030 | 48.79 | 0.00 | (estimated) |
| endosymbiont of unidentified scaly snail isolate Monju NZ AP012978 | 48.50 | 0.00 | (estimated) |
| Halorhodospira halochloris str. A CP007268 | 47.83 | 0.00 | (estimated) |
| Piscirickettsia salmonis LF 89 ATCC VR 1361 NZ CP011849 | 47.77 | 0.00 | (estimated) |
| Acinetobacter lactucae NZ CP020015 | 47.75 | 0.00 | (estimated) |
| Allochromatium vinosum DSM 180 NC 013851 | 47.68 | 0.00 | (estimated) |
| Acinetobacter schindleri NZ CP015615 | 47.63 | 0.00 | (estimated) |
| Acidihalobacter prosperus NZ CP017448 | 47.55 | 0.00 | (estimated) |
| Methylophaga nitratireducenticrescens NC 017857 | 47.13 | 0.00 | (estimated) |
| Methylomonas denitrificans NZ CP014476 | 46.85 | 0.00 | (estimated) |
| Thioploca ingrica NZ AP014633 | 46.72 | 0.00 | (estimated) |
| Moraxella bovoculi NZ CP011377 | 46.65 | 0.00 | (estimated) |
| Thioalkalivibrio paradoxus ARh 1 NZ CP007029 | 46.60 | 0.00 | (estimated) |
| Moraxella osloensis NZ CP024176 | 46.54 | 0.00 | (estimated) |
| Nitrosococcus oceani ATCC 19707 NC 007484 | 46.53 | 0.00 | (estimated) |
| Halothiobacillus sp. LS2 NZ CP016027 | 46.15 | 0.00 | (estimated) |
| Spiribacter curvatus NC 022664 | 46.15 | 0.00 | (estimated) |
| Thioalkalivibrio versutus NZ CP011367 | 46.14 | 0.00 | (estimated) |
MiGA also provides a tree based on the AAI; the methods are not explicit, but I have to assume that it aligns proteins shared across the dataset and builds a tree (BioNJ tree, the program is provided here. BioNJ is not the best, but it is better than NJ and is much faster than ML).