| Standards | Run Information |
|---|---|
| Project Name | ASM NGS |
| Experimenter | C. Mena |
| Experiment Name (project_run#_YYYY_MM_DD) | 2024_07_03_cm_2 |
| Run Start Date/Time | 7/3/24 |
| Run Length (hrs): | 48 hours |
| Total Library Concentration (ng/ul) | 193 ng |
| Positive Control (5ng BC10 PBK) (yes/no) (If yes label it on BC10) | no |
| Sequencing Kit (RAB204, PBK004, RBK004, ect.) | RBK114 |
| Sequencing Kit Batch Number | NA |
| Flowcell ID | FAY83186 |
| QC Total Pores | 1570 |
| Run Start Total Pore | 1459 |
| Device Number (111633, 111632, 110540, 111583, ect) | MN43813 |
| Standard Settings | Correct settings if different from below |
| Basecalling | ON (super accurate) |
| Basecalling Config: | Super accuracy |
| Barcoding | ON |
| Barcoding Options: Trim Barcodes | OFF |
| Barcoding Options: Barcodes both ends | OFF |
| Barcoding Options: Mid-read barcode filtering | OFF |
| Barcoding Options: Override minimum barcoding score | OFF (keep default at 60) |
| Filtering | OFF |
| Filtering Options: Qscore | 10 |
| Output format: Compression (.vbz, .gzip) | ON (default) |
| MinKNOW version | 24.02.16 |
| Nanopore Basecalling version (Guppy) | super accuracy |
| Notes | R.10 flow cell |
The bacteria that were selected for whole genome sequencing and their respective barcodes can be found below.
| Barcode | Sample Name |
|---|---|
| barcode01 | Paenibacillus sp. |
| barcode02 | Bacillus cereus |
| barcode03 | Klebsiella aerogenes |
| barcode04 | Staphylococcus aureus |
| barcode05 | Staphylococcus hominis |
| barcode06 | Bacillus subtilis |
| barcode07 | Staphylococcus epidermidis |
| barcode08 | Klebsiella aerogenes |
The nanopore flowcell performance can be determined using the following charts. They help to determine if the library chemistry and DNA template are going to provide high quality reads.
The nanopores through which DNA is passed, and signal collected, are arrayed as a 2-dimensional matrix. A heatmap can be plotted showing channel productivity against spatial position on the matrix. Such a plot enables the identification of spatial artifacts that could result from membrane damage through e.g. the introduction of an air-bubble. This heatmap representation of spatial activity shows only gross spatial aberations. Since each channel can address four different pores (Mux) the activity plot below shows the number of sequences produced per channel, not per pore.
\[\\[0.5in]\]
\[\\[0.25in]\]
The speed/time plot is used to observe any substantial changes in sequencing speed. A marked slow-down in sequencing speed can indicate challenges within the sequencing chemistry that could have been caused by the method of DNA isolation or an abundance of small DNA fragments.
\[\\[0.5in]\]
\[\\[0.25in]\]
The density plot of mean sequence quality plotted against log10 sequence length is a used to show patterns within the broader sequence collection. The density plot shown in the figure below has been de-speckled by omitting the rarer sequence bins containing only 1 read or fewer have been omitted. This is mainly aesthetic and masks some speckle around the periphery of the main density map.
\[\\[0.5in]\]
| ID | Length of Assembly | No. of Contigs | Genome N50 | GC% |
|---|---|---|---|---|
| Paenibacillus sp. | 6,613,973 | 1 | 6,613,973 | 47.16 |
| B. cereus | 5,292,514 | 2 | 5,216,597 | 35.40 |
| K. aerogenes | 5,422,585 | 3 | 5,325,448 | 54.85 |
| S. aureus | 2,791,163 | 1 | 2,791,163 | 32.88 |
| S. hominis | 2,316,782 | 2 | 2,304,307 | 31.54 |
| B. subtilis | 4,197,947 | 2 | 4,159,298 | 43.60 |
| S. epidermidis | 2,518,551 | 3 | 2,428,929 | 32.08 |
| K. aerogenes | 5,423,897 | 3 | 5,325,436 | 54.85 |
The statistics from the whole genome assemblies were calculated using seqkit stats version 2.8.0.
| ID | ANI | ANI Best Match | Completeness | Contamination |
|---|---|---|---|---|
| Paenibacillus sp. | 98.15 | NZ_CP106740.1 Paenibacillus sp. CC-CFT742 chromosome, complete genome | 100.00 | 0.00 |
| B. cereus | 100.00 | NZ_CP018933.1 Bacillus cereus strain ISSFR-9F chromosome, complete genome | 100.00 | 0.04 |
| K. aerogenes | 99.99 | NZ_CP014029.2 Klebsiella aerogenes strain FDAARGOS_152 chromosome, complete genome | 100.00 | 0.81 |
| S. aureus | 99.98 | NZ_CP047815.1 Staphylococcus aureus strain UP_1484 chromosome, complete genome | 100.00 | 0.19 |
| S. hominis | 99.02 | NZ_CP080457.1 Staphylococcus hominis subsp. hominis strain WiKim0113 chromosome, complete genome | 99.99 | 0.07 |
| B. subtilis | 99.87 | NZ_CP072845.1 Bacillus subtilis strain XP chromosome, complete genome | 100.00 | 0.06 |
| S. epidermidis | 99.45 | NZ_CP090912.1 Staphylococcus epidermidis strain 44 chromosome, complete genome | 99.99 | 0.05 |
| K. aerogenes | 99.99 | NZ_CP024885.1 Klebsiella aerogenes strain AR_0009 chromosome, complete genome | 100.00 | 0.84 |
To determine the identification of the bacteria using the whole genome sequence, average nucleotide identity was calculated using skani version 0.2.1 and used references downloaded from NCBI with ncbi-genome-download and assemblies that are considered complete. The completeness and contamination was determined using CheckM2 version 1.0.2.
Nanopore whole genome sequencing assemblies were annotated with Bakta version 1.9.4 annotation pipeline.
| ID | Coding Density | tRNAs | rRNAs | CDSs | pseudogenes | hypotheticals |
|---|---|---|---|---|---|---|
| Paenibacillus sp. | 87.2 | 110 | 39 | 5,944 | 13 | 392 |
| B. cereus | 85.9 | 104 | 42 | 5,326 | 26 | 98 |
| K. aerogenes | 89.5 | 88 | 25 | 5,047 | 10 | 142 |
| S. aureus | 85.4 | 61 | 19 | 2,568 | 7 | 43 |
| S. hominis | 88.1 | 63 | 19 | 2,205 | 6 | 92 |
| B. subtilis | 89.0 | 88 | 30 | 4,435 | 25 | 228 |
| S. epidermidis | 84.2 | 61 | 19 | 2,323 | 11 | 80 |
| K. aerogenes | 89.5 | 88 | 25 | 5,047 | 10 | 143 |
Circular genome maps were created using bakta_plot via Circos. The plots are created as a part of the standard workflow and are saved as png and svg files. The two default plot types, features and cog, were created following assembly. Both plot types share two innermost GC content and GC skew rings.
All features are plotted on two rings representing the forward and
reverse strand from outer to inner, respectively using the following
feature colors:
CDS
tRNA/tmRNA
rRNA
ncRNA
ncRNA-region
CRISPR
GAP
Misc
All protein-coding genes (CDS) are colored according to assigned COG functional categories. To better distinguish non-coding genes, these are plotted on the additional 3rd ring.
1 RNA processing and
modification
2 Chromatin structure and
dynamics
3 Energy production and
conversion
4 Cell cycle control, cell
division, chromosome partitioning
5 Amino acid transport and
metabolism
6 Nucleotide transport and
metabolism
7 Carbohydrate transport
and metabolism
8 Coenzyme transport and
metabolism
9 Lipid transport and
metabolism
10 Translation, ribosomal
structure and biogenesis
11 Transcription
12 Replication,
recombination and repair
13 Cell
wall/membrane/envelope biogenesis
14 Cell
motility
15 Posttranslational
modification, protein turnover, chaperones
16 Inorganic ion transport
and metabolism
17 Secondary metabolites
biosynthesis, transport and catabolism
18 General function
prediction only
19 Function
unknown
20 Signal transduction
mechanisms
21 Intracellular
trafficking, secretion, and vesicular transport
22 Defense
mechanisms
23 Extracellular
structures
24 Mobilome: prophages,
transposons
25 Nuclear
structure
26 Cytoskeleton