pedbank

Pedbank

library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(readr)

megamap <- "/pita/pub/data/16S_DBs/maps/DB1-31_premap_v3.txt"
megamap |> read_tsv() |> 
  filter(Cohort == "pedbank") |> 
  filter(reads_number > 4000) |> 
  distinct(sample_ID, .keep_all = T) |> 
  pull(reads_number) |> summary()
Warning: One or more parsing issues, see `problems()` for details
Rows: 10860 Columns: 30
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr (29): SampleID, BarcodeSequence, LinkerPrimerSequence, Barcode, Database...
dbl  (1): reads_number

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   4003    7296   12208   15109   19567   85763