curatedMetagenomicData
providescuratedMetagenomicData
provides 6 types of data for each dataset:
Types 1-3 are generated by MetaPhlAn2; 4-6 are generated by HUMAnN2.
Currently, curatedMetagenomicData
provides:
These datasets are documented in the reference manual.
Load the ExperimentHub library,
library(ExperimentHub)
and create the hub:
eh = ExperimentHub()
## snapshotDate(): 2016-06-08
The following queries ExperimentHub for all records matching “curatedMetagenomicData”:
myquery = query(eh, "curatedMetagenomicData")
This is an abbreviated list of what was found:
myquery
## ExperimentHub with 162 records
## # snapshotDate(): 2016-06-08
## # $dataprovider: Human Microbiome Project Consortium, BGI-Shenzhen, She...
## # $species: Homo sapiens
## # $rdataclass: ExpressionSet
## # additional mcols(): taxonomyid, genome, description,
## # preparerclass, tags, sourceurl, sourcetype
## # retrieve records with, e.g., 'object[["EH2"]]'
##
## title
## EH2 | Candela_Africa.stool.abundance.eset.rda
## EH3 | Candela_Africa.stool.genefamilies.eset.rda
## EH4 | Candela_Africa.stool.marker_ab.eset.rda
## EH5 | Candela_Africa.stool.marker_pres.eset.rda
## EH6 | Candela_Africa.stool.pathabundance.eset.rda
## ... ...
## EH159 | t2dmeta_short.stool.genefamilies.eset.rda
## EH160 | t2dmeta_short.stool.marker_ab.eset.rda
## EH161 | t2dmeta_short.stool.marker_pres.eset.rda
## EH162 | t2dmeta_short.stool.pathabundance.eset.rda
## EH163 | t2dmeta_short.stool.pathcoverage.eset.rda
Note that the first column, “EH2”, “EH3”, for example, are unique IDs for each dataset. The low numbers here are because curatedMetagenomicData
is the first major dataset to be distributed through ExperimentHub
. For a full list, mcols(myquery)
produces a data.frame
. The following is unevaluated in this script, but can be used to display an interactive spreadsheet of the results:
View(mcols(myquery))
Or to write the search results to a csv file:
write.csv(mcols(myquery), file="curatedMetagenomicData_allrecords.csv")
Here we choose “EH2” from the View(mcols(myquery)) above. This loads the “Candela_Africa” stool dataset. Note that this operation will occur much more quickly subsequent times that it is performed, due to local caching.
Please ignore the warning that this generates, it is harmless and will not appear in the release version.
taxabund = eh[["EH2"]]
## BioC_mirror: https://bioconductor.org
## Using Bioconductor 3.4 (BiocInstaller 1.23.5), R 3.3.1 (2016-06-21).
## Installing package(s) 'curatedMetagenomicData'
## see ?curatedMetagenomicData and browseVignettes('curatedMetagenomicData') for documentation
## loading from cache '/home/nar/.ExperimentHub/2'
taxabund
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 3302 features, 38 samples
## element names: exprs
## protocolData: none
## phenoData
## sampleNames: H10 H11 ... IT8 (38 total)
## varLabels: dataset_name sampleID ... group (211 total)
## varMetadata: labelDescription
## featureData: none
## experimentData: use 'experimentData(object)'
## pubMedIds: 25981789
## Annotation: NA
This ExpressionSet
object contains information about the experiment:
experimentData(taxabund)@name
## [1] "Rampelli S, Schnorr SL, Consolandi C, Turroni S, Severgnini M, Peano C, Brigidi P, Crittenden AN, Henry AG, Candela M"
experimentData(taxabund)@title
## [1] "Metagenome Sequencing of the Hadza Hunter-Gatherer Gut Microbiota"
experimentData(taxabund)@abstract
## [1] "Through human microbiome sequencing, we can better understand how host evolutionary and ontogenetic history is reflected in the microbial function. However, there has been no information on the gut metagenome configuration in hunter-gatherer populations, posing a gap in our knowledge of gut microbiota (GM)-host mutualism arising from a lifestyle that describes over 90% of human evolutionary history. Here, we present the first metagenomic analysis of GM from Hadza hunter-gatherers of Tanzania, showing a unique enrichment in metabolic pathways that aligns with the dietary and environmental factors characteristic of their foraging lifestyle. We found that the Hadza GM is adapted for broad-spectrum carbohydrate metabolism, reflecting the complex polysaccharides in their diet. Furthermore, the Hadza GM is equipped for branched-chain amino acid degradation and aromatic amino acid biosynthesis. Resistome functionality demonstrates the existence of antibiotic resistance genes in a population with little antibiotic exposure, indicating the ubiquitous presence of environmentally derived resistances. Our results demonstrate how the functional specificity of the GM correlates with certain environment and lifestyle factors and how complexity from the exogenous environment can be balanced by endogenous homeostasis. The Hadza gut metagenome structure allows us to appreciate the co-adaptive functional role of the GM in complementing the human physiology, providing a better understanding of the versatility of human life and subsistence."
experimentData(taxabund)@other
## $platform_summary
## [1] NA
##
## $platform_accession
## [1] NA
##
## $platform_title
## [1] "GAIIx"
##
## $platform_technology
## [1] "Illumina"
##
## $platform_distribution
## [1] NA
##
## $platform_manufacturer
## [1] "Illumina"
##
## $processor
## [1] "MetaPhlAn2 (marker & abundance) / HUMAnN2 (geneFam., pathAbun., & pathCov.)"
It also contains information about the study subjects. For example, here are the first 11 columns of data (all that are available for this dataset), for the first 5 subjects:
pData(taxabund)[1:5, 1:11]
## dataset_name sampleID subjectID bodysite disease age gender country
## H10 Candela_Africa H10 h10 stool n 40 female tanzania
## H11 Candela_Africa H11 h11 stool n 29 female tanzania
## H12 Candela_Africa H12 h12 stool n 8 female tanzania
## H13 Candela_Africa H13 h13 stool n 34 male tanzania
## H14 Candela_Africa H14 h14 stool n 30 male tanzania
## sequencing_technology pubmedid camp
## H10 Illumina 25981789 dedauko
## H11 Illumina 25981789 dedauko
## H12 Illumina 25981789 dedauko
## H13 Illumina 25981789 dedauko
## H14 Illumina 25981789 dedauko
Again, this could be explored more fully by (unevaluated):
View(pData(taxabund))
And of course, it contains the table of taxonomic abundances. Here are the relative abundances for the first two Operational Taxonomic Units (OTUs), for the first five subjects:
exprs(taxabund)[1:2, 1:5]
## H10 H11 H12 H13 H14
## k__Archaea 0.24169 0.50621 0.30522 0.40133 0.17479
## k__Archaea|p__Euryarchaeota 0.24169 0.50621 0.30522 0.40133 0.17479
We refer users to the MetaPhlAn2 documentation for description of what these values represent.
Recall that we looked up dataset codes such as “EH6” by doing View(mcols(myquery))
.
Let’s look, for example, at pathway coverage for the Candela Africa dataset:
pathcoverage = myquery[["EH7"]]
## Warning: package 'curatedMetagenomicData' is not available (for R version
## 3.3.1)
Coverage of the first two pathways, for the first two subjects, is:
exprs(pathcoverage)[1:2, 1:2]
## H1
## 12DICHLORETHDEG-PWY 0.25
## 12DICHLORETHDEG-PWY|g__Bacteroides.s__Bacteroides_cellulosilyticus NA
## H10
## 12DICHLORETHDEG-PWY 0.25
## 12DICHLORETHDEG-PWY|g__Bacteroides.s__Bacteroides_cellulosilyticus NA
We refer users to the HUMAnN2 documentation for description of what these values represent.
phyloseq
Bioconductor librarycuratedMetagenomicData
provides 6 types of data; of these, only the taxonomic profiles make sense to analyse as phylogenetic data. Here we use a convenience script from the Waldron lab to create the phyloseq
object
library(phyloseq)
source("https://raw.githubusercontent.com/waldronlab/presentations/master/Waldron_2016-06-07_EPIC/metaphlanToPhyloseq.R")
And create the phyloseq
object from the above-created object “taxabund”.
Candela = metaphlanToPhyloseq(tax = exprs(taxabund),
metadat = pData(taxabund),
simplenames = TRUE,
roundtointeger = TRUE)
Now we can do all sorts of taxonomy-aware analyses of this dataset…
summary(otu_table(Candela))
## H10 H11 H12 H13
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0 Median : 0
## Mean : 2415 Mean : 2413 Mean : 2412 Mean : 2409
## 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0
## Max. :997583 Max. :994938 Max. :996948 Max. :995987
## H14 H15 H16 H17
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0 Median : 0
## Mean : 2409 Mean : 2413 Mean : 2411 Mean : 2412
## 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0
## Max. :998252 Max. :980141 Max. :999055 Max. :992864
## H18 H19 H1 H20
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0 Median : 0
## Mean : 2418 Mean : 2418 Mean : 2370 Mean : 2382
## 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0
## Max. :1000000 Max. :999642 Max. :988258 Max. :969768
## H21 H22 H23 H24
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0 Median : 0
## Mean : 2384 Mean : 2402 Mean : 2412 Mean : 2365
## 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0
## Max. :989730 Max. :989674 Max. :1000000 Max. :999042
## H25 H26 H27 H2
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0 Median : 0
## Mean : 2405 Mean : 2401 Mean : 2396 Mean : 2390
## 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0
## Max. :994856 Max. :982371 Max. :985848 Max. :1000000
## H3 H4 H5 H6
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0 Median : 0
## Mean : 2418 Mean : 2392 Mean : 2381 Mean : 2420
## 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0
## Max. :999013 Max. :994328 Max. :985114 Max. :1000000
## H7 H8 H9 IT11
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0 Median : 0
## Mean : 2416 Mean : 2402 Mean : 2405 Mean : 2417
## 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0
## Max. :999459 Max. :991263 Max. :996313 Max. :1000000
## IT13 IT14 IT1 IT2
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0 Median : 0
## Mean : 2411 Mean : 2410 Mean : 2402 Mean : 2392
## 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0
## Max. :999760 Max. :1000000 Max. :1000000 Max. :972812
## IT3 IT4 IT5 IT6
## Min. : 0 Min. : 0 Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0 Median : 0 Median : 0
## Mean : 2351 Mean : 2390 Mean : 2402 Mean : 2381
## 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0 3rd Qu.: 0
## Max. :1000000 Max. :997734 Max. :1000000 Max. :999404
## IT7 IT8
## Min. : 0 Min. : 0
## 1st Qu.: 0 1st Qu.: 0
## Median : 0 Median : 0
## Mean : 2328 Mean : 2367
## 3rd Qu.: 0 3rd Qu.: 0
## Max. :1000000 Max. :1000000
summary(sample_data(Candela))
## dataset_name sampleID subjectID
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## bodysite disease age
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## gender country sequencing_technology
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## pubmedid camp paired_end_insert_size
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## read_length total_reads matched_reads
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## uniquely_matching_reads uniquely_matched_reads gene_number
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## gene_number_for_11m_uniquely_matched_reads hitchip_probe_number
## Length:38 Length:38
## Class :character Class :character
## Mode :character Mode :character
## bmi gene_count_class hitchip_probe_class
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## X.SampleID rna_sampleid infant_gender
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## infant_gestation_weeks cohort less_than_29weeks
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## sample_collection_days collectionweek samplecollectionwindow
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## gut_sample_id_ncbipublic gut_sample_id_corrected projectid
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## flowcell comment mlst_project
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## mlst_ec st_ec mlst_kp
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## st_kp extractionprotocolid site_id_cincinnati
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## delivery_mode infant_ethnicity infant_race
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## birth_year necrotizing_enterocolitis dol_firstnecors
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## sepsis infant_birthweight_kg infant_birth_length_cm
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## maternal_abx_given postnatal_antimicrobial_use postnatal_abx_2window
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## perinatal_antimicrobial_use death nursing_status
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## maternal_age_at_delivery_years mat_age_bin parity
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## infant_fut2 mat_fut2 lowh
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## lowlea highslea married
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## gravida preeclampsia mult_birth
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## hypertension hypertension_prepreg chorioamnionitis
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## primipar daysonform3daysprior days_on_abx_14
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## visit_number snprnt wmsphase
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## first repeat. stooltexture
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## daysafteronset hus stec_count
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## shigatoxin2elisa readsmillions nonhuman
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## stec_coverage stxab_detected stx_ratio
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## typingdata c_difficile_frequency ibd
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## sampling_day
## Length:38
## Class :character
## Mode :character
## known_consumers_of_a_defined_fermented_milk_product_.dfmp.
## Length:38
## Class :character
## Mode :character
## mgs_richness mgs_profile_matched_sample_pairs ethnicity
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## sample_name index_2_primer index_1_primer
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## sample_name_repeat1 extraction_conc sampleid_with_barcode
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## initial_nreads quality_control screensensloc_hg19
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## screensensloc_phix174 screenhiv percent_initial_reads
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## percent_qc_reads tipo_prelievo prima_dopo_cura
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## codice_provetta data
## Length:38 Length:38
## Class :character Class :character
## Mode :character Mode :character
## prelievo_efemalefemaleettuato_presso patient_number
## Length:38 Length:38
## Class :character Class :character
## Mode :character Mode :character
## pasi bsa age_ofemale_onset
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## arthritis type antibiotic_usage
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## stage cirrhotic hbv_related
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## alcohol_related other_causes_related inr
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## crea alb tb
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## pt ascites he
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## ctp meld antivirus
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## X..blocker designation age_range
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## site_symmetry affected.unaffected method
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## site_characteristic total_intital_reads estimated_median_insert_size
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## reported_as_failed_qc uniquely_align_to_human
## Length:38 Length:38
## Class :character Class :character
## Mode :character Mode :character
## non_uniquely_align_to_human_with_0_2_mismatches
## Length:38
## Class :character
## Mode :character
## reads_removed_because_of_read_pair_trimming_discrepancy
## Length:38
## Class :character
## Mode :character
## too_short_after_quality_trimming..50bp. final_._paired_reads
## Length:38 Length:38
## Class :character Class :character
## Mode :character Mode :character
## final_._singletons total_final_reads
## Length:38 Length:38
## Class :character Class :character
## Mode :character Mode :character
## X._reads_kept_by_mean_slope_of_last_5_points_.coverage.
## Length:38
## Class :character
## Mode :character
## population bmi_class X16s_rrna
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## shotgun_metagenome read_depth_16s paired
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## single height weight
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## diabetic fbg sbp
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## dbp fins fcp
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## hbalc tg tcho
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## hdl ldl kit
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## sample_group illumina_run classification
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## gad.antibodies whr wc
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## cholesterol triglycerides creatinine
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## y.gt fasting_glucose fasting_insulin
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## hba1c adiponectin leptin
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## glp.1 fgf.19 hscrp
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## c.peptide tnfa il.1
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## cd163 statins insulin
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## oral_anti.diabetic_medication years_in_sweden tnm_stage
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## ajcc_stage localization fobt
## Length:38 Length:38 Length:38
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
## wif.1_gene_methylation_test group
## Length:38 Length:38
## Class :character Class :character
## Mode :character Mode :character
phyloseq
help vignettes here.
Candela
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 3302 taxa and 38 samples ]
## sample_data() Sample Data: [ 38 samples by 211 sample variables ]
## tax_table() Taxonomy Table: [ 3302 taxa by 8 taxonomic ranks ]
Candela = prune_taxa(taxa_sums(Candela) > 0, Candela)
Candela
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 648 taxa and 38 samples ]
## sample_data() Sample Data: [ 38 samples by 211 sample variables ]
## tax_table() Taxonomy Table: [ 648 taxa by 8 taxonomic ranks ]
rank_names(Candela)
## [1] "Kingdom" "Phylum" "Class" "Order" "Family" "Genus" "Species"
## [8] "Strain"
subset_taxa(Candela, !is.na(Strain))
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 207 taxa and 38 samples ]
## sample_data() Sample Data: [ 38 samples by 211 sample variables ]
## tax_table() Taxonomy Table: [ 207 taxa by 8 taxonomic ranks ]
(Candela.sp_strain = subset_taxa(Candela, !is.na(Species)))
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 444 taxa and 38 samples ]
## sample_data() Sample Data: [ 38 samples by 211 sample variables ]
## tax_table() Taxonomy Table: [ 444 taxa by 8 taxonomic ranks ]
taxonomy.level = apply(tax_table(Candela), 1, function(x) sum(!is.na(x)))
Candela.phy = prune_taxa(taxonomy.level==2, Candela)
taxa_names(Candela.phy)
## [1] "p__Euryarchaeota" "p__Acidobacteria" "p__Actinobacteria"
## [4] "p__Bacteroidetes" "p__Cyanobacteria" "p__Firmicutes"
## [7] "p__Fusobacteria" "p__Proteobacteria" "p__Spirochaetes"
## [10] "p__Tenericutes" "p__Verrucomicrobia" "p__Eukaryota_noname"
## [13] "p__Microsporidia"
Keep taxa only if they are in the most abundant 10% of taxa in at least two samples:
f1<- filterfun_sample(topp(0.1))
pru <- genefilter_sample(Candela, f1, A=2)
summary(pru)
## Mode FALSE TRUE NA's
## logical 455 193 0
subset_taxa(Candela, pru)
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 193 taxa and 38 samples ]
## sample_data() Sample Data: [ 38 samples by 211 sample variables ]
## tax_table() Taxonomy Table: [ 193 taxa by 8 taxonomic ranks ]
More help here.
plot_heatmap(Candela.sp_strain, method="PCoA", distance="bray")
par(mar = c(20, 4, 0, 0) + 0.1) # make more room on bottom margin
barplot(sort(taxa_sums(Candela.sp_strain), TRUE)[1:30]/nsamples(Candela.sp_strain), las=2)
?phyloseq::estimate_richness
vegan
packageNote, you can ignore warning about singletons:
alpha_meas = c("Shannon", "Simpson", "InvSimpson")
(p <- plot_richness(Candela, "gender", "camp", measures=alpha_meas))
alphas = estimate_richness(Candela, measures=alpha_meas)
pairs(alphas)
E.g. Bray-Curtis dissimilarity between all pairs of samples:
plot(hclust(phyloseq::distance(Candela, method="bray")),
main="Bray-Curtis Dissimilarity", xlab="", sub = "")
?phyloseq::distance
and ?phyloseq::distanceMethodList
ord = ordinate(Candela, method="PCoA", distance="bray")
plot_ordination(Candela, ord, color="camp", shape="camp") +
ggplot2::ggtitle("Bray-Curtis Principal Coordinates Analysis")
Screeplot:
plot_scree(ord) + ggplot2::ggtitle("Screeplot")