MWAS package provides multiple formats of visualization methods, including beeswarm, gradient, scatterplot, violin*, boxplot*, and heatmap*.

Return to Index


Preprocessing Raw Input Data

The raw input data might need a preprocessing step in order to show the relationships appropriately. The function preprocess.mwas includes the following functions (Table 1: Preprocessing options):

Table 1: Preprocessing options
fig.cap=Table 1. Preprocessing options


Visualization Options and Parameters

Command-line version: All visualization related options are capital letters, except the input options shared with other modules (the blue ones in Table 2). Examples will follow in the next few sections, using the data in the directory test/data/.

Table 2: Visualization options

(* Not available in the current version.)


Plot a gradient effect across samples for each individual OTUs/taxa

Rscript $MWAS_DIR/bin/mwas_analysis.R -w plot -M gradient -i test/data/taxa/GG_100nt_even10k-adults_L7.biom -o example/plot_otu_gradient -S

-w: plot mode
-M: gradient plot
-i: input file directory; it could be a .biom format table or a .txt format OTU or taxon table
-o: output directory; the gradient plot is saved as a .pdf file
-S: shorten the taxonomy names in order to show only the lowest taxon level name (removing k__ etc. to simplify the taxon names on the plots)

If you are familiar with R, you could manipulate your data in a more flexible way. Here is the same example as shown in the command-line version.

  1. Set work directory
setwd("~/Documents/LabProjects/mwas_git/")
  1. Load MWAS functions
file.sources = list.files("lib", pattern="*.R$", full.names=TRUE, ignore.case=TRUE)
invisible(sapply(file.sources, source, .GlobalEnv))
  1. Set visualization parameters
opts <- list()
opts$mode <- "plot"
opts$method <- "gradient"
opts$input_fp <- "test/data/taxa/GG_100nt_even10k-adults_L7.biom"
opts$transform_type <- "none"
opts$suppress_relative_abundance_conversion <- FALSE
opts$min_prevalence <- NULL
opts$collapse_table <- FALSE
opts$outdir <- "example/plot_otu_gradient"
opts$shorten_taxa <- TRUE
opts$multiple_axes <- FALSE
opts$filter_kegg <- FALSE
  1. Creat the output directory if needed
if(opts$outdir != ".") dir.create(opts$outdir,showWarnings=FALSE, recursive=TRUE)
  1. Parse input parameters and plot the corresponding figure type
mwas.obj <- import.plot.params(opts)
plot(mwas.obj)

The above steps are exactly the same version as in the command-line version. Alternatively, you could also directly use inner-funcions rather than the wrapper functions. More detail on other functions could be found in the learn module tutorial.


Draw a beeswarm plot for each individual OTUs/taxa

The command format is very silimar to gradient plot, except the changes in some options. The beeswarm plot is similar to a scatter plot but each point is closely packed, non-overlapped to each other. It is another way to visualize the distribution of samples.

Rscript $MWAS_DIR/bin/mwas_analysis.R -w plot -M beeswarm -i $MWAS_DIR/test/data/taxa/GG_100nt_even10k-adults_L7.biom -o example/beeswarm2 -m $MWAS_DIR/test/data/gg-map-adults.txt -c COUNTRY -A 0.05 -N 20 -S

-w: plot mode
-M: beeswarm plot
-i: input file directory; it could be a .biom format table or a .txt format OTU or taxon table
-o: output directory; the gradient plot is saved as a .pdf file
-m: mapping file; Category name should be given as well
-c: categroy name
-F: taxon statistic test result table, including p-values, q-vaules (adjusted p-value; False Discovery Rate control); Required if -i option is empty
-A: False discovery rate control cutoff
-N: Number of taxa to be considered; If omitted, then plot all the taxa that selected
-S: shorten the taxonomy names in order to show only the lowest taxon level name (removing k__ etc. to simplify the taxon names on the plots)

If you are familiar with R, you could manipulate your data more flexibly. Here is the same example as shown in the command-line version.

  1. Set work directory (This step only needs once.)
setwd("~/Documents/LabProjects/mwas_git/")
  1. Load MWAS functions (This step only needs once.)
file.sources = list.files("lib", pattern="*.R$", full.names=TRUE, ignore.case=TRUE)
invisible(sapply(file.sources, source, .GlobalEnv))
  1. Set visualization parameters
opts <- list()
opts$mode <- "plot"
opts$method <- "beeswarm"
opts$input_fp <- "test/data/taxa/GG_100nt_even10k-adults_L7.biom"
opts$map_fp <- "test/data/gg-map-adults.txt"
opts$category <- "COUNTRY"
opts$outdir <- "example/plot-beeswarm"
opts$shorten_taxa <- TRUE
opts$fdr <- 0.05
opts$nplot <- 20
  1. Creat the output directory if needed
if(opts$outdir != ".") dir.create(opts$outdir,showWarnings=FALSE, recursive=TRUE)
  1. Parse input parameters and plot the corresponding figure type
mwas.obj <- import.plot.params(opts)
plot(mwas.obj)

Draw a scatter plot for each individual OTUs/taxa

The command format is silimar to beeswarm plot, except the changes in some options. The scatter plot shows correlation between two categories for each selected taxon. It there are more than two categories, then the function outputs pairwise comparison results, as shown below, which has three categories.


Rscript $MWAS_DIR/bin/mwas_analysis.R -w plot -M scatterplot -i test/data/taxa/merged-taxa.txt -o example/scatterplot -m test/data/gg-map-adults.txt -c COUNTRY -A 0.01 -N 20 -S

-w: plot mode
-M: scatterplot
-i: input file directory; it could be a .biom format table or a .txt format OTU or taxon table
-o: output directory; the gradient plot is saved as a .pdf file
-m: mapping file; Category name should be given as well
-c: categroy name
-A: False discovery rate control cutoff
-N: Number of taxa to be considered; If omitted, then plot all the taxa that selected
-S: shorten the taxonomy names in order to show only the lowest taxon level name (removing k__ etc. to simplify the taxon names on the plots)

If you are familiar with R, you could manipulate your data more flexibly. Here is the same example as shown in the command-line version.

  1. Set work directory (This step only needs once.)
setwd("~/Documents/LabProjects/mwas_git/")
  1. Load MWAS functions (This step only needs once.)
file.sources = list.files("lib", pattern="*.R$", full.names=TRUE, ignore.case=TRUE)
invisible(sapply(file.sources, source, .GlobalEnv))
  1. Set visualization parameters
opts <- list()
opts$mode <- "plot"
opts$method <- "scatterplot"
opts$input_fp <- "test/data/taxa/merged-taxa.txt"
opts$map_fp <- "test/data/gg-map-adults.txt"
opts$category <- "COUNTRY"
opts$outdir <- "example/scatter-plot"
opts$shorten_taxa <- TRUE
opts$fdr <- 0.01
opts$nplot <- 20
  1. Creat the output directory if needed
if(opts$outdir != ".") dir.create(opts$outdir,showWarnings=FALSE, recursive=TRUE)
  1. Parse input parameters and plot the corresponding figure type
mwas.obj <- import.plot.params(opts)
plot(mwas.obj)

Reference

Hu Huang, Emmanuel Montassier, Pajau Vangay, Gabe Al Ghalith, Dan Knights. “Robust statistical models for microbiome phenotype prediction with the MWAS package” (in preparation)