class: center, middle, inverse, title-slide # Install PICRUSt on windows ###
Hsiu J. Ho ### 2017/08/18 --- ### PICRUSt - PICRUSt: Phylogenetic Investigation of Communities by Reconstruction of Unobserved States - 就是functional prediction, 可以透過greengene的OTUID轉換成KEGG或COG各功能的含量, 其中KEGG又可進一步知道pathway - 進一步資訊請看[官方網頁](http://picrust.github.io/picrust/index.html) - [the PICRUSt GitHub repository](https://github.com/picrust/picrust) --- ### 安裝PYTHON - 安裝phyton 2.7版, 其中又分為32位元及64位元, 這會影響之後套件安裝的版本, [官方網頁](https://www.python.org/downloads/)上下載預設是32位元, 若要64位元要在[網頁](https://www.python.org/downloads/release/python-2713/)中搜尋下載點, 本例是用32位元的python. - 安裝python套件的方法: ```yaml pip install foo pip install foo.whl pip install -U foo pip uninstall foo ``` 其中pip及foo.whl要完整路徑, 而foo.whl是在[Unofficial Windows Binaries for Python Extension Packages](http://www.lfd.uci.edu/~gohlke/pythonlibs/)下載的套件, -U表示更新, uninstall為反安裝. --- ### 安裝PICRUSt - 在 [Installing PICRUSt](http://picrust.github.io/picrust/install.html#install) 頁面中有簡易的安裝流程. - 下載 [VCForPython27.msi](https://www.microsoft.com/en-us/download/confirmation.aspx?id=44266) 並安裝 - 下載 [numpy+mkl.whl](http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy) 及 [scipy.whl](http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy) 的32位元版 - [下載PICRUSt的python程式](https://github.com/picrust/picrust/releases/download/1.1.1/picrust-1.1.1.tar.gz)並解壓縮 - 在[頁面](http://picrust.github.io/picrust/picrust_precalculated_files.html#id1)下載PICRUSt’s Precalculated Files, 在項目Greengenes v13.5 (and IMG 4)及Greengenes 18may2012之下的檔案都要下載, 並在安裝PICRUSt前, 把檔案移入上一步解壓後路徑picrust/data資料夾中. - 開啓命令提示字元(cmd), 輸入以下指令, 請自行增加檔案所在路徑: ```yaml pip install numpy-1.13.1+mkl-cp27-cp27m-win32.whl pip install scipy-0.19.1-cp27-cp27m-win32.whl pip install cogent pip install biom-format pip install h5py ``` - 安裝PICRUSt, 會把PICRUSt’s Precalculated Files複製到系統路徑底下, 因為下載程式及檔案就可以跑PICRUSt了, 但會讀取在系統路徑底下的Precalculated Files, 其中...是忽略部分路徑, 實際是要完整路徑資訊才能執行. ```yaml C:\Python27\Scripts\pip install E:\...\picrust-1.1.1\. ``` --- ### Example 1 - 在PICRUSt程式壓縮檔中, 有官方提供的教學資料 ```yaml .\picrust-1.1.1\tutorials\hmp_mock_16S.biom ``` - 因為教學資料中的OTUID是參照Greengenes 18may2012 - 在cmd中, 預設是預測KEGG ```yaml cd E:\fastq\picrust\Precalculated\v1.1.1\picrust-1.1.1 e: c:\python27\python .\scripts\normalize_by_copy_number.py -g 18may2012 -i .\tutorials\hmp_mock_16S.biom -o .\tutorials\normalized_otus.biom c:\python27\python .\scripts\predict_metagenomes.py -g 18may2012 -i .\tutorials\normalized_otus.biom -o .\tutorials\metagenome_predictions.biom c:\python27\python .\scripts\predict_metagenomes.py -g 18may2012 -f -i .\tutorials\normalized_otus.biom -o .\tutorials\metagenome_predictions.tab c:\python27\python .\scripts\metagenome_contributions.py -g 18may2012 -i .\tutorials\normalized_otus.biom -l K00001,K00002,K00004 -o .\tutorials\ko_metagenome_contributions.tab ``` --- ### Example 1 - 也可以改為預測COG ```yaml c:\python27\python .\scripts\predict_metagenomes.py --type_of_prediction cog -g 18may2012 -i .\tutorials\normalized_otus.biom -o .\tutorials\metagenome_predictions_COG.biom c:\python27\python .\scripts\predict_metagenomes.py --type_of_prediction cog -g 18may2012 -f -i .\tutorials\normalized_otus.biom -o .\tutorials\metagenome_predictions_COG.tab dir .\tutorials ``` - 可以得到兩種格式的輸出檔biom及tab, 其中biom的資訊量會比tab多. - 可參考[Analyzing PICRUSt predicted metagenomes](http://picrust.github.io/picrust/tutorials/downstream_analysis.html#downstream-analysis-guide)的介紹, 進一步用其他軟體分析 --- ### 在R環境下匯入及匯出BIOM格式檔 ```r install.packages("devtools") # if not already installed library("devtools") install_github("joey711/biom") ``` ```r path="E:/fastq/picrust/Precalculated/v1.1.1/picrust-1.1.1/tutorials" a1=biom::read_biom(file.path(path,"metagenome_predictions.biom")) a1 ``` ``` # biom object. # type: OTU table # matrix_type: dense # 6885 rows and 2 columns ``` --- ### 在R環境下匯入及匯出BIOM格式檔 ```r a2=readr::read_tsv(file.path(path,"metagenome_predictions.tab"),skip = 1) dim(a2) ``` ``` # [1] 6885 4 ``` ```r head(a2,3) ``` ``` # # A tibble: 3 x 4 # `#OTU ID` staggered even KEGG_Description # <chr> <dbl> <dbl> <chr> # 1 K00001 431689 416486 alcohol dehydrogenase [EC:1.1.1.1] # 2 K00002 4465 16404 alcohol dehydrogenase (NADP+) [EC:1.1.1.2] # 3 K00003 107292 141415 homoserine dehydrogenase [EC:1.1.1.3] ``` --- ### Example 2: 在R中完成PICRUSt ```r b1=biom::make_biom(phylo@otu_table@.Data) setwd("E:/fastq/picrust/Precalculated/v1.1.1/picrust-1.1.1") #export biom::write_biom(b1,"./tutorials/b1.biom") #gene copy number system("c:/python27/python ./scripts/normalize_by_copy_number.py -i ./tutorials/b1.biom -o ./tutorials/b1_normalized_otus.biom") #KEGG functional prediction system("c:/python27/python ./scripts/predict_metagenomes.py -i ./tutorials/b1_normalized_otus.biom -o ./tutorials/b1_metagenome_predictions.biom") #彙整pathway system("c:/python27/python ./scripts/categorize_by_function.py -i ./tutorials/b1_metagenome_predictions.biom -c KEGG_Pathways -l 3 -o ./tutorials/b1_metagenome_predictions.L3.biom") #COG functional prediction system("c:/python27/python ./scripts/predict_metagenomes.py --type_of_prediction cog -i ./tutorials/b1_normalized_otus.biom -o ./tutorials/b1_cog_metagenome_predictions.biom") #COG_Category system("c:/python27/python ./scripts/categorize_by_function.py -i ./tutorials/b1_cog_metagenome_predictions.biom -c COG_Category -l 1 -o ./tutorials/b1_cog_metagenome_predictions.L1.biom") system("c:/python27/python ./scripts/categorize_by_function.py -i ./tutorials/b1_cog_metagenome_predictions.biom -c COG_Category -l 2 -o ./tutorials/b1_cog_metagenome_predictions.L2.biom") ``` --- ### Example 2: 在R中完成PICRUSt - 匯入PICRUSt結果 ```r setwd("E:/fastq/picrust/Precalculated/v1.1.1/picrust-1.1.1/tutorials") b2=biom::read_biom("b1_metagenome_predictions.biom") b3=biom::read_biom("b1_metagenome_predictions.L3.biom") b4=biom::read_biom("b1_cog_metagenome_predictions.biom") b41=biom::read_biom("b1_cog_metagenome_predictions.L1.biom") b42=biom::read_biom("b1_cog_metagenome_predictions.L2.biom") ``` - 資料整理: biom to matrix ```r require(pipeR) b5=do.call(rbind,b4$data) rownames(b5)=biom::rownames(b4) b6=biom::observation_metadata(b4) b6$COG_Category %>>% (regmatches(., gregexpr(pattern="\\[[A-Z]\\]", .))) %>>% sapply(length) %>>% table() ``` --- ### Debug - 在安裝套件及執行PICRUSt的過程中, 可能會出現錯誤, 這時就GOOGLE錯誤資訊尋找解決的方法. - 可能需要更新套件 - 可能缺少套件, 補齊後再重新安裝之前的套件 - 也有可能位元版本不一致, 下載正確的版本後再重新安裝之前的套件