该文档主要整理了EvidenCellMarker Database的数据库结构,功能及搭建细节,按照页面进行划分
1. 数据介绍
以EvidenCellMarker/为根目录来展示具体位置
路径:
data/mydata.RDS数据结构
> mydata <- readRDS("./EvidenCellMarker/data/mydata.RDS") > str(mydata) 'data.frame': 890296 obs. of 29 variables: $ top_level : chr "brain" "lung" "blood" "breast" ... $ top_level_uberon_id : chr "UBERON:0000955" "UBERON:0002048" "UBERON:0000178" "UBERON:0000310" ... $ top_level_mixed_name : chr "NULL" "NULL" "NULL" "NULL" ... $ tissue_class : chr "Nervous System" "Lung" "Blood" "Breast" ... $ cell_name_standardized: chr "neuronal cell" "Nucleocapsid-specific B cell" "N-specific B cell" "neoplastic cell" ... $ cell_name_cl_id : chr "NULL" "NULL" "NULL" "CL:0001063" ... $ cell_name_orig : chr "neuronal" "N-specific B cells" "N-specific B cells" "tumour cell" ... $ disease_type_do : chr "glioblastoma" "COVID-19" "COVID-19" "breast cancer" ... $ disease_type_doid : chr "DOID:3068" "DOID:0080600" "DOID:0080600" "DOID:1612" ... $ disease_type : chr "Glioblastoma" "COVID-19 (SARS-CoV-2 infection)" "COVID-19 (SARS-CoV-2 infection)" "Breast cancer" ... $ marker : chr "Tuj 1" "N protein" "N" "N-cadherin" ... $ marker_polarity : chr "positive" "positive" "positive" "positive" ... $ marker_corrected : chr "NULL" "NULL" "NULL" "NULL" ... $ gene_symbol : chr "A1BG" "NAT2" "NAT2" "CDH2" ... $ gene_type : chr "protein-coding" "protein-coding" "protein-coding" "protein-coding" ... $ gene_full_name : chr "alpha-1-B glycoprotein" "N-acetyltransferase 2" "N-acetyltransferase 2" "cadherin 2" ... $ gene_aliases : chr "A1B, ABG, GAB, HYST2477" "AAC2, NAT-2, PNAT" "AAC2, NAT-2, PNAT" "ACOGS, ADHD8, ARVD14, CD325, CDHN, CDw325, NCAD" ... $ protein_id : chr "unknown" "unknown" "unknown" "P19022" ... $ protein_name : chr "unknown" "unknown" "unknown" "Cadherin-2" ... $ entrez_id : chr "1" "10" "10" "1000" ... $ ensembl_id : chr "ENSG00000121410" "ENSG00000156006" "ENSG00000156006" "ENSG00000170558" ... $ species : chr "Human" "Human" "Human" "Human" ... $ pmid : chr "25871395" "37884506" "36560669" "33801519" ... $ pmcid : chr "PMC4496184" "PMC10603102" "PMC9785906" "PMC7958863" ... $ journal : chr "Oncotarget" "Nature communications" "Viruses" "International journal of molecular sciences" ... $ year : chr "2015" "2023" "2022" "2021" ... $ title : chr "Targeted therapy of glioblastoma stem-like cells and tumor non-stem cells using cetuximab-conjugated iron-oxide nanoparticles." "Respiratory mucosal immune memory to SARS-CoV-2 after infection and vaccination." "CXCL12 and CXCL13 Cytokine Serum Levels Are Associated with the Magnitude and the Quality of SARS-CoV-2 Humoral Responses." "Heterogeneous Manifestations of Epithelial-Mesenchymal Plasticity of Circulating Tumor Cells in Breast Cancer Patients." ... $ section : chr "RESULTS; Multilineage differentiation and tumorigenicity of human GBM neurospheres" "Methods; B cells immunophenotyping and detection of SARS-CoV-2 specific B cells" "3. Results; 3.3. Frequencies of Spike, RBD, and N-Specific B Cell Responses" "4. Materials and Methods; 4.3. Multiplex Immunofluorescence (mIF) Staining" ... $ source : chr "N08-74, N08-30, and N08-1002 neurospheres formed invasive tumors in athymic nude mice brains within 4-11 months"| __truncated__ "Cryopreserved BAL cells and PBMCs were used for detection of SARS-CoV-2 specific B cells in lower airways and b"| __truncated__ "Briefly, blood SARS-CoV-2-specific B cells were identified using the biotinylated Spike, RBD, or N proteins (Fi"| __truncated__ "Heterogeneous Manifestations of Epithelial-Mesenchymal Plasticity of Circulating Tumor Cells in Breast Cancer P"| __truncated__ ...
数据解释如下:
- top_level: 组织的最终分类信息,经过标准化后的组织名称。
- top_level_uberon_id: 对应UBERON标准化后的组织ID。
- top_level_mixed_name: 如果top_level名称为mixed_tissue(多种组织类型混合),这里是该组织的统一名称,用于避免混乱。
- tissue_class: 组织原始名称。
- cell_name_standardized: 细胞名称的标准化版本,通常对应于Cell Ontology标准。
- cell_name_cl_id: 对应的Cell Ontology的ID。
- cell_name_orig: 细胞名称的原始版本,来自于论文原文。
- disease_type_do: 通过Disease Ontology数据库标准化后的疾病类型名称,Normal表示正常。
- disease_type_doid: 对应的Disease Ontology的ID。
- disease_type: 疾病类型,原始的未标准化类型。
- marker: 在论文中提到的marker(标记)。
- marker_polarity: marker的极性,可能是positive或negative。
- gene_symbol: 标记基因的基因符号。
- gene_type: 标记基因的类型。
- gene_full_name: 标记基因的全名。
- gene_aliases: 标记基因的别名。
- protein_id: 对应标记基因的蛋白质ID。
- protein_name: 对应蛋白质的名称。
- entrez_id: 对应基因的Entrez ID。
- ensembl_id: 对应基因的Ensembl ID。
- species: 物种。
- pmid: PubMed ID。
- pmcid: PubMed Central ID。
- journal: 期刊名称。
- year: 论文发表年份。
- title: 论文标题。
- section: 论文中的章节名称。
- source: 支撑注释的原文描述。
我的文本全文数据库(记录在了texts中):
(base) server@server-MS03-CE0:./EvidenCellMarker/data$ sqlite3 pmc_texts.sqlite
SQLite version 3.50.2 2025-06-28 14:00:48
Enter ".help" for usage hints.
sqlite> .tables
status texts
sqlite> SELECT * FROM texts LIMIT 10;
PMC12085826|Title|1|1|Navigating single-cell RNA-sequencing: protocols, tools, databases, and applications
PMC12085826|Abstract|1|1|Single-cell RNA-sequencing (scRNA-seq) technology brought about a revolutionary change in the transcriptomic world, paving the way for comprehensive analysis of cellular heterogeneity in complex biological systems.
PMC12085826|Abstract|1|2|It enabled researchers to see how different cells behaved at single-cell levels, providing new insights into the process.
PMC12085826|Abstract|1|3|However, despite all these advancements, scRNA-seq also experiences challenges related to the complexity of data analysis, interpretation, and multi-omics data integration.
PMC12085826|Abstract|1|4|In this review, these complications were discussed in detail, directly pointing at the optimization of scRNA-seq approaches and understanding the world of single-cell and its dynamics.
PMC12085826|Abstract|1|5|Different protocols and currently functional single-cell databases were also covered.
PMC12085826|Abstract|1|6|This review highlights different tools for the analysis of scRNA-seq and their methodologies, emphasizing innovative techniques that enhance resolution and accuracy at a single-cell level.
PMC12085826|Abstract|1|7|Various applications were explored across domains including drug discovery, tumor microenvironment (TME), biomarker discovery, and microbial profiling, and case studies were discussed to explain the importance of scRNA-seq by uncovering novel and rare cell types and their identification.
PMC12085826|Abstract|1|8|This review underlines a crucial aspect of scRNA-seq in the advancement of personalized medicine and highlights its potential to understand the complexity of biological systems.
PMC12085826|Introduction|1|1|Two centuries after Robert Hooke and Antonie van Leeuwenhoek, cells were redefined as the fundamental functional unit of life [1].
对应列:
sqlite> PRAGMA table_info(texts);
0|pmcid|TEXT|1||0
1|section|TEXT|0||0
2|paragraph|INTEGER|0||0
3|sentence|INTEGER|0||0
4|text|TEXT|0||0
sqlite>
pmcid: PMC ID
section: 章节名
paragraph: 段落名
sentence:句子序号(按顺序排列)
text:正文,按句子划分
我再把这些内容分类成多个板块,用于注释信息的展示 在以下说明中,我括号里的内容表示在数据库展示时的名称(因为上面表格的名称不太规范),所有的信息都按顺序排列。
- 注释信息: species(Species)
- 组织相关的(Tissue information):
- top_level(Standardized tissue name)
- top_level_uberon_id:如果为NA则展示Not mapped(Uberon ID),需要设置超链接(需要把:改为_),例如UBERON:0000178的超链接为http://purl.obolibrary.org/obo/UBERON_0000178
- top_level_mixed_name:如果为NA则不展示(Mixed tissue name)
- tissue_class(Original tissue name)
- 细胞相关的(Cell information):
- cell_name_standardized(Standardized cell name)
- cell_name_cl_id:如果为NA则展示Not mapped(Cell Ontology ID),需要设置超链接(需要把“:”改为 " _ " ),例如CL:0002326的超链接为http://purl.obolibrary.org/obo/CL_0002326
- cell_name_orig(Original cell name)
- 疾病信息相关的(Disease information):
- disease_type_do(Standardized Disease type)
- disease_type_doid(Disease Ontology ID),统一超链接到https://disease-ontology.org/do/ (无ID后缀,因为该网页不支持查询)
- disease_type(Original Disease type)
- 基因和蛋白信息相关的(Gene & Protein information):
- marker(Original marker name)
- marker_polarity(Marker polarity)
- gene_symbol(HGNC gene symbol):注意,增加一个check的标志,当gene_qc_pass为TRUE时,为绿色的勾,如果为FALSE时,则是红色的叉。鼠标放上去时,显示gene_qc_note中的信息。以FLT3为例,统一链接到https://www.genecards.org/cgi-bin/carddisp.pl?gene=FLT3
- gene_type(Gene type)
- gene_full_name(Gene full name)
- gene_aliases(Gene aliases)
- protein_id(Protein id):注意,增加一个check的标志,当protein_qc_pass为TRUE时,为绿色的勾,如果为FALSE时,则是红色的叉。鼠标放上去时,显示protein_qc_note中的信息。(以P12830为例,超链接到https://www.uniprot.org/uniprotkb/P12830/entry)
- protein_name(Protein name)
- entrez_id(Entrez ID)
- ensembl_id(Ensembl ID)
- 文献来源信息:
- pmid(Pubmed ID),以36708705为例,超链接到https://pubmed.ncbi.nlm.nih.gov/36708705/
- pmcid(Pubmed Central ID),以PMC10014032为例,超链接到https://pmc.ncbi.nlm.nih.gov/articles/PMC10014032/
- journal(Journal)
- year(Year)
- title(Article title)
- section(Article section)
- source(Omitted evidence)再增设一个按钮,命名为Full text,按照section来按顺序展示论文全文,可以通过解析pmc_texts.sqlite来获取的
2. Home页
主要展示统计值:
- 细胞数量:unique cell_name_standardized
- 组织数量:unique top_level
- marker数量:unique gene_symbol
- 疾病数量:unique disease_type_do
- 文献数量:unique pmcid
物种分布:species(可以饼图或者柱形图)
还可以做器官-组织-细胞交互的图像
3. Search页
3.1 Direct Search
搜索检索词,检索库中出现这个词的行(泛检索)
给一些example(点击后写入搜索框):
- 物种:Human
- 组织:brain
- 细胞:activated CD4-positive, alpha-beta T cell
- 疾病:type 1 diabetes mellitus
- 基因:CD19
3.2 Advanced
筛选条目(如果可以的话,在筛选时实时交互,相当于实时筛选已有的选项,不会出现not found的情况):
- Species
- Tissue Type
- Cell Type
- Disease Type
- Marker (Gene Symbol)
- Marker Polarity
在结果展示时,由于总的column过多,初步检索仅展示Species,Tissue Type,Cell Type,Disease Type,Gene Symbol,Protein name,Evidence,然后增加一个details按钮
点击detail后,展示完整的详细信息。
- 物种信息:
- Species
- 组织信息(Tissue information):
- top_level(Standardized tissue name)
- top_level_uberon_id:如果为NA则展示Not mapped(Uberon ID),需要设置超链接(需要把:改为_),例如UBERON:0000178的超链接为http://purl.obolibrary.org/obo/UBERON_0000178
- top_level_mixed_name:如果为NA则不展示(Mixed tissue name)
- tissue_class(Original tissue name)
- 细胞信息(Cell information):
- cell_name_standardized(Standardized cell name)
- cell_name_cl_id:如果为NA则展示Not mapped(Cell Ontology ID),需要设置超链接(需要把“:”改为 " _ " ),例如CL:0002326的超链接为http://purl.obolibrary.org/obo/CL_0002326
- cell_name_orig(Original cell name)
- 疾病信息(Disease information):
- disease_type_do(Standardized Disease type)
- disease_type_doid(Disease Ontology ID),统一超链接到https://disease-ontology.org/do/ (无ID后缀,因为该网页不支持查询)
- disease_type(Original Disease type)
- 基因和蛋白信息(Gene & Protein information):
- marker(Original marker name)
- marker_polarity(Marker polarity)
- gene_symbol(HGNC gene symbol):注意,增加一个check的标志,当gene_qc_pass为TRUE时,为绿色的勾,如果为FALSE时,则是红色的叉。鼠标放上去时,显示gene_qc_note中的信息。以FLT3为例,统一链接到https://www.genecards.org/cgi-bin/carddisp.pl?gene=FLT3
- gene_type(Gene type)
- gene_full_name(Gene full name)
- gene_aliases(Gene aliases)
- protein_id(Protein id):注意,增加一个check的标志,当protein_qc_pass为TRUE时,为绿色的勾,如果为FALSE时,则是红色的叉。鼠标放上去时,显示protein_qc_note中的信息。(以P12830为例,超链接到https://www.uniprot.org/uniprotkb/P12830/entry )
- protein_name(Protein name)
- entrez_id(Entrez ID)
- ensembl_id(Ensembl ID)
- 文献来源信息:
- pmid(Pubmed ID),以36708705为例,超链接到https://pubmed.ncbi.nlm.nih.gov/36708705/
- pmcid(Pubmed Central ID),以PMC10014032为例,超链接到https://pmc.ncbi.nlm.nih.gov/articles/PMC10014032/。再增设一个按钮,命名为**Full text**,按照section来按顺序展示论文全文,可以通过解析pmc_texts.sqlite来获取的(按照pmc id)。
- journal(Journal)
- year(Year)
- title(Article title)
- section(Article section)
- source(Omitted evidence)source中高亮cell_name_orig和marker字段
4. Browse页
Browse主要由一个tree来组织,选择组织大类后,再来选择小类。大类在classifications,小类就是top_level
在mydata中我做了更新,如果一个组织存在于多个关系,会用分号分隔
> str(mydata)
> ......
$ classifications : chr "body regions" "body regions" "body regions" "circulatory system; body fluids" ...
布局类似于(http://www.bio-bigdata.center/CellMarkerBrowse.jsp)
5. Cell Annotation页
5.1 背景代码
用户上传seurat对象的obj(在代码中为example_pbmc_input.rds),分析得到注释文件
封装的代码(bash命令)
cd /mnt/workdir/cellmarker/EvidenCellMarker
# 疾病+正常
Rscript ./cell_annotation/annotate_cells.R \
--input ./data/example_pbmc_input.rds \
--species Human \
--tissue "blood" \
--n_variable_features 2000 \
--dims 30 \
--cluster_resolutions "0.3,0.5,1.0" \
--min_markers_per_cell 2 \
--disease_type "Normal,acute myeloid leukemia,aplastic anemia" \
--output_dir ./results/example_full \
--n_threads 8 \
--random_seed 1234
# 仅正常
Rscript ./cell_annotation/annotate_cells.R \
--input ./data/example_pbmc_input.rds \
--species Human \
--tissue "blood" \
--n_variable_features 2000 \
--dims 30 \
--cluster_resolutions "0.3,0.5,1.0" \
--min_markers_per_cell 2 \
--disease_type "Normal" \
--output_dir ./results/example_full \
--n_threads 32 \
--random_seed 1234
# 仅疾病
Rscript ./cell_annotation/annotate_cells.R \
--input ./data/example_pbmc_input.rds \
--species Human \
--tissue "blood" \
--n_variable_features 2000 \
--dims 30 \
--cluster_resolutions "0.3,0.5,0.8" \
--min_markers_per_cell 2 \
--disease_type "acute myeloid leukemia" \
--output_dir ./results/example_full \
--n_threads 8 \
--random_seed 1234
对于参数设置:
input:在Web中来源于用户上传文件的调用species:Human,Mouse和Rat三选一,用户在线选择。tissue:来源于mydata$top_level,用户在线选择。n_variable_features:默认2000,最多3000,最少1000,每500一个刻度。用户在线选择。dims:默认30,最多40,最少10。每5一个刻度,必须是整数。用户在线选择。cluster_resolutions:可以多选,默认为0.8,最多选3个,最大为1.5。min_markers_per_cell,一个细胞中最少marker数量,默认为2,最小为1,必须是整数。disease_type,来源于mydata$disease_type_do,默认为Normal,表示为筛选正常细胞,当出现多个时,例如"Normal,acute myeloid leukemia,aplastic anemia"表示筛选多种疾病状态的细胞output_dir:在web应用中,应该固定一个输出目录,然后对于每次提交生成一个unique随机链接n_threads:固定为8,主要用于sctype中的多线程注释random_seed:默认为1234,表示降维聚类的种子数,用于重复结果
5.2 结果展示:
运行示例代码:
Rscript ./cell_annotation/annotate_cells.R \
--input ./data/example_pbmc_input.rds \
--species Human \
--tissue "blood" \
--n_variable_features 2000 \
--dims 30 \
--cluster_resolutions "0.3,0.5,1.0" \
--min_markers_per_cell 2 \
--disease_type "Normal" \
--output_dir ./results/example_full \
--n_threads 32 \
--random_seed 1234
结果的结构:
(base) server@server-MS03-CE0:/mnt/workdir/cellmarker/EvidenCellMarker/results/example_full$ tree
.
├── annotated_seurat.RDS
├── resolution_0.30
│ ├── cluster_markers_all.csv
│ ├── cluster_markers_significant.csv
│ ├── llm_decisions.csv
│ ├── llm_prompts_log.csv
│ ├── umap_cluster_annotation.pdf
│ ├── umap_cluster_annotation.png
│ ├── umap_marker_dotplot.pdf
│ └── umap_marker_dotplot.png
├── resolution_0.50
│ ├── cluster_markers_all.csv
│ ├── cluster_markers_significant.csv
│ ├── llm_decisions.csv
│ ├── llm_prompts_log.csv
│ ├── umap_cluster_annotation.pdf
│ ├── umap_cluster_annotation.png
│ ├── umap_marker_dotplot.pdf
│ └── umap_marker_dotplot.png
├── resolution_1.00
│ ├── cluster_markers_all.csv
│ ├── cluster_markers_significant.csv
│ ├── llm_decisions.csv
│ ├── llm_prompts_log.csv
│ ├── umap_cluster_annotation.pdf
│ ├── umap_cluster_annotation.png
│ ├── umap_marker_dotplot.pdf
│ └── umap_marker_dotplot.png
└── summary.txt
4 directories, 26 files
按照分辨率分为3个panel,每个分辨率中内容都是一样的。
5.2.1 图像
在一个panel中,展示以下png图片(相对路径),同时提供相同前缀的pdf文件的下载按钮
- 用box包裹后,box的标题为
UMAP plot,相对路径在./results/example_full/resolution_0.50/umap_cluster_annotation.png
- 用box包裹后,box的标题为
Dot plot,相对路径在./results/example_full/resolution_0.50/umap_marker_dotplot.png
5.2.2 表格
- 用box包裹后,box标题为
LLM decision process,相对路径在./results/example_full/resolution_0.50/llm_decisions.csv
该csv有以下几列:
- "cluster":字符串格式,需要改为numeric,按照数字从小到大排序
- "selected_celltype":最终LLM选择的细胞名称
- "confidence":one of "high", "medium", "low",或许可以直接用绿("high")、黄("medium")、红("low")来表示。
- "reasoning":推理过程,一段很长的话
- "key_markers_validated":对于细胞注释关键的marker基因
- "overlap_genes_all"列不需要展示
- "overlap_genes_pos":cluster的marker基因与细胞的positive marker基因的交集
- "overlap_genes_neg":cluster的marker基因与细胞的negative marker基因的交集
提供两个下载按钮,用于下载:(1)命名为
Download all cluster markers,路径在./results/example_full/resolution_0.50/cluster_markers_all.csv,(2)命名为Download significant cluster markers,路径在./results/example_full/resolution_0.50/cluster_markers_significant.csv6. Cell Score页
6.1 背景代码
用户上传seurat对象,根据选择不同的tissue和cell来获得不同的细胞打分
封装的代码(bash命令)
Rscript ./cell_score/ucell_demo.R \
--species Human \
--tissue brain \
--disease_name glioblastoma \
--cell_name "malignant cell" \
--polarity positive \
--n_threads 4 \
--outdir ./results/cell_score_positive_example_6threads \
--input_seurat ./data/example_glioblastoma_input.rds \
--marker_rds ./data/mydata.RDS \
--type_in_metadata cell_type \
--target_cells "malignant cell"
对于参数设置:
species:设置为Human,Mouse和Rat,用户在线选择(必选)tissue:来源于mydata$top_level,用户在线选择(必选)disease_name:来源于mydata$disease_type_do,用户在线选择(必选)cell_name:来源于mydata$cell_name_standardized,用户在线选择(必选)polarity:来源于mydata$marker_polarity,用户在线选择(必选)n_threads:固定为4(根据最终部署的服务器配置决定,4线程占用内存大约)outdir:在web应用中,一次提交应该固定一个输出目录,然后对于每次提交生成一个unique随机链接input_seurat:在Web中来源于用户上传文件的调用(必选)marker_rds:固定为./data/mydata.RDStype_in_metadata:对于input_seurat中的metadata的列(metadata中注释细胞类型的列的名称)(非必选)target_cells:列中需要展示的值(已注释细胞中的目标细胞名称,该值必须包含在type_in_metadata中)(非必选)6.2 结果展示
运行示例代码
Rscript ./cell_score/ucell_demo.R \
--species Human \
--tissue brain \
--disease_name glioblastoma \
--cell_name "malignant cell" \
--polarity positive \
--n_threads 6 \
--outdir ./results/cell_score_positive_example \
--input_seurat ./data/example_glioblastoma_input.rds \
--marker_rds ./data/mydata.RDS \
--type_in_metadata cell_type \
--target_cells "malignant cell"
结果目录tree结构:
(base) server@server-MS03-CE0:/mnt/workdir/cellmarker/EvidenCellMarker$ tree results/cell_score_positive_example
results/cell_score_positive_example
├── Boxplot_cell_signatures_by_target_cell_type_malignant_cell.pdf
├── Boxplot_cell_signatures_by_target_cell_type_malignant_cell.png
├── CustomPlot_cell_type_umap.pdf
├── CustomPlot_cell_type_umap.png
├── FeaturePlot_cell_signatures_umap.pdf
├── FeaturePlot_cell_signatures_umap.png
├── meta_with_ucell_scores.csv
├── TargetCellsPlot_cell_type_malignant_cell_umap.pdf
└── TargetCellsPlot_cell_type_malignant_cell_umap.png
1 directory, 9 files
6.2.1 图像
由于参数设置的不同,分为以下几种情况
- 第一种情况,
type_in_metadata和target_cells均已设置的情况下,文件最全,按顺序展示如下:
(1)Web中标题取为Cell type plot
./results/cell_score_positive_example/CustomPlot_cell_type_umap.png
(2)Web中标题取为Cell score plot
./results/cell_score_positive_example/FeaturePlot_cell_signatures_umap.png
(3)Web中标题取为Target cells destribution plot
./results/cell_score_positive_example/TargetCellsPlot_cell_type_malignant_cell_umap.png
(4)Web中标题取为Boxplot of scores
./results/cell_score_positive_example/Boxplot_cell_signatures_by_target_cell_type_malignant_cell.png
- 第二种情况,如果设置了
type_in_metadata而没有设置target_cells,那么就没有上面的(3)和(4)两张图 - 第三种情况,如果
type_in_metadata和target_cells都没有设置,那么就没有上面的(1)(3)(4)图
注:所有的图片均提供
png格式和同名称的
6.2.2 表格
无论什么情况,都展示表格results/cell_score_positive_example/meta_with_ucell_scores.csv,并且提供下载功能
head results/cell_score_positive_example/meta_with_ucell_scores.csv
"","nCount_RNA","nFeature_RNA","cell_type","cell_signatures","umap_1","umap_2"
"PJ017_0",14882,4710,"malignant cell",0.382159413787947,-1.72728063129287,4.9318121515492
"PJ017_1",13889,4951,"malignant cell",0.300994466876028,-0.43518288396697,5.81114612448011
"PJ017_2",12877,4017,"macrophage",0.0835327251881761,-6.45049966834884,-6.57972015512194
"PJ017_3",12742,3993,"malignant cell",0.377698021035841,-1.36464430354934,4.81031928884779
"PJ017_4",12775,4280,"malignant cell",0.368226907930811,-1.43739284538131,4.84666381704603
"PJ017_5",12530,3919,"malignant cell",0.353559144608943,-1.76326598190169,4.60113368856703
"PJ017_6",11919,3874,"mural cell",0.182717710981506,-10.1987182047258,4.58634935247694
"PJ017_7",12183,4103,"malignant cell",0.380389811076217,-1.38707925342422,4.72182260381971
"PJ017_8",12079,3693,"malignant cell",0.334280444643836,-1.86855424903731,4.69542299139295
supple. 其余功能
1. 邮件通知
由于分析时间较长,因此在提交任务后即生成unique的url链接,用户可通过该链接来访问分析结果,当分析完成时,以邮件形式通知用户
2. Job Status
对于目前任务状态进行查询(方便用户查看服务器负载情况,规划分析时间)
3. Download页
待编写
4. Help页
待编写