1 简介

由于16S rRNA基因的高通量测序技术的进步，环境和人体中微生物群落的直接分析变得更加方便和可靠。推断微生物群落成员间的相关关系对基因组调查研究具有重要意义。传统的皮尔森相关分析将观测数据视为微生物的绝对丰度可能会导致虚假的结果，因为这些数据只代表相对丰度。在对这些成分数据进行相关性分析之前，需要特别小心和适当的方法。

Sparcc基于绝对丰度；群落多样性是调节这种成分效应的关键因素（Additionally, we show that community diversity is the key factor that modulates the acuteness of such compositional effects ）；

2 两个假设条件¹

OTU数目很大（the number of different OTUs is large）；
关于种的相关网络（the true correlation network is ‘sparse’）；

3 脚本实现

3.1 测试数据

3.2 python 脚本²

sparcc

3.3 python 运行命令³

python SparCC.py network.abundance -i 10 –cor_file=correlation.matrix –cov_file=covariance.matrix

3.4 python 运行结果

出现了一个大于１的相关系数：25911.5752878

出现了两个空值

3.5 R 脚本

SparCC.count <- function(x, imax = 10, kmax = 10, alpha = 0.1, Vmin = 1e-4) {
  # dimension for w (latent variables)
  p <- ncol(x);
  n <- nrow(x);
  # posterior distribution (alpha)
  x <- x + 1;
  # store generate data
  y <- matrix(0, n, p);
  # store covariance/correlation matrix
  cov.w <- cor.w <- matrix(0, p, p);
  indLow <- lower.tri(cov.w, diag = T);
  # store covariance/correlation for several posterior samples
  covs <- cors <- matrix(0, p * (p + 1) / 2, imax);
  for(i in 1:imax) {
    # 生成后验分布
    y <- t(apply(x, 1, function(x) 
      gtools::rdirichlet(n = 1, alpha = x)));
    # 估计相关性阵及协方差阵
    cov_cor <- SparCC.frac(x = y, kmax = kmax, alpha = alpha, Vmin = Vmin);
    # 解三角矩阵
    covs[, i] <- cov_cor$cov.w[indLow];
    cors[, i] <- cov_cor$cor.w[indLow];
  }
  # 计算平均数
  cov.w[indLow] <- apply(covs, 1, median); 
  cor.w[indLow] <- apply(cors, 1, median);
  #
  cov.w <- cov.w + t(cov.w);
  diag(cov.w) <- diag(cov.w) / 2;
  cor.w <- cor.w + t(cor.w);
  diag(cor.w) <- 1;
  #
  return(list(cov.w = cov.w, cor.w = cor.w));
}

3.6 R 运行命令

abu <- read.table(“~/Rmarkdown/cor/sparcc/abundance” ,sep = “\t” , header = T)

res_spa_count <- SparCC.count(x = t(abu[-1]) )

3.7 R 运行结果

3.8 脚本结果讨论

按照Inferring Correlation Networks from Genomic Survey Data中公式（14）,相关系数应该小于１：

\[1>> \{ρ_{ij}\}_i\]

对于较小数据量及数量级差距比较大的数据，计算中容易产生奇异值
python脚本还有计算出空值的情况

Sparcc

shuchengren

2017-09-07

1 简介

2 两个假设条件¹

3 脚本实现

3.1 测试数据

3.2 python 脚本²

3.3 python 运行命令³

3.4 python 运行结果

3.5 R 脚本

3.6 R 运行命令

3.7 R 运行结果

3.8 脚本结果讨论

4 讨论⁴

4.1 优点

4.2 缺点

5 参考文献

Sparcc

shuchengren

2017-09-07

1 简介

2 两个假设条件1

3 脚本实现

3.1 测试数据

3.2 python 脚本2

3.3 python 运行命令3

3.4 python 运行结果

3.5 R 脚本

3.6 R 运行命令

3.7 R 运行结果

3.8 脚本结果讨论

4 讨论4

4.1 优点

4.2 缺点

5 参考文献

2 两个假设条件¹

3.2 python 脚本²

3.3 python 运行命令³

4 讨论⁴