在Cox比例风险模型中,当需要进行中介分析时,选择正确的统计方法至关重要。传统的中介分析包(如mediation)主要为线性或广义线性模型设计,直接应用于生存数据会产生错误的结果,因为它无法正确处理删失数据。regmedint包是专门为处理生存数据中介分析而设计的,能够正确地整合Cox比例风险模型。
本报告旨在比较regmedint和mediation包在生存分析中的应用,并明确指出为何应选择regmedint。
首先,我们加载所需的R包,并读入已经创建好的模拟生存数据。
# 加载必要的库
library(survival)
library(regmedint) # 专门用于生存分析的中介分析
library(mediation) # 传统中介分析包
## Loading required package: MASS
## Loading required package: Matrix
## Loading required package: mvtnorm
## Loading required package: sandwich
## mediation: Causal Mediation Analysis
## Version: 4.5.1
library(dplyr)
##
## Attaching package: 'dplyr'
## The following object is masked from 'package:MASS':
##
## select
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
# 读入数据
data_surv <- read.csv("data_surv.csv")
# 显示数据前几行
cat("数据概览:\n")
## 数据概览:
head(data_surv)
## id age rpl_status glucose time event
## 1 1 31 0 4.956355 3.2376689 0
## 2 2 33 0 5.248053 0.5566793 1
## 3 3 47 0 3.795454 7.4540971 0
## 4 4 36 0 5.489934 3.6700368 1
## 5 5 36 0 4.664446 3.1650059 0
## 6 6 49 0 5.534679 1.7811348 0
mediation包的局限性:
regmedint包的优势:
mediation包为了说明问题,我们首先演示如果强行使用mediation包分析生存数据会发生什么。
med_model_wrong1 <- lm(glucose ~ rpl_status + age, data = data_surv)
out_model_wrong1 <- lm(time ~ rpl_status + glucose + age, data = data_surv)
med_result_wrong1 <- mediate(med_model_wrong1, out_model_wrong1,
treat = "rpl_status", mediator = "glucose",
boot = TRUE, sims = 500)
summary(med_result_wrong1)
##
## Causal Mediation Analysis
##
## Nonparametric Bootstrap Confidence Intervals with the Percentile Method
##
## Estimate 95% CI Lower 95% CI Upper p-value
## ACME -0.29743 -0.42627 -0.17082 < 2.2e-16 ***
## ADE -1.03899 -1.37329 -0.71114 < 2.2e-16 ***
## Total Effect -1.33642 -1.65360 -1.01474 < 2.2e-16 ***
## Prop. Mediated 0.22256 0.12393 0.33273 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Sample Size Used: 1000
##
##
## Simulations: 500
# This model is already created in the previous chunk
# med_model_wrong1 <- lm(glucose ~ rpl_status + age, data = data_surv)
out_model_wrong2 <- glm(event ~ rpl_status + glucose + age,
family = binomial, data = data_surv)
med_result_wrong2 <- mediate(med_model_wrong1, out_model_wrong2,
treat = "rpl_status", mediator = "glucose",
boot = TRUE, sims = 500)
summary(med_result_wrong2)
##
## Causal Mediation Analysis
##
## Nonparametric Bootstrap Confidence Intervals with the Percentile Method
##
## Estimate 95% CI Lower 95% CI Upper p-value
## ACME (control) 0.079430 0.054594 0.103483 < 2.2e-16 ***
## ACME (treated) 0.070718 0.048109 0.093792 < 2.2e-16 ***
## ADE (control) 0.234405 0.172936 0.306690 < 2.2e-16 ***
## ADE (treated) 0.225693 0.166900 0.295641 < 2.2e-16 ***
## Total Effect 0.305122 0.243315 0.369255 < 2.2e-16 ***
## Prop. Mediated (control) 0.260321 0.174207 0.350966 < 2.2e-16 ***
## Prop. Mediated (treated) 0.231768 0.150057 0.329700 < 2.2e-16 ***
## ACME (average) 0.075074 0.051013 0.097887 < 2.2e-16 ***
## ADE (average) 0.230049 0.169465 0.300519 < 2.2e-16 ***
## Prop. Mediated (average) 0.246045 0.162183 0.338963 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Sample Size Used: 1000
##
##
## Simulations: 500
这些方法都忽略了生存数据中的删失信息,因此结果是不可靠的。
regmedint包现在我们使用regmedint包进行正确的分析。
# regmedint分析
regmed_result <- regmedint(data = data_surv,
yvar = "time",
avar = "rpl_status",
mvar = "glucose",
cvar = c("age"),
eventvar = "event",
a0 = 0, # 参考组(无RPL)
a1 = 1, # 处理组(有RPL)
m_cde = mean(data_surv$glucose), # 中介变量固定值
c_cond = mean(data_surv$age), # 协变量固定值
mreg = "linear", # 中介变量回归类型
yreg = "survCox") # 结局回归类型(Cox回归)
# 获取详细结果
summary_regmed <- summary(regmed_result)
regmedint结果解释## 主要效应指标 (风险比 HR) 及置信区间:
## - 总效应 (Total Effect, te): HR = 2.720 (95% CI: 2.256-3.281), p = 0.0000
## - 直接效应 (Pure Natural Direct Effect, pnde): HR = 2.043 (95% CI: 1.683-2.481), p = 0.0000
## - 间接效应 (Pure Natural Indirect Effect, pnie): HR = 1.236 (95% CI: 1.120-1.366), p = 0.0000
## - 中介比例: 21.2%
有RPL的人CVD发病风险显著升高,其中约五分之一的风险提升是通过血糖水平升高间接介导的,其余为直接作用。血糖在RPL与CVD之间起到显著中介作用。
对于不同类型的数据,应选择相应的方法:
regmedint:
mediation:
mediation:
对于生存数据,必须使用regmedint包进行中介分析,而不能使用传统的mediation包,以确保结果的准确性和可靠性。