R语言使用ipwtm函数对生存数据进行时点加权

本文链接：https://blog.csdn.net/dege857/article/details/130718913

文章介绍了如何使用R语言中的ipwtm函数处理生存数据，特别是针对HIV患者的数据集haartdat。该函数用于进行生存数据的加权，包括逆概率加权和逆概率删减权重，以评估HAART治疗对患者生存的影响。通过对比加权和未加权的Cox比例风险模型，展示了加权分析如何改变结果。最后，展示了如何绘制加权后的生存函数曲线。

摘要由CSDN通过智能技术生成

后台粉丝希望我介绍一下能进行生存数据时点加权的ipwtm函数，今天简单分享一下。ipwtm函数属于ipw包，我们先导入包和数据。

library(ipw)
data(haartdat)
head(haartdat,6)
##   patient tstart fuptime haartind event sex age cd4.sqrt endtime dropout
## 1       1   -100       0        0     0   1  22 23.83275    2900       0
## 2       1      0     100        0     0   1  22 25.59297    2900       0
## 3       1    100     200        0     0   1  22 23.47339    2900       0
## 4       1    200     300        0     0   1  22 24.16609    2900       0
## 5       1    300     400        0     0   1  22 23.23790    2900       0
## 6       1    400     500        0     0   1  22 24.85961    2900       0

这是一个HIV患者的生存数据，比较的是一种HAART治疗治疗方法对患者生存的影响。patient患者的ID，tstart开始随访的时间，fuptime随访终止的时间，haartind是否采用HAART治疗，event结局指标，是否死亡。sex性别，age随访时候的年龄，cd4.sqrt：CD4细胞的平方根。 ipwtm函数对生存数据加权很简单，其实就是一句话代码，exposure就是放入你的研究变量，numerator这里填入权重分子，如果按照Robins的算法这里填入1，属于不稳定权重，denominator这里属于权重分母，填入我们的协变量，它对权重的计算有点类似ipwpoint函数，不过更复杂，但是大概的原理是分子这个地方的变量生成一个模型，分母这个地方的变量生成一个模型，然后对预测值相除，当然过程中还有复杂的处理。tstart开始时间，timevar结束时间，data就是你的数据。type的类型有”first”, “cens” and “all”. “first”是权重由最低值进行转换。

temp <- ipwtm(
  exposure = haartind,
  family = "survival",
  numerator = ~ sex + age,
  denominator = ~ sex + age + cd4.sqrt,
  id = patient,
  tstart = tstart,
  timevar = fuptime,
  type = "first",
  data = haartdat)

跑出结果后权重就是在temp中
在这里插入图片描述
我们可以把权重提取出来绘图

ipwplot(weights = temp$ipw.weights, timevar = haartdat$fuptime,
        binwidth = 100, ylim = c(-1.5, 1.5), main = "Stabilized inverse probability weights")

在这里插入图片描述
这个图画法其实很简单，X就是我们的随访时间，Y轴就是权重取了个对数，设定一段宽度，每隔一段画一个箱线图。如果把ype设置为”cens”,进行的是逆概率删减权重（IPCW），我也不是很了解。

temp2 <- ipwtm(
  exposure = dropout,
  family = "survival",
  numerator = ~ sex + age,
  denominator = ~ sex + age + cd4.sqrt,
  id = patient,
  tstart = tstart,
  timevar = fuptime,
  type = "cens",
  data = haartdat)

ipwplot(weights = temp2$ipw.weights, timevar = haartdat$fuptime,
        binwidth = 100, ylim = c(-1.5, 1.5), main = "Stabilized inverse probability of censoring weights")

在这里插入图片描述
下面是加权和不加权对结果的影响
加权的

require(survival)
## 载入需要的程辑包：survival
summary(coxph(Surv(tstart, fuptime, event) ~ haartind + cluster(patient),
              data = haartdat, weights = temp$ipw.weights*temp2$ipw.weights))
## Call:
## coxph(formula = Surv(tstart, fuptime, event) ~ haartind, data = haartdat, 
##     weights = temp$ipw.weights * temp2$ipw.weights, cluster = patient)
## 
##   n= 19175, number of events= 31 
## 
##             coef exp(coef) se(coef) robust se      z Pr(>|z|)  
## haartind -0.9363    0.3921   0.4299    0.4527 -2.068   0.0386 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##          exp(coef) exp(-coef) lower .95 upper .95
## haartind    0.3921      2.551    0.1614    0.9521
## 
## Concordance= 0.597  (se = 0.029 )
## Likelihood ratio test= 5.57  on 1 df,   p=0.02
## Wald test            = 4.28  on 1 df,   p=0.04
## Score (logrank) test = 5.07  on 1 df,   p=0.02,   Robust = 4.83  p=0.03
## 
##   (Note: the likelihood ratio and score tests assume independence of
##      observations within a cluster, the Wald and robust score tests do not).

不加权的

summary(coxph(Surv(tstart, fuptime, event) ~ haartind, data = haartdat))
## Call:
## coxph(formula = Surv(tstart, fuptime, event) ~ haartind, data = haartdat)
## 
##   n= 19175, number of events= 31 
## 
##             coef exp(coef) se(coef)      z Pr(>|z|)
## haartind -0.5862    0.5565   0.4368 -1.342     0.18
## 
##          exp(coef) exp(-coef) lower .95 upper .95
## haartind    0.5565      1.797    0.2364      1.31
## 
## Concordance= 0.569  (se = 0.032 )
## Likelihood ratio test= 1.97  on 1 df,   p=0.2
## Wald test            = 1.8  on 1 df,   p=0.2
## Score (logrank) test = 1.85  on 1 df,   p=0.2

加权的haartind是0.3921，不加权的haartind是0.5565，加权后影响变小。
绘制加权后的生存函数曲线

f1<-survfit(Surv(tstart, fuptime, event) ~ haartind + cluster(patient),
            data = haartdat, weights = temp$ipw.weights*temp2$ipw.weights)
ggsurvplot(f1, data = haartdat)