论文标题
高维介体的非线性调解分析,其因果结构未知
Non-linear Mediation Analysis with High-dimensional Mediators whose Causal Structure is Unknown
论文作者
论文摘要
从治疗到结局的因果途径上有多个潜在的介体,我们考虑了通过每个不同介体沿多个可能的因果路径分解效应的问题。在Pearl的路径特异性效应框架下(Pearl,2001; Avin等,2005),这种细粒分解需要严格的假设,例如正确指定了调解人之间的因果结构,并且在地下室之间没有未观察到的混杂。相比之下,可以在较弱的条件下确定多个介体的介入直接和间接影响(Vansteelandt和Daniel,2017年),同时提供科学相关的因果解释。尽管如此,当前的估计方法需要(正确)指定联合介体分布的模型,当存在一组可能连续且非连续的介体集合时,这可能很困难。在本文中,我们避免了对这种分布进行建模的需求,通过对Vanderweele和Tchetgen Tchetgen(2017)先前提出的纵向中介作用的介入效应进行定义。我们提出了一种新的估计策略,该策略使用(反事实)介体分布的非参数估计。可以使用非线性结果模型来适应非连续的结果。估计通过蒙特卡洛整合进行。使用公开可用的基因组数据(Huang and Pan,2016)来说明该过程,以评估microRNA表达对脑癌患者三个月死亡率的因果关系,这些死亡率可能是由多个基因表达值介导的。
With multiple potential mediators on the causal pathway from a treatment to an outcome, we consider the problem of decomposing the effects along multiple possible causal path(s) through each distinct mediator. Under Pearl's path-specific effects framework (Pearl, 2001; Avin et al., 2005), such fine-grained decompositions necessitate stringent assumptions, such as correctly specifying the causal structure among the mediators, and there being no unobserved confounding among the mediators. In contrast, interventional direct and indirect effects for multiple mediators (Vansteelandt and Daniel, 2017) can be identified under much weaker conditions, while providing scientifically relevant causal interpretations. Nonetheless, current estimation approaches require (correctly) specifying a model for the joint mediator distribution, which can be difficult when there is a high-dimensional set of possibly continuous and non-continuous mediators. In this article, we avoid the need to model this distribution, by developing a definition of interventional effects previously suggested by VanderWeele and Tchetgen Tchetgen (2017) for longitudinal mediation. We propose a novel estimation strategy that uses non-parametric estimates of the (counterfactual) mediator distributions. Non-continuous outcomes can be accommodated using non-linear outcome models. Estimation proceeds via Monte Carlo integration. The procedure is illustrated using publicly available genomic data (Huang and Pan, 2016) to assess the causal effect of a microRNA expression on the three-month mortality of brain cancer patients that is potentially mediated by expression values of multiple genes.