论文标题
反事实和个人治疗效果的共形推断
Conformal Inference of Counterfactuals and Individual Treatment Effects
论文作者
论文摘要
评估治疗效果异质性广泛为治疗决策提供了信息。目前,通过柔性机器学习算法对条件平均治疗效果的估计非常重视。尽管这些方法在一致性和收敛率方面具有一些理论上的吸引力,但它们在不确定性量化方面的表现通常很差。这是令人不安的,因为评估风险对于在敏感和不确定的环境中可靠的决策至关重要。在这项工作中,我们提出了一种基于共形的推理方法,可以在潜在的结果框架下为反事实和个人治疗效果产生可靠的间隔估计。对于具有完美符合性的完全随机或分层的随机实验,无论数据生成机制如何,这些间隔都可以保证有限样品的平均覆盖范围。对于遵守强大的无知性假设的随机实验和一般观察性研究,间隔满足了双重稳健的特性,该特性列出以下内容:如果可以准确估算倾向分数或潜在结果的条件分位数,则平均覆盖率大致控制。关于合成数据集和真实数据集的数值研究在经验上表明,即使在简单模型中,现有方法也遭受了覆盖不足的影响。相比之下,我们的方法以相当短的间隔实现了所需的覆盖范围。
Evaluating treatment effect heterogeneity widely informs treatment decision making. At the moment, much emphasis is placed on the estimation of the conditional average treatment effect via flexible machine learning algorithms. While these methods enjoy some theoretical appeal in terms of consistency and convergence rates, they generally perform poorly in terms of uncertainty quantification. This is troubling since assessing risk is crucial for reliable decision-making in sensitive and uncertain environments. In this work, we propose a conformal inference-based approach that can produce reliable interval estimates for counterfactuals and individual treatment effects under the potential outcome framework. For completely randomized or stratified randomized experiments with perfect compliance, the intervals have guaranteed average coverage in finite samples regardless of the unknown data generating mechanism. For randomized experiments with ignorable compliance and general observational studies obeying the strong ignorability assumption, the intervals satisfy a doubly robust property which states the following: the average coverage is approximately controlled if either the propensity score or the conditional quantiles of potential outcomes can be estimated accurately. Numerical studies on both synthetic and real datasets empirically demonstrate that existing methods suffer from a significant coverage deficit even in simple models. In contrast, our methods achieve the desired coverage with reasonably short intervals.