论文标题
有效的干预设计,用于与潜伏期的因果发现
Efficient Intervention Design for Causal Discovery with Latents
论文作者
论文摘要
我们考虑在存在潜在变量的情况下恢复因果图,我们试图最大程度地减少恢复过程中使用的干预措施的成本。我们考虑两种干预成本模型:(1)一个线性成本模型,其中变量子集的干预成本具有线性形式,以及(2)身份成本模型,其中干预成本相同,无论其启动什么变量,即目标是最小的干预措施。在线性成本模型下,我们给出了一种算法来确定基本因果图的祖先关系,并在最佳干预成本的$ 2 $因素范围内实现。在某些温和的限制下,对于任何$ε> 0 $的近似因素,可以提高到$ 1+ε$。在身份成本模型下,我们使用因果图的参数化通过特殊类型的围墙来绑定恢复整个因果图所需的干预措施,包括潜在变量。特别是,我们介绍了$ p $ - 胶合器的概念,这些概念是由因果图中特定类型的调节类型引起的一对节点之间的cool,并在Causal图中任何两个节点之间的最大$ p $ - collive量的最大数量函数上提供了干预量的上限。
We consider recovering a causal graph in presence of latent variables, where we seek to minimize the cost of interventions used in the recovery process. We consider two intervention cost models: (1) a linear cost model where the cost of an intervention on a subset of variables has a linear form, and (2) an identity cost model where the cost of an intervention is the same, regardless of what variables it is on, i.e., the goal is just to minimize the number of interventions. Under the linear cost model, we give an algorithm to identify the ancestral relations of the underlying causal graph, achieving within a $2$-factor of the optimal intervention cost. This approximation factor can be improved to $1+ε$ for any $ε> 0$ under some mild restrictions. Under the identity cost model, we bound the number of interventions needed to recover the entire causal graph, including the latent variables, using a parameterization of the causal graph through a special type of colliders. In particular, we introduce the notion of $p$-colliders, that are colliders between pair of nodes arising from a specific type of conditioning in the causal graph, and provide an upper bound on the number of interventions as a function of the maximum number of $p$-colliders between any two nodes in the causal graph.