论文标题

通过概率编程推断信号通路

Inferring Signaling Pathways with Probabilistic Programming

论文作者

Merrell, David, Gitter, Anthony

论文摘要

细胞通过称为信号通路的令人眼前的复杂生化过程来调节自己。这些通常被描述为网络,其中节点代表蛋白质和边缘表明它们相互影响。为了了解细胞水平的疾病和疗法,对工作中的信号传导途径有准确的了解至关重要。由于信号通路可以通过疾病来改变,因此从状况或患者特异性数据中推断信号通路的能力非常有价值。有多种用于推断信号通路的技术。我们以过去的作品为基础,该作品将信号通路推断作为动态的贝叶斯网络结构估计问题在磷酸蛋白质组学时间过程中。我们采用贝叶斯方法,使用马尔可夫链蒙特卡洛(Monte Carlo)在可能的动态贝叶斯网络结构上估计后部分布。我们的主要贡献是(i)一种新型的建议分布,该分布有效地采样了稀疏图,以及(ii)普通限制性建模假设的放松。我们使用Gen Probabilistic编程语言在Julia中实施了我们的方法,称为稀疏信号通路采样。概率编程是构建统计模型的有力方法。最终的代码是模块化的,可扩展的且清晰的。尤其是GEN语言使我们能够自定义生物图的推理程序并确保有效抽样。我们在模拟数据和HPN-Dream途径重建挑战上评估了我们的算法,将我们的性能与各种基线方法进行了比较。我们的结果证明了概率编程的巨大潜力,以及特定于生物网络推断的Gen。在https://github.com/gitter-lab/ssps上找到完整的代码库

Cells regulate themselves via dizzyingly complex biochemical processes called signaling pathways. These are usually depicted as a network, where nodes represent proteins and edges indicate their influence on each other. In order to understand diseases and therapies at the cellular level, it is crucial to have an accurate understanding of the signaling pathways at work. Since signaling pathways can be modified by disease, the ability to infer signaling pathways from condition- or patient-specific data is highly valuable. A variety of techniques exist for inferring signaling pathways. We build on past works that formulate signaling pathway inference as a Dynamic Bayesian Network structure estimation problem on phosphoproteomic time course data. We take a Bayesian approach, using Markov Chain Monte Carlo to estimate a posterior distribution over possible Dynamic Bayesian Network structures. Our primary contributions are (i) a novel proposal distribution that efficiently samples sparse graphs and (ii) the relaxation of common restrictive modeling assumptions. We implement our method, named Sparse Signaling Pathway Sampling, in Julia using the Gen probabilistic programming language. Probabilistic programming is a powerful methodology for building statistical models. The resulting code is modular, extensible, and legible. The Gen language, in particular, allows us to customize our inference procedure for biological graphs and ensure efficient sampling. We evaluate our algorithm on simulated data and the HPN-DREAM pathway reconstruction challenge, comparing our performance against a variety of baseline methods. Our results demonstrate the vast potential for probabilistic programming, and Gen specifically, for biological network inference. Find the full codebase at https://github.com/gitter-lab/ssps

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源