论文标题

通过最佳因果熵原理(BOCSE)对布尔网络和功能的数据驱动学习

Data-Driven Learning of Boolean Networks and Functions by Optimal Causation Entropy Principle (BoCSE)

论文作者

Sun, Jie, AlMomani, Abd AlRahman, Bollt, Erik

论文摘要

布尔功能和网络通常用于复杂生物系统的建模和分析中,并且该范式在数据科学和决策中的其他重要领域高度相关,例如在医疗领域和金融行业中。从数据中对布尔网络和布尔函数的自动化学习是一项艰巨的任务,部分原因是要估计的大量未知数(包括网络和函数的结构),对此,蛮力方法将呈指数级复杂。在本文中,我们开发了一种新信息理论方法,我们表明,比以前的方法要高得多。在最近开发的最佳因果熵原理(OCSE)的基础上,我们证明可以正确地推断直接连接与间接连接之间的区分网络,我们在这里开发了一种有效的算法,该算法进一步渗透了Boolean网络(包括其结构和功能)(包括其结构和功能),该数据基于Nodes的Evollat​​ing States的数据。我们称这种新的推理方法为布尔最佳因果熵(BOCSE),我们将证明我们的方法既计算有效又对噪声也有弹性。此外,它允许选择一组最能解释该过程的功能,该语句可以描述为网络布尔函数减少订单模型。我们在几个现实世界中重点介绍了特征选择的方法:(1)诊断尿疾病,(2)心脏Spect诊断,(3)游戏TIC-TAC-TOE中的信息职位,以及(4)(4)对默认状态下贷款的风险因果分析。我们提出的方法在所有示例中都是有效而有效的。

Boolean functions and networks are commonly used in the modeling and analysis of complex biological systems, and this paradigm is highly relevant in other important areas in data science and decision making, such as in the medical field and in the finance industry. Automated learning of a Boolean network and Boolean functions, from data, is a challenging task due in part to the large number of unknowns (including both the structure of the network and the functions) to be estimated, for which a brute force approach would be exponentially complex. In this paper we develop a new information theoretic methodology that we show to be significantly more efficient than previous approaches. Building on the recently developed optimal causation entropy principle (oCSE), that we proved can correctly infer networks distinguishing between direct versus indirect connections, we develop here an efficient algorithm that furthermore infers a Boolean network (including both its structure and function) based on data observed from the evolving states at nodes. We call this new inference method, Boolean optimal causation entropy (BoCSE), which we will show that our method is both computationally efficient and also resilient to noise. Furthermore, it allows for selection of a set of features that best explains the process, a statement that can be described as a networked Boolean function reduced order model. We highlight our method to the feature selection in several real-world examples: (1) diagnosis of urinary diseases, (2) Cardiac SPECT diagnosis, (3) informative positions in the game Tic-Tac-Toe, and (4) risk causality analysis of loans in default status. Our proposed method is effective and efficient in all examples.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源