从关系域中的示例中学习概率的时间安全性

论文标题

从关系域中的示例中学习概率的时间安全性

Learning Probabilistic Temporal Safety Properties from Examples in Relational Domains

论文作者

Rens, Gavin, Yang, Wen-Chi, Raskin, Jean-François, De Raedt, Luc

论文摘要

我们提出了一个框架，以学习一组标记为安全或不安全的状态的概率计算树逻辑（PCTL）公式的片段。我们在关系环境中工作，并将关系马尔可夫决策过程中的思想与PCTL模型检查结合在一起。更具体地说，我们假设有一个不明的关系PCTL目标公式仅由安全状态满足，并且具有最大$ K $步骤的视野和一个阈值概率$α$。然后，该任务包括从域专家将这个未知公式从被标记为安全或不安全的州中。我们采用关系学习原则来诱导所有安全状态满足的PCTL公式，并且没有一个不安全的状态。然后，该公式可以用作该域的安全规范，以便将来该系统避免陷入危险情况。遵循关系学习原则，我们引入了候选公式生成过程，以及一种决定哪种候选公式的方法是给定标记状态的令人满意的规范。但是，专家知道并且不知道系统策略的情况得到处理，但是，两种情况的学习过程中的大部分都是相同的。我们在合成关系领域评估我们的方法。

We propose a framework for learning a fragment of probabilistic computation tree logic (pCTL) formulae from a set of states that are labeled as safe or unsafe. We work in a relational setting and combine ideas from relational Markov Decision Processes with pCTL model-checking. More specifically, we assume that there is an unknown relational pCTL target formula that is satisfied by only safe states, and has a horizon of maximum $k$ steps and a threshold probability $α$. The task then consists of learning this unknown formula from states that are labeled as safe or unsafe by a domain expert. We apply principles of relational learning to induce a pCTL formula that is satisfied by all safe states and none of the unsafe ones. This formula can then be used as a safety specification for this domain, so that the system can avoid getting into dangerous situations in future. Following relational learning principles, we introduce a candidate formula generation process, as well as a method for deciding which candidate formula is a satisfactory specification for the given labeled states. The cases where the expert knows and does not know the system policy are treated, however, much of the learning process is the same for both cases. We evaluate our approach on a synthetic relational domain.

下载PDF全文

下载文献需遵守相关版权规定

论文标题