具有可解释的模仿学习的建模代理政策

论文标题

具有可解释的模仿学习的建模代理政策

Modelling Agent Policies with Interpretable Imitation Learning

论文作者

Bewley, Tom, Lawry, Jonathan, Richards, Arthur

论文摘要

当我们在安全至关重要领域中部署自主代理时，重要的是要了解其内部机制和表示。我们概述了一种模仿学习方法，用于在MDP环境中进行反向工程黑匣子策略，以决策树的形式产生简化，可解释的模型。作为此过程的一部分，我们通过从马尔可夫州构建的大量候选特征中选择，明确地对代理的潜在状态表示。我们介绍了在多区域交通环境中实现的最初有希望的结果。

As we deploy autonomous agents in safety-critical domains, it becomes important to develop an understanding of their internal mechanisms and representations. We outline an approach to imitation learning for reverse-engineering black box agent policies in MDP environments, yielding simplified, interpretable models in the form of decision trees. As part of this process, we explicitly model and learn agents' latent state representations by selecting from a large space of candidate features constructed from the Markov state. We present initial promising results from an implementation in a multi-agent traffic environment.

下载PDF全文

下载文献需遵守相关版权规定

论文标题