连续的神经算法计划者

论文标题

连续的神经算法计划者

Continuous Neural Algorithmic Planners

论文作者

He, Yu, Veličković, Petar, Liò, Pietro, Deac, Andreea

论文摘要

神经算法推理研究与神经网络学习算法的问题，尤其是图形体系结构。 XLVIN最近的一项提案获得了使用图神经网络的好处，该图形神经网络模拟了深钢筋学习剂中的价值迭代算法。它允许无模型的计划，而无需访问有关环境的特权信息，这通常不可用。但是，XLVIN仅支持离散的动作空间，因此在非定程论上适用于大多数实际利益任务。我们通过离散化将XLVIN扩展到连续的动作空间，并评估几种选择性扩展政策以处理大型计划图。我们的提议CNAP展示了神经算法推理如何在高维连续控制环境（例如Mujoco）中产生可衡量的影响，从而在低数据环境中带来了增长，并且超过了无模型的基准。

Neural algorithmic reasoning studies the problem of learning algorithms with neural networks, especially with graph architectures. A recent proposal, XLVIN, reaps the benefits of using a graph neural network that simulates the value iteration algorithm in deep reinforcement learning agents. It allows model-free planning without access to privileged information about the environment, which is usually unavailable. However, XLVIN only supports discrete action spaces, and is hence nontrivially applicable to most tasks of real-world interest. We expand XLVIN to continuous action spaces by discretization, and evaluate several selective expansion policies to deal with the large planning graphs. Our proposal, CNAP, demonstrates how neural algorithmic reasoning can make a measurable impact in higher-dimensional continuous control settings, such as MuJoCo, bringing gains in low-data settings and outperforming model-free baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题