论文标题

随机路径计划的多任务选项学习和发现

Multi-Task Option Learning and Discovery for Stochastic Path Planning

论文作者

Shah, Naman, Srivastava, Siddharth

论文摘要

本文解决了可靠,有效地解决长期远程随机路径计划问题的问题。从具有随机动力学模拟器和环境的占用矩阵的香草RL公式开始,我们的方法使用策略以及组成发现选项的高级路径计算有用的选项。我们的主要贡献是(1)创建抽象状态的数据驱动方法,该方法用作有用的选项的终点,(2)使用自动生成的选项指南计算选项策略的方法,以密集的伪奖励函数的形式以及(3)总体算法组成计算选项。我们表明,这种方法可实现可执行性和可溶性的强烈保证:在相当笼统的条件下,计算的期权指南导致可组合期权策略,因此确保了向下的可再启动性。对一系列机器人,环境和任务的经验评估表明,这种方法有效地传递了跨相关任务的知识,并且它的表现优于现有方法的大幅度。

This paper addresses the problem of reliably and efficiently solving broad classes of long-horizon stochastic path planning problems. Starting with a vanilla RL formulation with a stochastic dynamics simulator and an occupancy matrix of the environment, our approach computes useful options with policies as well as high-level paths that compose the discovered options. Our main contributions are (1) data-driven methods for creating abstract states that serve as endpoints for helpful options, (2) methods for computing option policies using auto-generated option guides in the form of dense pseudo-reward functions, and (3) an overarching algorithm for composing the computed options. We show that this approach yields strong guarantees of executability and solvability: under fairly general conditions, the computed option guides lead to composable option policies and consequently ensure downward refinability. Empirical evaluation on a range of robots, environments, and tasks shows that this approach effectively transfers knowledge across related tasks and that it outperforms existing approaches by a significant margin.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源