论文标题

通过计划准米数的多任务增强学习

Multi-task Reinforcement Learning with a Planning Quasi-Metric

论文作者

Micheli, Vincent, Sinnathamby, Karthigan, Fleuret, François

论文摘要

我们介绍了一种新的强化学习方法,结合了计划准米特(PQM),该方法估计了从任何状态到另一个状态所需的步骤数,而特定于任务的“瞄准器”计算了目标状态以达到给定目标。这种分解允许跨任务的跨任务无关模型模型共享,该模型可以捕获环境的动态,并且可以以密集且无监督的方式学习。与最近发表的有关标准位叉问题以及在Mujoco机器人臂模拟器中的方法相比,我们实现了多重训练的速度。

We introduce a new reinforcement learning approach combining a planning quasi-metric (PQM) that estimates the number of steps required to go from any state to another, with task-specific "aimers" that compute a target state to reach a given goal. This decomposition allows the sharing across tasks of a task-agnostic model of the quasi-metric that captures the environment's dynamics and can be learned in a dense and unsupervised manner. We achieve multiple-fold training speed-up compared to recently published methods on the standard bit-flip problem and in the MuJoCo robotic arm simulator.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源