论文标题
在学识渊博的潜在动作空间中为可推广的腿部运动计划
Planning in Learned Latent Action Spaces for Generalizable Legged Locomotion
论文作者
论文摘要
分层学习以样本有效的方式成功地学习了步行机器人的可推广运动技能。但是,用于在层次结构的两层之间进行通信的低维“潜在”操作通常是用户设计的。在这项工作中,我们提出了一个完整的层次结构框架,该框架能够共同学习低级控制器和高级潜在动作空间。一旦学习了这个潜在空间,我们就会使用学识渊博的高级动力学模型以模型预测的控制方式计划连续的潜在动作。该框架概括为多个机器人,我们介绍了雏菊六型模拟,A1四倍模拟和雏菊机器人硬件的结果。我们比较了文献中的一系列学习的分层方法,并表明我们的框架在多个任务和两个模拟上的表现优于基准。除了学习方法外,我们还将作用于所需的机器人运动作用的逆基因模式(IK),并表明我们的完整框架在A1和Daisy模拟上都优于不良设置的IK。在硬件上,我们显示了雏菊六角形在非结构化的室外设置中实现了多个运动任务,只有2000个硬件样本,从而增强了我们方法的鲁棒性和样本效率。
Hierarchical learning has been successful at learning generalizable locomotion skills on walking robots in a sample-efficient manner. However, the low-dimensional "latent" action used to communicate between two layers of the hierarchy is typically user-designed. In this work, we present a fully-learned hierarchical framework, that is capable of jointly learning the low-level controller and the high-level latent action space. Once this latent space is learned, we plan over continuous latent actions in a model-predictive control fashion, using a learned high-level dynamics model. This framework generalizes to multiple robots, and we present results on a Daisy hexapod simulation, A1 quadruped simulation, and Daisy robot hardware. We compare a range of learned hierarchical approaches from literature, and show that our framework outperforms baselines on multiple tasks and two simulations. In addition to learning approaches, we also compare to inverse-kinematics (IK) acting on desired robot motion, and show that our fully-learned framework outperforms IK in adverse settings on both A1 and Daisy simulations. On hardware, we show the Daisy hexapod achieve multiple locomotion tasks, in an unstructured outdoor setting, with only 2000 hardware samples, reinforcing the robustness and sample-efficiency of our approach.