多机器人轨迹计划的战略指导的Stackelberg Meta学习

论文标题

多机器人轨迹计划的战略指导的Stackelberg Meta学习

Stackelberg Meta-Learning for Strategic Guidance in Multi-Robot Trajectory Planning

论文作者

Zhao, Yuhan, Zhu, Quanyan

论文摘要

轨迹指导要求领导者机器人代理协助追随者机器人代理人合作到达目标目的地。但是，当领导者为一个不同的追随者家庭服务并且对追随者的信息不完整时，计划合作变得困难。需要学习和快速适应不同的合作计划。我们开发了一种Stackelberg Meta学习方法来应对这一挑战。我们首先将指导性的轨迹计划问题作为动态的Stackelberg游戏，以捕获领导者的互动。然后，我们利用元学习来为不同的追随者制定合作策略。领导者从规定的一组追随者那里学习了一个最佳响应模型。当特定的追随者启动指导查询时，领导者迅速使用少量学习数据适应了特定于特定的模型，并使用它来执行轨迹指导。我们使用模拟来阐述我们的方法比其他学习方法为学习追随者的行为提供了更好的概括和适应性表现。与零指导方案的比较还证明了指导的价值和有效性。

Trajectory guidance requires a leader robotic agent to assist a follower robotic agent to cooperatively reach the target destination. However, planning cooperation becomes difficult when the leader serves a family of different followers and has incomplete information about the followers. There is a need for learning and fast adaptation of different cooperation plans. We develop a Stackelberg meta-learning approach to address this challenge. We first formulate the guided trajectory planning problem as a dynamic Stackelberg game to capture the leader-follower interactions. Then, we leverage meta-learning to develop cooperative strategies for different followers. The leader learns a meta-best-response model from a prescribed set of followers. When a specific follower initiates a guidance query, the leader quickly adapts to the follower-specific model with a small amount of learning data and uses it to perform trajectory guidance. We use simulations to elaborate that our method provides a better generalization and adaptation performance on learning followers' behavior than other learning approaches. The value and the effectiveness of guidance are also demonstrated by the comparison with zero guidance scenarios.

下载PDF全文

下载文献需遵守相关版权规定

论文标题