分布式深入强化学习：调查和多人多机构学习工具箱

论文标题

分布式深入强化学习：调查和多人多机构学习工具箱

Distributed Deep Reinforcement Learning: A Survey and A Multi-Player Multi-Agent Learning Toolbox

论文作者

Yin, Qiyue, Yu, Tongtong, Shen, Shengqi, Yang, Jun, Zhao, Meijing, Huang, Kaiqi, Liang, Bin, Wang, Liang

论文摘要

随着Alphago的突破，深度加强学习成为解决顺序决策问题的公认技术。尽管其声誉享有声誉，但其试验和错误学习机制引起的数据效率低下，使深度强化学习难以在各个领域实用。已经开发了许多用于样本有效的深入强化学习的方法，例如环境建模，经验转移和分布式修改，其中分布式深度强化学习在各种应用中都表明了其潜力，例如人类计算机游戏和智能运输。在本文中，我们通过比较了经典的分布式深入强化学习方法，并研究重要组成部分以实现有效的分布式学习，从而结束了这一令人兴奋的领域的状态，涵盖了单人单位分布式的深度强化学习与最复杂的多个玩家分布的多个代理分布的深层强化学习。此外，我们审查了最近发布的工具箱，这些工具箱有助于实现分布的深度强化学习，而无需对其非分配版本进行许多修改。通过分析其优势和劣势，开发和释放了多人多代理分布的深入强化学习工具箱，这在战争游戏中得到了进一步验证，这是一个复杂的环境，显示了针对多个玩家和多个代理商在复杂游戏下分配深层强化学习的提议工具箱的可用性。最后，我们试图指出挑战和未来的趋势，希望这份简短的评论能为有兴趣分发深入强化学习感兴趣的研究人员提供指南或火花。

With the breakthrough of AlphaGo, deep reinforcement learning becomes a recognized technique for solving sequential decision-making problems. Despite its reputation, data inefficiency caused by its trial and error learning mechanism makes deep reinforcement learning hard to be practical in a wide range of areas. Plenty of methods have been developed for sample efficient deep reinforcement learning, such as environment modeling, experience transfer, and distributed modifications, amongst which, distributed deep reinforcement learning has shown its potential in various applications, such as human-computer gaming, and intelligent transportation. In this paper, we conclude the state of this exciting field, by comparing the classical distributed deep reinforcement learning methods, and studying important components to achieve efficient distributed learning, covering single player single agent distributed deep reinforcement learning to the most complex multiple players multiple agents distributed deep reinforcement learning. Furthermore, we review recently released toolboxes that help to realize distributed deep reinforcement learning without many modifications of their non-distributed versions. By analyzing their strengths and weaknesses, a multi-player multi-agent distributed deep reinforcement learning toolbox is developed and released, which is further validated on Wargame, a complex environment, showing usability of the proposed toolbox for multiple players and multiple agents distributed deep reinforcement learning under complex games. Finally, we try to point out challenges and future trends, hoping this brief review can provide a guide or a spark for researchers who are interested in distributed deep reinforcement learning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题