通过多机构强化学习新兴的社会学习

论文标题

通过多机构强化学习新兴的社会学习

Emergent Social Learning via Multi-agent Reinforcement Learning

论文作者

Ndousse, Kamal, Eck, Douglas, Levine, Sergey, Jaques, Natasha

论文摘要

社会学习是人类和动物智力的关键组成部分。通过从专家在环境中的行为中获取线索，社会学习者可以获得复杂的行为并迅速适应新的情况。本文调查了在多机构环境中的独立强化学习（RL）代理人是否可以学会使用社会学习来提高其绩效。我们发现在大多数情况下，香草无模型的RL代理不使用社会学习。我们分析了这种缺陷的原因，并表明，通过对培训环境施加限制并引入基于模型的辅助损失，我们能够获得广义的社会学习政策，使代理人能够：i）i）发现从单人培训中学习的复杂技能，以及从新环境中获得的专家提出的提示，可以从单身培训中学到的复杂技能。相比之下，经过无模型RL或模仿学习训练的代理商的推广性较差，并且在转移任务中没有成功。通过混合多代理和独奏培训，我们可以获得使用社交学习来获得他们在独自一人时可以部署的技能的代理商，甚至从一开始就可以独自训练的代理人。

Social learning is a key component of human and animal intelligence. By taking cues from the behavior of experts in their environment, social learners can acquire sophisticated behavior and rapidly adapt to new circumstances. This paper investigates whether independent reinforcement learning (RL) agents in a multi-agent environment can learn to use social learning to improve their performance. We find that in most circumstances, vanilla model-free RL agents do not use social learning. We analyze the reasons for this deficiency, and show that by imposing constraints on the training environment and introducing a model-based auxiliary loss we are able to obtain generalized social learning policies which enable agents to: i) discover complex skills that are not learned from single-agent training, and ii) adapt online to novel environments by taking cues from experts present in the new environment. In contrast, agents trained with model-free RL or imitation learning generalize poorly and do not succeed in the transfer tasks. By mixing multi-agent and solo training, we can obtain agents that use social learning to gain skills that they can deploy when alone, even out-performing agents trained alone from the start.

下载PDF全文

下载文献需遵守相关版权规定

论文标题