论文标题
一般政策映射:在线持续强化学习受昆虫大脑的启发
General policy mapping: online continual reinforcement learning inspired on the insect brain
论文作者
论文摘要
我们已经开发了一种用于昆虫大脑启发的在线持续或终身增强学习(RL)的模型。我们的模型利用特征提取的离线培训和通用的一般政策层,以使RL算法在在线设置中的收敛。在任务上共享一个共同的政策层会导致积极的向后转移,在这些旧任务中,代理人在共享相同基础一般政策的旧任务中不断改进。对代理网络的生物学启发限制是RL算法收敛的关键。这为在资源受限的方案中提供了有效的在线RL的途径。
We have developed a model for online continual or lifelong reinforcement learning (RL) inspired on the insect brain. Our model leverages the offline training of a feature extraction and a common general policy layer to enable the convergence of RL algorithms in online settings. Sharing a common policy layer across tasks leads to positive backward transfer, where the agent continuously improved in older tasks sharing the same underlying general policy. Biologically inspired restrictions to the agent's network are key for the convergence of RL algorithms. This provides a pathway towards efficient online RL in resource-constrained scenarios.