在时间限制下进行多任务加固学习的最佳选择

论文标题

在时间限制下进行多任务加固学习的最佳选择

Optimal Options for Multi-Task Reinforcement Learning Under Time Constraints

论文作者

Del Verme, Manuel, da Silva, Bruno Castro, Baldassarre, Gianluca

论文摘要

强化学习可以从使用选项作为编码重复行为并促进探索的一种方式中受益匪浅。一个重要的开放问题是，在求解相关任务的特定分布时，代理如何自主学习有用的选项。我们研究了一些影响期权最佳性的条件，在代理商有限预算学习每个任务的设置中，任务分布可能涉及具有不同级别相似性的问题。我们直接搜索最佳选项集，并表明发现的选项明显不同，具体取决于可用的学习时间预算等因素，并且发现的选项表现优于流行的选项生成启发式方法。

Reinforcement learning can greatly benefit from the use of options as a way of encoding recurring behaviours and to foster exploration. An important open problem is how can an agent autonomously learn useful options when solving particular distributions of related tasks. We investigate some of the conditions that influence optimality of options, in settings where agents have a limited time budget for learning each task and the task distribution might involve problems with different levels of similarity. We directly search for optimal option sets and show that the discovered options significantly differ depending on factors such as the available learning time budget and that the found options outperform popular option-generation heuristics.

下载PDF全文

下载文献需遵守相关版权规定

论文标题