论文标题

Flambe:低等级MDP的结构复杂性和表示

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

论文作者

Agarwal, Alekh, Kakade, Sham, Krishnamurthy, Akshay, Sun, Wen

论文摘要

为了应对增强学习(RL)中维度的诅咒,通常做出参数假设是值或策略是某些低维特征空间的函数。这项工作着重于表示的学习问题:我们如何学习此类功能?在假设基础(未知)动力学对应于低等级转变矩阵的假设下,我们展示了表示的学习问题与特定的非线性矩阵分解问题是如何相关的。从结构上讲,我们在这些低级MDP和潜在变量模型之间建立了精确的连接,显示了它们如何显着概括RL中的表示形式学习。从算法上讲,我们开发了Flambe,该Flambe在低级过渡模型中可证明有效的RL进行探索和表示学习。

In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common practice to make parametric assumptions where values or policies are functions of some low dimensional feature space. This work focuses on the representation learning question: how can we learn such features? Under the assumption that the underlying (unknown) dynamics correspond to a low rank transition matrix, we show how the representation learning question is related to a particular non-linear matrix decomposition problem. Structurally, we make precise connections between these low rank MDPs and latent variable models, showing how they significantly generalize prior formulations for representation learning in RL. Algorithmically, we develop FLAMBE, which engages in exploration and representation learning for provably efficient RL in low rank transition models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源