论文标题

通过不完美的反馈在线非凸优化

Online non-convex optimization with imperfect feedback

论文作者

Héliou, Amélie, Martin, Matthieu, Mertikopoulos, Panayotis, Rahier, Thibaud

论文摘要

我们考虑在线学习的问题,并遇到非凸面损失。在反馈方面,我们假设学习者观察到或以其他方式构建了一个在每个阶段遇到的损失函数的不精确模型,并且我们提出了基于双重平均的混合策略学习政策。在这种一般背景下,我们为学习者的静态(外部)遗憾以及对事后最佳动态政策产生的遗憾提供了一系列紧张的遗憾最小化保证。随后,我们将此一般模板应用于学习者仅访问过程每个阶段实际损失的情况。这是通过基于内核的估计器来实现的,该估计量仅使用学习者实现的损失作为输入为每个回合的损失函数生成不精确的模型。

We consider the problem of online learning with non-convex losses. In terms of feedback, we assume that the learner observes - or otherwise constructs - an inexact model for the loss function encountered at each stage, and we propose a mixed-strategy learning policy based on dual averaging. In this general context, we derive a series of tight regret minimization guarantees, both for the learner's static (external) regret, as well as the regret incurred against the best dynamic policy in hindsight. Subsequently, we apply this general template to the case where the learner only has access to the actual loss incurred at each stage of the process. This is achieved by means of a kernel-based estimator which generates an inexact model for each round's loss function using only the learner's realized losses as input.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源