依赖实例的遗憾分析内核匪徒

论文标题

依赖实例的遗憾分析内核匪徒

Instance-Dependent Regret Analysis of Kernelized Bandits

论文作者

Shekhar, Shubhanshu, Javidi, Tara

论文摘要

我们研究了内核匪徒问题，该问题涉及设计一种自适应策略，以查询嘈杂的零订单 - 端口，以有效地了解未知函数$ f $的优化器，并具有$ M <\ \ \ iffty $限制的$ m <\ \ iffty $，以重现Kernel Hilbert Space〜（RKHS）与正相关的kernel hilbert space〜（rkhs）。在\ emph {minimax框架}中工作的先前结果已经表征了最差的案例（在问题类中的所有功能上）限制了\ emph {any}算法可实现的遗憾，并通过匹配〜（modulo polygarithmic casse for pologarthmic casse for polst-cass cass for the Matern offers of \ emph {any}算法都构建了算法。这些结果有两个缺点。首先，minimax下限没有提供有关特定问题实例中常用算法可实现的遗憾界限的信息。其次，由于其最差的案例性质，现有的上限分析无法适应功能类中的更轻松的问题实例。我们的工作采取了解决这两个问题的步骤。首先，我们得出了\ emph {Instement}遗憾的算法的下限均匀〜（在功能类中）消失了归一化的累积遗憾。我们的结果对所有实际相关的内核强盗算法有效，例如GP-UCB，GP-TS和SupkernElucb，确定了与每个问题实例相关的基本复杂度度量。然后，我们通过提出一种新的Minimax近距离算法来解决第二个问题，该算法也适应了更容易的问题实例。

We study the kernelized bandit problem, that involves designing an adaptive strategy for querying a noisy zeroth-order-oracle to efficiently learn about the optimizer of an unknown function $f$ with a norm bounded by $M<\infty$ in a Reproducing Kernel Hilbert Space~(RKHS) associated with a positive definite kernel $K$. Prior results, working in a \emph{minimax framework}, have characterized the worst-case~(over all functions in the problem class) limits on regret achievable by \emph{any} algorithm, and have constructed algorithms with matching~(modulo polylogarithmic factors) worst-case performance for the \matern family of kernels. These results suffer from two drawbacks. First, the minimax lower bound gives no information about the limits of regret achievable by the commonly used algorithms on specific problem instances. Second, due to their worst-case nature, the existing upper bound analysis fails to adapt to easier problem instances within the function class. Our work takes steps to address both these issues. First, we derive \emph{instance-dependent} regret lower bounds for algorithms with uniformly~(over the function class) vanishing normalized cumulative regret. Our result, valid for all the practically relevant kernelized bandits algorithms, such as, GP-UCB, GP-TS and SupKernelUCB, identifies a fundamental complexity measure associated with every problem instance. We then address the second issue, by proposing a new minimax near-optimal algorithm which also adapts to easier problem instances.

下载PDF全文

下载文献需遵守相关版权规定

论文标题