固定预算最佳手臂识别的最小值最佳算法

论文标题

固定预算最佳手臂识别的最小值最佳算法

Minimax Optimal Algorithms for Fixed-Budget Best Arm Identification

论文作者

Komiyama, Junpei, Tsuchiya, Taira, Honda, Junya

论文摘要

我们考虑固定预算的最佳手臂识别问题，目标是找到具有固定数量样本的最大均值的手臂。众所周知，错误识别最好的手臂的概率对巡回赛的数量成倍小。但是，已经讨论了有关此值的速率（指数）的有限特征。在本文中，我们表征了对所有可能参数的优化而导致的最小最佳速率。我们介绍了两个价格，$ r^{\ mathrm {go}} $和$ r^{\ mathrm {go}} _ {\ infty} $，对应于错误识别概率的下限，每种都与所提出的algorithm相关。费率$ r^{\ mathrm {go}} $与$ r^{\ mathrm {go}} $ - 跟踪相关联，可以通过神经网络有效地实现，并显示出胜过现有的算法。但是，此速率要求可以实现非平凡的条件。为了解决此问题，我们介绍了第二个速率$ r^{\ mathrm {go}} _ \ infty $。我们表明，通过引入一种称为延迟最佳跟踪（DOT）的概念算法，确实可以实现此速率。

We consider the fixed-budget best arm identification problem where the goal is to find the arm of the largest mean with a fixed number of samples. It is known that the probability of misidentifying the best arm is exponentially small to the number of rounds. However, limited characterizations have been discussed on the rate (exponent) of this value. In this paper, we characterize the minimax optimal rate as a result of an optimization over all possible parameters. We introduce two rates, $R^{\mathrm{go}}$ and $R^{\mathrm{go}}_{\infty}$, corresponding to lower bounds on the probability of misidentification, each of which is associated with a proposed algorithm. The rate $R^{\mathrm{go}}$ is associated with $R^{\mathrm{go}}$-tracking, which can be efficiently implemented by a neural network and is shown to outperform existing algorithms. However, this rate requires a nontrivial condition to be achievable. To address this issue, we introduce the second rate $R^{\mathrm{go}}_\infty$. We show that this rate is indeed achievable by introducing a conceptual algorithm called delayed optimal tracking (DOT).

下载PDF全文

下载文献需遵守相关版权规定

论文标题