论文标题
随机多级成分优化算法,其独立于级别的收敛速率
Stochastic Multi-level Composition Optimization Algorithms with Level-Independent Convergence Rates
论文作者
论文摘要
在本文中,我们研究了平滑的随机多层组成优化问题,其中目标函数是$ t $函数的嵌套组成。我们假设通过随机的一阶甲骨文访问功能及其梯度的嘈杂评估。为了解决这类问题,我们使用移动平均随机估计提出了两种算法,并分析其收敛到问题的$ε$ - 稳定点。我们表明,第一种算法是\ cite {gharuswan20}对$ t $级别案例的概括,可以通过在每种迭代中使用小型样品来实现$ \ mathcal {o}(1/ε^6)$的样本复杂性。通过使用函数值的线性随机估计来修改此算法,我们将样本复杂性提高到$ \ MATHCAL {O}(1/ε^4)$。 {\ color {black}此修改不仅消除了在每次迭代中使用一个小批次样本的要求,而且还可以使算法无参数且易于实现}。据我们所知,这是为(联合国)限制的多层次设置设计的这种在线算法首次在随机甲骨文上的标准假设(无偏见和第二矩的无偏见和界限)下获得平滑单层设置的相同样品复杂性。
In this paper, we study smooth stochastic multi-level composition optimization problems, where the objective function is a nested composition of $T$ functions. We assume access to noisy evaluations of the functions and their gradients, through a stochastic first-order oracle. For solving this class of problems, we propose two algorithms using moving-average stochastic estimates, and analyze their convergence to an $ε$-stationary point of the problem. We show that the first algorithm, which is a generalization of \cite{GhaRuswan20} to the $T$ level case, can achieve a sample complexity of $\mathcal{O}(1/ε^6)$ by using mini-batches of samples in each iteration. By modifying this algorithm using linearized stochastic estimates of the function values, we improve the sample complexity to $\mathcal{O}(1/ε^4)$. {\color{black}This modification not only removes the requirement of having a mini-batch of samples in each iteration, but also makes the algorithm parameter-free and easy to implement}. To the best of our knowledge, this is the first time that such an online algorithm designed for the (un)constrained multi-level setting, obtains the same sample complexity of the smooth single-level setting, under standard assumptions (unbiasedness and boundedness of the second moments) on the stochastic first-order oracle.