通过广义级别估计的稳健估计

论文标题

通过广义级别估计的稳健估计

Robust estimation via generalized quasi-gradients

论文作者

Zhu, Banghua, Jiao, Jiantao, Steinhardt, Jacob

论文摘要

我们探讨了为什么许多最近提出的强大估计问题是有效解决的，即使潜在的优化问题是非凸面。我们研究了这些可靠的估计问题的损失格局，并确定了“广义准级别”的存在。每当存在这些准级别时，一大批低温算法就可以确保近似全球最低限度。这包括常用的过滤算法。对于有界协方差下的分布的强大平均值估计，我们表明，当且仅当损坏级别$ε<1/3 $时，相关优化问题的任何一阶固定点都是{近似全局最小值}。因此，任何侵蚀固定点的优化算法都会产生有效的稳健估计器，并以崩溃点$ 1/3 $。通过仔细的初始化和步长，我们将其提高到$ 1/2 $，这是最佳的。对于其他任务，包括线性回归和联合平均值和协方差估计，损失格局更加坚固：远离全球最小值的固定点是固定点。然而，我们表明存在广义的准元素并构建有效的算法。这些算法比文献中的算法更简单，对于线性回归，我们将估计误差从$ o（\sqrtε）$提高到假设经认证的超额合同的$ O（ε）$的最佳速率。对于具有近乎身份协方差的平均估计，我们表明一种简单的梯度下降算法可实现崩溃点$ 1/3 $和迭代复杂性$ \ tilde {o}（o}（d/ε^2）$。

We explore why many recently proposed robust estimation problems are efficiently solvable, even though the underlying optimization problems are non-convex. We study the loss landscape of these robust estimation problems, and identify the existence of "generalized quasi-gradients". Whenever these quasi-gradients exist, a large family of low-regret algorithms are guaranteed to approximate the global minimum; this includes the commonly-used filtering algorithm. For robust mean estimation of distributions under bounded covariance, we show that any first-order stationary point of the associated optimization problem is an {approximate global minimum} if and only if the corruption level $ε< 1/3$. Consequently, any optimization algorithm that aproaches a stationary point yields an efficient robust estimator with breakdown point $1/3$. With careful initialization and step size, we improve this to $1/2$, which is optimal. For other tasks, including linear regression and joint mean and covariance estimation, the loss landscape is more rugged: there are stationary points arbitrarily far from the global minimum. Nevertheless, we show that generalized quasi-gradients exist and construct efficient algorithms. These algorithms are simpler than previous ones in the literature, and for linear regression we improve the estimation error from $O(\sqrtε)$ to the optimal rate of $O(ε)$ for small $ε$ assuming certified hypercontractivity. For mean estimation with near-identity covariance, we show that a simple gradient descent algorithm achieves breakdown point $1/3$ and iteration complexity $\tilde{O}(d/ε^2)$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题