用于高维广义线性模型的非反应莫罗包络理论

论文标题

用于高维广义线性模型的非反应莫罗包络理论

A Non-Asymptotic Moreau Envelope Theory for High-Dimensional Generalized Linear Models

论文作者

Zhou, Lijia, Koehler, Frederic, Sur, Pragya, Sutherland, Danica J., Srebro, Nathan

论文摘要

我们证明了一种新的概括界，该概括显示了高斯空间中任何类别的线性预测指标，类的rademacher复杂性以及在任何连续损失$ \ ell $下的训练错误都可以控制损失$ \ ell $的所有Moreau信封下的测试错误。我们使用有限样本结合来直接恢复Zhou等人的“乐观速度”。（2021）对于正方形损耗的线性回归，对于最小$ \ ell_2 $ - norm插值来说是紧密的，但是我们还处理了更通用的设置，其中标签是由潜在误称的多指数模型生成的。相同的论点可以通过平方铰链损耗来分析最大边缘分类器的嘈杂插值，并在尖峰协方差设置中建立一致性结果。更一般而言，当仅假定损失是Lipschitz时，我们的界限会有效地改善Talagrand众所周知的收缩引理的倍数，而我们证明了interpolators interpolators（Koehler etal。2021）的均匀收敛，以实现所有平滑的，非负损失。最后，我们表明，使用局部高斯宽度约束的概括通常对于经验风险最小化的人来说是锋利的，建立了一种非反对的莫罗象膜膜理论，用于泛化，该理论适用于比例缩放机制之外，处理模型误解，并补充现有的非摩洛威人的信封理论用于M-festimative。

We prove a new generalization bound that shows for any class of linear predictors in Gaussian space, the Rademacher complexity of the class and the training error under any continuous loss $\ell$ can control the test error under all Moreau envelopes of the loss $\ell$. We use our finite-sample bound to directly recover the "optimistic rate" of Zhou et al. (2021) for linear regression with the square loss, which is known to be tight for minimal $\ell_2$-norm interpolation, but we also handle more general settings where the label is generated by a potentially misspecified multi-index model. The same argument can analyze noisy interpolation of max-margin classifiers through the squared hinge loss, and establishes consistency results in spiked-covariance settings. More generally, when the loss is only assumed to be Lipschitz, our bound effectively improves Talagrand's well-known contraction lemma by a factor of two, and we prove uniform convergence of interpolators (Koehler et al. 2021) for all smooth, non-negative losses. Finally, we show that application of our generalization bound using localized Gaussian width will generally be sharp for empirical risk minimizers, establishing a non-asymptotic Moreau envelope theory for generalization that applies outside of proportional scaling regimes, handles model misspecification, and complements existing asymptotic Moreau envelope theories for M-estimation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题