基于条件互信息和对嘈杂，迭代算法的应用来锐化概括界限

论文标题

基于条件互信息和对嘈杂，迭代算法的应用来锐化概括界限

Sharpened Generalization Bounds based on Conditional Mutual Information and an Application to Noisy, Iterative Algorithms

论文作者

Haghifam, Mahdi, Negrea, Jeffrey, Khisti, Ashish, Roy, Daniel M., Dziugaite, Gintare Karolina

论文摘要

Russo和J. Zou（2016）和Xu and Raginsky（2017）的信息理论框架在学习算法的概括误差方面，就算法的输出与培训样本之间的相互信息而言。在这项工作中，我们研究了Steinke和Zakynthinou（2020）的提案，以理解学习算法的概括误差，通过引入一个超级样本，该样本包含训练样本作为随机子集并在超级样本上有条件地计算相互信息。我们首先表明，基于条件相互信息的这些新界限比基于无条件互信息的界限更紧。然后，我们介绍了BU，S。Zou和Veeravalli（2019）的“单个样本”思想以及Negrea等人的“数据依赖性”思想。（2019年），使用分解的共同信息。最后，我们将这些界限应用于Langevin Dynamics算法的研究，表明对超级样本的调节使我们能够利用优化轨迹中的信息来基于假设检验获得更严格的界限。

The information-theoretic framework of Russo and J. Zou (2016) and Xu and Raginsky (2017) provides bounds on the generalization error of a learning algorithm in terms of the mutual information between the algorithm's output and the training sample. In this work, we study the proposal, by Steinke and Zakynthinou (2020), to reason about the generalization error of a learning algorithm by introducing a super sample that contains the training sample as a random subset and computing mutual information conditional on the super sample. We first show that these new bounds based on the conditional mutual information are tighter than those based on the unconditional mutual information. We then introduce yet tighter bounds, building on the "individual sample" idea of Bu, S. Zou, and Veeravalli (2019) and the "data dependent" ideas of Negrea et al. (2019), using disintegrated mutual information. Finally, we apply these bounds to the study of Langevin dynamics algorithm, showing that conditioning on the super sample allows us to exploit information in the optimization trajectory to obtain tighter bounds based on hypothesis tests.

下载PDF全文

下载文献需遵守相关版权规定

论文标题