论文标题
深度学习理论的最新进展
Recent advances in deep learning theory
论文作者
论文摘要
深度学习通常被描述为一个经过实验驱动的领域,该领域在不连续批评的理论基础上被描述为。迄今为止尚未井井有条的大量文献已经部分解决了这个问题。本文回顾并组织了深度学习理论的最新进展。文献分为六组:(1)分析深度学习的普遍性的复杂性和基于能力的方法; (2)随机微分方程及其用于建模随机梯度下降及其变体的动态系统,这些变化及其变体的特征是深度学习的优化和概括,部分受贝叶斯推论的启发; (3)驱动动态系统轨迹的损失格局的几何结构; (4)从正面和负面的角度来看,深神网络过度参数化的作用; (5)网络体系结构中几种特殊结构的理论基础; (6)伦理和安全方面越来越密集的关注点及其与普遍性的关系。
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.