论文标题
通过学习动力的镜头的概括
Generalization Through the Lens of Learning Dynamics
论文作者
论文摘要
机器学习(ML)系统不仅必须学习以匹配训练集上的目标功能的输出,还必须将其推广到新颖情况,以便在部署时产生准确的预测。在大多数实际应用中,用户无法详尽列举模型的所有可能输入。因此,强大的概括性能对于表现且可靠的ML系统的开发至关重要,可以在现实世界中部署。尽管在许多假设类别中,理论上都被广泛理解,但深层神经网络的令人印象深刻的概括性能阻碍了理论家。在深度强化学习(RL)中,我们对概括的理解进一步使广泛使用的RL算法的概括与稳定之间的冲突更加复杂。本论文将通过研究监督和强化学习任务中深神经网络的学习动态来提供对概括的见解。
A machine learning (ML) system must learn not only to match the output of a target function on a training set, but also to generalize to novel situations in order to yield accurate predictions at deployment. In most practical applications, the user cannot exhaustively enumerate every possible input to the model; strong generalization performance is therefore crucial to the development of ML systems which are performant and reliable enough to be deployed in the real world. While generalization is well-understood theoretically in a number of hypothesis classes, the impressive generalization performance of deep neural networks has stymied theoreticians. In deep reinforcement learning (RL), our understanding of generalization is further complicated by the conflict between generalization and stability in widely-used RL algorithms. This thesis will provide insight into generalization by studying the learning dynamics of deep neural networks in both supervised and reinforcement learning tasks.