论文标题
内核回归和广泛的神经网络中频谱依赖的学习曲线
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks
论文作者
论文摘要
我们使用高斯过程和统计物理学的理论方法来得出内核回归作为训练样本数量的概括性能的分析表达式。我们的表达适用于广泛的神经网络,因为训练它们与神经切线内核(NTK)之间的等效性之间的等效性。通过计算由于内核的不同光谱成分而导致的总概括误差的分解,我们确定了一个新的光谱原理:随着训练集的大小的增长,内核机和神经网络的尺寸逐渐适合目标函数的更高光谱模式。当数据从高维超球的均匀分布中采样时,包括NTK在内的DOT产品内核展示了学习阶段,其中学习了目标函数的不同频率模式。我们通过模拟合成数据和MNIST数据集来验证我们的理论。
We derive analytical expressions for the generalization performance of kernel regression as a function of the number of training samples using theoretical methods from Gaussian processes and statistical physics. Our expressions apply to wide neural networks due to an equivalence between training them and kernel regression with the Neural Tangent Kernel (NTK). By computing the decomposition of the total generalization error due to different spectral components of the kernel, we identify a new spectral principle: as the size of the training set grows, kernel machines and neural networks fit successively higher spectral modes of the target function. When data are sampled from a uniform distribution on a high-dimensional hypersphere, dot product kernels, including NTK, exhibit learning stages where different frequency modes of the target function are learned. We verify our theory with simulations on synthetic data and MNIST dataset.