论文标题
基于天生系列的神经网络扰动理论
A Neural Network Perturbation Theory Based on the Born Series
论文作者
论文摘要
在过去十年中,使用同名深度神经网络(DNN)的深度学习已成为解决各种基于数据的理论物理问题的一种有吸引力的方法。更深层次的体系结构有一个明显的趋势,其中包含越来越强大和涉及的层次。相反,DNN的泰勒系数仍然主要是根据可解释性研究而出现的,其中最多可以计算出一级。但是,尤其是在理论物理学中,众多问题也受益于获得更高的订单。这一差距激发了神经网络(NN)泰勒膨胀的一般表述。将我们的分析限制在多层感知器(MLP)中,并引入我们称为繁殖者和顶点的数量,既取决于MLP的重量和偏见,我们都建立了图形理论方法。与量子字段理论中的Feynman规则类似,我们可以系统地将包含传播器和顶点的图分配给相应的部分衍生物。研究这种方法的浅势势的S波散射长度,我们观察到NNS以使其衍生物主要适应目标函数泰勒膨胀的领先顺序。为了解决这个问题,我们提出了迭代性NN扰动理论。在每次迭代期间,我们都取消了领先顺序,以便可以在随后的迭代期间忠实地学习次要领先的顺序。执行了两次迭代后,我们发现在相应的迭代期间正确适应了一阶和二阶术语。最后,我们将这两个结果结合在一起,找到一个代理,该代理是机器学习的二阶Born近似。
Deep Learning using the eponymous deep neural networks (DNNs) has become an attractive approach towards various data-based problems of theoretical physics in the past decade. There has been a clear trend to deeper architectures containing increasingly more powerful and involved layers. Contrarily, Taylor coefficients of DNNs still appear mainly in the light of interpretability studies, where they are computed at most to first order. However, especially in theoretical physics numerous problems benefit from accessing higher orders, as well. This gap motivates a general formulation of neural network (NN) Taylor expansions. Restricting our analysis to multilayer perceptrons (MLPs) and introducing quantities we refer to as propagators and vertices, both depending on the MLP's weights and biases, we establish a graph-theoretical approach. Similarly to Feynman rules in quantum field theories, we can systematically assign diagrams containing propagators and vertices to the corresponding partial derivative. Examining this approach for S-wave scattering lengths of shallow potentials, we observe NNs to adapt their derivatives mainly to the leading order of the target function's Taylor expansion. To circumvent this problem, we propose an iterative NN perturbation theory. During each iteration we eliminate the leading order, such that the next-to-leading order can be faithfully learned during the subsequent iteration. After performing two iterations, we find that the first- and second-order Born terms are correctly adapted during the respective iterations. Finally, we combine both results to find a proxy that acts as a machine-learned second-order Born approximation.