理论与实践在功能近似中的差距与深神经网络

论文标题

理论与实践在功能近似中的差距与深神经网络

The gap between theory and practice in function approximation with deep neural networks

论文作者

Adcock, Ben, Dexter, Nick

论文摘要

深度学习（DL）正在转变行业，因为对现实世界数据培训的深度神经网络（DNNS）正在自动化决策过程。部分是由关于DNN近似理论的快速扩展的文献驱动的，表明它们可以估计各种功能，因此越来越多地考虑了科学计算中的问题。但是，与该领域的传统算法不同，从数值分析原理（例如稳定性，准确性，计算效率和样本复杂性）中，对DNN的了解知之甚少。在本文中，我们介绍了一个计算框架，用于检查实践中的DNN，并使用它来研究有关这些问题的经验绩效。我们研究了不同宽度和深度的DNN在各个维度的测试功能上的性能，包括平滑和分段平滑的功能。我们还将DL与大约平滑功能的一流方法进行了比较。基于压缩传感（CS）。从这些实验中我们的主要结论是，DNN的近似理论与其实际性能之间存在关键的差距，而训练有素的DNN在功能上表现较差，其功能相对较差，在这些功能的功能上，具有强大的近似结果（例如平滑功能），但与其他功能的最佳阶段方法相比，表现良好。为了进一步分析这一差距，我们提供了一些理论见解。我们建立了一个实用的存在定理，主张具有与CS相同性能的DNN体系结构和培训程序的存在。这建立了一个关键的理论基准，表明差距可以缩小，尽管它是通过保证执行的策略，但不如当前一流的方案更好。然而，它通过仔细设计DNN体系结构和培训策略来强调实践DNN的承诺大约有望。

Deep learning (DL) is transforming industry as decision-making processes are being automated by deep neural networks (DNNs) trained on real-world data. Driven partly by rapidly-expanding literature on DNN approximation theory showing they can approximate a rich variety of functions, such tools are increasingly being considered for problems in scientific computing. Yet, unlike traditional algorithms in this field, little is known about DNNs from the principles of numerical analysis, e.g., stability, accuracy, computational efficiency and sample complexity. In this paper we introduce a computational framework for examining DNNs in practice, and use it to study empirical performance with regard to these issues. We study performance of DNNs of different widths & depths on test functions in various dimensions, including smooth and piecewise smooth functions. We also compare DL against best-in-class methods for smooth function approx. based on compressed sensing (CS). Our main conclusion from these experiments is that there is a crucial gap between the approximation theory of DNNs and their practical performance, with trained DNNs performing relatively poorly on functions for which there are strong approximation results (e.g. smooth functions), yet performing well in comparison to best-in-class methods for other functions. To analyze this gap further, we provide some theoretical insights. We establish a practical existence theorem, asserting existence of a DNN architecture and training procedure that offers the same performance as CS. This establishes a key theoretical benchmark, showing the gap can be closed, albeit via a strategy guaranteed to perform as well as, but no better than, current best-in-class schemes. Nevertheless, it demonstrates the promise of practical DNN approx., by highlighting potential for better schemes through careful design of DNN architectures and training strategies.

下载PDF全文

下载文献需遵守相关版权规定

论文标题