合成培养皿：快速建筑搜索的新型替代模型

论文标题

合成培养皿：快速建筑搜索的新型替代模型

Synthetic Petri Dish: A Novel Surrogate Model for Rapid Architecture Search

论文作者

Rawal, Aditya, Lehman, Joel, Such, Felipe Petroski, Clune, Jeff, Stanley, Kenneth O.

论文摘要

神经体系结构搜索（NAS）探讨了大量的架构图案 - 一个计算密集型过程，通常通过在大型网络中实例化每个图案，并培训和评估数千个域特异性数据样本的网络来对每个基序进行实例化评估。受到生物学基序（例如细胞）的启发，有时是从其自然环境中提取并在人造培养皿环境中研究的，本文提出了用于评估建筑基序的合成培养皿模型。在合成的培养皿中，在非常小的网络中实例化建筑基序，并使用非常熟悉的合成数据样本进行评估（以有效地近似于整个问题中的性能）。合成培养皿中图案的相对性能可以代替其地面真实性，从而加速了NAS最昂贵的步骤。与其他基于神经网络的预测模型解析了基序的结构以估计其性能，合成的培养皿通过在人工环境中训练实际基序来预测主题性能，从而从其真正的内在特性中得出预测。本文中的实验表明，合成的培养皿可以预测具有明显较高精度的新图案的性能，尤其是在可用地面真相数据不足时。我们的希望是，这项工作可以激发研究在替代受控环境中研究模型所提取组件的性能的新研究方向。

Neural Architecture Search (NAS) explores a large space of architectural motifs -- a compute-intensive process that often involves ground-truth evaluation of each motif by instantiating it within a large network, and training and evaluating the network with thousands of domain-specific data samples. Inspired by how biological motifs such as cells are sometimes extracted from their natural environment and studied in an artificial Petri dish setting, this paper proposes the Synthetic Petri Dish model for evaluating architectural motifs. In the Synthetic Petri Dish, architectural motifs are instantiated in very small networks and evaluated using very few learned synthetic data samples (to effectively approximate performance in the full problem). The relative performance of motifs in the Synthetic Petri Dish can substitute for their ground-truth performance, thus accelerating the most expensive step of NAS. Unlike other neural network-based prediction models that parse the structure of the motif to estimate its performance, the Synthetic Petri Dish predicts motif performance by training the actual motif in an artificial setting, thus deriving predictions from its true intrinsic properties. Experiments in this paper demonstrate that the Synthetic Petri Dish can therefore predict the performance of new motifs with significantly higher accuracy, especially when insufficient ground truth data is available. Our hope is that this work can inspire a new research direction in studying the performance of extracted components of models in an alternative controlled setting.

下载PDF全文

下载文献需遵守相关版权规定

论文标题