学习在哪里看 - 生成的NAS非常有效

论文标题

学习在哪里看 - 生成的NAS非常有效

Learning Where To Look -- Generative NAS is Surprisingly Efficient

论文作者

Lukasik, Jovita, Jung, Steffen, Keuper, Margret

论文摘要

在最近，对表现良好的神经体系结构（NAS）的高效，自动化的搜索引起了人们的关注。因此，主要的研究目标是减少对神经体系结构进行昂贵评估的必要性，同时有效地探索大型搜索空间。为此，替代模型将体系结构嵌入了潜在的空间并预测其性能，而神经体系结构的生成模型则可以在生成器借鉴的潜在空间内基于优化的搜索。替代模型和生成模型都旨在促进结构良好的潜在空间中的查询有效搜索。在本文中，我们通过利用有效的替代模型和生成设计的优势来进一步改善查询效率和有前途的建筑产生之间的权衡。为此，我们提出了一个与替代预测指标配对的生成模型，该模型迭代地学会了从越来越有希望的潜在子空间中生成样品。这种方法可导致非常有效和高效的架构搜索，同时保持查询量较低。此外，我们的方法允许以直接的方式共同优化多个目标，例如准确性和硬件延迟。我们展示了这种方法的好处，不仅是W.R.T.优化体系结构以提高最高分类精度，但在硬件约束和在单个NAS基准测试中的最先进方法的上下文中，用于单个目标和多个目标。我们还可以在Imagenet上实现最先进的性能。该代码可在http://github.com/jovitalukasik/ag-net上找到。

The efficient, automated search for well-performing neural architectures (NAS) has drawn increasing attention in the recent past. Thereby, the predominant research objective is to reduce the necessity of costly evaluations of neural architectures while efficiently exploring large search spaces. To this aim, surrogate models embed architectures in a latent space and predict their performance, while generative models for neural architectures enable optimization-based search within the latent space the generator draws from. Both, surrogate and generative models, have the aim of facilitating query-efficient search in a well-structured latent space. In this paper, we further improve the trade-off between query-efficiency and promising architecture generation by leveraging advantages from both, efficient surrogate models and generative design. To this end, we propose a generative model, paired with a surrogate predictor, that iteratively learns to generate samples from increasingly promising latent subspaces. This approach leads to very effective and efficient architecture search, while keeping the query amount low. In addition, our approach allows in a straightforward manner to jointly optimize for multiple objectives such as accuracy and hardware latency. We show the benefit of this approach not only w.r.t. the optimization of architectures for highest classification accuracy but also in the context of hardware constraints and outperform state-of-the-art methods on several NAS benchmarks for single and multiple objectives. We also achieve state-of-the-art performance on ImageNet. The code is available at http://github.com/jovitalukasik/AG-Net .

下载PDF全文

下载文献需遵守相关版权规定

论文标题