年龄脉：联合神经结构和超量参数搜索与表格数据的自动数据并行培训

论文标题

年龄脉：联合神经结构和超量参数搜索与表格数据的自动数据并行培训

AgEBO-Tabular: Joint Neural Architecture and Hyperparameter Search with Autotuned Data-Parallel Training for Tabular Data

论文作者

Egele, Romain, Balaprakash, Prasanna, Vishwanath, Venkatram, Guyon, Isabelle, Liu, Zhengying

论文摘要

为大型表格数据集开发高性能的预测模型是一项具有挑战性的任务。最先进的方法基于来自不同监督学习方法的专家开发模型合奏。最近，自动化机器学习（AUTOML）正在成为自动化预测模型开发的一种有希望的方法。神经体系结构搜索（NAS）是一种自动方法，可以同时生成和评估多个神经网络体系结构，并改善生成的模型的精度。 NAS中的一个关键问题，尤其是对于大型数据集，是评估每个生成的架构所需的大量计算时间。尽管数据并行培训是一种可以解决此问题的有前途的方法，但它在NAS中的使用很困难。对于不同的数据集，需要调整数据并行训练设置，例如并行过程，学习率和批次大小的数量，以实现较高的精度和减少训练时间。为此，我们开发了一种年龄段的方法，是一种结合衰老进化（年龄）的方法，一种搜索神经结构空间的平行NAS方法，以及一种异步的贝叶斯优化方法，用于调整数据并行训练的超参数。我们证明了所提出的方法为大型表格基准数据集生成高性能神经网络模型的功效。此外，我们证明了使用我们的方法自动发现的神经网络模型在推理速度以最先进的自动集合模型的速度乘以两个数量级，同时达到相似的精度值。

Developing high-performing predictive models for large tabular data sets is a challenging task. The state-of-the-art methods are based on expert-developed model ensembles from different supervised learning methods. Recently, automated machine learning (AutoML) is emerging as a promising approach to automate predictive model development. Neural architecture search (NAS) is an AutoML approach that generates and evaluates multiple neural network architectures concurrently and improves the accuracy of the generated models iteratively. A key issue in NAS, particularly for large data sets, is the large computation time required to evaluate each generated architecture. While data-parallel training is a promising approach that can address this issue, its use within NAS is difficult. For different data sets, the data-parallel training settings such as the number of parallel processes, learning rate, and batch size need to be adapted to achieve high accuracy and reduction in training time. To that end, we have developed AgEBO-Tabular, an approach to combine aging evolution (AgE), a parallel NAS method that searches over neural architecture space, and an asynchronous Bayesian optimization method for tuning the hyperparameters of the data-parallel training simultaneously. We demonstrate the efficacy of the proposed method to generate high-performing neural network models for large tabular benchmark data sets. Furthermore, we demonstrate that the automatically discovered neural network models using our method outperform the state-of-the-art AutoML ensemble models in inference speed by two orders of magnitude while reaching similar accuracy values.

下载PDF全文

下载文献需遵守相关版权规定

论文标题