头像 - 使用替代模型的机器学习管道评估

论文标题

头像 - 使用替代模型的机器学习管道评估

AVATAR -- Machine Learning Pipeline Evaluation Using Surrogate Model

论文作者

Nguyen, Tien-Dung, Maszczyk, Tomasz, Musial, Katarzyna, Zöller, Marc-Andre, Gabrys, Bogdan

论文摘要

在自动ML管道组成和优化期间，机器学习（ML）管道的评估至关重要。以前的方法，例如基于贝叶斯的基于遗传的优化，这些方法是在自动Weka，Auto-Sklearn和TPOT中实现的，通过执行它们来评估管道。因此，这些方法的管道组成和优化需要大量的时间，以防止他们探索复杂的管道以找到更好的预测模型。为了进一步探讨这项研究挑战，我们进行了实验，表明许多生成的管道无效，并且没有必要执行它们以找出它们是否是好管道。为了解决这个问题，我们提出了一种新的方法，可以使用替代模型（Avatar）评估ML管道的有效性。阿凡塔（Avatar）可以通过快速忽略无效管道来加速自动ML管道组成和优化。我们的实验表明，与需要执行的传统评估方法相比，化身在评估复杂管道方面更有效。

The evaluation of machine learning (ML) pipelines is essential during automatic ML pipeline composition and optimisation. The previous methods such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods requires a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid, and it is unnecessary to execute them to find out whether they are good pipelines. To address this issue, we propose a novel method to evaluate the validity of ML pipelines using a surrogate model (AVATAR). The AVATAR enables to accelerate automatic ML pipeline composition and optimisation by quickly ignoring invalid pipelines. Our experiments show that the AVATAR is more efficient in evaluating complex pipelines in comparison with the traditional evaluation approaches requiring their execution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题