机器学习的烟雾测试：发现严重缺陷的简单测试

论文标题

机器学习的烟雾测试：发现严重缺陷的简单测试

Smoke Testing for Machine Learning: Simple Tests to Discover Severe Defects

论文作者

Herbold, Steffen, Haar, Tobias

论文摘要

如今，机器学习已成为软件应用程序中数据分析的标准技术。软件工程师需要适合这些新型系统的质量保证技术。在本文中，我们讨论了一个问题，即自数十年以来一直是教科书的一部分的标准软件测试技术也对机器学习软件的测试也很有用。具体而言，我们尝试确定可以使用的通用和简单的烟雾测试，以断言可以执行基本功能而不会崩溃。我们发现我们可以使用类似于等效类别和边界价值分析的技术得出此类测试。此外，我们发现这些概念也可以应用于超参数，以进一步提高烟雾测试的质量。即使我们的方法几乎是微不足道的，我们也能够在三个库中的两个机器学习库中找到错误和严重的错误。这表明，在机器学习的时代，常见的软件测试技术仍然有效，并且考虑到如何适应这种新上下文可以帮助发现和预防严重的错误，即使在成熟的机器学习库中也可以帮助您找到严重的错误。

Machine learning is nowadays a standard technique for data analysis within software applications. Software engineers need quality assurance techniques that are suitable for these new kinds of systems. Within this article, we discuss the question whether standard software testing techniques that have been part of textbooks since decades are also useful for the testing of machine learning software. Concretely, we try to determine generic and simple smoke tests that can be used to assert that basic functions can be executed without crashing. We found that we can derive such tests using techniques similar to equivalence classes and boundary value analysis. Moreover, we found that these concepts can also be applied to hyperparameters, to further improve the quality of the smoke tests. Even though our approach is almost trivial, we were able to find bugs in all three machine learning libraries that we tested and severe bugs in two of the three libraries. This demonstrates that common software testing techniques are still valid in the age of machine learning and that considerations how they can be adapted to this new context can help to find and prevent severe bugs, even in mature machine learning libraries.

下载PDF全文

下载文献需遵守相关版权规定

论文标题