论文标题
重启和片状的实证研究基于Travis CI建立
Empirical Study of Restarted and Flaky Builds on Travis CI
论文作者
论文摘要
连续集成(CI)是开发实践,开发人员经常将代码集成到通用代码库中。集成了代码后,CI服务器将运行一个测试套件和其他工具来生成一组报告(例如,衬里和测试的输出)。如果CI测试运行的结果是出乎意料的,则开发人员可以选择手动重新启动构建,并在同一代码上重新运行相同的测试套件;如果重新启动的构建结果与原始构建不同,则可以揭示散布性。在这项研究中,我们分析了重新启动的构建,片状构建及其对开发工作流程的影响。我们观察到,开发人员在我们的Travis CI数据集中重新启动了至少1.72%的构建,相当于56,522个重新启动构建。我们观察到,更成熟和更复杂的项目更有可能包括重新开始的构建。重新启动的构建主要是由于测试,网络问题或Travis CI限制(例如执行超时)而最初失败的构建。最后,我们观察到重新启动的构建对开发工作流程有重大影响。实际上,在重新开始的构建中,有54.42%的开发人员在初始失败的一个小时内分析和重新启动了构建。这表明开发人员等待CI结果,打断他们的工作流程以解决该问题。重新启动的构建也减慢了拉动请求的合并三倍,使中间时间从16h到48h。
Continuous Integration (CI) is a development practice where developers frequently integrate code into a common codebase. After the code is integrated, the CI server runs a test suite and other tools to produce a set of reports (e.g., output of linters and tests). If the result of a CI test run is unexpected, developers have the option to manually restart the build, re-running the same test suite on the same code; this can reveal build flakiness, if the restarted build outcome differs from the original build. In this study, we analyze restarted builds, flaky builds, and their impact on the development workflow. We observe that developers restart at least 1.72% of builds, amounting to 56,522 restarted builds in our Travis CI dataset. We observe that more mature and more complex projects are more likely to include restarted builds. The restarted builds are mostly builds that are initially failing due to a test, network problem, or a Travis CI limitations such as execution timeout. Finally, we observe that restarted builds have a major impact on development workflow. Indeed, in 54.42% of the restarted builds, the developers analyze and restart a build within an hour of the initial failure. This suggests that developers wait for CI results, interrupting their workflow to address the issue. Restarted builds also slow down the merging of pull requests by a factor of three, bringing median merging time from 16h to 48h.