论文标题
表征Python库迁移
Characterizing Python Library Migrations
论文作者
论文摘要
开发人员在很大程度上依靠库中的应用程序编程界面(API)来构建其软件。随着软件的发展,开发人员可能需要用替代库(称为库迁移的过程)替换所用的库。手动执行此操作可能很乏味,耗时,并且容易出现错误。自动迁移技术可以帮助减轻这种负担。但是,设计有效的自动迁移技术需要了解将使用旧库的客户代码转换为新库所需的代码更改的类型。本文贡献了一项经验研究,从迁移中所需的代码变更和涉及的典型发展工作中的代码变化方面,提供了Python图书馆迁移的整体观点。我们在335个python库迁移中手动标记3,096个与迁移相关的代码更改,从311个客户存储库中迁移,涉及来自35个域的141个库对。根据我们的标记数据,我们得出了描述与迁移相关的代码更改Pymigtax的分类法。利用pymigtax和我们的标记数据,我们研究了Python库迁移的各种特征,例如程序元素的类型和API映射的属性,迁移中与迁移相关的代码变化类型的组合以及迁移所需的典型开发工作。我们的发现突出了当前图书馆迁移工具的各种潜在缺点。例如,我们发现40%的库对具有涉及非功能程序元素的API映射,而大多数库迁移技术通常假设来自源库中的函数调用将映射到目标库中的(一个或多个)函数调用。作为涉及开发工作的近似值,我们发现,平均而言,开发人员需要学习4个API和2个API映射才能进行迁移,并且...(截断)
Developers heavily rely on Application Programming Interfaces (APIs) from libraries to build their software. As software evolves, developers may need to replace the used libraries with alternate libraries, a process known as library migration. Doing this manually can be tedious, time-consuming, and prone to errors. Automated migration techniques can help alleviate some of this burden. However, designing effective automated migration techniques requires understanding the types of code changes required to transform client code that used the old library to the new library. This paper contributes an empirical study that provides a holistic view of Python library migrations, both in terms of the code changes required in a migration and the typical development effort involved. We manually label 3,096 migration-related code changes in 335 Python library migrations from 311 client repositories spanning 141 library pairs from 35 domains. Based on our labeled data, we derive a taxonomy for describing migration-related code changes, PyMigTax. Leveraging PyMigTax and our labeled data, we investigate various characteristics of Python library migrations, such as the types of program elements and properties of API mappings, the combinations of types of migration-related code changes in a migration, and the typical development effort required for a migration. Our findings highlight various potential shortcomings of current library migration tools. For example, we find that 40% of library pairs have API mappings that involve non-function program elements, while most library migration techniques typically assume that function calls from the source library will map into (one or more) function calls from the target library. As an approximation for the development effort involved, we find that, on average, a developer needs to learn about 4 APIs and 2 API mappings to perform a migration, and ... (truncated)