论文标题
跟踪公共JIRA存储库数据集的替代问题
An Alternative Issue Tracking Dataset of Public Jira Repositories
论文作者
论文摘要
组织使用问题跟踪系统(ITS)以称为问题的单位来跟踪和记录其项目的工作。这种文档风格鼓励进化的改进,因为每个问题都可以独立改进,评论,与其他问题相关,并通过组织工作流程进步。到目前为止,经常研究的包括Github,Gitlab和Bugzilla,而Jira(在实践中最受欢迎的吉拉(Jira)拥有大量其他信息,但尚未受到类似的关注。不幸的是,各种公共JIRA数据集很少见,这可能是由于难以找到和访问这些存储库。在本文中,我们发布了一个由16个公共JIRA的数据集,其中包含1822个项目,涵盖了270万期,总共有3200万个变化,900万条评论和100万期链接。我们认为,此JIRA数据集将导致许多富有成果的研究项目,研究问题的演变,链接,跨项目分析以及跨工具分析,并结合现有的研究良好的数据集。
Organisations use issue tracking systems (ITSs) to track and document their projects' work in units called issues. This style of documentation encourages evolutionary refinement, as each issue can be independently improved, commented on, linked to other issues, and progressed through the organisational workflow. Commonly studied ITSs so far include GitHub, GitLab, and Bugzilla, while Jira, one of the most popular ITS in practice with a wealth of additional information, has yet to receive similar attention. Unfortunately, diverse public Jira datasets are rare, likely due to the difficulty in finding and accessing these repositories. With this paper, we release a dataset of 16 public Jiras with 1822 projects, spanning 2.7 million issues with a combined total of 32 million changes, 9 million comments, and 1 million issue links. We believe this Jira dataset will lead to many fruitful research projects investigating issue evolution, issue linking, cross-project analysis, as well as cross-tool analysis when combined with existing well-studied ITS datasets.